Planet Primates

May 02, 2016


On classes $AWPP$ and $APP$

(1) $PP$ contains problems like 'is perm(M)>k'. So what problems does $AWPP$ and $APP$ contain with respect to permanent?

(2) Since it is not known if $NP$ is in $AWPP$ or $APP$ is there a candidate problem for $NP$ not in $AWPP$ other than $NP$ complete problems?

by Turbo at May 02, 2016 09:11 PM


Daycount Actual/Actual AFB example

This question is about the following example in Wikipedia about time factor using the Actual/Actual AFB daycount.

Assume that the $t_1=\text{28 Feb 2004}$ and $t_2=\text{29 Feb 2008}$.

There are $4$ years in between. According to the rule, we define $t_3=\text{29 Feb 2004}$ and add the number of days between $t_1$ and $t_3$ divided by either $365$ or $366$ depending on whether there is a $\text{29Feb}$ between ...

Here is one of the points where I am not clear. Should it be between $t_1$ and $t_2$ or from $t_1$ to $t_3$.

The other part that I am not clear is if the interval should be considered open at the right endpoint or not. This is, if the $\text{29Feb}$ should be in the interval $[t_1,t_2)$ or $[t_1,t_2]$, for the case in which $t_2$ is the relevant end-point.

In Wikipedia, the example currently obtains the same daycount: $4 + \frac{1}{366}$ regardless of whether the ISDA convention is applied or not.

by myfirsttime1 at May 02, 2016 09:09 PM


Path finding trivial problems

Is there any pathfinding trivial problems like the TSP?
I know that is possible to solve TSP using A*, Best-first and using spanning tree and other things as heuristic but, is there any problem well know by the AI community?

by Matheus Silva at May 02, 2016 09:05 PM

How to calculate Huffman code length without generating the actual huffman code?

Given a vector of elements, how can I calculate the length of the Huffman codewords without generating the Huffman code itself?

Using Matlab, I was able to compute the Huffman code and and get the length of the codewords but is it possible to get the lengths alone without computing the codewords?

by Jay Mar at May 02, 2016 09:02 PM


Erratic behavior of train_test_split() in scikit-learn

Python 3.5 (anaconda install) SciKit 0.17.1

I just can't understand why train_test_split() has been giving me what I consider unreliable splits of a list of training cases.

Here's an example. My list trnImgPaths has 3 classes, each one with 67 images (total 201 images):

   ... thru ...
   ... thru ...
   ... thru ...

My list of targets trnImgTargets perfectly matches this both in length and also the classes themselves align perfectly with trnImgPaths.

In[148]: len(trnImgPaths)
Out[148]: 201
In[149]: len(trnImgTargets)
Out[149]: 201

If I run:

[trnImgs, testImgs, trnTargets, testTargets] = \
    train_test_split(trnImgPaths, trnImgTargets, test_size=141, train_size=60, random_state=42)


[trnImgs, testImgs, trnTargets, testTargets] = \
    train_test_split(trnImgPaths, trnImgTargets, test_size=0.7, train_size=0.3, random_state=42)


[trnImgs, testImgs, trnTargets, testTargets] = \
    train_test_split(trnImgPaths, trnImgTargets, test_size=0.7, train_size=0.3)

Although I end up getting:

In[150]: len(trnImgs)
Out[150]: 60
In[151]: len(testImgs)
Out[151]: 141
In[152]: len(trnTargets)
Out[152]: 60
In[153]: len(testTargets)
Out[153]: 141

I never get a perfect split of 20 - 20 - 20 for the training set. I can tell because both by manual checking and doing a sanity check by confusion matrix. Here are the results for each experiment above, respectively:

[[19  0  0]
 [ 0 21  0]
 [ 0  0 20]]

[[19  0  0]
 [ 0 21  0]
 [ 0  0 20]]

[[16  0  0]
 [ 0 22  0]
 [ 0  0 22]]

I expected the split to be perfectly balanced. Any thoughts why this is happening?

It even appears it may be misclassifying a few cases a priori, because there will never be n=22 training cases for a given class.

by pepe at May 02, 2016 09:02 PM



On power of $P/poly$

(1) We know that $EXP ⊄ P/poly ⇒ BPP$ is in $SUBEXP$. Does $SUBEXP ⊄ P/poly$ mean $P=BPP$ or anything close?

(2) We know that if $NP$ is in $P/poly$ then $PH$ collapses to second level. What is the consequence if $\oplus P$ is in $P/poly$?

by Turbo at May 02, 2016 09:00 PM


Deep Neural Network

Every deep network is typically trained to carry out a particular task (sentence prediction, image classification or speech recognition, for example). However, a single human brain network is capable of performing all these tasks simultaneously. Describe how you would go about training a “Jack-of-all-trades” network that can do many tasks reasonably well. What kind of issues do you expect while training such a model? Describe what regularization methods, weight decay methods, and loss functions you would consider in your approach

by rohan at May 02, 2016 09:00 PM


decision tree for significant variables

how can I use decision tree graph to determine the significant variables,I know which one has largest information gain should be in the root of tree which means has small entropy so this is my graph if I want to know which variables are significant how can I interpret

thanks for help
enter image description here

by 404 at May 02, 2016 08:55 PM


Equivalence of data-flow analysis, abstract interpretation and type inference?

@Babou's answer to a recent question reminds me that at one time I think I read a paper about the equivalence (in terms both of the facts that can be inferred or proved and the time complexity of running the inference algorithm) of data-flow analysis, abstract interpretation, and type inference.

In some sub-cases (like between forward context-sensitive interprocedural data-flow analysis and abstract interpretation) the equivalence is relatively obvious to me, but the question seems more subtle for other comparisons. For example, I can't figure out how Hindley-Milner type inference could be used to prove some of the properties that can be proved with flow-sensitive data-flow analysis.

What are the seminal references discussing the equivalences (or differences) between data-flow analysis, abstract interpretation and type inference?

by Wandering Logic at May 02, 2016 08:38 PM

How can I explain to my parents that I study programming languages? (soft question)

(the title is oversimplified, but hey, it is just a title)

I am currently finishing my MSc in computer science. I am interested in programming languages, especially in type systems. I got interested in research in this field and next semester I will start a PhD on the subject.

Now here is the real question: how can I explain what I (want to) do to people with no previous knowledge in either computer science or related fields?

The title comes from the facts that I am not even able to explain what I do to my parents, friends and so on. Yeah, I can say "the whole point is to help software developers to write better software", but I do not think it is really useful: they are not aware of "programming", they have not clue of what it means. It feels like I am saying I am an auto mechanic to someone from the Middle Ages: they simply do not know what I am talking about, let alone how to improve it.

Does anyone have good analogies with real-world? Enlightening examples causing "a-ha" moments? Should I actually show a short and simple snippet of code to 60+ year-old with no computer science (nor academic) experience? If so, which language should I use? Did anyone here face similar issues?

I know it is a broad question, maybe border-line with the policy of this site, but I think that some good, straight-to-the-point answers could come. If you know how to rephrase the question to better fit the policies, please edit or leave a comment.

(is there a soft-question tag? Maybe its absences means this is not really a good question for this site...)

by effeffe at May 02, 2016 08:37 PM



Machine learning libraries support to spark streaming

I am working on the spark streaming project where machine learning need to implemented. I have seen there are couple of algorithms are present in spark but mostly for batch only. For streaming, there are only streaming linear regression and kmeans streaming are present . Is it true. Can I use other machine learning libraries ( random forest) in streaming ?

by Sonam at May 02, 2016 08:26 PM

LSTM network learning

I have attempted to program my own LSTM (long short term memory) neural network. I would like to verify that the basic functionality is working. I have implemented a Back propagation through time BPTT algorithm to train a single cell network.

Should a single cell LSTM network be able to learn a simple sequence, or are more than one cells necessary? The network does not seem to be able to learn a simple sequence such as 1 0 0 0 1 0 0 0 1 0 0 0 1.

I am sending the the sequence 1's and 0's one by one, in order, into the network, and feeding it forward. I record each output for the sequence.

After running the whole sequence through the LSTM cell, I feed the mean error signals back into the cell, saving the weight changes internal to the cell, in a seperate collection, and after running all the errors one by one through and calculating the new weights after each error, I average the new weights together to get the new weight, for each weight in the cell.

Am i doing something wrong? I would very appreciate any advice.

Thank you so much!

by bfengineer at May 02, 2016 08:24 PM


Why is determining if there is a solution to a Battleship puzzle NP-Complete?

This paper says that the decision problem, "Given a particular puzzle, is there a solution?" is NP-Complete. I don't understand why this can't be done in polynomial time. Given constraints that no two ships can be orthogonally or diagonally adjacent, why not just create a grid where there are 2 times as many columns as "bins" with enough rows to put a "separator" run in between every ship. I've seen the reduction demonstrated this way and it seems like it could be done in polynomial time.

by Derek at May 02, 2016 08:16 PM

Backward vs Forward Data-flow Analysis

I understand how both forward and backward data-flow analysis work but in what situations would we use them? Why do we need to be able to do it in both ways? Do compilers of certain types perform one way more efficiently than the other?

by Haych at May 02, 2016 08:11 PM


approximating fBm sotchastic integral

Suppose I have the following stochastic integral:

$$\int_a^b f(t)dB_H(t)$$

with the term $dB_H(t)$ a fractional brownian motion with associated $H$ parameter.

Is it true that for $H \in (1/2,1)$, we have the following result?

$$\int_a^b f(t)dB_H(t) := \lim_{\Delta t_k \rightarrow0} \sum_k f(t_k)[(B_H(t_{k+1})-B_H(t_k)]$$

by baluch_stan at May 02, 2016 08:11 PM


Regression prediction with Azure

I have a date column and a "value"colum (with values 1 or 0). I want to predict the value column given a date (when it will be 1 and when it will be 0).

I have created a regression model, and obtained the same scored label for every input. 0.698689.

I have created the predictive model, and added the project column as you said, in order to not ask me for the value input. Now it asks me for the date, and I obtain the scored label, and the value is not obtained. I understand that socred label is the accurancy, but not the predictive value.

Could you help me?


by Gabriel Bursztyn at May 02, 2016 08:00 PM



Microsoft Tay -like learning algorithm?

what kind of algorithm a system like Microsoft Tay used to "learn new knowledge"?

I just started studying machine learning (ML) and natural language processing (NLP), and they seem to be all about classifications, finding relations, calculating similarities, but nothing about actual understanding/meaning.

I know that "understanding" a complex topic, so my question is how an autonomous system like Tay was able to respond to user statements and keep a dialog?

by using naive bayes classification, for example, I can find a document that is more related to the user statement, and than, inspect each paragraph, find the one most related, and present it to the user. this would be a super-primitive QA system, but is infinitely far from a conversation.

tools like word2vec also help in finding the best matches and most related documents, but this is also not a conversation.

I imagine that there might be a combination of state-machines, maybe some logic programming with Prolog (to keep some structured knowledge), some NLP filtering and classification to process user statements, and some other strategies to promote some intention or objetive to keep the conversation wheel spinning.

(I know that Tay algorithms are closed, but is there any open-source arrangement or paper I could look for?)

by weeanon at May 02, 2016 07:50 PM


Restart specific racoon tunnel

I have several gif* interfaces on my FreeBSD box. They are representing tunnels, encrypted using racoon+ipsec. If, at some moment, one of the tunnels hangs up, I am forced to reset racoon this way:

/usr/local/etc/rc.d/racoon restart

But in that case all tunnels are reset, which leads to a short absence of connectivity on all my tunnels (3-5 seconds, but nevertheless).

Is there any method to reset one specific gif tunnel, while not touching any other tunnels?

by Alexander Tarasov at May 02, 2016 07:44 PM


Minmax vs Maxmin

I'm reading this paper about building a combat simulator for 8 unit vs 8 unit mini combats in StarCraft: Brood War. The basic idea is to build a search tree simulating these small combats in order to determine next best move in combat scenarios in StarCraft game play. Section 3.2 (which probably is necessary to understand my question) is the part I am having trouble with, where he talks about approximating a combat (where simultaneous moves occur) with a version where alternating move occur. This allows him to use minmax trees and alpha-beta search.

The part I don't understand is when he describes how building the minmax tree gives an advantage to one player, while a maxmin tree gives an advantage to the other player. For one, his diagram labels the tree where "max" goes first as "maxmin", but his text (and wikipedia :P) describe the tree where max goes first as "minmax". Perhaps that is just a mislabelling.

The main part I am confused about is when he goes on to say:

Proposition 1: For stacked matrix games G, we have $$mini(G) \leq Nash(G) \leq maxi(G)$$

My understanding of this is that $Nash(G)$ is the real final game state taking into account simultaneous moves. Then $mini(G)$ is the final game state if we approximate with MAX moving first, and $maxi(G)$ is the final game state if we approximate with MIN moving first. Besides the Nash equilibrium stuff, which is a bit beyond my education so far, I don't understand how the inequality $$mini(G) \leq maxi(G)$$ can be true. Below are 4 examples. The leaf node values come, I believe, from an evaluation function of that game state's value to MAX:

enter image description here

The two on the left are maxmin, meaning MIN goes first. The two on the right are minmax, with MAX going first. The top pair contains one set of leaf node values that lead to $$5=maxi(G) \lt mini(G)=8$$ This goes against the proposition above. However, the bottom pair contain a different set of leaf node values that lead to $$4=maxi(G) \gt mini(G)=3$$ Here the proposition seems to hold. I believe I could also construct a scenario where $mini(G) = maxi(G)$.

What am I missing here? Is there really a hard relation between $mini(G)$ and $maxi(G)$ or doesn't it just depend on the leaf node values?

by xdhmoore at May 02, 2016 07:44 PM


Is the money market account (MMA) numeraire and the forward measure equivalent?

Suppose we have a risk-neutral measure $\tilde{\mathbb{P}}$. The money market account is given as $M(t) = e^{\int^t_0 R(s) ds}$, while the price of the zero-coupon bond at time $t$ that matures at $T$ is denoted $B(t,T)$.

So, the forward measure is defined to be the measure with $B(t,T)$ taken as the numeraire. However, I am curious if taking $M(t)$ will also make the measure into a forward measure. If this is not true in general, does it work when the interest rate is constant as $R(t) = r$? This would imply that $B(t,T) = e^{r(T-t)}$, and $B(0,T) = \frac{1}{M(T)}$ and $B(T,T) = \frac{1}{M(0)}$, which seems to imply somewhat of a connection between the two measures just by looking at the Radon-Nikodym derivative, $\mathbb{Z}$.

Also, I have an additional question about the usefulness of the forward measure. It seems that forward measures are useful in options pricing because we can take the discount out for the risk-neutral pricing formula so that $V(t) = D(t) \tilde{\mathbb{E}}^F[V(T) | {\cal{F}}(t)]$. But are there any other advantages of using the forward measure?

by Astaboom at May 02, 2016 07:44 PM


Patience Sort+ ping pong merge implementation

A recent paper out of Microsoft Research describes a new, faster implementation of the patience sort algorithm. A key part of the implementation is an improved merging strategy dubbed the "ping-pong" merge. I am confused as to why this merge strategy uses two arrays to perform the merging, instead of just using a single array and always performing a "blind merge" as described in the paper. It seems that always performing blind merges, and thus only using a single array to perform the merge, would cut down memory usage with no change in runtime.

by dcb at May 02, 2016 07:42 PM


How to use Theano/TensorFlow/Keras for running SGD without neural nets?

Given a model equation, a specific loss function and gradients (that I've already derived), how do I use something like Theano/TensorFlow (or Keras since it's more generic) to train the model without using neural nets?

I simply want to use SGD to minimize the regularized logistic loss. Is this a good example: ?

Equations (1) and (2) of are, for instance, something I'm trying to work with.

by Nilesh at May 02, 2016 07:32 PM

Creating a filterable list with RxJS

I'm trying to get into reactive programming. I use array-functions like map, filter and reduce all the time and love that I can do array manipulation without creating state.

As an exercise, I'm trying to create a filterable list with RxJS without introducing state variables. In the end it should work similar to this:

enter image description here enter image description here

I would know how to accomplish this with naive JavaScript or AngularJS/ReactJS but I'm trying to do this with nothing but RxJS and without creating state variables:

var list = [

Rx.Observable.fromEvent(document.querySelector('#filter'), 'keyup')
  .map(function(e) { return; });

// i need to get the search value in here somehow:
Rx.Observable.from(list).filter(function() {}); 

Now how do I get the search value into my filter function on the observable that I created from my list?

Thanks a lot for your help!

by Macks at May 02, 2016 07:28 PM


SABR Calibration: Normal vs Log-Normal Market Data

This question is about getting some clarification as to how to understand market quotes for normal & log-normal vols together with certain model assumptions.

So let us define

  1. $C_{BS}(F_0,K,T,\sigma,\beta)=\mathbb{E}[(F_T-K)^+]\quad \text{with}\quad dF_t=\sigma F_t^\beta dW_t$$

  2. $C_{SABR}(F_0,K,T,\sigma_0,\beta,\nu,\rho)=\mathbb{E}[(F_T-K)^+]$ $$\text{with}\quad dF_t=\sigma_t F_t^\beta dW_t,\quad \sigma_t=\nu \sigma_t dZ_t,\quad dW_tdZ_t = \rho dt$$

And for any given combination of $F_0,K,T,\sigma_0,\beta,\nu,\rho$ the SABR-implied vol $v_{SABR}$ is the quantity such that the following relationship holds

$$C_{BS}(F_0,K,T,v_{SABR},1) = C_{SABR}(F_0,K,T,\sigma_0,\beta,\nu,\rho)$$

See right-hand side of page 89.

Now let us assume that for a fixed expiry/tenor we are given a set of volatility market quotes:

enter image description here

Ideally, I want to calibrate the SABR model to it. So when I set $\beta=1$ and calibrate $\sigma_0,\nu,\rho$ to the log-normal vols, I get a very nice fit:

enter image description here

However, when I set $\beta=0$ and calibrate $\sigma_0,\nu,\rho$ to the normal vols, I get a very poor fit:

enter image description here

So I have two questions:

  1. Is my definition of the SABR vol $v_{SABR}$ correct? For example, would $$C_{BS}(F_0,K,T,v_{SABR},\beta) = C_{SABR}(F_0,K,T,\sigma_0,\beta,\nu,\rho)$$ perhaps be more correct? Note that the difference here is the $\beta$ in $C_{BS}$ as opposed to having a 1 there.
  2. Is maybe my normal vol market data of an atypical shape causing SABR to only find a poor fit? Or is my SABR implementation faulty?

by Tom at May 02, 2016 07:20 PM



In Keras model.fit_generator() method, what is the generator queue controlled parameter "max_q_size" used for?

I built a simple generator that the yields a tuple(inputs, targets) with only single item in the inputs and targets lists--basically crawling the data set, one sample item at a time.

I pass this generator into:

                      max_q_size=1  # defaults to 10

I get that:

  • nb_epoch is the number of times the training batch will be run
  • samples_per_epoch is the number of samples trained with per epoch

But what is max_q_size for and why would it default to 10? I thought the purpose of using a generator was to batch data sets into reasonable chunks, so why the additional queue?

by Ray at May 02, 2016 07:09 PM



Implementing NEAT with TensorFlow

I want to use TensorFlow to teach an agent to play a multiplayer game. to be exact. TensorFlow is obviously really young. Would this be possible? Other thoughts?

by Gunnar Norred at May 02, 2016 06:57 PM


Why are loops faster than recursion?

In practice I understand that any recursion can be written as a loop (and vice versa(?)) and if we measure with actual computers we find that loops are faster than recursion for the same problem. But is there any theory what makes this difference or is it mainly emprical?

by Programmer 400 at May 02, 2016 06:57 PM

Distance vector in a weighted graph

I got a weighted, connected and directed graph $G$. There is a vector called the distance vector $Dv \in \mathbb{N}^n$ in which $Dv_i$ is the shortest distance from $1$ to $i$. All edge weights are positive integers. I have to show that every distance vector $Dv$ satisfies:

  1. $Dv_1 = 0$.

  2. For all $j \neq 1$ there exists $i$ such that $Dv_j = Dv_i + w(i,j)$.

  3. For all $i,j$ it holds that $Dv_j \leq Dv_i + w(i,j)$.

I think that 1 is trivial: from 1 to 1 you have no distance. But the rest? Can you give me an idea how to prove 2 and 3?

by Asker at May 02, 2016 06:53 PM

What is the fastest online sorting algorithm?

Quoting Online algorithm from Wikipedia:

In computer science, an online algorithm[1] is one that can process its input piece-by-piece in a serial fashion, i.e., in the order that the input is fed to the algorithm, without having the entire input available from the start.

One is Insertion Sort, but it runs in horrible $O(n^2)$ time.

by CrazyPython at May 02, 2016 06:52 PM



Orderability of Belief States in a POMDP?

Consider a POMDP with integer states $1,2,\ldots,N$, where $N$ is finite. We thus have a complete order over the states.

It seems reasonable to think that belief states for this POMDP may be orderable in some partial order sense.

Does this orderability translate into any structure of the optimal policy? Anyone have any relevant literature?

by jonem at May 02, 2016 06:45 PM

Algorithmic intuition for logarithmic complexity

I believe I have a reasonable grasp of complexities like $\mathcal{O}(1)$, $\Theta(n)$ and $\Theta(n^2)$.

In terms of a list, $\mathcal{O}(1)$ is a constant lookup, so it's just getting the head of the list. $\Theta(n)$ is where I'd walk the entire list, and $\Theta(n^2)$ is walking the list once for each element in the list.

Is there a similar intuitive way to grasp $\Theta(\log n)$ other than just knowing it lies somewhere between $\mathcal{O}(1)$ and $\Theta(n)$?

by Khanzor at May 02, 2016 06:45 PM


gbm() package in R

I'm using R's gbm() package to do a boosted classification problem, where my response variable is a binary variable taking values of 0 and 1. I have 11 predictors in my data set. After running the gbm() procedure, I then obtained a plot of the relative importance for each variable. Basically, it says that my variable X2 has a relative influence of 95.5%.

My problem is that I tried to actually confirm that this is true, or rather, visualize the accuracy of this result. I therefore created a scatter plot of X2 vs. the binary response variable, but there is no clear correlation between the 2 variables, despite a very strong relative influence. Perhaps a logistic regression would work, but the pseudo-R^2 values were almost zero.

Does anyone know in this case what a 95.5% relative influence would even mean? Thanks!

by Thomas Moore at May 02, 2016 06:45 PM


How to describe algorithms, prove and analyse them?

Before reading The Art of Computer Programming (TAOCP), I have not considered these questions deeply. I would use pseudo code to describe algorithms, understand them and estimate the running time only about orders of growth. The TAOCP thoroughly changes my mind.

TAOCP uses English mixed with steps and goto to describe the algorithm, and uses flow charts to picture the algorithm more readily. It seems low-level, but I find that there's some advantages, especially with flow chart, which I have ignored a lot. We can label each of the arrows with an assertion about the current state of affairs at the time the computation traverses that arrow, and make an inductive proof for the algorithm. The author says:

It is the contention of the author that we really understand why an algorithm is valid only when we reach the point that our minds have implicitly filled in all the assertions, as was done in Fig.4.

I have not experienced such stuff. Another advantage is that, we can count the number of times each step is executed. It's easy to check with Kirchhoff's first law. I have not analysed the running time exactly, so some $\pm1$ might have been omitted when I was estimating the running time.

Analysis of orders of growth is sometimes useless. For example, we cannot distinguish quicksort from heapsort because they are all $E(T(n))=\Theta(n\log n)$, where $EX$ is the expected number of random variable $X$, so we should analyse the constant, say, $E(T_1(n))=A_1n\lg n+B_1n+O(\log n)$ and $E(T_2(n))=A_2\lg n+B_2n+O(\log n)$, thus we can compare $T_1$ and $T_2$ better. And also, sometimes we should compare other quantities, such as variances. Only a rough analysis of orders of growth of running time is not enough. As TAOCP translates the algorithms into assembly language and calculate the running time, It's too hard for me, so I want to know some techniques to analyse the running time a bit more roughly, which is also useful, for higher-level languages such as C, C++ or pseudo codes.

And I want to know what style of description is mainly used in research works, and how to treat these problems.

by Frank Science at May 02, 2016 06:43 PM



Retreive candidate attributes for a node in Decision Tree using R

I am using R for creating a decision tree using CART. I did it using

feature_vectors <- read.table("C:/Users/DVS/Desktop/TagMe!-Data/Train/feature_vectors.txt", quote="\"")
ind <- sample(2, nrow(, replace=TRUE, prob=c(0.7, 0.3))
trainData <-[ind==1,]
testData <-[ind==2,]
myFormula <- quality ~ fixed.acidity + volatile.acidity + citric.acid + residual.sugar + chlorides + free.sulfur.dioxide + total.sulfur.dioxide + density + pH + sulphates + alcohol
table(predict(wine_ctree), trainData$quality)

Now, I Need to Print a list of candidate attributes possible for Root node. ie node with minimal deviation in (im)purity values from Selected Root Node. Is there any way to use built in functions or do I have to modify the source?

by dvs at May 02, 2016 06:33 PM


Maximum weight matching and submodular functions

Given a bipartite graph $G = (U \cup V, E)$ with positive weights let $f: 2^U \rightarrow \mathbb{R}$ with $f(S)$ equal to the maximum weight matching in the graph $G[S\cup V]$.

Is it true that $f$ is a submodular function?

by George Octavian Rabanca at May 02, 2016 06:30 PM


When would the worst case for Huffman coding occur?

I am doing a project on Huffman coding and wanted to know when it wouldn't be ideal to use or rather when would the Huffman coding produce low compression. Since it mainly revolves around the frequencies of the characters present in the input text, I believe the answer is also going to be related to that.

This is what I got from wiki.

The worst case for Huffman coding can happen when the probability of a symbol exceeds 2^(−1) = 0.5, making the upper limit of inefficiency unbounded.

So if a character appears more in our input text, would it ensure good or bad compression? According to wiki, if the probability of symbol exceeds 0.5 (meaning that if a character appears more in our input text), it would be produce bad compression. But from what I understand, in Huffman coding, the more a character appears, the more better it is for us to get compression right?

Well anyway I just want to know in which type of files or text data, our Huffman coding would get bad compression. Maybe I am overthinking, but a little help would be great.


P.S: I don't think this question is subjective, I know that there is a specific answer to it. All I want is a worst case, I don't presume it's subjective?

by rohitkrishna094 at May 02, 2016 06:25 PM

How to find a value in R with attributes? [on hold]

For example, I have the following dataset:

enter image description here

from the above, I can see that for User U1, the score he rated for Life so far is 3.

How can I write a code to find the value under X.4 when User = U1?

by Brenda Zhao at May 02, 2016 06:19 PM

Construct DFA given oracle access to the language

I was given a question the following question.

Given a minimal DFA $A$ with $m$ states over some alphabet $\Sigma$ which is a "black box" (you can only run words to it and it tells you if it accepts or not):

a) describe an algorithm that allows you to find the equivalence classes of the Myhill-Nerode relation and then,

b) describe how you can reconstruct $A$ using the equivalence classes.

I managed to solve the first part using a brute force algorithm that checks all the words with length at most $m$. So now I have the equivalence classes but I don't know how to solve (b), how to reconstruct the DFA and its transition function. I think of looking at the representing words of the classes in lexicographic order then choosing the $m$ states from left to right corresponding to the words as an encoding, then the accepting states will be the states of the words which the black box automaton accepts and $q_0$ will be the state of $\epsilon$'s equivalence class which clearly is an equivalence class. But now the hard part is the transition function.

by Lior at May 02, 2016 06:10 PM




How to implement equi-recursive types in PLT Redex?

I believe that I understand both equi-recursive and iso-recursive types quite well. Hence, I've been trying to implement a type checker for ISWIM with equi-recursive types in PLT Redex. However, for the life of me I can't figure out how to make type equivalence work. Everything else works great.

This is my language:

(define-language iswim
  [X  ::= variable-not-otherwise-mentioned]
  [b  ::= number true false unit]
  [O  ::= + - * =]
  [M  ::= b X (λ (X : T) M) (M M) (if M M M) (O M M)
      (pair M M) (fst M) (snd M) (inL M T) (inR M T)
      (match M (λ (X : T) M) (λ (X : T) M))]
  [V  ::= b (λ (X : T) M) (pair V V) (inL V T) (inR V T)]
  [T  ::= X Unit Bool Num (T -> T) (T + T) (T × T) (μ (X) T)]
  [Γ  ::= () (X T Γ)]
  (λ (X : T) M #:refers-to X)
  (μ (X) T #:refers-to X))

The type checker is a judgment form (I think the "App" case is wrong):

(define-judgment-form iswim
  #:mode (types I I O)
  #:contract (types Γ M T)

  [-------------------- "Number"
   (types Γ number Num)]

  [-------------------- "True"
   (types Γ true Bool)]

  [-------------------- "False"
   (types Γ false Bool)]

  [-------------------- "Unit"
   (types Γ unit Unit)]

  [(where T (lookup Γ X))
   -------------------- "Var"
   (types Γ X T)]

  [(types (X T_1 Γ) M T_2)
   -------------------- "Abs"
   (types Γ (λ (X : T_1) M) (T_1 -> T_2))]

  [(types Γ M_1 T_1)
   (types Γ M_2 T_2)
   (equiv-types T_1 (T_2 -> T_3))
   -------------------- "App"
   (types Γ (M_1 M_2) T_3)]

  [(types Γ M_1 Bool)
   (types Γ M_2 T)
   (types Γ M_3 T)
   -------------------- "If"
   (types Γ (if M_1 M_2 M_3) T)]

  [(types Γ M_1 Num)
   (types Γ M_2 Num)
   (where T (return-type O))
   -------------------- "Op"
   (types Γ (O M_1 M_2) T)]

  [(types Γ M_1 T_1)
   (types Γ M_2 T_2)
   -------------------- "Pair"
   (types Γ (pair M_1 M_2) (T_1 × T_2))]

  [(types Γ M (T_1 × T_2))
   -------------------- "First"
   (types Γ (fst M) T_1)]

  [(types Γ M (T_1 × T_2))
   -------------------- "Second"
   (types Γ (snd M) T_2)]

  [(types Γ M T_1)
   -------------------- "Left"
   (types Γ (inL M T_2) (T_1 + T_2))]

  [(types Γ M T_2)
   -------------------- "Right"
   (types Γ (inR M T_1) (T_1 + T_2))]

  [(types Γ M_3 (T_1 + T_2))
   (types (X_1 T_1 Γ) M_1 T_3)
   (types (X_2 T_2 Γ) M_2 T_3)
   -------------------- "Match"
   (types Γ (match M_3
              (λ (X_1 : T_1) M_1)
              (λ (X_2 : T_2) M_2))

Type equivalence is another judgment form (I put all of the blame on this code):

(define-judgment-form iswim
  #:mode (equiv-types I I)
  #:contract (equiv-types T T)

  [-------------------- "Refl"
   (equiv-types T T)]

  [(equiv-types T_1 T_3)
   (equiv-types T_2 T_4)
   -------------------- "Fun"
   (equiv-types (T_1 -> T_2) (T_3 -> T_4))]

  [(equiv-types T_1 T_3)
   (equiv-types T_2 T_4)
   -------------------- "Sum"
   (equiv-types (T_1 + T_2) (T_3 + T_4))]

  [(equiv-types T_1 T_3)
   (equiv-types T_2 T_4)
   -------------------- "Prod"
   (equiv-types (T_1 × T_2) (T_3 × T_4))]

  [(where X_3 ,(variable-not-in (term (T_1 T_2)) (term X_2)))
   (equiv-types (substitute T_1 X_1 X_3) (substitute T_2 X_2 X_3))
   -------------------- "Mu"
   (equiv-types (μ (X_1) T_1) (μ (X_2) T_2))]

  [(equiv-types (substitute T_1 X (μ (X) T_1)) T_2)
   -------------------- "Mu Left"
   (equiv-types (μ (X) T_1) T_2)]

  [(equiv-types T_1 (substitute T_2 X (μ (X) T_2)))
   -------------------- "Mu Right"
   (equiv-types T_1 (μ (X) T_2))])

Here are my meta-functions:

(define-metafunction iswim
  lookup  : Γ X -> T or #f
  [(lookup () X)        #f]
  [(lookup (X T Γ) X)   T]
  [(lookup (X T Γ) X_1) (lookup Γ X_1)])

(define-metafunction iswim
  return-type : O -> T
  [(return-type +) Num]
  [(return-type -) Num]
  [(return-type *) Num]
  [(return-type =) Bool])

Any help will be appreciated.

by Aadit M Shah at May 02, 2016 05:56 PM

What is the difference between supervised learning and unsupervised learning?

In terms of artificial intelligence and machine learning. Can you provide a basic, easy explanation with an example?

by TIMEX at May 02, 2016 05:53 PM


how to choose a price adjustment, a roll date and a data center for my trading strategy?

I have many doubts about Which roll date and price adjustment should I use. I need to backtest like 50 diferents futures. 6 index(mini sp500, Nikkei 225…), 10 Agriculture (soybean, Oat, Corn….),3 Meats (live Cattle, Lean Hog, Feeder Cattle), 8 Currencies (yen , Australian Dollar, Pound, Swiss Franc…), 5 Metals (Silver, Gold, Palladium…), Treasury Notes (10 years, 5 years…), Us Bond 30 year and some more…

My backtest is for 15 years from 2000 to 2015. I have choosen the backward Panama canal method, rolling with the open interest switch and with a depth #1 in all of then.

My question is…Is that correct? Or I should use differents kind of methods for the differents kinds of futures(agricultures, metals, currencies…)

Another question is that the SCF FUTURES of some futures have gaps in the graphics. There are severals with this gaps between 2009 until 2012 (the mayority of the currencies and the agricultures futures) . The example below is the yen future.

enter image description here

I don’t know why produce this gaps or undiscontinuous bars.

Thank you very much for your time .

by Manuel Botias at May 02, 2016 05:52 PM


Quality of an estimation function

I am trying to find a way to determine the quality of an estimation function.

I have a dictionary that contains int values.

The total "sum" of this dictionary is Dictionary Key * Value.

public int RealValue
        return Items.Sum(x => x.Key * x.Value);

The estimated sum of the Dictionary is calculated by using windows and weights.

    public int EstimatedValue
            return Items.Where(x => x.Key < window1).Sum(x => weight1 * x.Value) +
                (Items.Where(x => x.Key >= window1 &&  x.Key < window2).Sum(x => weight2 * x.Value)) +
                Items.Where(x => x.Key >= window2 && x.Key < window3).Sum(x => weight3 * x.Value);

Now I want to assing a rating to this Estimation Function, i.e. to the quality of the choosen windows and weights.

The estimation function is good, if it can successfully determine which of two dictionaries contain the greater value. It does not matter how close the estimation is to the real count. Of course the Estimation Function is supposed to work with any random pair of dictionaries that are candidates for testing.

What would be a good approach to solve the above problem?

by Postlagerkarte at May 02, 2016 05:47 PM


CloudFormation Linting with cfn-nag

Over the last 3 years I’ve done a lot of CloudFormation work and while it’s an easy enough technology to get to grips with the mass of JSON can become a bit of a blur when you’re doing code reviews. It’s always nice to get a second pair of eyes, especially an unflagging, automated set, that has insight in to some of the easily overlooked security issues you can accidentally add to your templates. cfn-nag is a ruby gem that attempts to sift through your code and present guidelines on a number of frequently misused, and omitted, resource properties.

gem install cfn-nag

Once the gem and its dependencies finish installing you can list all the rules it currently validates against.

$ cfn_nag_rules
IAM policy should not apply directly to users.  Should be on group

I found reading through the rules to be quite a nice context refresher. While there are a few I don’t agree with there are also a some I wouldn’t have thought to single out in code review so it’s well worth having a read through the possible anti-patterns. Let’s check our code with cfn-nag.

cfn_nag --input-json-path . # all .json files in the directory
cfn_nag --input-json-path templates/buckets.json # single file check

The default output from these runs looks like:

| Resources: ["AssetsBucketPolicy"]
| It appears that the S3 Bucket Policy allows s3:PutObject without server-side encryption

Failures count: 0
Warnings count: 1

| Resources: ["ELB"]
| Elastic Load Balancer should have access logging configured

Failures count: 0
Warnings count: 1

If you’d like to reprocess the issues in another part of your tooling / pipelining then the json output formatter might be more helpful.

cfn_nag --input-json-path . --output-format json

        "type": "WARN",
        "message": "Elastic Load Balancer should have access logging configured",
        "logical_resource_ids": [
        "violating_code": null

While the provided rules are useful it’s always a good idea to have an understanding of how easy a linting tool makes adding your own checks. In the case of cfn-nag there are two typed of rules. Some use JSON and jq and the others are pure ruby code. Let’s add a simple pure ruby rule to ensure all our security groups have descriptions. At the moment this requires you to drop code directly in to the gems contents but I imagine this will be fixed in the future.

First we’ll create our own rule:

# first we find where the gem installs its custom rules
$ gem contents cfn-nag | grep custom_rules


Then we’ll add a new rule to that directory

touch $full_path/lib/custom_rules/security_group_missing_description.rb

Our custom check looks like this -

class SecurityGroupMissingDescription

  def rule_text
    'Security group does not have a description'

  def audit(cfn_model)
    logical_resource_ids = []

    cfn_model.security_groups.each do |security_group|
      unless security_group.group_description
        logical_resource_ids << security_group.logical_resource_id

    if logical_resource_ids.size > 0 Violation::FAILING_VIOLATION,
                    message: rule_text,
                    logical_resource_ids: logical_resource_ids)

The code above was heavily ‘borrowed’ from an existing check and a little bit of object exploration was done using pry. Once we have our new rule we need to plumb it in to the current rule loading code. This is currently a little unwieldy but it’s worth keeping an eye on the docs for when this is fixed. We need to edit two locations in the $full_path/lib/cfn_nag.rb file. Add a require to the top of the file along side the other custom_rules and add our new classes name to the custom_rule_registry at the bottom.

--- ./.rbenv/versions/2.3.0/lib/ruby/gems/2.3.0/gems/cfn-nag-0.0.19/lib/cfn_nag.rb  2016-05-01 18:00:14.123226626 +0100
+++ ./.rbenv/versions/2.3.0/lib/ruby/gems/2.3.0/gems/cfn-nag-0.0.19/lib/cfn_nag.rb  2016-05-02 09:55:16.842675430 +0100
@@ -1,4 +1,5 @@
 require_relative 'rule'
+require_relative 'custom_rules/security_group_missing_description'
 require_relative 'custom_rules/security_group_missing_egress'
 require_relative 'custom_rules/user_missing_group'
 require_relative 'model/cfn_model'
@@ -175,6 +176,7 @@

   def custom_rule_registry
+      SecurityGroupMissingDescription,

We can then add a simple CloudFormation security group resource and test our code when it does, and does not include a “description” property.

cat single-sg.json

  "Resources": {
    "my_sg": {
      "Type": "AWS::EC2::SecurityGroup",
      "Properties": {
        "GroupDescription": "some_group_desc",
        "SecurityGroupIngress": {
          "CidrIp": "",
          "FromPort": 34,
          "ToPort": 34,
          "IpProtocol": "tcp"
        "VpcId": "vpc-12345678"

If you run cfn_nag over that template then you shouldn’t see our new rule mentioned. Now go back and remove the GroupDescription line and run it again.

| Resources: ["my_sg"]
| Security group does not have a description

It’s quite early days for the project and there are a few gaps in functionality, controlling which rule sets to apply and easier addition of custom rules are the two I’d like to see, but considering how easy it is to install and run cfn-nag over your templates I think it’s well worth giving your code an occasional once over with a second pair of (automated) eyes. I don’t think I’d add it to my build/deploy pipelines until it addresses that missing functionality but as a small automated code review helper I can see it being quite handy.

by Dean Wilson at May 02, 2016 05:46 PM


Interest rate modelling [on hold]

Assume that the price of a stock is given as

enter image description here

  1. where rt is the interest rate process in the Ho & Lee model and Bt is a second Brownian motion, which is correlated with the Brownian motion driving the spot rate process, and the correlation coefficient between the two Brownian motions is ρ . Set up an algorithm that prices a down and out Barrier call option on the stock. (may use pseudo code). Explain all steps.

  2. where r𝑡 is the interest rate process in the Vasicek model and 𝐵𝑡 is a Brownian motion, which is correlated with correlation coefficient 𝜌 with the Brownian motion that drives the spot rate process. Set up an algorithm that prices an Asian call option on the stock. Explain all steps

by Student M at May 02, 2016 05:23 PM

Local and Stochastic volatility [on hold]

a.What is a volatility surface and how does it point in general to the limitations of the Black-Scholes model? Discuss. b. Describe the algorithm based on the Newton method to compute implied volatilities. c. Briefly explain the main differences and common features of local and stochastic volatility models.

by Student M at May 02, 2016 05:15 PM


Register Now – AWS DevDay in San Francisco

I am a firm believer in the value of continuing education. These days, the half-life on knowledge of any particular technical topic seems to be less than a year. Put another way, once you stop learning your knowledge base will be just about obsolete within 2 or 3 years!

In order to make sure that you stay on top of your field, you need to decide to learn something new every week. Continuous learning will leave you in a great position to capitalize on the latest and greatest languages, tools, and technologies. By committing to a career marked by lifelong learning, you can be sure that your skills will remain relevant in the face of all of this change.

Keeping all of this in mind, I am happy to be able to announce that we will be holding an AWS DevDay in San Francisco on June 21st.The day will be packed with technical sessions, live demos, and hands-on workshops, all focused on some of today’s hottest and most relevant topics. If you attend the AWS DevDay, you will also have the opportunity to meet and speak with AWS engineers and to network with the AWS technical community.

Here are the tracks:

  • Serverless – Build and run applications without having to provision, manage, or scale infrastructure. We will demonstrate how you can build a range of applications from data processing systems to mobile backends to web applications.
  • Containers – Package your application’s code, configurations, and dependencies into easy-to-use building blocks. Learn how to run Docker-enabled applications on AWS.
  • IoT – Get the most out of connecting IoT devices to the cloud with AWS. We will highlight best practices using the cloud for IoT applications, connecting devices with AWS IoT, and using AWS endpoints.
  • Mobile – When developing mobile apps, you want to focus on the activities that make your app great and not the heavy lifting required to build, manage, and scale the backend infrastructure. We will demonstrate how AWS helps you easily develop and test your mobile apps and scale to millions of users.

We will also be running a series of hands-on workshops that day:

  • Zombie Apocalypse Workshop: Building Serverless Microservices.
  • Develop a Snapchat Clone on AWS.
  • Connecting to AWS IoT.

Registration and Location
There’s no charge for this event, but space is limited and you need to register quickly in order to attend.

All sessions will take place at the AMC Metreon at 135 4th Street in San Francisco.





by Jeff Barr at May 02, 2016 05:09 PM


Functional Programming - Simple For Loop For Incrementing Counter

We dont use for loop in functional programming, instead, we use higher order functions like map, filter, reduce etc. These are fine for iterating through an array.

However, I wonder how do I do a simple counter loop.

let i = 0;
for( i; i < 10; i++) {
  console.log( "functional programming is a religion")

So, how would one do this in functional programming?

by Kayote at May 02, 2016 05:07 PM



How does a language abstract away from underlying byte machines?

So as the title suggests,

How does a programing language, such as Java or C, abstract itself away from underlying byte machines?

Im curious, because I know that static/dynamic typing, data abstraction and control abstraction play a role. But I don't understand it so I was wondering if anyone could shed a nice explanation on this question!

Thanks a lot in advance!

EDIT: I currently have this as an explanation so far..

Programming languages tend to abstract away from the underlying byte machines to provide a high-level view for the programmer. This can be for multiple reasons ranging from making code easier to write and debug, to solving platform issues.

One way languages abstract themselves away is through typing. Using java as an example, it uses predefined base types of a certain size, including but not limited to: int, float and char. It also uses object/structured types which have a dynamic size such as Object and String. Like some others, it provides a garbage-collection and automatic memory management utility. This makes it abstract even further away as you can worry less about allocating memory space to data types, compared to a language like C.

by madcrazydrumma at May 02, 2016 05:03 PM


Land of the Free, Home of the Brave: In den USA hat ...

Land of the Free, Home of the Brave: In den USA hat eine Richterin eine Frau zur Entsperrung ihres Smartphones per Fingerabdruck gezwungen. Nun würde man denken, dass das ein völlig klarer Fall von Zeugnisverweigerungsrecht ist, dass man dem Staatsapparat keinen Zugriff auf sein Smartphone geben muss. Aber das scheint in den USA noch nicht so klar geklärt zu sein.

May 02, 2016 05:00 PM


Design models using adjusted or unadjusted stock prices (time series prediction)?

I'm creating a predictive model for closing price of stocks (using neural network and support vector machines.). Is it appropriate to use adjusted prices or unadjusted prices for this prediction purpose? my inputs are technical indicators (+ lags of price) and my output is trend deterministic (1 if we have increase in price and down if we have decrease in price).

by user2991243 at May 02, 2016 04:52 PM


On the definition of NL^O for oracles O

This paper proposes that, among other things, NL$^{\hspace{-0.04 in}O}$ be defined by giving the machine
an O(1)-height stack of oracle tapes which do not count toward the space bound,
and requiring that the machine be deterministic when the oracle stack is non-empty.
What would happen if one modified that definition to replace "deterministic" with
" non-deterministic ∩ co-nondeterministic " , ​ where that's formalized as follows:

Regard the machines as either rejecting or giving an output from {NO,YES}. ​ (So, outputting NO is different from rejecting.) ​ Say a non-deterministic machine "strongly" does something if and only if [[there's a way for it to do that thing] and [the only way for it to not do that thing is rejecting]].
Let "niwsi" be short for "its next interaction with the stack is". ​ Now, the condition is that,
whenever the oracle stack is non-empty, one of the following holds:

strongly, it gives an output without any further interaction with the stack
strongly, niwsi creating a new oracle tape
strongly, niwsi querying the content of the top tape
there is [a bit $b$] and [a tape on the stack] such that strongly, niwsi writing $b$ to that tape

Would the modified definition be equivalent to that paper's definition?
If no, would the resulting definition have any bad consequences?
(such as a reasonable uniformity notion with respect to which ​ NL $\subseteq$ AC1
relativizes for their definition but not for the modified definition)

by Ricky Demer at May 02, 2016 04:46 PM

Edward Z. Yang

Announcing cabal new-build: Nix-style local builds

cabal new-build, also known as “Nix-style local builds”, is a new command inspired by Nix that comes with cabal-install 1.24. Nix-style local builds combine the best of non-sandboxed and sandboxed Cabal:

  1. Like sandboxed Cabal today, we build sets of independent local packages deterministically and independent of any global state. new-build will never tell you that it can't build your package because it would result in a “dangerous reinstall.” Given a particular state of the Hackage index, your build is completely reproducible. For example, you no longer need to compile packages with profiling ahead of time; just request profiling and new-build will rebuild all its dependencies with profiling automatically.
  2. Like non-sandboxed Cabal today, builds of external packages are cached globally, so that a package can be built once, and then reused anywhere else it is also used. No need to continually rebuild dependencies whenever you make a new sandbox: dependencies which can be shared, are shared.

Nix-style local builds work with all versions of GHC supported by cabal-install 1.24, which currently is GHC 7.0 and later. Additionally, cabal-install is on a different release cycle than GHC, so we plan to be pushing bugfixes and updates on a faster basis than GHC's yearly release cycle.

Although this feature is in only beta (there are bugs, see “Known Issues”, and the documentation is a bit sparse), I’ve been successfully using Nix-style local builds exclusively to do my Haskell development. It's hard to overstate my enthusiasm for this new feature: it “just works”, and you don't need to assume that there is a distribution of blessed, version-pegged packages to build against (e.g., Stackage). Eventually, new-build will simply replace the existing build command.

Quick start

Nix-style local builds “just work”: there is very little configuration that needs to be done to start working with it.

  1. Download and install cabal-install 1.24:

    cabal update
    cabal install cabal-install

    Make sure the newly installed cabal is in your path.

  2. To build a single Cabal package, instead of running cabal configure; cabal build, you can use Nix-style builds by prefixing these commands with new-; e.g., cabal new-configure; cabal new-build. cabal new-repl is also supported. (Unfortunately, other commands are not yet supported, e.g. new-clean (#2957) or new-freeze (#2996).)

  3. To build multiple Cabal packages, you need to first create cabal.project file in some root directory. For example, in the Cabal repository, there is a root directory with a folder per package, e.g., the folders Cabal and cabal-install. Then in cabal.project, specify each folder:

    packages: Cabal/

    Then, in the directory for a package, you can say cabal new-build to build all of the components in that package; alternately, you can specify a list of targets to build, e.g., package-tests cabal asks to build the package-tests test suite and the cabal executable. A component can be built from any directory; you don't have to be cd'ed into the directory containing the package you want to build. Additionally, you can qualify targets by the package they came from, e.g., Cabal:package-tests asks specifically for the package-tests component from Cabal. There is no need to manually configure a sandbox: add a cabal.project file, and it just works!

Unlike sandboxes, there is no need to add-source; just add the package directories to your cabal.project. And unlike traditional cabal install, there is no need to explicitly ask for packages to be installed; new-build will automatically fetch and build dependencies.

There is also a convenient script you can use for hooking up new-build to your Travis builds.

How it works

Nix-style local builds are implemented with these two big ideas:

  1. For external packages (from Hackage), prior to compilation, we take all of the inputs which would influence the compilation of a package (flags, dependency selection, etc.) and hash it into an identifier. Just as in Nix, these hashes uniquely identify the result of a build; if we compute this identifier and we find that we already have this ID built, we can just use the already built version. These packages are stored globally in ~/.cabal/store; you can list all of the Nix packages that are globally available using ghc-pkg list --package-db=$HOME/.cabal/store/ghc-VERSION/package.db.
  2. For local packages, we instead assign an inplace identifier, e.g., foo-0.1-inplace, which is local to a given cabal.project. These packages are stored locally in dist-newstyle/build; you can list all of the per-project packages using ghc-pkg list --package-db=dist-newstyle/packagedb. This treatment applies to any remote packages which depend on local packages (e.g., if you vendored some dependency which your other dependencies depend on.)

Furthermore, Nix local builds use a deterministic dependency solving strategy, by doing dependency solving independently of the locally installed packages. Once we've solved for the versions we want to use and have determined all of the flags that will be used during compilation, we generate identifiers and then check if we can improve packages we would have needed to build into ones that are already in the database.


new-configure FLAGS

Overwrites cabal.project.local based on FLAGS.

new-build [FLAGS] [COMPONENTS]

Builds one or more components, automatically building any local and non-local dependencies (where a local dependency is one where we have an inplace source code directory that we may modify during development). Non-local dependencies which do not have a transitive dependency on a local package are installed to ~/.cabal/store, while all other dependencies are installed to dist-newstyle.

The set of local packages is read from cabal.project; if none is present, it assumes a default project consisting of all the Cabal files in the local directory (i.e., packages: *.cabal), and optional packages in every subdirectory (i.e., optional-packages: */*.cabal).

The configuration of the build of local packages is computed by reading flags from the following sources (with later sources taking priority):

  1. ~/.cabal/config
  2. cabal.project
  3. cabal.project.local (usually generated by new-configure)
  4. FLAGS from the command line

The configuration of non-local packages is only affect by package-specific flags in these sources; global options are not applied to the build. (For example, if you --disable-optimization, this will only apply to your local inplace packages, and not their remote dependencies.)

new-build does not read configuration from cabal.config.


Here is a handy phrasebook for how to do existing Cabal commands using Nix local build:

old-style new-style
cabal configure cabal new-configure
cabal build cabal new-build
cabal clean rm -rf dist-newstyle cabal.project.local
cabal run EXECUTABLE cabal new-build; ./dist-newstyle/build/PACKAGE-VERSION/build/EXECUTABLE/EXECUTABLE
cabal repl cabal new-repl
cabal test TEST cabal new-build; ./dist-newstyle/build/PACKAGE-VERSION/build/TEST/TEST
cabal benchmark BENCH cabal new-build; ./dist-newstyle/build/PACKAGE-VERSION/build/BENCH/BENCH
cabal haddock does not exist yet
cabal freeze does not exist yet
cabal install --only-dependencies unnecessary (handled by new-build)
cabal install does not exist yet (for libraries new-build should be sufficient; for executables, they can be found in ~/.cabal/store/ghc-GHCVER/PACKAGE-VERSION-HASH/bin)

cabal.project files

cabal.project files actually support a variety of options beyond packages for configuring the details of your build. Here is a simple example file which displays some of the possibilities:

-- For every subdirectory, build all Cabal files
-- (project files support multiple Cabal files in a directory)
packages: */*.cabal
-- Use this compiler
with-compiler: /opt/ghc/8.0.1/bin/ghc
-- Constrain versions of dependencies in the following way
constraints: cryptohash < 0.11.8
-- Do not build benchmarks for any local packages
benchmarks: False
-- Build with profiling
profiling: true
-- Suppose that you are developing Cabal and cabal-install,
-- and your local copy of Cabal is newer than the
-- distributed hackage-security allows in its bounds: you
-- can selective relax hackage-security's version bound.
allow-newer: hackage-security:Cabal

-- Settings can be applied per-package
package cryptohash
  -- For the build of cryptohash, instrument all functions
  -- with a cost center (normally, you want this to be
  -- applied on a per-package basis, as otherwise you would
  -- get too much information.)
  profiling-detail: all-functions
  -- Disable optimization for this package
  optimization: False
  -- Pass these flags to GHC when building
  ghc-options: -fno-state-hack

package bytestring
  -- And bytestring will be built with the integer-simple
  -- flag turned off.
  flags: -integer-simple

When you run cabal new-configure, it writes out a cabal.project.local file which saves any extra configuration options from the command line; if you want to know how a command line arguments get translated into a cabal.project file, just run new-configure and inspect the output.

Known issues

As a tech preview, the code is still a little rough around the edges. Here are some more major issues you might run into:

  • Although dependency resolution is deterministic, if you update your Hackage index with cabal update, dependency resolution will change too. There is no cabal new-freeze, so you'll have to manually construct the set of desired constraints.
  • A new feature of new-build is that it avoids rebuilding packages when there have been no changes to them, by tracking the hashes of their contents. However, this dependency tracking is not 100% accurate (specifically, it relies on your Cabal file accurately reporting all file dependencies ala sdist, and it doesn't know about search paths). There's currently no UI for forcing a package to be recompiled; however you can induce a recompilation fairly easily by removing an appropriate cache file: specifically, for the package named p-1.0, delete the file dist-newstyle/build/p-1.0/cache/build.

If you encounter other bugs, please let us know on Cabal's issue tracker.

by Edward Z. Yang at May 02, 2016 04:45 PM


Hot Startups on AWS – April 2016 – Robinhood, Dubsmash, Sharethrough

Continuing with our focus on hot AWS-powered startups (see Hot Startups on AWS – March 2016 for more info), this month I would like to tell you about:

  • Robinhood – Free stock trading to democratize access to financial markets.
  • Dubsmash – Bringing joy to communication through video.
  • Sharethrough – An all-in-one native advertising platform.

The founders of Robinhood graduated from Stanford and then moved to New York to build trading platforms for some of the largest financial institutions in the world. After seeing that these institutions charged investors up to $10 to place trades that cost almost nothing, they moved back to California with the goal of democratizing access to the markets and empowering personal investors.

Starting with the idea that a technology-driven brokerage could operate with significantly less overhead than a traditional firm, they built a self-serve service that allows customers to sign up in less than 4 minutes. To date, their customers have transacted over 3 billion dollars while saving over $100 million dollars in commissions.

After a lot of positive pre-launch publicity, Robinhood debuted with a waiting list of nearly a million people. Needless to say, they had to pay attention to scale from the very beginning. Using 18 distinct AWS services, a beginning team of just two DevOps people built the entire system. They use AWS Identity and Access Management (IAM) to regulate access to services and to data, simplifying their all-important compliance efforts. The Robinhood data science team uses Amazon Redshift to help identify possible instances of fraud and money laundering. Next on the list is international expansion, with plans to make use of multiple AWS Regions.

The founders of Dubsmash had previously worked together to create several video-powered applications. As the cameras in smartphones continued to improve, they saw an opportunity to create a platform that would empower people to express themselves visually. Starting simple, they built their first prototype in a couple of hours. The functionality was minimal: play a sound, select a sound, record a video, and share. The initial response was positive and they set out to build the actual product.

The resulting product, Dubsmash, allows users to combine video with popular sound bites and to share the videos online – with a focus on modern messaging apps. The founders began working on the app in the summer of 2014 and launched the first version the following November. Within a week it reached the top spot in the German App Store. As often happens, early Dubsmash users have put the app to use in intriguing and unanticipated ways. For example, Eric Bruce uses Dubsmash to create entertaining videos of him and his young son Jack to share with Priscilla (Eric’s wife / Jack’s mother) (read Watch A Father and His Baby Son Adorably Master Dubsmash to learn more).

Dubsmash uses Amazon Simple Storage Service (S3) for video storage, with content served up through Amazon CloudFront.  They have successfully scaled up from their MVP and now handle requests from millions of users. To learn more about their journey, read their blog post, How to Serve Millions of Mobile Clients with a Single Core Server.

Way back in 2008, a pair of Stanford graduate students were studying the concept of virality and wanted to create ads that would deserve your attention rather than simply stealing it. They created Sharethrough, an all-in-one native advertising platform for publishers, app developers, and advertisers. Today the company employs more than 170 people and serves over 3 billion native ad impressions per month.

Sharethrough includes a mobile-first content-driven platform designed to engage users with quality content that is integrated into the sites where it resides. This allows publishers to run premium ads and to maintain a high-quality user experience. They recently launched an AI-powered guide that helps to maximize the effectiveness of ad headlines.

Sharethrough’s infrastructure is hosted on AWS, where they make use of over a dozen high-bandwidth services including Kinesis and Dynamo, for the scale of the technical challenges they face. Relying on AWS allows them to focus on their infrastructure-as-code approach, utilizing tools like Packer and Terraform for provisioning, configuration and deployment. Read their blog post (Ops-ing with Packer and Terraform) to learn more.




by Jeff Barr at May 02, 2016 04:24 PM


Edge-partitioning cubic graphs into claws and paths

Again an edge-partitioning problem whose complexity I'm curious about, motivated by a previous question of mine.

Input: a cubic graph $G=(V,E)$

Question: is there a partition of $E$ into $E_1, E_2, \ldots, E_s$, such that the subgraph induced by each $E_i$ is either a claw (i.e. $K_{1,3}$, often called a star) or a $3$-path (i.e. $P_4$)?

I think I saw a paper one day where this problem was proven to be NP-complete, but I cannot find it anymore, and I don't remember whether that result applied to cubic graphs. On a related matter, I'm aware that edge-partitioning a bipartite graph into claws is NP-complete (see Dyer and Frieze). Does anyone have a reference for the problem I describe, or something related (i.e. the same problem on another graph class, that I could then try to reduce to cubic graphs)?

by Anthony Labarre at May 02, 2016 04:12 PM


Comparing asymptotic complexity of functions $\log{n}$, $(\log{n})^c$ and $\sqrt{n}$ [duplicate]

This question already has an answer here:

I usually follow approach of taking logs and putting arbitrary large powers of $2$ for $n$ and reducing the given function to some constant value for large value of $n$. So in this case I did it as follows:

  • $\log{n}$

    Putting $n=2^{2^{2^{16}}}$ and taking $\log$


  • $(\log{n})^c$

    Putting $n=2^{2^{2^{16}}}$, $c=2$ and taking $\log$



  • $\sqrt{n}$

    Putting $n=2^{2^{2^{16}}}$ and taking $\log$


So it looks like:

$\log{n}=\sqrt{n}<(\log {n})^c$

However I checked that $\log{8}>\sqrt{8}$. But, $\log{(2^{2^{2^{16}}})}=64$, whereas $\sqrt{2^{2^{2^{16}}}}$ is much larger number making equality between the two unlikely.

  1. So where I am making mistakes in above calculations?
  2. How above three functions compare asymptotically?

by anir at May 02, 2016 04:11 PM

How are hash tables O(1) taking into account hashing speed?

Hash tables are said to be amortized $\Theta(1)$ using say simple chaining and doubling at a certain capacity.

However, this assumes the lengths of the elements are constant. Computing the hash of an element requires going through the element, taking $\Theta(l)$ time where $l$ is the length.

But to discriminate between $n$ elements, we need the elements to have length at least $\lg n$ bits; otherwise by pigeonhole principle they won't be distinct. The hash function going through $\lg n$ bits of element is going to take $\Theta(\lg n)$ time.

So can we instead say that the speed of a hash table, taking into account a reasonable hash function which uses all parts of the input, is actually $\Theta(\lg n)$? Why, then, are hash tables in practice efficient for storing variable-length elements, such as strings and large integers?

by user54609 at May 02, 2016 04:10 PM


Was passiert, wenn ein Big-Data-Fetischist, so ein ...

Was passiert, wenn ein Big-Data-Fetischist, so ein Startup-Hypegenerator, ein Land managen darf, kann man in Estland beobachten.

May 02, 2016 04:00 PM


High Scalability

Gone Fishin'

Well, not exactly Fishin', but I'll be on a month long vacation starting today. I won't be posting (much) new content, so we'll all have a break. Disappointing, I know. Please use this time for quiet contemplation and other inappropriate activities. See you on down the road...

by Todd Hoff at May 02, 2016 03:56 PM


Computing a histogram with the number of extant values not known in advance

(This may be more fitting for CSTheory, I'm not sure.)

I'm looking for an practical or theoretical work (that is, academic papers, online jots, pseudocode or code) regarding efficient algorithms for the following problem:

Unknown-Number-of-Bins Histogram


  • An array of integers $a$, of length $n$.


  • An array of integers $\text{bins}$ of length $m <= n$.
  • An array of unsigned integers $\text{counts}$, also of length $m$.

Output Requirements:

  • For every $i \in \{0...m-1\}$ it must be the case that

    $\bigl|\bigl\{ j \in \{0...n-1\} \mid a_j = \text{bins}_i \bigr\}\bigr|$ $ = \text{counts}_i$

    In other words, $\text{bins}$ and $\text{counts}$ constitute a histogram of $a$, with one bin for every unique value in $a$.

  • It is not required for $\text{bins}$ or $\text{counts}$ to be sorted.

Other Notes:

  • Complexity is considered as a function of both $n$ and $m$.
  • Low time complexity is required both asymptotically and for relatively low values of $m$ - but it is not required for low values of $n$.
  • No hiding monstrosities in the $\mathop{O}(\cdot)$ constants please! This should be usable in practice.
  • A parallel(izable) approach? You are most welcome :-)
  • Low space complexity is a plus.
  • Deterministic algorithms preferred, and barring that, go easy on those coin flips.

Clearly, there are many way to go about this, some very straightforward, e.g. "sort the input, then build a sorted histogram in a single pass", in $\mathop{O}(n \log{n})$ time. Of course I'm interested in something better....

by einpoklum at May 02, 2016 03:51 PM

How to find maxflow with minimum number of edges?

I am struggling with the flowing problem:

You are given a source s and a sink t and a biparted graph G. All vertices {v} from the left half are connected to the source s with given capacity C[v]. Also all vertices {u} from the right half of the graph are connected to the sink t with a capacity B[u]. Each one of these edges cost 0 units of money (its free).

And in the end each vertex {u} from the left half of the graph is connected to each vertex {v} from the right half with capacity equal to infinity. Each of these edges cost 1 unit of money.

All edges are directed.

What we want to find is the maximum flow and the least amount of money to create it (the minimum sum of costs of all the edges we have selected). Also we want to print which edges we have selected.

Number of nodes <= 20
Number of edges <= 400

I thought of using mincost maxflow (It is reasonably good at time complexity) but I have no idea how to apply it.

by Lepluto at May 02, 2016 03:37 PM


Dynamically changing weights in TensorFlow

In TensorFlow, I'm trying to change weights during training, but get no change in the results. I've tried to disrupt the weights (set to zero), but it seems to do nothing (other than take longer to complete). What am I missing? Is there a way to manipulate W like a regular matrix/tensor during session?

from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets('MNIST_data', one_hot=True)

import tensorflow as tf
sess = tf.InteractiveSession()
x = tf.placeholder(tf.float32, shape=[None, 784])
y_ = tf.placeholder(tf.float32, shape=[None, 10])
W = tf.Variable(tf.zeros([784,10]), trainable=True)
W2 = tf.Variable(tf.zeros([784,10]), trainable=False)
b = tf.Variable(tf.zeros([10]))

y = tf.nn.softmax(tf.matmul(x,W) + b)
loss = tf.reduce_mean(tf.square(y_ - y))
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(loss)

for i in range(1000):
#try to change W during training
  W = W2

  W = tf.Variable(tf.zeros([784,10]))


  batch = mnist.train.next_batch(1){x: batch[0], y_: batch[1]})

correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))

accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

print(accuracy.eval(feed_dict={x: mnist.test.images, y_: mnist.test.labels}))

Accuracy remains the same (0.82).

by Danny at May 02, 2016 03:34 PM

Computing ROC curve for K-NN classifier

As you probably know, in K-NN, the decision is usually taken according to the "majority vote", and not according to some threshold - i.e. there is no parameter to base a ROC curve on.

Note that in the implementation of my K-NN classifier, the votes don't have equal weights. I.e. the weight for each "neighbor" is e^(-d), where d is the distance between the tested sample and the neighbor. This measure gives higher weights for the votes of the nearer neighbors among the K neighbors.

My current decision rule is that if the sum of the scores of the positive neighbors is higher than the sum of the scores of the negative samples, then my classifier says POSITIVE, else, it says NEGATIVE.

But - There is no threshold.

Then, I thought about the following idea:

Deciding on the class of the samples which has a higher sum of votes, could be more generally described as using the threshold 0, for the score computed by: (POS_NEIGHBORS_SUMMED_SCORES - NEG_NEIGHBORS_SUMMED_SCORES)

So I thought changing my decision rule to be using a threshold on that measure, and plotting a ROC curve basing on thresholds on the values of


Does it sound like a good approach for this task?

by SomethingSomething at May 02, 2016 03:26 PM

Generate new observations that meet past trend

I have 100 observations, each containing ten samples.

Observation 1: 0.7 0.6 0.9 0.5 1.2 1.6 0.98 0.65 1.34 1.22
Observation 2: ...
Observation 100: ...

The underlying assumption is that all 100 observations follow some common trend.

Based on the above data, I want to apply machine learning and come up with new observations each time that meet that trend. How is that possible?

More concretely, how should I frame the machine learning problem? What should my dependent/independent variables be?

by Learner at May 02, 2016 03:22 PM


IS $LOGSPACE\subsetneq QMA$ an open problem?

Having read some chapters of Computational Complexity: A Modern Approach, I see no time or space hierarchy theorem which applies to this case. As far as I can see, we know the following inclusions:

$L \subseteq NL \subseteq P \subseteq BPP \subseteq BQP \subseteq QMA$

but no inclusion is known to be strict, and it is not known whether at least one of these inclusions is strict. The statements $L=P, P=BQP, BQP=QMA$ are all unsolved, so I am free to conjecture that they are all true, or all false. I know that it is absolutely obvious that $L\subsetneq QMA$, because to go from one to the other one must (1) remove the logarithmic space bound, (2) introduce randomness to the algorithm, (3) add quantum and (4) add nondeterminism. This question is interesting because each of these augmentations is believed but not proved to add computational power (with the possible exception of (2)), they correspond to their respective flagship inclusions above, and this question captures all of them simultaneously, showcasing an enormously long distance unsolved relationship. It is also interesting because to my knowledge it is the longest-distance unresolved inclusion between two natural complexity classes (classes that arise naturally in studying theory and which capture fundamental notions of resource-bounded computation, but this point is subjective).

The only strict inclusions I can find are $L\subsetneq Space(n) \not= NP$ and $L \subsetneq PSPACE \supseteq P$. The first is not satisfying because $Space(n) \subsetneq NP$ is open.

This also begs the question, alright if these augmentations individually are hard to reason about, but surely you cannot add them all at once without changing your complexity class? And is it easier to prove such weaker strict inclusions?

(and, tongue in cheek, if the answer to my question is yes, what have complexity theorists been up to the past few decades?)

by Lieuwe Vinkhuijzen at May 02, 2016 03:06 PM


Are side effects everything that cannot be found in a pure function?

Is it safe to say that the following dichotomy holds:

Each given function is

  • either pure
  • or has side effects

If so, side effects (of a function) are anything that can't be found in a pure function.

by Aleksey Bykov at May 02, 2016 03:03 PM


Unverhoffter Datenreichtum bei der AfD: Namen und Anschrift ...

Unverhoffter Datenreichtum bei der AfD: Namen und Anschrift von 2000 Teilnehmern zum AfD-Parteitag geleakt. Habt ihr ja wahrscheinlich gesehen.

Ich linke da nicht drauf, weil ich das verurteile.

Würden wir das umgekehrt gutheißen, wenn AfD-Sympathisanten das mit einem Parteitag der Linkspartei oder SPD tun würde?

Nein, würden wir nicht.

Wenn die AfD verfassungsfeindlich wäre, dann könnte man so eine Liste sammeln, aber das wären dann bitte auch nicht irgendwelche Netz-Vigilantes sondern das wären dann die zuständigen Behörden.

Ob euch die Position der AfD gefallen oder nicht — das ist eine Partei und damit Teil unserer Demokratie. Setzt euch gefälligst inhaltlich mit denen auseinander und nicht durch Demos vor Privathäusern von Politikern oder Leaken von personenbezogenen Daten.

Diese beschissene Mistgabel-Mentalität immer, das ist für meinen Geschmack viel zu nahe an Folterknast-Entführungen der Amis dran. Nur weil ich jemanden für gefährlich halte, heißt das nicht, dass der seine Menschenrechte verloren hat!

May 02, 2016 03:00 PM


Learning curves - Why does the training accuracy start so high, then suddenly drop?

I implemented a model in which I use Logistic Regression as classifier and I wanted to plot the learning curves for both training and test sets to decide what to do next in order to improve my model.

Just to give you some information, to do plot the learning curve I defined a function that takes a model, a pre-split dataset (train/test X and Y arrays, NB: using train_test_split function), a scoring function as input and iterates through the dataset training on n exponentially spaced subsets and returns the learning curves.

My results are in the below image enter image description here

I wonder why does the training accuracy start so high, then suddenly drop, then start to rise again as training set size increases? And conversely for the test accuracy. I thought extremely good accuracy and the fall was because of some noise due to small datasets in the beginning and then when datasets became more consistent it started to rise but I am not sure. Can someone explain this?

And finally, can we assume that these results mean a low variance/moderate bias (70% accuracy in my context is not that bad) and so to improve my model I must resort to ensemble methods or extreme feature engineering?

by DiamondDogs95 at May 02, 2016 02:59 PM


can we have a virtual destructor in a class? [on hold]

what are different uses of virtual constructor and virtual destructor in a class? can virual constructor or virtual destructor to be u/sed in class?destructor cannot return any value.Are they important for reallocation

by amruta at May 02, 2016 02:58 PM

Planet Emacsen

sachachua: 2016-05-02 Emacs News

Links from, /r/orgmode, Hacker News,, Youtube, the changes to the Emacs NEWS file, and emacs-devel.

Past Emacs News round-ups

The post 2016-05-02 Emacs News appeared first on sacha chua :: living an awesome life.

by Sacha Chua at May 02, 2016 02:48 PM


Finding one face in planar graph

Given a planar graph (represented using adjacency lists) we want to find a set of vertices which are around one (random) face. We know that the graph contains at least one triangle.

How do we find such a face?

by keram at May 02, 2016 02:46 PM


Evidence of containment of $PH$

We know that $PH$ is in $P^{PP}$ or in $P^{\#P}$ and we do not know if $PH$ is in $PP$. We know $AWPP$ and $APP$ are weakening of $PP$ where $AWPP$ is in $APP$ is in $PP$.

(1) Is it possible if $PH$ is in $P^{AWPP}$ or $P^{APP}$ or is there any consequences if $PH$ is in $P^{AWPP}$ or $P^{APP}$? Would it make it more plausible $PH$ is in $PP$?

(2) Is there an analog of decision version of $\#P$ which is $PP$ for clases $AWPP$ or $APP$?

by Turbo at May 02, 2016 02:33 PM


find a minimum-cost pair of arc-disjoint paths, both within a given restricted distance

Is there a polynomial algorithm that can find a pair of arc-disjoint paths in a directed graph that has a minimum total cost, subject to the condition that both paths are within the same distance.

Given a distance $D$, and a graph $G(V,E)$, where $V$ is a set of nodes, $E$ is a set of directed links (or arcs). Each link has two additive metrics, namely, a cost (binary, $0$ or $1$) and a distance (positive integer). For a source-destination pair $(s,t)$, find a set of two paths P1 and P2, where $P_1$ and $P_2$ are arc-disjoint and path distance $L(P_1)\le D$, $L(P_2)\le D$, such that the total cost is minimized. Note that the distance and the cost of a path is the sum of the length and the cost of the links it traverses, respectively.

I knew that when the cost of each arc is integer or real, it is an NP-Complete problem, see GCSDP($k$). I do not know whether it is still NP-complete when the cost of each arc is limited to $\{0,1\}$. Anyway, I want to know if there is a polynomial solution for it.

by Along Chare at May 02, 2016 02:25 PM


Features construction packages/source code

I'm currently working on features constructions algorithms, I read many papers on them ( FRINGE [Pagallo & Haussler, 1990], CITRE [Matheus and Rendel, 90], GALA [Yuh-Jyh Hu, Dennis Kibler, 1996], Genetic Programming ...). I looked for some implementations, and as they are not very popular, I didn't find many interesting results. I found this library for rapidMiner, and this one for Weka, but nothing more. Do you please have some links to features constructions implementations (packages, source code, tools)? Also, Do you please know if some of those algorithms are applicable to specifics problems or they can be applied to any of them.

Thanks in advance for any help

by Cedric FOTSO at May 02, 2016 02:07 PM


What are you working on this week?

This is the weekly thread to discuss what you have done recently and are working on this week.

Please be descriptive and don’t hesitate to champion your accomplishments or ask for help, advice or other guidance.

by caius at May 02, 2016 02:05 PM

Planet Theory

TCC 2016-B Call For Papers

(Posting this here as a backup, since the main TCC website is temporarily down. The submission server is up and running though.) The Fourteenth Theory of Cryptography Conference will be held in Beijing, China, sponsored by the International Association for Cryptologic Research (IACR). Papers presenting original research on foundational and theoretical aspects of cryptography are […]

by adamdsmith at May 02, 2016 02:02 PM


Kurzer Kommentar des Staatsfernsehens zum Thema TTIP:Kein ...

Kurzer Kommentar des Staatsfernsehens zum Thema TTIP:
Kein Wort darüber, dass sich seit Beginn der TTIP-Verhandlungen 2013 und der damals absolut berechtigten Kritik enorm viel getan hat in Sachen Transparenz: So sind die seit 11:00 Uhr von Greenpeace online gestellten Texte bereits seit mindestens einem Monat von den Abgeordneten des Bundestages sowie anderer nationaler Parlamente und des Europaparlaments einsehbar.
NEIN das ist ja eine Frechheit! Boah da krieg ich ja Pickel. Haben die ernsthaft jemanden gefunden, der den Sonnenkönig-Regierungsstil bei TTIP noch verteidigt!?

Da zahlt man doch gerne seine Rundfunkgebühren!1!!

May 02, 2016 02:00 PM


How to construct a running kd-tree?

I have a stream of 3-tuples of type (x,y,t) where x and y are in the range 0-127 and t is time, therefore monotonic increasing.

I want to be able to quickly search for points around a query point and think that constructing a kd-tree from the data is a good way to achieve that.

Now I can hardly construct a kd-tree by inserting each new point as, due to the monotonic increasing t, it will inadvertently become unbalanced.

Is there a way to create a "running" kd-tree, that is inserting elements on the one t-side, while dropping elements on the other t-side without the tree becoming unbalanced?

by fho at May 02, 2016 01:50 PM


Teaching investment portfolio maths

I don't know whether this question is in order here. I do a bit of teaching and I am preparing my own notes but I thought that his should not be necessary.

In which book/pdf on the web can we find a basic but rigorous treatment of the notions

  • return (log,geometric)
  • expected return (arithmetic/geometric)
  • volatility (annualizing, ...)
  • Sharpe ratio
  • maybe more (e.g. draw down)

both in the case of one asset and in the portfolio setting (where matrix algebra can be applied).

I would love to have this one paper from the net that containes this short intro. It would answer 10% of the questions posted here too.

If it is not on the web - let us write it ;)

by Richard at May 02, 2016 01:44 PM


p2k16 Hackathon Report: tb@ on documentation, ports, wireless

The second p2k16 report comes from first time hackathon attendee Theo Buehler, who writes:

Earlier this year gilles@ invited me to attend p2k16 in Nantes. This was going to be my first hackathon. Despite the fact that it is in the middle of the semester, I could arrange to take a week off and thus got the opportunity to finally meet a few members of the project.

May 02, 2016 01:42 PM

Planet Emacsen

Irreal: Building a Hugo Blog with Org Mode

Chris Bonnell has a post up on how he blogs using Org and the Hugo engine. It's another static page solution like the Jekyll and Nikola ones that I've written about before.

This solution is a little trickier because Hugo doesn't support Org markdown but Bonnell shows how to get around that. You'll probably need to follow the two links he gives to understand his setup. If you're trying to decide on a static blogging platform, give Bonnell's post a read.

by jcs at May 02, 2016 01:41 PM

Planet Theory

Handbook of Computational Social Choice

We are delighted to announce that the Handbook of Computational Social Choice has now been published with Cambridge University Press.

handbook_cscDescription: The rapidly growing field of computational social choice, at the intersection of computer science and economics, deals with the computational aspects of collective decision making. This handbook, written by thirty-six prominent members of the computational social choice community, covers the field comprehensively. Chapters devoted to each of the field’s major themes offer detailed introductions. Topics include voting theory (such as the computational complexity of winner determination and manipulation in elections), fair allocation (such as algorithms for dividing divisible and indivisible goods), coalition formation (such as matching and hedonic games), and many more. Graduate students, researchers, and professionals in computer science, economics, mathematics, political science, and philosophy will benefit from this accessible and self-contained book.

A PDF of the book is freely available on the Cambridge University Press website. Click on the Resources tab, then on “Resources” under “General Resources”, and you will find a link called “Online Version”. The password is cam1CSC.

Alternatively, the book can be purchased through Cambridge University Press, Amazon, and other retailers.

We hope that the book will become a valuable resource for the computational social choice community, and the CS-econ community at large.

Best regards,
Felix Brandt, Vince Conitzer, Ulle Endriss, Jerome Lang, and Ariel Procaccia (the editors)

by felixbrandt at May 02, 2016 01:39 PM


Initiating new orders with active "order-session" only?

Is it a must to establish "quote-session" & subscribing to quotes/market data before initiating a "New Order-single(Market-GTC)"? I actually can't see any use of quote-session for trading activities & my FIX-bridge is opening "single-new orders" using "order-session" only.

The reason i want to avoid quote-session: 1.It can add up to over-all processing time/latency 2.I can logon to quote-session & use the quote-flow from a different application. Any thoughts on these?

by Reza Str at May 02, 2016 01:32 PM



Implications of Halting Problem being unsolvable?

I came across a confusing situation when reducing the Halting Problem (HP) to the Blank Tape Accepting Problem (BP).

We know that since HP can be reduced to BP, BP is decidable $\implies$ HP is solvable. By contrapositive, we get that if HP is undecidable, then BP is undecidable as well. But now this creates an issue.

BP is just a special case of HP where the input to HP is a Turing Machine and a blank tape. Also, HP is a harder version of BP since we can get all possible tapes as input. So, just because a harder problem is unsolvable, how can we conclude that the more general form is also unsolvable.

It might be possible that we have some TM that can solve BP but cannot solve HP.

I know that both BP and HP are undecidable, but my main question is that how can we conclude via reduction that BP $\implies$ HP?

by Banach Tarski at May 02, 2016 01:21 PM


Python generic functions dispatching

Is there are any way to dispatch a function which arguments may be:

  • single lambda-function (i.e. func(lambda x: x))
  • kwargs (i.e. func(a='some sting', b='some other string'))

As I see it, singledispatch decorator from functools only supports dispatching on the first argument, which in my case won't work. Am I missing something?

by Egor Biriukov at May 02, 2016 01:18 PM


Satisfiability 2 CNF-SAT to 3 CNF-SAT transformation/reduction

This Reduction is trying to prove that 2CNF-SAT is also NP-Complete, after proving 3CNF-SAT is NP-Complete.

If we had a reduction that given an instance of 2CNF-SAT with k clauses over 'i' number of variables, and we create an instance of 3CNF-SAT with 2n clauses by introducing for clause i a new variable y; then for the i'th 2SAT clause we generate two 3SAT clauses. This is a reduction from a 2CNF-SAT to a 3CNF-SAT.

Is this not a correct reduction because all of the other clauses after the transformation are still 2CNF-SAT except for the i'th clause?

by theCoder at May 02, 2016 01:09 PM


Telepolis über die Kölner Silvesternacht. Die Situation ...

Telepolis über die Kölner Silvesternacht. Die Situation war noch viel schlimmer als bisher angesagt, großflächiges Behördenversagen, wo man hinguckt, und danach wie immer alle nur mit Arschbedecken und sich selbst aus dem Scheinwerferlicht bewegen beschäftigt. Wie so Kakerlaken, wenn man den Stein hochhebt.

Money Quote:

Gegen Mitternacht wurde es plötzlich so eng in der Mitte der Brücke, dass Menschen in Panik vom Fußweg über Zäune auf Gleise kletterten. Der Einsatzleiter der Bundespolizei erinnert sich vor dem Ausschuss: "Das ist ja wie Duisburg hier", habe jemand in Anspielung auf die Duisburger Loveparade (2010) geschrien. Mit dem Hilferuf: "Rettet meinen Sohn!" habe ein Vater ihm seinen 5-jährigen Jungen entgegen gehalten, andere schrien: "Entweder wir springen aufs Gleis oder in den Rhein". Die Bundespolizei stoppte daraufhin den Zugverkehr.
Heilige Scheiße, das ist ja mal übel!

Die Landesregierung hält anscheinend immer noch Dokumente geheim in der Angelegenheit.

Auf der anderen Seite: In NRW ist Rot-Grün an der Macht. Von der SPD kam ja noch nie ordentliche Planung, immer nur "wird schon gutgehen" Gutwetter-Risikomanagement.

Update: Ein Schmankerl aus dem Artikel noch, fürs Archiv:

Presseberichten zufolge teilte Kraft lediglich mit, dass durch eine Offenlegung die "Funktionsfähigkeit der Regierung" beeinträchtigt werden könnte.

May 02, 2016 01:00 PM


How do I solve this? [on hold]

How do I start with this problem:- Q)Design a DFA for set of strings over {a,b} in which there are at least two occurrences of b between any two occurrences of a.

by sky4597 at May 02, 2016 12:48 PM


What are the variables involved in constructing an ROC curve?

Say I have a classifier and I achieve FAR of 10% and FRR of 15%. What would I need to do with these to construct an ROC curve?

I'm having trouble seeing what they actually represent and the situation in which they are used. I don't seem to have an important variable the shifts the FAR and FRR in one direction or the other. Can I still use ROC?

by mino at May 02, 2016 12:47 PM

Do I have detect a different sides of the object as a single object or a different objects?

Do I have detect a different sides of the object as a single object or a different objects?

For example, I want to detect a cars using one of Region-based - Convolution Deep Neural Network - approach:

I want to detect a car from any side. And I am not interested which side I see.

enter image description here

All of 3 sides of the car look very different - as different dissimilar objects. Will this improve the detection, if I will detect each of the 3 sides as 3 different objects?

Or the last some fully connected layers completely solves this problem in CNNs?

And how many fully connected layers and neurons in each must be for detection of 3 sides of each of 20 objects?

by Alex at May 02, 2016 12:36 PM


Global Min Var Portfolio Weights for each period (restricted model) [on hold]

I have daily returns for 5 assets from 1990 to 2015.

I am trying to see the evolution of the weights for the Global Min Var Portfolio with 2 restrictions: 1) all portfolio weights are strictly positive 2)The sum of the portfolio weights equals 1

I have code that gives me the Global Min Var weights for a given period, but if I want to plot the weights for all periods I would need a loop for the function. I already tried a loop but it doesn't work. Would you be able to help me and tell me the code for the loop?

The code for the function is:

A = [];
b = [];
weights = fmincon(fun,x0,A,b,Aeq,beq,lb,ub);

Thanks in advance!

by Alice at May 02, 2016 12:27 PM


What is this recurrence doing [on hold]

I'm having a hard time understand what is going on here. It seems like there are two indices i and j and some kind of minimum is found recursively? I'm also not sure what role k plays.

H[i,j] = min {H[i, k-1] + H[k + 1, j]} + Σ p's with H[i,i] = 0
        i≤k<j                           s=i

by Carlo at May 02, 2016 12:25 PM


Benutzt hier jemand PGP/GPG? (Ctrl-F Apple) :-)Bisschen ...

Benutzt hier jemand PGP/GPG? (Ctrl-F Apple) :-)

Bisschen Kontext. Money Quote:

Special thanks to Mr. D. J. Bernstein for refinements to the algorithm that allowed us to reduce the required workload considerably
Das sind übrigens die selben Leute, die auch hinter der Meldung hier standen. Damals war die Aufklärung, dass das Bitflipper waren, die dann auch keine gültige Signatur hatten, die man also auch gar nicht versehentlich verwenden könnte.

May 02, 2016 12:00 PM


How to choose a GARCH model which delivers iid standardized residuals?

For my thesis I first need to examine nine financial time series and fit a conditional volatility model such that the obtained standardized residuals ($z_t = \epsilon_t / \sigma_t$) are approximately iid with mean 0 and variance 1.

Whereas GARCH(1,1) succeeds in delivering iid standardized residuals for five of these series, and GJR-GARCH(1,1) achieves iid standardized residuals for other two series, I've not been able to get iid $z_t$ for the remaining two series using GARCH, GJRGARCH, ThresholdGARCH, EGARCH, NAGARCH and CSGARCH.

When I shared my results with my thesis supervisor, he said I've probably done something wrong since GARCH, GJRGARCH and ThresholdGARCH usually succeed wrt this goal. The problem is I don't understand what I could have done wrong.

The mentioned series are three SPDR ETFs (XLF and XLU). Closing prices can be found here (this is my first question, so I'm sorry if this isn't the way you're supposed to share data):



After obtaining log returns and demeaning them, I use the first 1766 observations to estimate all parameters and obtain standardized residuals. I then conduct ARCH tests on standardized residuals (at lags 1, 5, and 10), which for these two series reject homoscedasticity. Therefore I don't obtain iid residuals and can't go on with my analysis.

Any help would be greatly appreciated.

PS: Is there any other test which is more adequate in testing iid residuals? I think I've read somewhere that, while most people still use ARCH tests on residuals, these are supposed to test raw data and should not be used for residuals.

by Kondo at May 02, 2016 11:52 AM


Is there an implementation of higher kinded types in typed lambda calculus?

I can see that we can do higher kinded types ( * -> *) -> * in Scala and Haskell and other languages. I'm looking for a simpler vanilla implementation of just the basic static type checking system - perhaps in Prolog or MiniKanren.

My question is: Is there an implementation of higher kinded types in typed lambda calculus?

by hawkeye at May 02, 2016 11:35 AM

Recursive Algorithm design for finding the permutation of a string

Recently I am reading about induction and found that recursion and induction are the sides of the same coin. For example induction states that:

$$ R_{n} = R_{n-1} + n$$

i.e. the nth solution can be build up from the (n-1)th solution. Fine, so let's coming to the finding the permutation of a given sequence which in my case is a string.

For a single letter a, its permutation will be a hence we can consider it as the base case.

Moving forward, for two letters ab its permutation would be ab and ba hence, it seems that we need to insert the second character b at every possible position in the previous sequence so, I think I got my inductive step here which is: $$R_{n-1} + \text{insert the current character in every possible position}$$

Now, since its an inductive definition so we can convert it to the recursive algorithm(as per my knowledge).


Is this the correct way of thinking about algorithms? When to decide if the algorithm requires recursive/inductive thinking?

PS: The author here gives a very good recursive solution but that seems very difficult to reason about in the first place, video link.

by CodeYogi at May 02, 2016 11:31 AM


Does any one no how to get value from one haskell program to another

I am a newbie to function programming. How can I pass a value from one Haskell program to another? Suppose we have

  • in a file called Addition.hs:

    Addition = value+10
  • in a file called Value.hs:


With PHP, for example, if include a file in another, I can access its classes. Is it possible in Haskell? That would make my program clearer.

by fahad khalid at May 02, 2016 11:04 AM


Particle locating/collision prediction in bounded (two-dimensional) environments

I believe that many physics engines, particle simulators, and even video games use discrete-event simulation to determine where a particle or object is at any moment, and the direction in which it is moving. Even to detect collisions, I have read that such systems do this by checking each instance to see whether or not the coordinates of two objects/particles overlap. Some are so regressive that they even check, in two dimensions, whether the sprites of two objects/particles have overlapped ( My questions are:

  1. How true is this? As in, do modern physics engines and particle simulators perform repeated checks every $s$ seconds (probably very small, less than $10^{-3}$ I would guess?) to determine where a particle is $now$, what its velocity currently is, and whether or not two particles are colliding at the current moment? if not, are they able to wholly - without any uncertainty or approximation - determine exactly where a particle will be at some time $t=t_0$, or when two such specified particles will collide (if at all)?
  2. Continuing from the previous point; do these engines have any way of mathematically proving - for $n$-dimensional bounded environments - whether or not two particles will $ever$ collide?
  3. What are the names of such engines or simulators, and what are they primarily used for? What are the advantages and disadvantages and disadvantages of each of these?

If you would like clarification on any of the above terms, do let me know, and I will make my explanation more lucid. It would be best if answers contained references to other reliable sources, but regular responses would also be all right. Thank you very much for the help.

(In the above questions, it is assumed that

  • The required initial conditions (position and velocity of the particle at $t=0$) are given
  • Particles are points without mass or volume, where a collision is defined as an overlap in coordinates (all components of the position vector in the Cartesian plane are equal)
  • Rebounds against the wall are perfectly elastic, and the angle of incidence = angle of reflection.
  • Velocity (vector) need be neither constant nor linearly changing.)

by vincemathic at May 02, 2016 11:00 AM


Augmented Dickey-Fuller Questions

I've been searching in bibliography about this test applied to an AR(p) model. $$Q(L)(Y_{t})=c+\epsilon_{t}$$

Where L represent the Lag Operator and $Q=1-\phi_{1}x-.....-\phi_{p}x^{p}$ is the polynomial expression associated to the model.

I know that if $Q(r)=0$ implies $|r|>1$, then the process is stationary (at least in weak sense).

My question is: Why the Null Hypothesis of Augmented Dickey-Fuller test is stated as: "$r=1$ is a root of the polynomial"? Rejecting that hypothesis implies that every single root of Q lies outside the unit circle??

I'm new at this area so every recommendation or suggestion will be useful. Thanks.

by Ivan Rey at May 02, 2016 10:51 AM


Hill Climbing Search - 8 queens

enter image description here

In that picture, I am having trouble counting why there are 17 pairs of queens. I can only count 14. I counted 10 diagonal pairs and 5 horizontal pairs. Can someone please elaborate on this?

I am not exactly sure how it was numbered, but I believe the numbers on the board represent the new number of pairs that are attacking each other if a queen in that column were to move to that square.

by Christian at May 02, 2016 10:38 AM

Fred Wilson

The Business Blockchain

the business blockchainI’ve been reading The Business Blockchain this weekend. It was written by AVC community member William Mougayar.

This book started out as a Kickstarter project which I blogged about at the time. If you backed that project you will get a copy of this book. If not, you might want to get a copy on Amazon.

I am not done with it yet, but the book makes a complex subject, blockchain technology, accessible for the non-technical. It also lays out some of the more obvious uses cases for the technology and explains how the blockchain technology market is evolving.

If you think you might want to start a business based on blockchain technology or if you think blockchain technology is going to reshape a market you are working in, or if you just want to understand this thing that your son or daughter is obsessed about, then this is a great book to read.

I am also quite proud that the conversations we have had on this blog on this topic over the past five years have shaped William’s work and certainly had something to do with his interest and his growing expertise and reputation in this area.

This blog community is a talented group and we have helped each other grow and develop. This book is just one of many examples of that.

by Fred Wilson at May 02, 2016 10:31 AM


Java equivalent to Python's itertools.groupby()?

Here is an example use case of itertools.groupby() in Python:

from itertools import groupby

Positions = [   ('AU', '1M', 1000),
                ('NZ', '1M', 1000),
                ('AU', '2M', 4000),
                ('AU', 'O/N', 4500),  
                ('US', '1M', 2500), 


Pos = sorted(Positions, key=lambda x: x[FLD_COUNTRY])
for country, pos in groupby(Pos, lambda x: x[FLD_COUNTRY]):
    print country, sum(p[FLD_CONSIDERATION] for p in pos)

# -> AU 9500
# -> NZ 1000
# -> US 2500

Is there any language construct or library support in Java that behaves or can achieve what itertools.groupby() does above?

by Anthony Kong at May 02, 2016 10:22 AM


Is there any wordpress widget that i can add on my website for customized stocks? [on hold]

Is there any wordpress widget that i can add on my website for customized stocks? Because so far from what i have searched, all the stock tickers widgets are using data provided from Google Finance, Yahoo Finance, etc.,

However, i want to create customized tickers since the market i am targeting just started and the data are not available in google and yahoo finance yet.

Any idea how i can do it?

by Nyan Tun Zaw at May 02, 2016 09:59 AM

Accuracy Rebonato Swaption Approximation Formula among Different Strikes

Can somebody explain me if the Rebonato swaption volatility approximation formula is accurate for only ATM strikes, and if yes why? Can it also be used for ITM and OTM strikes?

My foundings:

Let $0 < T_0 < T_1 < \ldots < T_N$ be a tenor structure. Consider a payer swaption that gives the right to enter into a payer interest rate swap at $T_0$ with payments of both the floating and the fixed leg on $T_1,\ldots,T_N$. The fixed rate is set to $K$.

I have implemented the Rebonato swaption volatility approximation formula in Matlab as $$\upsilon^{REB}= \sqrt{\frac{\sum_{n=0}^{N-1}\sum_{k=0}^{N-1}w_n\left(0\right)w_k\left(0\right)L_n\left(0\right)L_k\left(0\right)\rho_{n,k}\int_0^{T_0}\sigma_n\left(t\right)\sigma_k\left(t\right)dt}{SwapRate\left(0\right)^2}}\\ =\sqrt{\frac{\sum_{n=0}^{N-1}\sum_{k=0}^{N-1}w_n\left(0\right)w_k\left(0\right)L_n\left(0\right)L_k\left(0\right)\rho_{n,k}\int_0^{T_0}\sigma_n\left(t\right)\sigma_k\left(t\right)dt}{\sum_{n=0}^{N-1}w_n\left(0\right)L_n\left(0\right)}},$$ where $L_n\left(0\right):=L\left(0;T_n,T_{n+1}\right)$ represent the initial Libor curve and $w_n\left(0\right)$ are the weights defined as $$w_n\left(t\right) = \frac{\tau_n P\left(t,T_{n+1}\right)}{\sum_{r=0}^{N-1} \tau_r P\left(t,T_{r+1}\right)},$$
with $\tau_n =T_{n+1}-T_n$.

The instantaneous volatilities $\sigma_n\left(t\right)$ are given by the following parametrization; $$\sigma_n\left(t\right) = \phi_n\left(a+b\left(T_n-t\right)\right)e^{-c\left(T_n-t\right)}+d.$$

To get the swaption price at time $0$, I have used this swaption approximation as an input in Black's forumla; $$V_{swaption}\left(0\right) = Black\left(K,SwapRate\left(0\right),\upsilon^{REB}\right)\\ =Black\left(K,\sum_{n=0}^{N-1}w_n\left(0\right)L_n\left(0\right),\upsilon^{REB}\right)$$

In order to access the accuracy of the Rebonato approximation formula I have compared the prices of various swaptions obtained by plugging the approximation volatility in Black (as above) and the prices obtained by a Monte Carlo evaluation doing 1000000 simulations.

I was particularly interested in the accuracy among different strikes $K$. To illustrate this, consider the 4Y10Y swaption and its corresponding ATM , ATM+1%, ATM+2% and ATM+3% strikes (ATM strike is $K=SwapRate\left(0\right)$).

My foundings were that as you move further away from the ATM strike, the approximation gets worse (difference between Monte Carlo price and price with Rebonato swaption approx volatility increases). In concrete numbers, the difference for ATM strike is 9 bp and for ATM+3% 36 bp.

I have searched in the literature for an explanation, but cannot find any. As far as I have understood, no assumptions evolving the strike are made in deriving the Rebonato formula.

Brigo and Mercurio also perform an accuracy test of the Rebonato formula in their book 'Interest Rate Models - Theory and Practice', namely:

"The results are based on a comparision of Rebonato's formula with the volatilities that plugged into Black's formula lead to the Monte Carlo prices of the corresponding at-the-money swaptions. "

Furthermore, Jäckel and Rebonato analyze in their paper 'Linking Caplet and Swaption Volatilities in a BGM/J Framework: Approximate Solutions' how well the approximation performs by comparing the ATM swaption prices obtained by the Rebonato volatility and the Monte Carlo ATM prices.

Is it coincidence that I only can find results for ATM swaptions or does Rebonato's swaption volatility approximation formula really not perform well for ITM and OTM swaptions?

Any help is appreciated. Thanks in advance.

by Tinkerbell at May 02, 2016 09:55 AM


Fine Tuning Results

I am using pre-trained googLeNet model downloaded from this link visit​.

I did the fine tunning but the results are not promising. I am only getting the accuracy of 50% after 5000 iterations. My training dataset has 3k images validation has 1k images and evaluation has 1k images. I set the base_lr to 0.001 and max_itr to 10000. Is this accuracy normal or did I do something wrong? I can also share what changes I did in the files, if needed.

Any suggestions would be of great help.

by Ashutosh Singla at May 02, 2016 09:54 AM


Connected Graph with Permutation Characterization

Given a directed graph $G$ with $N$ vertices and adjacency matrix $A$. For any sequence of vertices $s \in \{1,\dots,N\}^k$ with $k>0$, there is a path $p$ in $G$ with $p = P(s)$, where $P$ is a permutation operation.

Is there a name for such a graph $G$ and how can it be characterized via the adjacency matrix $A$?

by Sebastian Schlecht at May 02, 2016 09:34 AM


Recommendations for a good (rigorous) text to study Computational Complexity.

I look for a good text to learn basics of computational complexity.

I've read some parts of the first two chapters of "Computational Complexity: A Modern Approach" by Boaz Barak and Sanjeev Arora, it's readable and gives the big picture but I find it not to be rigorious enough for me. I'm a mathematics student so I want a more rigorous treatment of the subject. So, the text I look for should:-

1- Define Turing machines formally and prove the basic results in a formal way not using handy-waving arguments. For example, Boaz treatment of Turing machines is not rigorous and so I'd love to see more precise treatment of that.

2- Give proofs that are rigorous enough like those I read in mathematics texts and are not just "overviews".

What are your recommendations? What is the texts that satisfies my needs as close as possible?

by Maths Lover at May 02, 2016 09:34 AM


Using libFM MCMC -- test file formatting? (NaN evaluation metric error)

I'm trying to create a very basic recommender system (using some kind of factorization machines). In both my training and testing files, I have Users, Items, and Ratings. However, the ratings column for the testing files are blank.

I'm using this tutorial with the libFM library:

But of course I've subbed the input data values for my own data values.

My concern is, they seem to be expecting a y_values matrix (of ratings) in the test file? But isn't it expected that the test file does not contain any ratings, as these are to be expected? I'm brand new to machine learning and this is my first project, so I'm unsure how to proceed. Right now, when I run my code I'm getting a NaN error for the evaluation metric, which I'm sure is because the error is very high since my ratings test matrix is 0. What should I do?

import numpy as np
from sklearn.feature_extraction import DictVectorizer
from pyfm import pylibfm

def loadDataTrain():
    data = [] #total
    y = [] #ratings
    users = set()
    songs = set()

    f = np.genfromtxt('trainy.csv', delimiter=",")
    fa = f.tolist()
    for i in range(1,701):
        line = fa[i]
        (song,user,rating) = line[0],line[1],line[2]

        data.append({ "user": str(user), "song": str(song)})
    return (data, np.array(y), users, songs)

def loadDataTest():
   data = [] #total
   y = [] #ratings
   users = set()
   songs = set()

   f = np.genfromtxt('testy.csv', delimiter=",")
   fa = f.tolist()
   for i in range(1,701):
       line = fa[i]
       (song,user,rating) = line[0],line[1],line[2]

       data.append({ "user": str(user), "song": str(song)})
   return (data, np.array(y), users, songs)

(train_data, y_train, train_users, train_items) = loadDataTrain()
(test_data, y_test, test_users, test_items) = loadDataTest()
v = DictVectorizer()
X_train = v.fit_transform(train_data)
X_test = v.transform(test_data)

# Build and train a Factorization Machine
fm = pylibfm.FM(num_factors=200, num_iter=10, verbose=True, 
            task="regression", initial_learning_rate=0.001, 

preds = fm.predict(X_test)
from sklearn.metrics import mean_squared_error
print("RMSE: ", mean_squared_error(y_test,preds)**0.5)

by 1.mkn at May 02, 2016 09:22 AM

Approaches used for sentiment analysis in R [on hold]

I am doing sentiment analysis using R.I have used kaify's R code for my project- I just want to know which approach is been used here like there are different methods of doing sentiment analysis for ex- machine learning approach, lexicon approach, hybrid approach. Which one is used here ? Thanks

by Kavya at May 02, 2016 09:13 AM


Looking for a use case of a $k$-$d$ tree with a norm other than $L^2$

In Python's implementation of $k$-$d$ tree it is possible to manually change the norm used for computing distances from $L^2$ to $L^p$.

When would one use a norm other than $L^2$ in a $k$-$d$ tree?

by Mikhail at May 02, 2016 09:09 AM


discritization with mdlp function

I used function mdlp in package {discretization} to discretize my data, I see that this method is very good, but when I used big data like waveform from UCI, I see that there are many variable with one level, is this normal??? what is the problem? many thanks in advance.


'data.frame':   5000 obs. of  41 variables:
 $ V1 : Factor w/ 1 level "1": 1 1 1 1 1 1 1 1 1 1 ...
 $ V2 : Factor w/ 3 levels "1","2","3": 1 2 3 3 3 3 2 2 3 3 ...
 $ V3 : Factor w/ 5 levels "1","2","3","4",..: 4 2 3 2 2 2 5 2 5 1 ...
 $ V4 : Factor w/ 5 levels "1","2","3","4",..: 3 2 4 4 3 3 5 2 5 1 ...
 $ V5 : Factor w/ 6 levels "1","2","3","4",..: 2 4 5 3 6 4 6 1 6 1 ...
 $ V6 : Factor w/ 6 levels "1","2","3","4",..: 1 3 5 1 6 4 6 1 6 2 ...
 $ V7 : Factor w/ 7 levels "1","2","3","4",..: 5 3 5 1 7 4 7 3 7 1 ...
 $ V8 : Factor w/ 8 levels "1","2","3","4",..: 2 6 7 1 4 6 7 6 8 3 ...
 $ V9 : Factor w/ 5 levels "1","2","3","4",..: 1 2 5 1 5 4 3 3 4 2 ...
 $ V10: Factor w/ 6 levels "1","2","3","4",..: 1 2 4 1 5 4 5 6 5 4 ...
 $ V11: Factor w/ 4 levels "1","2","3","4": 1 1 1 1 3 3 2 4 2 2 ...
 $ V12: Factor w/ 7 levels "1","2","3","4",..: 3 5 2 6 2 6 2 7 1 7 ...
 $ V13: Factor w/ 7 levels "1","2","3","4",..: 6 5 3 5 3 5 1 7 2 6 ...
 $ V14: Factor w/ 6 levels "1","2","3","4",..: 6 6 1 6 2 2 1 3 1 4 ...
 $ V15: Factor w/ 7 levels "1","2","3","4",..: 7 6 2 7 3 6 2 3 1 6 ...
 $ V16: Factor w/ 7 levels "1","2","3","4",..: 6 6 1 7 2 2 5 5 4 7 ...
 $ V17: Factor w/ 6 levels "1","2","3","4",..: 5 5 1 6 5 2 1 4 2 6 ...
 $ V18: Factor w/ 6 levels "1","2","3","4",..: 6 5 3 3 3 3 2 3 3 5 ...
 $ V19: Factor w/ 4 levels "1","2","3","4": 2 4 2 4 1 1 2 3 1 4 ...
 $ V20: Factor w/ 4 levels "1","2","3","4": 4 2 2 3 4 2 4 2 2 4 ...
 $ V21: Factor w/ 1 level "1": 1 1 1 1 1 1 1 1 1 1 ...
 $ V22: Factor w/ 1 level "1": 1 1 1 1 1 1 1 1 1 1 ...
 $ V23: Factor w/ 1 level "1": 1 1 1 1 1 1 1 1 1 1 ...
 $ V24: Factor w/ 1 level "1": 1 1 1 1 1 1 1 1 1 1 ...
 $ V25: Factor w/ 1 level "1": 1 1 1 1 1 1 1 1 1 1 ...
 $ V26: Factor w/ 1 level "1": 1 1 1 1 1 1 1 1 1 1 ...
 $ V27: Factor w/ 1 level "1": 1 1 1 1 1 1 1 1 1 1 ...
 $ V28: Factor w/ 1 level "1": 1 1 1 1 1 1 1 1 1 1 ...
 $ V29: Factor w/ 1 level "1": 1 1 1 1 1 1 1 1 1 1 ...
 $ V30: Factor w/ 1 level "1": 1 1 1 1 1 1 1 1 1 1 ...
 $ V31: Factor w/ 1 level "1": 1 1 1 1 1 1 1 1 1 1 ...
 $ V32: Factor w/ 1 level "1": 1 1 1 1 1 1 1 1 1 1 ...
 $ V33: Factor w/ 1 level "1": 1 1 1 1 1 1 1 1 1 1 ...
 $ V34: Factor w/ 1 level "1": 1 1 1 1 1 1 1 1 1 1 ...
 $ V35: Factor w/ 1 level "1": 1 1 1 1 1 1 1 1 1 1 ...
 $ V36: Factor w/ 1 level "1": 1 1 1 1 1 1 1 1 1 1 ...
 $ V37: Factor w/ 1 level "1": 1 1 1 1 1 1 1 1 1 1 ...
 $ V38: Factor w/ 1 level "1": 1 1 1 1 1 1 1 1 1 1 ...
 $ V39: Factor w/ 1 level "1": 1 1 1 1 1 1 1 1 1 1 ...
 $ V40: Factor w/ 1 level "1": 1 1 1 1 1 1 1 1 1 1 ...
 $ class : Factor w/ 3 levels "0","1","2": 3 1 2 1 2 2 1 3 1 3 ...

by Kamel at May 02, 2016 09:01 AM


Why would we want a self-hosting compiler?

I understand that a self-hosting compiler is a compiler which can compile the code of the language that it is written in into different language but I don't understand why we would want to do that. What are the benefits (and drawbacks) of a compiler which is self-hosting?

by Haych at May 02, 2016 08:35 AM


Incorporating user feedback in production deployed DeepLearning ML model

I have developed a ML model using deep learning (LSTM in keras on top of theano) and deployed it in production environment. This model does a classification (0/1) for a NLP task. The prediction of the model is displayed to users, and the users have the option to give a feedback (if the prediction was right/wrong).

How can I continuously incorporate this feedback in my model ? From a UX stand point you dont want a user to correct/teach the system more than twice/thrice for a specific input, system shld learn fast i.e. so the feedback shld be incorporated "fast". (Google priority inbox does this in a seamless way)

How does one incorporate the feedback. I have searched a lot on net but could not find relevant material. any pointers will be of great help.

Pls dont say retrain the model from scratch by including new data points. Thats surely not how google and facebook build their smart systems

To further explain my question - think of google's spam detector or their priority inbox or their recent feature of "smart replies". Its a well known fact that they are built using DeepNets and that it has the ability to learn / incorporate (fast) user feed.

All the while when it incorporates the user feedback fast (i.e. user has to teach the system correct output atmost 2-3 times per data point and the system start to give correct output for that data point) AND it also ensure it maintains old learnings and does not start to give wrong outputs on older data points (where it was giving right output earlier) while incorporating the learning from new data point.

I have not found any blog/literature/discussion w.r.t how to build such systems - An intelligent system that uses Deeplearning and can incorporate user feedback in "fast" way

Hope my question is little more clear now.

Update: Some related questions I found are:

by Anuj Gupta at May 02, 2016 08:28 AM


How to determine portion of portfolio's risks from components?

Say I have a portfolio of 3 stocks $A,B,C$ with $\mu_A = 5%$, $\mu_B = 10%$, $\mu_C = 15%$ and volatility $\sigma_A = 10%$, $\sigma_B = 15%$, and $\sigma_C = 25%$. Let us also say that correlations are $\rho_{AC} = 0.7$, $\rho_{AB} = 0.3$, and $\rho_{BC} = -0.1$. Say total portfolio value is 1 and it is composed of $A,B,C$ equally by value. How would I calculate the corresponding risk exposure that I have to each of the three underlying securities?

Portfolio $\mu_{total} = \frac{1}{3} \times \mu_A + \frac{1}{3} \times \mu_B + \frac{1}{3} \times \mu_C$.

Portfolio $\sigma_{total} = \sqrt{\frac{1}{9}(\sigma_A^2+\sigma_B^2+\sigma_C^2 + 2\rho_{AC}\sigma_A\sigma_C+2\rho_{AB}\sigma_A\sigma_B+2\rho_{BC}\sigma_B\sigma_C)}$

How would you divide up $\sigma_{total}$ or is it not possible?

by bob at May 02, 2016 08:05 AM


How can I use stanford parser in my java program for effective sentiment analysis

I have to develop a project in java that uses a Stanford parser.What is the effective way to make use of Stanford Parser for sentiment analysis.How it could be integrated with some machine learning algorithm to improve the accuracy of the sentiment predicted.? I have to get the sentiment in scale of 0 to 5 (very negative,negative...very positive).

by Vishal Kharde at May 02, 2016 07:57 AM



How to apply Elliott wave priciple to any Time Series?

I'm strongly interested to computing Elliott Wave to any given Timeseries. Does anybody tried?

Is there any phython library to do that?

I'm looking for an algorithm taht if I give to it a time series, it label each movement according to Elliott Wave principle.

by sparkle at May 02, 2016 07:24 AM


How to calculate the number of parameters of Convolutional Neural Networks(CNNs) correctly?

I can't give the correct number of parameters of AlexNet or VGG Net.

For example, to calculate the number of parameters of a conv3-256 layer of VGG Net, the answer is 0.59M = (3*3)*(256*256), that is (kernel size) * (product of both number of channels in the joint layers), however in that way, I can't get the 138M parameters.

So could you please show me where is wrong with my calculation, or show me the right calculation procedure?

by Eric at May 02, 2016 07:17 AM


Find cell neighbors of a given edge in a 2D grid

In the figure below, cells are labeled row wise, and edges are labeled counter clockwise. That is, vertices 1' and 2' form edge #1, vertices 2' and 5' form edge #2, vertices 5' and 8' form edge #7, etc.

  • vertices 1' and 2' form edge #1
  • vertices 2' and 5' form edge #2
  • vertices 4' and 5' form edge #3
  • vertices 1' and 4' form edge #4
  • vertices 2' and 3' form edge #5
  • vertices 3' and 6' form edge #6
  • vertices 6' and 5' form edge #7
  • vertices 5' and 8' form edge #8
  • vertices 7' and 8' form edge #9
  • vertices 4' and 7' form edge #10
  • vertices 6' and 9' form edge #11
  • vertices 8' and 9' form edge #12

Question: given an edge number (and/or the vertices that form it), is there a way to algorithmically compute its corresponding cell neighbors?

enter image description here

by user107904 at May 02, 2016 07:13 AM



How to increase the dataset?

I am doing a project on face recognition. I have a dataset containing image of 21 actors(each 150). Now I want to increase the no. of image of each actor to 300+ for the training purpose. How can I do it using MATLAB. Some solutions can be we can vary the contrast/brightness level of each image. But what are some other factors through which I can increase the no. of images.

by jackson at May 02, 2016 06:37 AM


RWA Calculations Formulae

I am working as IT developer for one of the investment bank and I have recently joined and it is completely new domain to me. While I am still learning about this domain, what I was looking for short explanation on how RWA is calculated and formulae for the same. I know what RWA is, just looking for some info on how RWA is calculated. TIA

by Viru at May 02, 2016 06:19 AM


The range of significance in Type Theory

What exactly does "Types as ranges of significance of propositional functions. In modern 
terminology, types are domains of predicates" mean?

Update: I found in this paper (Pag 14 or 234) by Russell, where he defines what it is ranges of significance, not exactly including types else propositions.

by jonaprieto at May 02, 2016 05:57 AM

A query on complexity class $PP$

We know that $PH$ is in $P^{PP}$. Is $PP$ and $PH$ directly comparable? That is we know that $NP$ is in $BPP$ is in $\Sigma_2$.

Do we know $PH$ which contains all this is in $PP$?

What consequences are there this happens?

by Turbo at May 02, 2016 05:11 AM


using jruby to manipulate algorithms in weka

how do I begin using jruby with weka, as am new to programming. aim is to manipulate machine learning algorithms in weka using jruby to achieve active learning.

for instance active learning with SVM's, How? and where to begin with?

Thanks a lot for your time and effort

by user2372728 at May 02, 2016 04:58 AM

Computational Learning Theory [on hold]

Consider the class C of concepts of the form: (𝑎≤𝑥1≤𝑏) ᴧ (𝑐≤ 𝑥2≤𝑑). Note that each concept in this class corresponds to a rectangle in 2-dimensions. Let a, b be integers in the range [0, 199] and c, d be integers in the range [0, 99]. Give an upper bound on the number of training examples sufficient to assure that for any target concept c 2 C, any consistent learner using H = C will, with probability 0.99, output a hypothesis with error at most 0.05.

Trying to solve the above problem but couldn't make any head way. Please HELP!

by John Pal at May 02, 2016 04:42 AM

How does Azure's Matchbox Recommender differ from other SVD and PMF recommenders?

I'm trying to understand how Azure's Matchbox Recommender differs from other PMF recommenders and if there are specific benefits to Microsoft's model. Also, are there any existing open source implementations of this algorithm?

by not.K at May 02, 2016 04:03 AM



Why is Battleship not in P complexity class?

I would assume that the battleship puzzle would be NP-complete and not P because the time to calculate every possible positioning of the ships is above polynomial time. However, I don't see how I could calculate what the actual run-time complexity of finding all possible positions is. Given an A x B grid, there would be the sum of 10 choose i from i = 1 to 10, possible ways to place pegs in each row. Extrapolating this for the entire grid, there would be A x (the previous sum) possible peg placements. How do I solve the problem of number of possible placements for battleship?

I am trying to show that the problem of finding the positioning of ships on a grid is NOT in P because the algorithms for enumerating every possible placement would have greater than polynomial running time. The problem, as seen in many newspapers and puzzle books, has been proved to be NP-Complete so I do not think there is an issue with my stating it is NP-Complete.

by Battleship at May 02, 2016 02:58 AM



Quantum complexity of maximum inner product search

Given two matrices $X \in \mathbb{R}^{m \times k}$, $Y \in \mathbb{R}^{n \times k}$, maximum inner product search (MIPS) asks for the largest $l$ entries of $X Y^T$. Typically $k \ll m, n$ (many small dot products). In the approximate version, we ask for entries within a factor $\alpha < 1$ of the true largest.

Unfortunately, Ahle et al. show that an approximate solution to MIPS in subquadratic time $O(k^{O(1)} (mn)^{1-\epsilon})$ for $\epsilon > 0$ would contradict the strong exponential time hypothesis. Terminology note: Ahle et al. calls it inner product similarity join (IPS) instead of MIPS.

Ahle et al. consider only classical complexity. However, it is easy to beat the subquadratic limit using a quantum computer: applying Grover to the $m \times n$ search space gives $\tilde{O}(k \sqrt{m n})$ for $l = 1$.

Question: Is $\tilde{O}(k \sqrt{m n})$ the right quantum complexity for $l = 1$? What about for larger $l \ll m, n$?

by Geoffrey Irving at May 02, 2016 02:41 AM


Need advice about distributed backtesting architecture

We are working under complex enough distributed trading system where several components will run on different physical machines.

Unfortunately, I'm stuck on part backtesting part. Originally we was planning to use tick as synchronization marker. Idea was working till we not added more complicated interaction logic when components started to interact with each other with back loop.

I'm sure that this problem was solved many times before and dont want to reinvent the wheel... Can anybody share at least basic information about topic?

by rimas at May 02, 2016 02:36 AM


What mathematical background is enough to go through the entire Knuth's TAOCP?

I am an undergrad. I can do mathematics fine and am a lot more interested to explore.
I turned towards TAOCP as it is highly mathematically oriented so as to excel at maths and programming at the same time.
But I am stuck. I have only studied calculus, numerical analysis and probability (undergrad level) but the first few pages of the first volume itself is too difficult to get through.
Also, the problems are ranked according to their difficulty level. What all can I do to solve those problems? I really need to get better at mathematics.

What books(mathematics) should one read to be able to understand TAOCP? I am ready to go to a series of books just to get better at it.

Also, any suggestion(s) on how to learn/practice MIX(the TAOCP assembly)?

by piepi at May 02, 2016 02:36 AM


Adding negative EV position to portfolio for diversification?

Say I have a portfolio of expected return $10%$ and volatility $20%%. If I have another asset that is either:

  1. Negatively correlated
  2. Positively correlated
  3. Uncorrelated

With negative expected return $\mu < 0$ and volatility $\sigma$. From intuition I think that if we are allowed to use leverage, we should be adding this to portfolio under scenarios 1 and 3 to reduce risk (and apply leverage to achieve desire rate of return). Is this true? How would I size this position if I want to target 10%?

Is this scenario similar to the case of shorting one asset and buying another that are positively correlated to each other? In both instances (long/short positively correlated or long/long negatively.. or zero correlated), they should be risk reducing. And if we're allowed to use leverage we should be ale to achieve target return at lower risk? Though this also depends on the bounds of expected return and correlation?

by bob at May 02, 2016 02:12 AM


Voronoi Diagram of Lines

Let $S$ be a set of lines (or line segments) in $\mathbb{R}^3$. Consider the Voronoi diagram $VD(S)$. The best lower bound on the complexity of $VD(S)$ is $\Omega(n^2)$ and the best upper bound is $O(n^{3+\epsilon})$ [1]. What is the current state of research on this problem? Is anything more recent than the references in [2] known?

[1] Almost tight upper bounds for lower envelopes in higher dimensions.


by user36641 at May 02, 2016 02:08 AM

Looking for a use case of a $k$-$d$ tree with a norm other than $L^2$ [migrated]

In Python's implementation of $k$-$d$ tree it is possible to manually change the norm used for computing distances from $L^2$ to $L^p$.

When would one use a norm other than $L^2$ in a $k$-$d$ tree?

by Mikhail at May 02, 2016 02:01 AM


Calculating the cost for each operation in amortized analysis

According to what I've read in the CLRS book , we calculate the amortized cost for a complete set , and not for a single operation.But in an exam question , it was asked about an operation amortized cost and there hadn't been any info on the analyze function.

In these scenarios, how would one calculate the cost for an operation? For example , In the fibonacci heap , the extraction cost is considered. Is there any standard for this or is the exam question wrong?


by user3371603 at May 02, 2016 01:47 AM

arXiv Networking and Internet Architecture

User Selection and Power Allocation in Full Duplex Multi-Cell Networks. (arXiv:1604.08937v1 [cs.NI])

Full duplex (FD) communications has the potential to double the capacity of a half duplex (HD) system at the link level. However, in a cellular network, FD operation is not a straightforward extension of half duplex operations. The increased interference due to a large number of simultaneous transmissions in FD operation and realtime traffic conditions limits the capacity improvement. Realizing the potential of FD requires careful coordination of resource allocation among the cells as well as within the cell. In this paper, we propose a distributed resource allocation, i.e., joint user selection and power allocation for a FD multi-cell system, assuming FD base stations (BSs) and HD user equipment (UEs). Due to the complexity of finding the globally optimum solution, a sub-optimal solution for UE selection, and a novel geometric programming based solution for power allocation, are proposed. The proposed distributed approach converges quickly and performs almost as well as a centralized solution, but with much lower signaling overhead. It provides a hybrid scheduling policy which allows FD operations whenever it is advantageous, but otherwise defaults to HD operation. We focus on small cell systems because they are more suitable for FD operation, given practical self-interference cancellation limits. With practical self-interference cancellation, it is shown that the proposed hybrid FD system achieves nearly two times throughput improvement for an indoor multi-cell scenario, and about 65% improvement for an outdoor multi-cell scenario compared to the HD system.

by <a href="">Sanjay Goyal</a>, <a href="">Pei Liu</a>, <a href="">Shivendra Panwar</a> at May 02, 2016 01:30 AM

Provision of Public Goods on Networks: On Existence, Uniqueness, and Centralities. (arXiv:1604.08910v1 [cs.GT])

We consider the provision of public goods on networks of strategic agents. We study different effort outcomes of these network games, namely, the Nash equilibria, Pareto efficient effort profiles, and semi-cooperative equilibria (effort profiles resulting from interactions among coalitions of agents). We first identify necessary and sufficient conditions on the structure of the network for the uniqueness of these effort profiles. We show that our finding unifies (and strengthens) existing results in the literature. We also identify conditions for the existence of these effort profiles for the subclasses of games at the two extremes of our model, namely games of strategic complements and games of strategic substitutes. All identified conditions are based only on the network structure, and therefore, are independent of the selected solution concept, or the mechanism/negotiations that lead to the effort profile. We provide a graph-theoretical interpretation of agents' efforts by linking an agent's decision to her centrality in the interaction network. Using this connection, we separate the effects of incoming and outgoing edges on agents' efforts and uncover an alternating effect over walks of different length in the network.

by <a href="">Parinaz Naghizadeh</a>, <a href="">Mingyan Liu</a> at May 02, 2016 01:30 AM

SPARQL query processing with Apache Spark. (arXiv:1604.08903v1 [cs.DB])

The number of linked data sources and the size of the linked open data graph keep growing every day. As a consequence, semantic RDF services are more and more confronted to various "big data" problems. Query processing is one of them and needs to be efficiently addressed with executions over scalable, highly available and fault tolerant frameworks. Data management systems requiring these properties are rarely built from scratch but are rather designed on top of an existing cluster computing engine. In this work, we consider the processing of SPARQL queries with Apache Spark. We propose and compare five different query processing approaches based on different join execution models and Spark components. A detailed experimentation, on real-world and synthetic data sets, emphasizes that two approaches tailored for the RDF data model outperform the other ones on all major query shapes, i.e., star, snowflake, chain and hybrid.

by <a href="">Hubert Naacke</a>, <a href="">Olivier Cur&#xe9;</a>, <a href="">Bernd Amann</a> at May 02, 2016 01:30 AM

Computational Higher Type Theory I: Abstract Cubical Realizability. (arXiv:1604.08873v1 [cs.LO])

In this paper we give a "meaning explanation" of a computational higher type theory in the style of Martin-L\"{o}f and of Constable and Allen. Such an explanation starts with a dimension-stratified collection of terms endowed with a deterministic operational semantics defining what it means to evaluate closed terms of any dimension to canonical form. The semantics of types is given by specifying, at each dimension, when canonical elements are equal, when general elements are equal, and when these definitions capture the structure of an $\infty$-groupoid, namely when they are cubical and satisfy the uniform Kan condition of Bezem, Coquand, and Huber. The goal of this work is to develop a computation-based account of higher-dimensional type theory for which canonicity at observable types is true by construction. For the sake of clarity, we illustrate this method for a simple type theory with higher inductive types, one line between types given by an equivalence, and closed under function and product types. The main technical result is a canonicity theorem for closed terms of boolean type.

by <a href="">Carlo Angiuli</a>, <a href="">Robert Harper</a>, <a href="">Todd Wilson</a> at May 02, 2016 01:30 AM

Improved IKE Protocol Design Based On PKI/ECC. (arXiv:1604.08810v1 [cs.CR])

This Paper proposes an ECDH key exchange method and an ECsig Digital Signature Authentication method based on group with Koblits curve, man-in-the-middle attack prevention method for SA payload and initiator identification payload to design high intensity IKE that can be implemented in portable devices.

by <a href="">Pak Song-Ho</a>, <a href="">Pak Myong-Suk</a>, <a href="">Jang Chung-Hyok</a> at May 02, 2016 01:30 AM

Undecidability of Two-dimensional Robot Games. (arXiv:1604.08779v1 [cs.GT])

Robot game is a two-player vector addition game played on the integer lattice $\mathbb{Z}^n$. Both players have sets of vectors and in each turn the vector chosen by a player is added to the current configuration vector of the game. One of the players, called Eve, tries to play the game from the initial configuration to the origin while the other player, Adam, tries to avoid the origin. The problem is to decide whether or not Eve has a winning strategy. In this paper we prove undecidability of the robot game in dimension two answering the question formulated by Doyen and Rabinovich in 2011 and closing the gap between undecidable and decidable cases.

by <a href="">Reino Niskanen</a>, <a href="">Igor Potapov</a>, <a href="">Julien Reichert</a> at May 02, 2016 01:30 AM

Helly $\mathbf{EPT}$ graphs on bounded degree trees: forbidden induced subgraphs and efficient recognition. (arXiv:1604.08775v1 [math.CO])

The edge intersection graph of a family of paths in host tree is called an $EPT$ graph. When the host tree has maximum degree $h$, we say that $G$ belongs to the class $[h,2,2]$. If, in addition, the family of paths satisfies the Helly property, then $G \in$ Helly $[h,2,2]$.

The time complexity of the recognition of the classes

$[h,2,2]$ inside the class $EPT$ is open for every $h> 4$. Golumbic et al. wonder if the only obstructions for an $EPT$ graph belonging to $[h,2,2]$ are the chordless cycles $C_n$ for $n> h$. In the present paper, we give a negative answer to that question, we present a family of $EPT$ graphs which are forbidden induced subgraphs for the classes $[h,2,2]$.

Using them we obtain a total characterization by induced forbidden subgraphs of the classes Helly $[h,2,2]$ for $h\geq 4$ inside the class $EPT$. As a byproduct, we prove that Helly $EPT$$\cap [h,2,2]=$ Helly $[h,2,2]$. We characterize Helly $[h,2,2]$ graphs by their atoms in the decomposition by clique separators. We give an efficient algorithm to recognize Helly $[h,2,2]$ graphs.

by <a href="">Liliana Alc&#xf3;n</a>, <a href="">Marisa Gutierrez</a>, <a href="">Mar&#xed;a P&#xed;a Mazzoleni</a> at May 02, 2016 01:30 AM

Dynamic Clustering and Sleep Mode Strategies for Small Cell Networks. (arXiv:1604.08758v1 [cs.NI])

In this paper, a novel cluster-based approach for optimizing the energy efficiency of wireless small cell networks is proposed. A dynamic mechanism based on the spectral clustering technique is proposed to dynamically form clusters of small cell base stations. Such clustering enables intra-cluster coordination among the base stations for optimizing the downlink performance through load balancing, while satisfying users' quality-of-service requirements. In the proposed approach, the clusters use an opportunistic base station sleep-wake switching mechanism to strike a balance between delay and energy consumption. The inter-cluster interference affects the performance of the clusters and their choices of active or sleep state. Due to the lack of inter-cluster communications, the clusters have to compete with each other to make decisions on improving the energy efficiency. This competition is formulated as a noncooperative game among the clusters that seek to minimize a cost function which captures the tradeoff between energy expenditure and load. To solve this game, a distributed learning algorithm is proposed using which the clusters autonomously choose their optimal transmission strategies. Simulation results show that the proposed approach yields significant performance gains in terms of reduced energy expenditures up to 40% and reduced load up to 23% compared to conventional approaches.

by <a href="">Sumudu Samarakoon</a>, <a href="">Mehdi Bennis</a>, <a href="">Walid Saad</a>, <a href="">Matti Latva-aho</a> at May 02, 2016 01:30 AM

Opportunistic Sleep Mode Strategies in Wireless Small Cell Networks. (arXiv:1604.08756v1 [cs.NI])

The design of energy-efficient mechanisms is one of the key challenges in emerging wireless small cell networks. In this paper, a novel approach for opportunistically switching ON/OFF base stations to improve the energy efficiency in wireless small cell networks is proposed. The proposed approach enables the small cell base stations to optimize their downlink performance while balancing the load among each another, while satisfying their users' quality-of-service requirements. The problem is formulated as a noncooperative game among the base stations that seek to minimize a cost function which captures the tradeoff between energy expenditure and load. To solve this game, a distributed learning algorithm is proposed using which the base stations autonomously choose their optimal transmission strategies. Simulation results show that the proposed approach yields significant performance gains in terms of reduced energy expenditures up to 23% and reduced load up to 40% compared to conventional approaches.

by <a href="">Sumudu Samarakoon</a>, <a href="">Mehdi Bennis</a>, <a href="">Walid Saad</a>, <a href="">Matti Latva-aho</a> at May 02, 2016 01:30 AM

Enabling Relaying Over Heterogeneous Backhauls in the Uplink of Wireless Femtocell Networks. (arXiv:1604.08744v1 [cs.NI])

In this paper, we develop novel two-tier interference management strategies that enable macrocell users (MUEs) to improve their performance, with the help of open-access femtocells. To this end, we propose a rate-splitting technique using which the MUEs optimize their uplink transmissions by dividing their signals into two types: a coarse message that is intended for direct transmission to the macrocell base station and a fine message that is decoded by a neighboring femtocell and subsequently relayed over a heterogeneous (wireless/wired) backhaul. For deploying the proposed technique, we formulate a non-cooperative game between the MUEs in which each MUE can decide on its relaying femtocell while maximizing a utility function that captures both the achieved throughput and the expected backhaul delay. Simulation results show that the proposed approach yields up to 125% rate improvement and up to 2 times delay reduction with wired backhaul and, 150% rate improvement and up to 10 times delay reduction with wireless backhaul, relative to classical interference management approaches, with no cross-tier cooperation.

by <a href="">Sumudu Samarakoon</a>, <a href="">Mehdi Bennis</a>, <a href="">Walid Saad</a>, <a href="">Matti Latva-aho</a> at May 02, 2016 01:30 AM

Verifying Buchberger's Algorithm in Reduction Rings. (arXiv:1604.08736v1 [cs.SC])

In this paper we present the formal, computer-supported verification of a functional implementation of Buchberger's critical-pair/completion algorithm for computing Gr\"obner bases in reduction rings. We describe how the algorithm can be implemented and verified within one single software system, which in our case is the Theorema system.

In contrast to existing formal correctness proofs of Buchberger's algorithm in other systems, e. g. Coq and ACL2, our work is not confined to the classical setting of polynomial rings over fields, but considers the much more general setting of reduction rings; this, naturally, makes the algorithm more complicated and the verification more difficult.

The correctness proof is essentially based on some non-trivial results from the theory of reduction rings, which we formalized and formally proved as well. This formalization already consists of more than 800 interactively proved lemmas and theorems, making the elaboration an extensive example of higher-order theory exploration in Theorema.

by <a href="">Alexander Maletzky</a> at May 02, 2016 01:30 AM

System Level Performance Evaluation of LTE-V2X Network. (arXiv:1604.08734v1 [cs.NI])

Vehicles are among the fastest growing type of connected devices. Therefore, there is a need for Vehicle-to-Everything (V2X) communication i.e. passing of information from a Vehicle-to-Vehicle (V2V) or Vehicle-to-Infrastructure (V2I) and vice versa. In this paper, the main focus is on the communication between vehicles and road side units (RSUs) commonly referred to as V2I communication in a multi-lane freeway scenario. Moreover, we analyze network related bottlenecks such as the maximum number of vehicles that can be supported when coverage is provided by the Long Term Evolution Advanced (LTE-A) network. The performance evaluation is assessed through extensive system-level simulations. Results show that new resource allocation and interference mitigation techniques are needed in order to achieve the required high reliability requirements, especially when network load is high.

by <a href="">Petri Luoto</a>, <a href="">Mehdi Bennis</a>, <a href="">Pekka Pirinen</a>, <a href="">Sumudu Samarakoon</a>, <a href="">Kari Horneman</a>, <a href="">Matti Latva-aho</a> at May 02, 2016 01:30 AM

Exploring Social Networks for Optimized User Association in Wireless Small Cell Networks with Device-to-Device Communications. (arXiv:1604.08727v1 [cs.NI])

In this paper, we propose a novel social network aware approach for user association in wireless small cell networks. The proposed approach exploits social relationships between user equipments (UEs) and their physical proximity to optimize the network throughput. We formulate the problem as a matching game between UEs and their serving nodes (SNs). In our proposed game, the serving node can be a small cell base station (SCBS) or an important node with device-to-device capabilities. In this game, the SCBSs and UEs maximize their respective utility functions capturing both the spatial and social structures of the network. We show that the proposed game belongs to the class of matching games with externalities. Subsequently, we propose a distributed algorithm using which the SCBSs and UEs interact and reach a stable matching. We show the convergence of the proposed algorithm and study the properties of the resulting matching. Simulation results show that the proposed socially-aware user association approach can efficiently offload traffic while yielding a significant gain reaching up to 63% in terms of data rates as compared to the classical (social-unaware) approach.

by <a href="">Muhammad Ikram Ashraf</a>, <a href="">Mehdi Bennis</a>, <a href="">Walid Saad</a>, <a href="">Marcos Katz</a> at May 02, 2016 01:30 AM

"Knowing value" logic as a normal modal logic. (arXiv:1604.08709v1 [cs.AI])

Recent years witness a growing interest in nonstandard epistemic logics of "knowing whether", "knowing what", "knowing how" and so on. These logics are usually not normal, i.e., the standard axioms and reasoning rules for modal logic may be invalid. In this paper, we show that the conditional "knowing value" logic proposed by Wang and Fan (2013) can be viewed as a disguised normal modal logic by treating the negation of Kv operator as a special diamond. Under this perspective, it turns out that the original first-order Kripke semantics can be greatly simplified by introducing a ternary relation $R_i^c$ in standard Kripke models which associates one world with two $i$-accessible worlds that do not agree on the value of constant $c$. Under intuitive constraints, the modal logic based on such Kripke models is exactly the one studied by Wang and Fan (2013,2014). Moreover, there is a very natural binary generalizations of the "knowing value" diamond, which, surprisingly, does not increase the expressive power of the logic. The resulting logic with the binary diamond has a transparent normal modal system which sharpens our understanding of the "knowing value" logic and simplifies some previous hard problems.

by <a href="">Tao Gu</a>, <a href="">Yanjing Wang</a> at May 02, 2016 01:30 AM

A Modest Proposal for Open Market Risk Assessment to Solve the Cyber-Security Problem. (arXiv:1604.08675v1 [cs.CR])

We introduce a model for a market based economic system of cyber-risk valuation to correct fundamental problems of incentives within the information technology and information processing industries. We assess the makeup of the current day marketplace, identify incentives, identify economic reasons for current failings, and explain how a market based risk valuation system could improve these incentives to form a secure and robust information marketplace for all consumers by providing visibility into open, consensus based risk pricing and allowing all parties to make well informed decisions.

by <a href="">Timothy J. O&#x27;Shea</a>, <a href="">Adam Mondl</a>, <a href="">T. Charles. Clancy</a> at May 02, 2016 01:30 AM

Dependence between External Path-Length and Size in Random Tries. (arXiv:1604.08658v1 [math.CO])

We study the size and the external path length of random tries and show that they are asymptotically independent in the asymmetric case but strongly dependent with small periodic fluctuations in the symmetric case. Such an unexpected behavior is in sharp contrast to the previously known results that the internal path length is totally positively correlated to the size and that both tend to the same normal limit law. These two examples provide concrete instances of bivariate normal distributions (as limit laws) whose correlation is $0$, $1$ and periodically oscillating.

by <a href="">Michael Fuchs</a>, <a href="">Hsien-Kuei Hwang</a> at May 02, 2016 01:30 AM

Licensed-Assisted Access to Unlicensed Spectrum in LTE Release 13. (arXiv:1604.08632v1 [cs.NI])

Exploiting the unlicensed spectrum is considered by 3GPP as one promising solution to meet the ever-increasing traffic growth. As a result, one major enhancement for LTE in Release 13 has been to enable its operation in the unlicensed spectrum via Licensed-Assisted Access (LAA). In this article, we provide an overview of the Release 13 LAA technology including motivation, use cases, LTE enhancements for enabling the unlicensed band operation, and the coexistence evaluation results contributed by 3GPP participants.

by <a href="">Hwan-Joon</a> (Eddy) <a href="">Kwon</a>, <a href="">Jeongho Jeon</a>, <a href="">Abhijeet Bhorkar</a>, <a href="">Qiaoyang Ye</a>, <a href="">Hiroki Harada</a>, <a href="">Yu Jiang</a>, <a href="">Liu Liu</a>, <a href="">Satoshi Nagata</a>, <a href="">Boon Loong Ng</a>, <a href="">Thomas Novlan</a>, <a href="">Jinyoung Oh</a>, <a href="">Wang Yi</a> at May 02, 2016 01:30 AM

Throughput and range characterization of IEEE 802.11ah. (arXiv:1604.08625v1 [cs.NI])

The most essential part of Internet of Things (IoT) infrastructure is the wireless communication system that acts as a bridge for the delivery of data and control messages. However, the existing wireless technologies lack the ability to support a huge amount of data exchange from many battery driven devices spread over a wide area. In order to support the IoT paradigm, the IEEE 802.11 standard committee is in process of introducing a new standard, called IEEE 802.11ah. This is one of the most promising and appealing standards, which aims to bridge the gap between traditional mobile networks and the demands of the IoT. In this paper, we first discuss the main PHY and MAC layer amendments proposed for IEEE 802.11ah. Furthermore, we investigate the operability of IEEE 802.11ah as a backhaul link to connect devices over a long range. Additionally, we compare the aforementioned standard with previous notable IEEE 802.11 amendments (i.e. IEEE 802.11n and IEEE 802.11ac) in terms of throughput (with and without frame aggregation) by utilizing the most robust modulation schemes. The results show an improved performance of IEEE 802.11ah (in terms of power received at long range while experiencing different packet error rates) as compared to previous IEEE 802.11 standards.

by <a href="">Victor Ba&#xf1;os-Gonzalez</a>, <a href="">M. Shahwaiz Afaqui</a>, <a href="">Elena Lopez-Aguilera</a>, <a href="">Eduard Garcia-Villegas</a> at May 02, 2016 01:30 AM

Stringer: Balancing Latency and Resource Usage in Service Function Chain Provisioning. (arXiv:1604.08618v1 [cs.NI])

Network Functions Virtualization, or NFV, enables telecommunications infrastructure providers to replace special-purpose networking equipment with commodity servers running virtualized network functions (VNFs). A service provider utilizing NFV technology faces the SFC provisioning problem of assigning VNF instances to nodes in the physical infrastructure (e.g., a datacenter), and routing Service Function Chains (sequences of functions required by customers, a.k.a. SFCs) in the physical network. In doing so, the provider must balance between various competing goals of performance and resource usage. We present an approach for SFC provisioning, consisting of three elements. The first element is a fast, scalable round-robin heuristic. The second element is a Mixed Integer Programming (MIP) based approach. The third element is a queueing-theoretic model to estimate the average latency associated with any SFC provisioning solution. Combined, these elements create an approach that generates a set of SFC provisioning solutions, reflecting different tradeoffs between resource usage and performance.

by <a href="">Freddy C. Chua</a>, <a href="">Julie Ward</a>, <a href="">Ying Zhang</a>, <a href="">Puneet Sharma</a>, <a href="">Bernardo A. Huberman</a> at May 02, 2016 01:30 AM

On the Inefficiency of Standard Multi-Unit Auctions. (arXiv:1303.1646v3 [cs.GT] UPDATED)

We study two standard multi-unit auction formats for allocating multiple units of a single good to multi-demand bidders. The first one is the Discriminatory Auction, which charges every winner his winning bids. The second is the Uniform Price Auction, which determines a uniform price to be paid per unit. Variants of both formats find applications ranging from the allocation of state bonds to investors, to online sales over the internet, facilitated by popular online brokers. For these formats, we consider two bidding interfaces: (i) standard bidding, which is most prevalent in the scientific literature, and (ii) uniform bidding, which is more popular in practice. In this work, we evaluate the economic inefficiency of both multi-unit auction formats for both bidding interfaces, by means of upper and lower bounds on the Price of Anarchy for pure Nash equilibria and mixed Bayes-Nash equilibria. Our developments improve significantly upon bounds that have been obtained recently in [Markakis, Telelis, ToCS 2014] and [Syrgkanis, Tardos, STOC 2013] for submodular valuation functions. Moreover, we consider for the first time bidders with subadditive valuation functions for these auction formats. Our results signify that these auctions are nearly efficient, which provides further justification for their use in practice.

by <a href="">Bart de Keijzer</a>, <a href="">Evangelos Markakis</a>, <a href="">Guido Sch&#xe4;fer</a>, <a href="">Orestis Telelis</a> at May 02, 2016 01:30 AM

A partition of the hypercube into maximally nonparallel Hamming codes. (arXiv:1210.0010v2 [cs.IT] UPDATED)

By using the Gold map, we construct a partition of the hypercube into cosets of Hamming codes such that for every two cosets the corresponding Hamming codes are maximally nonparallel, that is, their intersection cardinality is as small as possible to admit nonintersecting cosets.

by <a href="">Denis Krotov</a> (Sobolev Institute of Mathematics, Novosibirsk, Russia) at May 02, 2016 01:30 AM


Extracting features for texture classification

I m a beginner in the field of pattern recognition and computer vision. I m working on a project right now to classify t-shirt patterns into three categories i.e. solids, stripes and checks. I have close up training images of the t-shirt images. A sample shirt image looks like this

I have looked at a bank of gabor filter features, but they are computationally expensive. It would of great help if someone could point me out in the general direction for working forward. Any help is appreciated.

EDIT - I found the solution in D.W.'s answer below, though my solution is not very good. I m classifying solid patterns by counting the number of line segments in the image. If they fall below a certain number, i m classifying them as solid. If not, i further classify them into stripes or checkered using HoG features and a linear SVM. The accuracy achieved was around 91%. It was a little low due to some misclassifed samples in the training set.

by m_amber at May 02, 2016 01:29 AM


Planet Theory

On the Complexity of Solving Zero-Dimensional Polynomial Systems via Projection

Authors: Cornelius Brand, Michael Sagraloff
Download: PDF
Abstract: Given a zero-dimensional polynomial system consisting of n integer polynomials in n variables, we propose a certified and complete method to compute all complex solutions of the system as well as a corresponding separating linear form l with coefficients of small bit size. For computing l, we need to project the solutions into one dimension along O(n) distinct directions but no further algebraic manipulations. The solutions are then directly reconstructed from the considered projections. The first step is deterministic, whereas the second step uses randomization, thus being Las-Vegas.

The theoretical analysis of our approach shows that the overall cost for the two problems considered above is dominated by the cost of carrying out the projections. We also give bounds on the bit complexity of our algorithms that are exclusively stated in terms of the number of variables, the total degree and the bitsize of the input polynomials.

May 02, 2016 12:41 AM

Complexity Hierarchies and Higher-Order Cons-Free Rewriting

Authors: Cynthia Kop, Jakob Grue Simonsen
Download: PDF
Abstract: Constructor rewriting systems are said to be cons-free if, roughly, constructor terms in the right-hand sides of rules are subterms of constructor terms in the left-hand side; the computational intuition is that rules cannot build new data structures. It is well-known that cons-free programming languages can be used to characterize computational complexity classes, and that cons-free first-order term rewriting can be used to characterize the set of polynomial-time decidable sets.

We investigate cons-free higher-order term rewriting systems, the complexity classes they characterize, and how these depend on the order of the types used in the systems. We prove that, for every k $\geq$ 1, left-linear cons-free systems with type order k characterize E$^k$TIME if arbitrary evaluation is used (i.e., the system does not have a fixed reduction strategy).

The main difference with prior work in implicit complexity is that (i) our results hold for non-orthogonal term rewriting systems with possible rule overlaps with no assumptions about reduction strategy, (ii) results for such term rewriting systems have previously only been obtained for k = 1, and with additional syntactic restrictions on top of cons-freeness and left-linearity.

Our results are apparently among the first implicit characterizations of the hierarchy E = E$^1$TIME $\subseteq$ E$^2$TIME $\subseteq$ .... Our work confirms prior results that having full non-determinism (via overlaps of rules) does not directly allow characterization of non-deterministic complexity classes like NE. We also show that non-determinism makes the classes characterized highly sensitive to minor syntactic changes such as admitting product types or non-left-linear rules.

May 02, 2016 12:40 AM


Greenpeace sagt, sie wollen heute TTIP veröffentlichen.Die ...

Greenpeace sagt, sie wollen heute TTIP veröffentlichen.

Die Presse hat sie auch schon.

Aber irgendwie hab ich ja keinen Bock mehr auf die Presse an der Stelle. Schaut nur, was die aus den Panama-Papers gemacht haben. Ein paar dreckige Scoops extrahiert, einmal in Richtung Russland gepinkelt, und dann fallengelassen wie eine heiße Kartoffel.

Kein Vergleich zu den veröffentlichten Rohdaten und der Suchmaschine, die Wikileaks immer hochgezogen hat bei sowas.

Ich komme mir bei sowas in letzter Zeit vor, als würde mich da jemand an einem Nasenring durchs Dorf zu ziehen versuchen. Nein, liebe SZ, ich würde mir gerne selber Gedanken machen. Eure Gedanken könnt ihr behalten. Packt die Daten ins Internet und überlasst das Rumgerüchten und Raunen lieber stinkenden Internet-Foren. Die sind da eh besser als ihr.

Ein bisschen Einordnen könnt ihr auch gerne machen, aber nicht aus einer "wir sind die einzigen, die die Daten haben"-Position heraus.

Update: Ist nur ungefähr die Hälfte von TTIP, was Greenpeace da hat.

May 02, 2016 12:00 AM

HN Daily

Planet Theory

Designing optimal- and fast-on-average pattern matching algorithms

Authors: Gilles Didier, Laurent Tichit
Download: PDF
Abstract: Given a pattern $w$ and a text $t$, the speed of a pattern matching algorithm over $t$ with regard to $w$, is the ratio of the length of $t$ to the number of text accesses performed to search $w$ into $t$. We first propose a general method for computing the limit of the expected speed of pattern matching algorithms, with regard to $w$, over iid texts. Next, we show how to determine the greatest speed which can be achieved among a large class of algorithms, altogether with an algorithm running this speed. Since the complexity of this determination make it impossible to deal with patterns of length greater than 4, we propose a polynomial heuristic. Finally, our approaches are compared with \nAlgos{} pre-existing pattern matching algorithms from both a theoretical and a practical point of view, i.e. both in terms of limit expected speed on iid texts, and in terms of observed average speed on real data. In all cases, the pre-existing algorithms are outperformed.

May 02, 2016 12:00 AM

Ortho-polygon Visibility Representations of Embedded Graphs

Authors: Emilio Di Giacomo, Walter Didimo, William S. Evans, Giuseppe Liotta, Henk Meijer, Fabrizio Montecchiani, Stephen K. Wismath
Download: PDF
Abstract: An ortho-polygon visibility representation of an $n$-vertex embedded graph $G$ (OPVR of $G$) is an embedding preserving drawing of $G$ that maps every vertex to a distinct orthogonal polygon and each edge to a vertical or horizontal visibility between its end-vertices. The vertex complexity of an OPVR of $G$ is the minimum $k$ such that every polygon has at most $k$ reflex corners. We present polynomial time algorithms that test whether $G$ has an OPVR and, if so, compute one of minimum vertex complexity. We argue that the existence and the vertex complexity of an OPVR of $G$ are related to its number of crossings per edge and to its connectivity. Namely, we prove that if $G$ has at most one crossing per edge (i.e. $G$ is a $1$-plane graph) an OPVR of $G$ always exists while this may not be the case if two crossings per edge are allowed. Also, if $G$ is a $3$-connected $1$-plane graph, we can compute in $O(n)$ time an OPVR of $G$ whose vertex complexity is bounded by a constant. However, if $G$ is a $2$-connected $1$-plane graph, the vertex complexity of any OPVR of $G$ may be $\Omega(n)$. In contrast, we describe a family of $2$-connected $1$-plane graphs for which, in $O(n)$ time, an embedding that guarantees constant vertex complexity can be computed. Finally, we present the results of an experimental study on the vertex complexity of ortho-polygon visibility representations of $1$-plane graphs.

May 02, 2016 12:00 AM

A Linear Time Parameterized Algorithm for Node Unique Label Cover

Authors: Daniel Lokshtanov, M. S. Ramanujan, Saket Saurabh
Download: PDF
Abstract: The optimization version of the Unique Label Cover problem is at the heart of the Unique Games Conjecture which has played an important role in the proof of several tight inapproximability results. In recent years, this problem has been also studied extensively from the point of view of parameterized complexity. Cygan et al. [FOCS 2012] proved that this problem is fixed-parameter tractable (FPT) and Wahlstr\"om [SODA 2014] gave an FPT algorithm with an improved parameter dependence. Subsequently, Iwata, Wahlstr\"om and Yoshida [2014] proved that the edge version of Unique Label Cover can be solved in linear FPT-time. That is, there is an FPT algorithm whose dependence on the input-size is linear. However, such an algorithm for the node version of the problem was left as an open problem. In this paper, we resolve this question by presenting the first linear-time FPT algorithm for Node Unique Label Cover.

May 02, 2016 12:00 AM

Optimal Computation of Avoided Words

Authors: Yannis Almirantis, Panagiotis Charalampopoulos, Jia Gao, Costas S. Iliopoulos, Manal Mohamed, Solon P. Pissis, Dimitris Polychronopoulos
Download: PDF
Abstract: The deviation of the observed frequency of a word $w$ from its expected frequency in a given sequence $x$ is used to determine whether or not the word is avoided. This concept is particularly useful in DNA linguistic analysis. The value of the standard deviation of $w$, denoted by $std(w)$, effectively characterises the extent of a word by its edge contrast in the context in which it occurs. A word $w$ of length $k>2$ is a $\rho$-avoided word in $x$ if $std(w) \leq \rho$, for a given threshold $\rho < 0$. Notice that such a word may be completely absent from $x$. Hence computing all such words na\"{\i}vely can be a very time-consuming procedure, in particular for large $k$. In this article, we propose an $O(n)$-time and $O(n)$-space algorithm to compute all $\rho$-avoided words of length $k$ in a given sequence $x$ of length $n$ over a fixed-sized alphabet. We also present a time-optimal $O(\sigma n)$-time and $O(\sigma n)$-space algorithm to compute all $\rho$-avoided words (of any length) in a sequence of length $n$ over an alphabet of size $\sigma$. Furthermore, we provide a tight asymptotic upper bound for the number of $\rho$-avoided words and the expected length of the longest one. We make available an open-source implementation of our algorithm. Experimental results, using both real and synthetic data, show the efficiency of our implementation.

May 02, 2016 12:00 AM

An I/O-efficient Generator for Massive Complex Networks with Explicit Community Structure

Authors: Michael Hamann, Manuel Penschuck
Download: PDF
Abstract: The LFR benchmark is a popular benchmark graph model used to evaluate community detection algorithms. We present the first external memory algorithm that is able to generate massive complex networks following the LFR model. Its most expensive component is the generation of random graphs with prescribed degree sequences which can be divided in two steps: They are first materialized as deterministic graphs using the Havel-Hakimi algorithm and then randomized. Our main contribution are HP-HH and ES-TFP, two I/O-efficient external memory algorithms for these two steps. In an experimental evaluation we demonstrate their performance: our implementation is able to generate graphs with more than 10 billion edges on a single machine, is competitive with a parallel massively distributed algorithm and on smaller instances faster than a state-of-the-art internal memory implementation.

May 02, 2016 12:00 AM

Relative Convex Hull Determination from Convex Hulls in the Plane

Authors: P. Wiederhold, H. Reyes
Download: PDF
Abstract: A new algorithm for the determination of the relative convex hull in the plane of a simple polygon A with respect to another simple polygon B which contains A, is proposed. The relative convex hull is also known as geodesic convex hull, and the problem of its determination in the plane is equivalent to find the shortest curve among all Jordan curves lying in the difference set of B and A and encircling A. Algorithms solving this problem known from Computational Geometry are based on the triangulation or similar decomposition of that difference set. The algorithm presented here does not use such decomposition, but it supposes that A and B are given as ordered sequences of vertices. The algorithm is based on convex hull calculations of A and B and of smaller polygons and polylines, it produces the output list of vertices of the relative convex hull from the sequence of vertices of the convex hull of A.

May 02, 2016 12:00 AM

On Approximating Functions of the Singular Values in a Stream

Authors: Yi Li, David P. Woodruff
Download: PDF
Abstract: For any real number $p > 0$, we nearly completely characterize the space complexity of estimating $\|A\|_p^p = \sum_{i=1}^n \sigma_i^p$ for $n \times n$ matrices $A$ in which each row and each column has $O(1)$ non-zero entries and whose entries are presented one at a time in a data stream model. Here the $\sigma_i$ are the singular values of $A$, and when $p \geq 1$, $\|A\|_p^p$ is the $p$-th power of the Schatten $p$-norm. We show that when $p$ is not an even integer, to obtain a $(1+\epsilon)$-approximation to $\|A\|_p^p$ with constant probability, any $1$-pass algorithm requires $n^{1-g(\epsilon)}$ bits of space, where $g(\epsilon) \rightarrow 0$ as $\epsilon \rightarrow 0$ and $\epsilon > 0$ is a constant independent of $n$. However, when $p$ is an even integer, we give an upper bound of $n^{1-2/p} \textrm{poly}(\epsilon^{-1}\log n)$ bits of space, which holds even in the turnstile data stream model. The latter is optimal up to $\textrm{poly}(\epsilon^{-1} \log n)$ factors.

Our results considerably strengthen lower bounds in previous work for arbitrary (not necessarily sparse) matrices $A$: the previous best lower bound was $\Omega(\log n)$ for $p\in (0,1)$, $\Omega(n^{1/p-1/2}/\log n)$ for $p\in [1,2)$ and $\Omega(n^{1-2/p})$ for $p\in (2,\infty)$. We note for $p \in (2, \infty)$, while our lower bound for even integers is the same, for other $p$ in this range our lower bound is $n^{1-g(\epsilon)}$, which is considerably stronger than the previous $n^{1-2/p}$ for small enough constant $\epsilon > 0$. We obtain similar near-linear lower bounds for Ky-Fan norms, SVD entropy, eigenvalue shrinkers, and M-estimators, many of which could have been solvable in logarithmic space prior to our work.

May 02, 2016 12:00 AM

On the Erdos-Szekeres convex polygon problem

Authors: Andrew Suk
Download: PDF
Abstract: Let $ES(n)$ be the smallest integer such that any set of $ES(n)$ points in the plane in general position contains $n$ points in convex position. In their seminal 1935 paper, Erdos and Szekeres showed that $ES(n) \leq {2n - 4\choose n-2} + 1 = 4^{n -o(n)}$. In 1960, they showed that $ES(n) \geq 2^{n-2} + 1$ and conjectured this to be optimal. Despite the efforts of many researchers, no improvement in the order of magnitude has ever been made on the upper bound over the last 81 years. In this paper, we nearly settle the Erdos-Szekeres conjecture by showing that $ES(n) =2^{n +o(n)}$.

May 02, 2016 12:00 AM

Derivative-free Efficient Global Optimization on High-dimensional Simplex

Authors: Priyam Das
Download: PDF
Abstract: In this paper, we develop a novel derivative-free deterministic greedy algorithm for global optimization of any objective function of parameters belonging to a unit-simplex. Main principle of the proposed algorithm is making jumps of varying step-sizes within the simplex parameter space and searching for the best direction to move in a greedy manner. Unlike most of the other existing methods of constraint optimization, here the objective function is evaluated at independent directions within an iteration. Thus incorporation of parallel computing makes it even faster. Requirement of parallelization grows only in the order of the dimension of the parameter space, which makes it more convenient for solving high-dimensional optimization problems in simplex parameter space using parallel computing. A comparative study of the performances of this algorithm and other existing algorithms have been shown for some moderate and high-dimensional optimization problems along with some transformed benchmark test-functions on simplex. Around 20-300 folds improvement in computation time has been achieved using the proposed algorithm over Genetic algorithm with more accurate solution.

May 02, 2016 12:00 AM

Decomposing Cubic Graphs into Connected Subgraphs of Size Three

Authors: Laurent Bulteau, Guillaume Fertin, Anthony Labarre, Romeo Rizzi, Irena Rusu
Download: PDF
Abstract: Let $S=\{K_{1,3},K_3,P_4\}$ be the set of connected graphs of size 3. We study the problem of partitioning the edge set of a graph $G$ into graphs taken from any non-empty $S'\subseteq S$. The problem is known to be NP-complete for any possible choice of $S'$ in general graphs. In this paper, we assume that the input graph is cubic, and study the computational complexity of the problem of partitioning its edge set for any choice of $S'$. We identify all polynomial and NP-complete problems in that setting, and give graph-theoretic characterisations of $S'$-decomposable cubic graphs in some cases.

May 02, 2016 12:00 AM

May 01, 2016


how is scan implimented in a purely functional way?

Functional reactive programming uses scan on streams/observables. But it also follows the functional principal of being stateless. How is this implemented without state, especially in languages like haskell?

EDIT: by stateless, I ment no mutable values. I know you can use recursuion for reduce, but it seems like that would be harder with scan because the function needs to pause until the next element comes in.

by revanistT3 at May 01, 2016 11:44 PM


Mathematically: How does increasing the number of assets reduce idiosyncratic risk?

As part of an Asset Pricing Module I'm currently taking, whilst looking at APT Ross (1974), we looked at how according to this model, risk originates from both systematic and idiosyncratic asset specific sources.

We first considered an N asset portfolio with equal weights to show how increasing N assets decreases the idiosyncratic (eP - here i am calling it the residual error term e of the portfolio P) residual variances:

Var(e) = (1/N)*(Average Sigma e)

It is clear to see that as N increases, the Variance of e decreases.

However, my question is for the case where N asset are held, but not in equal proportions. We end up with the following expression for the Var(eP):

Var(eP) = [(Summation from i=1 to N) (wi)^2 * (Sigma ei)] + All Covariance Terms

From our assumptions at the outset, idiosyncratic risks of say asset i don't affect asset j, so all the second terms from the above equation equal 0.

My question, in the below equation where wi is equal to the weight of asset i:

Var(eP) = [(Summation from i=1 to N) (wi)^2 * (Sigma ei)]

How can we see here that increasing N reduces idiosyncratic risk?


by Curious Student at May 01, 2016 11:34 PM



Why do I get good accuracy with IRIS dataset with a single hidden node?

I have a minimal example of a neural network with a back-propagation trainer, testing it on the IRIS data set. I started of with 7 hidden nodes and it worked well.

I lowered the number of nodes in the hidden layer to 1 (expecting it to fail), but was surprised to see that the accuracy went up.

I set up the experiment in azure ml, just to validate that it wasn't my code. Same thing there, 98.3333% accuracy with a single hidden node.

Can anyone explain to me what is happening here?

by jan at May 01, 2016 11:14 PM


Topological properties of classes

Is there any reasonable natural way to think of closure properties of complexity classes such as closure of $PP$ or $PH$ is $PSPACE$?

$P/poly$ is in interior of class $PP$ and so on?

I found this link

Is there reasonable way to concoct topological view from this of complexity landscape?

by Turbo at May 01, 2016 11:11 PM


Keras. ValueError: I/O operation on closed file

I use jupyter notebook with anaconda. I use kerast firstly, and i can't do tutorial. About this issues are two themes in stackoverflow, but solve not found.

My code:

model = Sequential()
model.add(Dense(1, input_dim=1, activation='softmax'))


X_train_shape = X_train.reshape(len(X_train), 1)
Y_train_shape = Y_train.reshape(len(Y_train), 1), Y_train, nb_epoch=5, batch_size=32)

And I have error, it's some random and sometimes one or two epoch competed:

Epoch 1/5 4352/17500 [======>.......................]

--------------------------------------------------------------------------- ValueError Traceback (most recent call last) in () 2 # of 32 samples 3 #sleep(0.1) ----> 4, Y_train, nb_epoch=5, batch_size=32) 5 #sleep(0.1)

C:\Anaconda3\envs\py27\lib\site-packages\keras\models.pyc in fit(self, x, y, batch_size, nb_epoch, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, **kwargs) 395 shuffle=shuffle, 396 class_weight=class_weight, --> 397 sample_weight=sample_weight) 398 399 def evaluate(self, x, y, batch_size=32, verbose=1,

C:\Anaconda3\envs\py27\lib\site-packages\keras\engine\training.pyc in fit(self, x, y, batch_size, nb_epoch, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight) 1009 verbose=verbose, callbacks=callbacks, 1010
val_f=val_f, val_ins=val_ins, shuffle=shuffle, -> 1011 callback_metrics=callback_metrics) 1012 1013 def evaluate(self, x, y, batch_size=32, verbose=1, sample_weight=None):

C:\Anaconda3\envs\py27\lib\site-packages\keras\engine\training.pyc in _fit_loop(self, f, ins, out_labels, batch_size, nb_epoch, verbose, callbacks, val_f, val_ins, shuffle, callback_metrics) 753 batch_logs[l] = o 754 --> 755 callbacks.on_batch_end(batch_index, batch_logs) 756 757 epoch_logs = {}

C:\Anaconda3\envs\py27\lib\site-packages\keras\callbacks.pyc in on_batch_end(self, batch, logs) 58 t_before_callbacks = time.time() 59 for callback in self.callbacks: ---> 60 callback.on_batch_end(batch, logs) 61 self._delta_ts_batch_end.append(time.time() - t_before_callbacks) 62 delta_t_median = np.median(self._delta_ts_batch_end)

C:\Anaconda3\envs\py27\lib\site-packages\keras\callbacks.pyc in on_batch_end(self, batch, logs) 187 # will be handled by on_epoch_end 188 if self.verbose and self.seen < self.params['nb_sample']: --> 189 self.progbar.update(self.seen, self.log_values) 190 191 def on_epoch_end(self, epoch, logs={}):

C:\Anaconda3\envs\py27\lib\site-packages\keras\utils\generic_utils.pyc in update(self, current, values) 110 info += ((prev_total_width - self.total_width) * " ") 111 --> 112 sys.stdout.write(info) 113 sys.stdout.flush() 114

C:\Anaconda3\envs\py27\lib\site-packages\ipykernel\iostream.pyc in write(self, string) 315 316 is_child = (not self._is_master_process()) --> 317 self._buffer.write(string) 318 if is_child: 319 # newlines imply flush in subprocesses

ValueError: I/O operation on closed file

by Владислав Михайлов at May 01, 2016 11:10 PM

hubertf's NetBSD blog

Bootstrap pkgsrc under 'bash on Windows'

Much bruha was made about Windows running Linux userland recently. Leaving out the fact that emulating other operating systems is something that NetBSD does for ages, there is one real challenge that every Linux user faces when he has set up his operating system: getting software installed easily. And of course there is only one truely portable answer to that question: use pkgsrc, of course!

The process is pretty much straight forward, and Ryo ONODERA has verified the prerequired Windows versions and Linux packages, and has sent instructions on how to bootstrap pkgsrc on Windows 10. Now who's the first one to post a screenshot with output of pkgsrc/misc/cowsay running "cowsay hello pkgsrc"? :-)

May 01, 2016 10:57 PM


Maximum set packing and minimum set cover duality

I read that the maximum set packing and the minimum set cover problems are dual of each other when formulated as linear programming problems. By the strong duality theorem, the optimal solution to the primal and dual LP problems should have the same value.

However, consider a universe $U = \{1, 2, 3, 4, 5\}$ and a collection of sets: $S = \{ \{1, 2, 3\}, \{3, 4, 5\}, \{1\}, \{2\}, \{3\}\}$. From what I understand, a minimum set cover would consist of the first 2 sets in set $S$, while the maximum set packing consists of the last 3 sets. These solutions aren't in accordance with the statement of the strong duality theorem.

Given that, I don't understand how the 2 problems can be dual of each other. What am I missing?

Thank you very much.

by abc at May 01, 2016 10:46 PM


Converting maths equation to Java

In this paper, section 2.1, they provide an approach to gain new reasoned probabilities from a set of classifier results. I can understand the concepts but am having difficultly completing the representation in Java. Here is my approach so far:

public double[] sftmx(double[] ovaProbabilities) {
        double[] newProbs = new double[ovaProbabilities.length];

        double[] zArray = generateZArray(ovaProbabilities);
        double zSum = 0;
        for (int j = 0; j < zArray.length; j++) {
            zSum += (zArray[j]);

        for (int i = 0; i < zArray.length; i++) {
            newProbs[i] = ((zArray[i])/zSum);

        return newProbs;

    private double[] generateZArray(double[] ovaPrbs) {
        double[] zArray = new double[ovaPrbs.length];
        for (int i = 0; i < ovaPrbs.length; i++) {
            double wk = 0;
            double wko = 0;

            // working out wk, wko and C values and equation minimization

            zArray[i] = Math.exp((wk*ovaPrbs[i])+wko);

        return zArray;

I primarily don't see why/how an equation needs minimising and what relevance it has. I hope the code is readable, I've been trying to get it working but this is as far as I've got!


by mino at May 01, 2016 10:40 PM


Decidablity with exponential number of solution

I am trying to understanding this. If a problem has exponential amount of candidate solutions, such as 2^2^n. Is this decidable? To my understanding, as long as its' verfiable, no matter how big the solutions there are, it's.decidable.

Thanks for clarity

by Someguy at May 01, 2016 10:39 PM

why should I go for logistic regression?

I am a student working on a Database management project with a bit of Python coding involved. The project is about Review Analysis.Basically I am trying to read a review and determine how good or bad it is.

I have a file of good and bad words with its scores like :: good = 2, better =3,best=4;

By this , I have written a code for sentiment analysis of the review which gives me the score of the review considering every single word.

Since the range varied from - infinity to + infinity, I had to scale it down to a range of 0 to 1. A graphical representation must be done on this data; i.e. the score of review ,may it be from -infinity to +infinity or 0 to 1.

I was advised to go for logistic regression plot by my guide. Why logistic regression? Is there a better way?

by Shreya B at May 01, 2016 10:25 PM


What does foldr do in Haskell?

So I came across the foldr function in Haskell which from what I pick up, you can use to calculate the product and sum of a list:

foldr f x xs

foldr (*) 1 [1..5] = 120
foldr (+) 0 [1..5] = 15

And adding numbers onto the x part would like, add on to the overall sum or multiply onto the final product

What does foldr actually do and why would anyone use it instead of the built in functions 'sum' or 'product' etc?

by HJGBAUM at May 01, 2016 10:24 PM


Consequences of bipartite perfect matching not in NL?

Are any significant consequences known of $\text{BPM} \not\in \textsf{NL}$?

I'm interested in the status of the following well-studied decision problem, in particular whether it is known to be in $\textsf{NL}$:

Bipartite Perfect Matching (BPM)
Input: bipartite $2n$-vertex graph $G$
Question: does $G$ contain a matching with $n$ edges?

Maximum Bipartite Matching is the version where the required size of a matching is given as part of the input. By Chandra-Stockmeyer-Vishkin, this is equivalent to BPM via $\textsf{AC}^0$-reductions, and these problems are hard for $\textsf{NL}$ via $\textsf{AC}^0$-reductions.

Recall the following sequence of inclusions (see the Zoo): $$\textsf{L} \subseteq \textsf{UL} \subseteq \textsf{NL} \subseteq \textsf{NC}^2 \subseteq \textsf{RNC}^2.$$

Looking at a more general problem, maximum matching for general graphs is in $\textsf{RNC}^2$, by Mulmuley-Vazirani-Vazirani (preprint), and according to Allender-Reinhardt-Zhou this problem was (in 1999) not known to be in NL. The latter paper also shows that BPM is in a nonuniform version of $\textsf{SPL}$, but $\textsf{SPL}$ is not known to be comparable to $\textsf{NL}$. (The Zoo also claims that BPM is in $\textsf{coRNC}$ by Karloff, although it is not obvious to me how this follows from the results in Karloff's paper.)

All of the usual approaches to finding a maximum bipartite matching are polynomial-time. Moreover, these algorithms have polynomial time bounds of quite low degree. In particular, an $O(n^{2.5})$ time upper bound follows from the reduction of Hopcroft-Karp and Karzanov to maximum flow. So it is not impossible that BPM (and perhaps even maximum matching) is in $\textsf{NL}$.

However, all the approaches I've looked at seem to be rather space-intensive. The usual algorithms essentially keep track of subsets of the vertices, and therefore seem to require something like $\Omega(n)$ bits (although proving such a bound rigorously seems likely to be difficult).

Looking at a more specific problem, if the input graphs are further restricted to be planar as well as bipartite, then Planar BPM is in $\textsf{UL}$ and hence $\textsf{NL}$, by Datta-Gopalan-Kulkarni-Tewari.

So BPM is $\textsf{NL}$-hard, but is in some complexity classes "not much larger than" $\textsf{NL}$, and is in $\textsf{NL}$ for restricted classes of bipartite graphs. It therefore seems reasonable to ask: if $\text{BPM} \not\in\textsf{NL}$, would anything interesting follow?

by András Salamon at May 01, 2016 10:16 PM


newff and train functions of python's neurolab is giving inconsistent results for same code and input

While the input is the same and the code is the same, I get two different results when run multiple time. There are only two unique outputs though. I do not know what part of the code is randomized and I'm having a hard time figuring out where the error is. Is this a known bug in neurolab by any chance?

I've attached the complete code below. Please run in it some 9-10 times to see the two different outputs. I also have attached the output from five runs of the same code and I see that the error output has two different values in the five runs. Please help.

Code: --------

import neurolab as nl
import numpy as np

# Create train samples

N = 200;

x1 = [0]*(N+1);

for ii in range(-N/2,N/2+1,1):

    x1[ii+N/2] = ii;

x1_arr = np.array(x1);

y1 = -2+ 3*x1_arr ;

y = [0]*len(y1);

for ii in range(len(y1)):

    if(y1[ii] > 15):

        y[ii] = 1;

l = len(y);

x0 = [1]*l;

x0_arr = np.array(x0);

x_arr = np.concatenate(([x0_arr], [x1_arr]), axis=0)

x = x1_arr;

y_arr = np.array(y);

size = l;

inp = x.reshape(size,1)

tar = y_arr.reshape(size,1)

# Create network with 2 layers and random initialized

net =[[-N/2, N/2]],[1, 1])

net.trainf =  nl.train.train_gd;

# Train network
error = net.train(inp, tar, epochs=100, show=100, goal=0.02, lr = 0.001)

# Simulate network
out = net.sim(inp);

Ouput ---------

========= RESTART: D:/Python_scripts/ML/nn_neurolab/ =========
Epoch: 100; Error: 2.49617137968;
The maximum number of train epochs is reached
========= RESTART: D:/Python_scripts/ML/nn_neurolab/ =========
Epoch: 100; Error: 2.49617137968;
The maximum number of train epochs is reached
========= RESTART: D:/Python_scripts/ML/nn_neurolab/ =========
Epoch: 100; Error: 2.66289633422;
The maximum number of train epochs is reached
========= RESTART: D:/Python_scripts/ML/nn_neurolab/ =========
Epoch: 100; Error: 2.49617137968;
The maximum number of train epochs is reached
========= RESTART: D:/Python_scripts/ML/nn_neurolab/ =========
Epoch: 100; Error: 2.66289633422;
The maximum number of train epochs is reached

Thanks and Cheers!

by Baalzamon at May 01, 2016 10:12 PM

Why can not I get a Correct Answer in the Hackerrank? Python 3 [on hold]

I am creating a program to solve the problem proposed in the Test Case #1. But I am always getting a wrong answer in the Test Case #2, it is because I never know what is the Test Case #2!

Why am I getting wrong in the second case?

Here is the problem propsed and my answer:

Task Complete the code in the editor below. The variables i, d, and s are already declared and initialized for you. You must declare 3 variables: one of type int, one of type double, and one of type String. Then you must read 3 lines of input from stdin and initialize your 3 variables. Finally, you must use the + operator to perform the following operations:

Print the sum of i plus your int variable on a new line. Print the sum of d plus your double variable to a scale of one decimal place on a new line. Concatenate s with the string you read as input and print the result on a new line.

Test Case #1 - Output expected:



HackerRank is the best place to learn and practice coding!

+Test Case #2 - Output expected:



is my favorite plataform!

Here's my code:

def test_case_1():
        ii = int()
        dd = float()
        ss = str()
        i = 4
        d = 4.0
        s = 'HackerRank '
        ii = 12
        dd = 4.0
        ss = "is the best place to learn and practice coding!"
        total_int = i + ii
        print (total_int)
        total_double = d + dd
        print (s + ss)
        memory = open('memory.txt', 'w')
def test_case_2():
        ii = int()
        dd = float()
        ss = str()
        i = 4
        d = 4.0
        s = 'HackerRank '
        ii = 3
        dd = 2.8
        ss = "is my favorite plataform!"
        total_int = i + ii
        print (total_int)
        total_double = d + dd
        print (s + ss)
        memory = open('memory.txt', 'w')
def starting():
    memory = open('memory.txt', 'r')
    ted = []
    for line in memory:

    return ted

def test(ted, main):
    if ted == ['0'] :

    elif ted == ['1'] :

def main():
        memory = open('memory.txt', 'r')

            memory = open('memory.txt', 'w')

    pronto = starting()
    test(pronto, main)


by Misael Viríssimo de Moura at May 01, 2016 09:59 PM

Planet Theory

Some more bits from the Gathering for Gardner

I posted about the Gathering for Gardner conference and about some of the talks I saw here. Today I continue with a few more talks.

Playing Penney's game with Roulette by Robert Vallen. Penney;'s game is the following:  let k be fixed. Alice and Bob pick different elements of {H,T}^k.  They flip a coin until one of their sequences shows up, and that person wins. Which sequences have the best probability of winning?

New Polyhedral dice by Robert Fathauer, Henry Segerman, Robert Bosch. This is a good example of how my mentality (and possibly yours) differs from others. When I hear ``60-sided dice'' I think ``p1,...,p60 where are all between 0 and 1 and add up to 1'' I also thought that only the platonic solids could be usedvto form fair dice (so only 4-sided, 6-sided, 8-sided, 12-sided, and 20-sided dice can be made). NOT so. These authors actually MAKE real dice and they do not have to be platonic solids. Here is their website.

Numerically balance dice by Robert Bosch (paper is here). Why do dice have the opposite sides sum to the same thing?  Read the paper to find out!

Secret messages in juggling and card shuffling by Erik Demaine. Erik Demaine was one of about 4 theoretical computer scientists I met at the conference, though Erik is so well rounded that calling him a theoretical computer scientist doesn't seem quite right. I had never met him before which surprised me. In this talk he showed us some new fonts- one using juggling. See here for an example of juggling fonts, co-authored with his father Martin.

Fibonacci Lemonade by Andrea Johanna Hawksley. Put in the leomon and sugar in fib number increments. Here is their website. In my first post I said the talks were on a variety of topics and then presented mostly math talks. This talk is an example of that variety. There were other talks involving the Fib numbers. I was surprised by this since they aren't that special (see here).

Penis Covers and Puzzles: Brain Injuries and Brain Health by Gini Wingard-Phillips. She recounted having various brain injuries and how working on mathematical puzzles, of the type Martin Gardner popularized as HELPING HER RECOVER! As for the title- people with brain injuries sometimes have a hard time finding the words for things so they use other words. In this case she wanted her husband to buy some condoms but couldn't think of the word so she said Penis Covers instead.

Loop- Pool on an Ellipse by Alex Bellos. Similar in my mind to the Polyhedral dice talk (you'll see why). We all know that if you built an elliptical pool table with a hole at one of the foci then if the ball is placed at the other foci and hit hard enough it WILL go into the other hole. But Alex Bellos actually MAKES these pool table (see here if you want buy one for $20,000). He told us the history- someone else tried to make one in 1962 but nobody bought them (I wonder if anyone are going to buy his), and Alex had problems with friction as you may recall that it only works on a frictionless surface. So his game does require some skill. The similarity to dice is that I (and you?) are used to thinking about dice and ellipses abstractly, not as objects people actually build.

This post is getting long so I'll stop here and report more in a later post. Why so mny posts? Six minute talks that I an actually understand and are delighted to tell you about!

by GASARCH ( at May 01, 2016 09:56 PM


Calculation of Bond Carry from Synthetic future prices

enter image description hereI have only government bond yields with different maturities. How can I obtain sythetic future prices on bonds? After obtained the future prices, I am supposed to compute the return and carry returns.

by user20280 at May 01, 2016 09:37 PM

On the reflection of a stochastic integral

Let ${(I_t)}_{t\geq 0}$ be a stochastic integral defined by $$ I_t=\int_{0}^{t}\theta_sdW_t, $$ where $W$ is a standard Brownian motion defined on $(\Omega,\mathcal{F},{(\mathcal{F}_t)}_{t\geq 0},\mathbb{P})$ and $\theta$ a stochastic process adapted to $\mathcal{F}_t$ satisfying the follows condition of integrability $$ E\left(\int_{0}^{t}\theta_s^2 ds\right)<\infty\;\;\ \forall t> 0. $$

We define the first passage time at $a$ for Brownian motion $W$ by the following random variable $$ \tau_a = \inf\{t\geq 0,W_t\geq a\}, $$ where $a>0$.

It is possible to show that $\tau_a$ is a stopping time. Moreover, By virtue of the reflection principle, we know that the following process

\begin{equation*} Z_t = \begin{cases} W_t \qquad & if \qquad 0 \leq t \leq \tau_a \\ 2a-W_t \qquad & if \qquad t > \tau_a \end{cases} \end{equation*}

also follows a standard Brownian motion under $\mathbb{P}$.

My question is as follows : Is it possible to rewrite the process $I$ in relation to the process $Z$?

I would like your opinion on this issue, thank you in advance.

by Mohamed Amine Kacef at May 01, 2016 09:32 PM


What are the Correct Conditions for Akra-Bazzi Master Theorem? (Cross-Post) [on hold]

NOTE: this question is cross-posted

The Akra-Bazzi method solves recurrences of the form:

$$T(n) = g(n) + \sum\limits_{i=1}^k a_iT(b_in + h_i(n))$$

In the Wikipedia article about the topic, it says that the condition on $g(n)$ is:

$$g(n) \in O(x^c)$$

for some constant $c$, and the $O$ is Big Oh notation.

However, elsewhere it says that the condition on $g(.)$ is "polynomial growth" (for example here):

We say that $g(n)$ satisfies the polynomial-growth condition if there exist positive constants $c_1,c_2$ such that for all $n \ge 1$, for all $1 \le i \le k$, and for all $u \in [b_in, n],$ $$c_1g(n) \le g(u) \le c_2g(n).$$

My question is this:

Are these two conditions equivalent? Does one imply the other? Or is Wikipedia's statement of the theorem simply incorrect?

EDIT: What confuses me about how the two conditions relate to each other, is that the Big Oh condition obviously compares the growth of the function to another function which is a polynomial, while the "polynomial growth condition" compares the growth of the function to itself and not to any polynomial.

So why is the "polynomial growth condition" even called that? If it is equivalent to the the Big Oh condition, then why don't all references state it like that? Isn't the first condition much easier to understand?

I checked the Wikipedia references and googled Akra-Bazzi, but all of the other papers I found gave the "polynomial growth condition", so I have no idea where the condition on Wikipedia came from.

The Wikipedia condition worked correctly for my homework (using $g(n)=n^2 \log n$), so I feel like it is probably sufficient but not necessary for the "polynomial growth condition", but I'm not sure because I can't find any sources corroborating that and I keep screwing up the argument when I attempt it myself.

by William at May 01, 2016 09:27 PM

Complexity of Hamiltonian path and clique problem

I came across this question. If we want to check if a graph contains both Hamiltonian path and clique. Would this problem be NPC.

I knew that clique contains a Hamiltonian path and both problems are NPC, but I am uncertain if something would be different if we check it in same time.

by Someguy at May 01, 2016 09:24 PM

Planet Theory

“Largely just men doing sums”: My review of the excellent Ramanujan film

[Warning: This movie review contains spoilers, as well as a continued fraction expansion.]

These days, it takes an extraordinary occasion for me and Dana to arrange the complicated, rocket-launch-like babysitting logistics involved in going out for a night at the movies.  One such an occasion was an opening-weekend screening of The Man Who Knew Infinitythe new movie about Srinivasa Ramanujan and his relationship with G. H. Hardy—followed by a Q&A with Matthew Brown (who wrote and directed the film), Robert Kanigel (who wrote the biography on which the film was based), and Fields Medalist Manjul Bhargava (who consulted on the film).

I read Kanigel’s The Man Who Knew Infinity in the early nineties; it was a major influence on my life.  There were equations in that book to stop a nerdy 13-year-old’s pulse, like

$$1+9\left( \frac{1}{4}\right) ^{4}+17\left( \frac{1\cdot5}{4\cdot8}\right)
^{4}+25\left( \frac{1\cdot5\cdot9}{4\cdot8\cdot12}\right) ^{4}+\cdots
=\frac{2^{3/2}}{\pi^{1/2}\Gamma\left( 3/4\right) ^{2}}$$

}}=\left( \sqrt{\frac{5+\sqrt{5}}{2}}-\frac{\sqrt{5}+1}{2}\right)

A thousand pages of exposition about Ramanujan’s mysterious self-taught mathematical style, the effect his work had on Hardy and Littlewood, his impact on the later development of analysis, etc., could never replace the experience of just staring at these things!  Popularizers are constantly trying to “explain” mathematical beauty by comparing it to art, music, or poetry, but I can best understand art, music, and poetry if I assume other people experience them like the above identities.  Across all the years and cultures and continents, can’t you feel Ramanujan himself leaping off your screen, still trying to make you see this bizarre aspect of the architecture of reality that the goddess Namagiri showed him in a dream?

Reading Kanigel’s book, I was also entranced by the culture of early-twentieth-century Cambridge mathematics: the Tripos, Wranglers, High Table.  I asked, why was I here and not there?  And even though I was (and remain) at most 1729-1729 of a Ramanujan, I could strongly identify with his story, because I knew that I, too, was about to embark on the journey from total scientific nobody to someone who the experts might at least take seriously enough to try to prove him wrong.

Anyway, a couple years after reading Kanigel’s biography, I went to the wonderful Canada/USA MathCamp, and there met Richard K. Guy, who’d actually known Hardy.  I couldn’t have been more impressed had Guy visited Platonic heaven and met π and e there.  To put it mildly, no one in my high school had known G. H. Hardy.

I often fantasized—this was the nineties—about writing the screenplay myself for a Ramanujan movie, so that millions of moviegoers could experience the story as I did.  Incidentally, I also fantasized about writing screenplays for Alan Turing and John Nash movies.  I do have a few mathematical biopic ideas that haven’t yet been taken, and for which any potential buyers should get in touch with me:

  • Radical: The Story of Évariste Galois
  • Give Me a Place to Stand: Archimedes’ Final Days
  • Mathématicienne: Sophie Germain In Her Prime
  • The Prime Power of Ludwig Sylow
    (OK, this last one would be more of a limited-market release)

But enough digressions; how was the Ramanujan movie?

Just as Ramanujan himself wasn’t an infallible oracle (many of his claims, e.g. his formula for the prime counting function, turned out to be wrong), so The Man Who Knew Infinity isn’t a perfect movie.  Even so, there’s no question that this is one of the best and truest movies ever made about mathematics and mathematicians, if not the best and truest.  If you’re the kind of person who reads this blog, go see it now.  Don’t wait!  As they stressed at the Q&A, the number of tickets sold in the first couple weeks is what determines whether or not the movie will see a wider release.

More than A Beautiful Mind or Good Will Hunting or The Imitation Game, or the play Proof, or the TV series NUMB3RS, the Ramanujan movie seems to me to respect math as a thing-in-itself, rather than just a tool or symbol for something else that interests the director much more.  The background to the opening credits—and what better choice could there be?—is just page after page from Ramanujan’s notebooks.  Later in the film, there’s a correct explanation of what the partition function P(n) is, and of one of Ramanujan’s and Hardy’s central achievements, which was to give an asymptotic formula for P(n), namely $$ P(n) \approx \frac{e^{π \sqrt{2n/3}}}{4\sqrt{3}n}, $$ and to prove the formula’s correctness.

The film also makes crystal-clear that pure mathematicians do what they do not because of applications to physics or anything else, but simply because they feel compelled to: for the devout Ramanujan, math was literally about writing down “the thoughts of God,” while for the atheist Hardy, math was a religion-substitute.  Notably, the movie explores the tension between Ramanujan’s untrained intuition and Hardy’s demands for rigor in a way that does them both justice, resisting the Hollywood urge to make intuition 100% victorious and rigor just a stodgy punching bag to be defeated.

For my taste, the movie could’ve gone even further in the direction of “letting the math speak”: for example, it could’ve explained just one of Ramanujan’s infinite series.  Audiences might even have liked some more T&A (theorems and asymptotic bounds).  During the Q&A that I attended, I was impressed to see moviegoers repeatedly pressing a somewhat-coy Manjul Bhargava to explain Ramanujan’s actual mathematics (e.g., what exactly were the discoveries in his first letter to Hardy?  what was in Ramanujan’s Lost Notebook that turned out to be so important?).  Then again, this was Cambridge, MA, so the possibility should at least be entertained that what I witnessed was unrepresentative of American ticket-buyers.

From what I’ve read, the movie is also true to South Indian dress, music, religion, and culture.  Yes, the Indian characters speak to each other in English rather than Tamil, but Brown explained that as a necessary compromise (not only for the audience’s sake, but also because Dev Patel and the other Indian actors didn’t speak Tamil).

Some reviews have mentioned issues with casting and characterization.  For example, Hardy is portrayed by Jeremy Irons, who’s superb but also decades older than Hardy was at the time he knew Ramanujan.  Meanwhile Ramanujan’s wife, Janaki, is played by a fully-grown Devika Bhise; the real Janaki was nine (!) when she married Ramanujan, and fourteen when Ramanujan left for England.  J. E. Littlewood is played as almost a comic-relief buffoon, so much so that it feels incongruous when, near the end of the film, Irons-as-Hardy utters the following real-life line:

I still say to myself when I am depressed and find myself forced to listen to pompous and tiresome people, “Well, I have done one thing you could never have done, and that is to have collaborated with Littlewood and Ramanujan on something like equal terms.”

Finally, a young, mustachioed Bertrand Russell is a recurring character.  Russell and Hardy really were friends and fellow WWI pacifists, but Hardy seeking out Bertie’s advice about each Ramanujan-related development seems like almost certainly just an irresistible plot device.

But none of that matters.  What bothered me more were the dramatizations of the prejudice Ramanujan endured in England.  Ramanujan is shown getting knocked to the ground, punched, and kicked by British soldiers barking anti-Indian slurs at him; he then shows up for his next meeting with Hardy covered in bruises, which Hardy (being aloof) neglects to ask about.  Ramanujan is also depicted getting shoved, screamed at, and told never to return by a math professor who he humiliates during a lecture.  I understand why Brown made these cinematic choices: there’s no question that Ramanujan experienced prejudice and snobbery in Cambridge, and that he often felt lonely and unwelcome there.  And it’s surely easier to show Ramanujan literally getting beaten up by racist bigots, than to depict his alienation from Cambridge society as the subtler matter that it most likely was.  To me, though, that’s precisely why the latter choice would’ve been even more impressive, had the film managed to pull it off.

Similarly, during World War I, the film shows not only Trinity College converted into a military hospital, and many promising students marched off to their deaths (all true), but also a shell exploding on campus near Ramanujan, after which Ramanujan gazes in horror at the bleeding dead bodies.  Like, isn’t the truth here dramatic enough?

One other thing: the movie leaves you with the impression that Ramanujan died of tuberculosis.  More recent analysis concluded that it was probably hepatic amoebiasis that he brought with him from India—something that could’ve been cured with the medicine of the time, had anyone correctly diagnosed it.  (Incidentally, the film completely omits Ramanujan’s final year, back in India, when he suffered a relapse of his illness and slowly withered away, yet with Janaki by his side, continued to do world-class research and exchanged letters with Hardy until the very last days.  Everyone I read commented that this was “the right dramatic choice,” but … I dunno, I would’ve shown it!)

But enough!  I fear that to harp on these defects is to hold the film to impossibly-high, Platonic standards, rather than standards that engage with the reality of Hollywood.  An anecdote that Brown related at the end of the Q&A session brought this point home for me.  Apparently, Brown struggled for an entire decade to attract funding for a film about a turn-of-the-century South Indian mathematician visiting Trinity College, Cambridge, whose work had no commercial or military value whatsoever.  At one point, Brown was actually told that he could get the movie funded, if he’d agree to make Ramanujan fall in love with a white nurse, so that a British starlet who would sell tickets could be cast as his love interest.  One can only imagine what a battle it must have been to get a correct explanation of the partition function onto the screen.

In the end, though, nothing made me appreciate The Man Who Knew Infinity more than reading negative reviews of it, like this one by Olly Richards:

Watching someone balancing algorithms or messing about with multivariate polynomials just isn’t conducive to urgently shovelling popcorn into your face.  Difficult to dislike, given its unwavering affection for its subject, The Man Who Knew Infinity is nevertheless hamstrung by the dryness of its subject … Sturdy performances and lovely scenery abound, but it’s still largely just men doing sums; important sums as it turns out, but that isn’t conveyed to the audience until the coda [which mentions black holes] tells us of the major scientific advances they aided.

On behalf of mathematics, on behalf of my childhood self, I’m grateful that Brown fought this fight, and that he won as much as he did.  Whether you walk, run, board a steamship, or take taxi #1729, go see this film.

Addendum: See also this review by Peter Woit, and this in Notices of the AMS by Ramanujan expert George Andrews.

by Scott at May 01, 2016 09:22 PM


Implication of polynomial solution to subset sum problem

Hi I am trying to get the concept right. If subset sum, a NPC problem, yields a polynomial solution. Does that mean P=NP?

Thanks for clarity.

by Someguy at May 01, 2016 09:01 PM



How to write the turing machine processing operations?

I have this Turing machine example given in my book:

enter image description here

For the language $0^n1^n$. I understand how it works because it's very similar to a Finite State Machine.

But what I want to know is the following:

enter image description here

What is the $|-$ symbol is for what are the states $q_0$, $q_1$ doing in middle of everything. Unfortunately my textbook doesn't explain any of it.

by user12132 at May 01, 2016 08:39 PM

Planet Emacsen

Pragmatic Emacs: Using the zenburn theme

A few people have asked about the theme I use in Emacs, as seen in the sreenshots and code snippets I use. It is zenburn, and I find it very easy on the eye, but also has good contrast and a nice colour palette for syntax highlighting.

Install and activate the theme by adding the following to your emacs config file

(use-package zenburn-theme
  :ensure t
  (load-theme 'zenburn t))

by Ben Maughan at May 01, 2016 08:17 PM


Prove Undecidability Without Using Rice's Theorem

Show that checking if a TM accepts some input string of length greater than some constant $k$ is undecidable. Here the constant $k$ is publicly known.

I tried solving this problem by trying to reduce the "Acceptance Problem" i.e. the problem where we have to check whether a TM $M$ accepts a string $w$ or not.

One idea I had was that we can modify the input $M$ to $M'$ such that $M'$ simply increases the length of the string to become greater than $k$ but that is wrong since by input we strictly mean what was provided and any thing that the TM does cannot be considered as part of the input.

How can I solve this?

by Banach Tarski at May 01, 2016 08:06 PM



What does it mean for an option strategy to be leveraged

Probably a newbie question, but what do traders mean when they say that an option strategy is leveraged ? And when can we say that it is the case ?

by BS. at May 01, 2016 07:44 PM

reference for elementary mortgage math

I have a student doing a project on default rate & prepayment rate for mortgages. She would like to include a section on how the quantities affect pricing, & so would like to reference a formula that gives the value of a mortgage in terms of them, say for constant values of them. I think it used to be possible to find them in one or both of Fabozzi's books, (fixed income handbook and mortgage handbook) but the editions she found in the library did not in fact have them. So the QUESTION is does someone know a readily available source that has a formula for the value of a mortgage in terms of these quantities or basically the same thing, psa, smm etc.

by michael hogan at May 01, 2016 07:37 PM


What is wrong with this reasoning that finding the genus of a degree 3 bipartite graph is NP-complete?

Finding genus of a biparite graph is $NP$-complete and finding genus of a degree $3$ graph is $NP$-complete and so finding genus of a degree $3$ bipartite graph is $NP$-complete.

Though implication could be right is there any harm in reasoning this way on $NP$-completeness?

by Turbo at May 01, 2016 07:36 PM


Tensorflow conv2d_transpose size error "Number of rows of out_backprop doesn't match computed"

I am creating a convolution autoencoder in tensorflow. I got this exact error:

tensorflow.python.framework.errors.InvalidArgumentError: Conv2DBackpropInput: Number of rows of out_backprop doesn't match computed: actual = 8, computed = 12 [[Node: conv2d_transpose = Conv2DBackpropInput[T=DT_FLOAT, data_format="NHWC", padding="SAME", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/cpu:0"](conv2d_transpose/output_shape, Variable_1/read, MaxPool_1)]]

Relevant code:

l1d = tf.nn.relu(tf.nn.conv2d_transpose(l1da, w2, [10, 12, 12, 32], strides=[1, 1, 1, 1], padding='SAME'))


w2 = tf.Variable(tf.random_normal([5, 5, 32, 64], stddev=0.01))

I checked the shape of the input to conv2d_transpose i.e. l1da and it is correct(10x8x8x64). The batch size is 10, input to this layer is in the form of 8x8x64, and the output is supposed to be 12x12x32.

What am I missing?

by goluhaque at May 01, 2016 07:35 PM


Problem with external keyboard and numluck [on hold]

I’ve got some problems with my laptop numeric keys. In fact, I bought an external keyboard for my laptop because it didn’t have numluck keys separately. The numluck keys are inside the keyboard, for example j is 3. The problem is when i turn the numluck key on, the external keyboard works well but i can’t use my keyboard completely because some keys type number!!! And when i turn the numluck off ,the external keyboard doesn’t work!!! Thanks in advance. Sorry for bad english writing.

by AmiPourmand at May 01, 2016 07:13 PM


Differences between editions of Security Analysis by Graham and Dodd?

Where can I find a comparison of the contents, a list of everything that changed or the differences among the different editions of the book Security Analysis by Benjamin Graham & David Dodd?

There are six editions of the book: 1934, 1940, 1951, 1962, 1988, and 2008.

Do I have to read all of them and compare myself or did someone already do that?

Edit: I already read The Intelligent Investor and the sixth edition of Security Analysis. This question is about the differences between the different editions. The sixth edition if I understand it correctly, is based on the 1940-edition.

by tomsv at May 01, 2016 07:07 PM


A Question of Two Avahi

I have two boxen with avahi 0.6.31 installed: one runs Ubuntu 14.04, and I am in the process of setting up the other with FreeBSD 10.3. I also have a Mac running OS X 10.4.

The Ubuntu and FreeBSD machines can resolve each other's addresses, as can the Ubuntu and Mac. The Mac and the FreeBSD box, however, cannot resolve each other. Can anyone tell me what I should look at to remedy this problem?

by Darwin von Corax at May 01, 2016 07:04 PM


Machine learning algorithm for predicting binary decisions on a large, underrepresented dataset

I would like create a classifier which works on a relatively large (about 30k samples) dataset with circa 20 attributes and a binary decision, however such, which contains relatively small amount of samples with, say, "Yes" decision. That is, the data for building a classifier seems underepresented.

My question is, are there any algorithms which work well with these kinds of datasets? So far I have tried C4.5 (J 48 actually), some basic SVM algorithms, Naive Bayes and MLP, hovewer each method failed to learn the dependencies in the data well (accuracy was at the level of about 90% and this underrepresented decision was... unrepresented by the classifier too). I'm using Weka if this makes any difference.

by Jules at May 01, 2016 06:55 PM

Christian Neukirchen


by Christian Neukirchen ( at May 01, 2016 06:31 PM


Scikit-Learn manually specifying .max_features in RFECV()-how many features get ranked?

I have followed this Scikit-Learn example in Python to obtain .feature_importances_ from a forest estimator. In that example, ExtraTreesClassifier() was used with its default hyperparameter settings - this would mean max_features='auto'. The output of this example is a plot of importances for 10 features.

Question 1:

When I re-run this example, with max_features=2, the plot is still showing feature importances for all 10 features. Should is only show the importances for 2 features?

Question 2:

Now, I would like to use ExtraTreesClassifier(max_features=2) with RFECV(). From the RFECV() docs, it indicates RFECV() assigns the best features a rank of 1 - we can see this in the .ranking_ attribute of RFECV(). However, if I specify the estimator to be ExtraTreesClassifier(max_features=2), then does RFECV() use 2 features in its estimator and only return ranks for 2 features? Or does it ignore max_features and return ranks for all the features?

by W R at May 01, 2016 06:30 PM


What are some real world applications of the Battleship puzzle?

Aside from data reconstruction, what would be some real world applications similar to that of solving the Battleship puzzle?

by Battleship at May 01, 2016 06:21 PM


Scraping options data and returning ticker symbols of companies meeting certain criteria [on hold]

This is my first post on this site and I am looking forward to becoming more involved as my skills in coding increase. My first questions involves scraping options (calls and puts) data from the web using yahoo or google finance. I am interested in running code, in R, which will look at the asking price for a call/put on a given stock. Specifically, I hope to run a code which examines all optionable stocks (csv file containing all symbols can be found at and returns a list of stock symbols corresponding to those stocks which have a asking price for a given call/put at or below a certain price with a strike price within a given percentage of the current trading price. For example, assume that I am looking at the hypothetical company ABC which is currently trading at 100 dollars a share. If there exist a put or call option on this stock for a 1 dollar premium or less with a strike price between 90 and 110 dollars (e.g. within 10% of the current price) I would be interested in examining options on this stock. Expanding this, I would be interested in generating an algorithm which would search all optionable stocks and return a list of stock symbols corresponding to those stocks meeting this criteria.

I have extensively searched through the available resources on this site and have found some insight into methods of scraping options data, specifically using the script as presented below (gathering data for Apple) and described in


    getOptionQuote <- function(symbol){
    output = list()
url = paste('', symbol, '&output=json', sep = "")
x = getURL(url)
fix = fixJSON(x)
json = fromJSON(fix)
numExp = dim(json$expirations)[1]
for(i in 1:numExp){
    # download each expirations data
    y = json$expirations[i,]$y
    m = json$expirations[i,]$m
    d = json$expirations[i,]$d
    expName = paste(y, m, d, sep = "_")
    if (i > 1){
        url = paste('', symbol, '&output=json&expy=', y, '&expm=', m, '&expd=', d, sep = "")
        json = fromJSON(fixJSON(getURL(url)))
    output[[paste(expName, "calls", sep = "_")]] = json$calls
    output[[paste(expName, "puts", sep = "_")]] = json$puts


  fixJSON <- function(json_str){
 stuff = c('cid','cp','s','cs','vol','expiry','underlying_id','underlying_price',

 for(i in 1:length(stuff)){
    replacement1 = paste(',"', stuff[i], '":', sep = "")
    replacement2 = paste('\\{"', stuff[i], '":', sep = "")
    regex1 = paste(',', stuff[i], ':', sep = "")
    regex2 = paste('\\{', stuff[i], ':', sep = "")
    json_str = gsub(regex1, replacement1, json_str)
    json_str = gsub(regex2, replacement2, json_str)


aapl_opt = getOptionQuote("AAPL")

However, this code does not support examining multiple stocks at once and thus has not been successful for my application. Any help would be greatly appreciated!


by brichard11 at May 01, 2016 06:18 PM


Tensorflow LSTM model testing

I'm new to LSTM and Tensorflow, and I'm trying to use an LSTM model to learn and then classify some huge data set that I have. (I'm not worried about the accuracy my intention is to learn). I tried to implement the model in a similar way as in the PTB word prediction tutorial that uses LSTM. The code in the tutorial ( uses the below line to run the session using the model

 cost, state, _ =[m.cost, m.final_state, eval_op],
                                 {m.input_data: x,
                                  m.targets: y,
                                  m.initial_state: state})

I modified this for my example as below (to get the logits and work with it):

  cost, state, _,output,logits =[m.cost, m.final_state, eval_op, m.output,m.logits],
                                 {m.input_data: x,
                                  m.targets: y,
                                  m.initial_state: state})

So my questions if someone could help are as below:

  • How can the model built while training be used for testing? What exactly is happening when 3 models are being used by the tutorial one for each test, train and validation?
  • What about the targets while testing(if I don't know them, say in a classification problem). What changes in the run_epoch () can be done in a way to use the model built during training.
  • Just another question: It's difficult to debug tensorflow graphs ( and I found it difficult to understand the tensorboard visualizer too) And I didn't find good resource for learning tensorflow (the website seems to be lacking structure/ documentation) What other resources/ debugging methods are there?


by Prabhanjan Bhat at May 01, 2016 06:02 PM

Is there any python library or API like deeplearning4j?

I need to do some deep learning work in python, mainly image processing based work. Do Python have any standard library or API which is works works like deeplearning4j?

by Razik at May 01, 2016 05:59 PM



Difference between Probabilistic kNN and Naive Bayes

I'm trying to modify an standard kNN algorithm to obtain the probability of belonging to a class instead of just the usual classification. I haven't found much information about Probabilistic kNN, but as far as I understand, it works similar to kNN, with the difference that it calculates the percentage of examples of every class inside the given radius.

So I wonder, what's the difference then between Naive Bayes and Probabilistic kNN? I just can spot that Naive Bayes takes into consideration the prior possibility, while PkNN does not. Am I getting it wrong?

Thanks in advance!

by vandermies at May 01, 2016 05:45 PM


Proving a binary heap has $\lceil n/2 \rceil$ leaves

I'm trying to prove that a binary heap with $n$ nodes has exactly $\left\lceil \frac{n}{2} \right\rceil$ leaves, given that the heap is built in the following way:

Each new node is inserted via percolate up. This means that each new node must be created at the next available child. What I mean by this is that children are filled level-down, and left to right. For example, the following heap:

   / \
  1   2

would have to have been built in this order: 0, 1, 2. (The numbers are just indexes, they give no indication of the actual data held in that node.)

This has two important implications:

  1. There can exist no node on level $k+1$ without level $k$ being completely filled

  2. Because children are built left to right, there can be no "empty spaces" between the nodes on level $k+1$, or situations like the following:

       / \
      1   2
     / \   \
    3  4    6

(This would be an illegal heap by my definition.) Thus, a good way to think of this heap is an array implementation of a heap, where there can't be any "jumps" in indeces of the array.

So, I was thinking induction would probably be a good way to do this... Perhaps something having to deal with even an odd cases for n. For example, some induction using the fact that even heaps built in this fashion must have an internal node with one child for an even n, and no such nodes for an odd n. Ideas?

by varatis at May 01, 2016 05:44 PM

Analysis of Weighted Quick Union with Path Compression

I have searched the internet for an analysis of why WQUPC is amortized $O( m \alpha (n) ) $ for m operations on n nodes ( $\alpha ( n) $ is the inverse Ackerman function).

I understand why it is $O ( m \log^{\ast} ( n ) ) $ as it is proven on wikipedia.

So the question is, why is WQUPC $O( m \alpha (n) ) $? I would prefer a rigorous analysis, but an intuitive explanation would also be useful.

by Sid T at May 01, 2016 05:34 PM

What is the difference between centralised, decentralised, distributed, fully-distributed?

What is the difference between centralised, decentralised, distributed, fully-distributed, partially-centralised, partially-decentralised system? Which type or topology is the system attached in the picture below? For instance, A is a master node (coordinator) of A1, A2, A3, which makes this part a centralised system. A, B, C and D are all connected between each other, which makes this system a decentralised one. But then, what are fully-distributed, partially-centralised, partially-decentralised systems?

enter image description here

by Andrei at May 01, 2016 05:33 PM


Binary Search Tree insert function in OCaml

Suppose we want to insert value smaller. It will go to Node (insert x left, k, right) I don't understand how we can have insert x left when function insert is declared as taking only one argument, the key. How can left also be passed to insert funtion?

type 'a bst_t =  
| Leaf
| Node of 'a bst_t * 'a * 'a bst_t

let rec insert x = function  
  | Leaf -> Node (Leaf, x, Leaf) 
  | Node (left, k, right) ->
    if x < k then Node (insert x left, k, right) 
    else Node (left, k, insert x right) 

by power_output at May 01, 2016 05:32 PM


Quantum algorithms for QED computations related to the fine structure constants

My question is about quantum algorithms for QED (quantum electrodynamics) computations related to the fine structure constants. Such computations (as explained to me) amounts to computing Taylor-like series $$\sum c_k\alpha^k,$$ where $\alpha$ is the fine structure constant (around 1/137) and $c_k$ is the contribution of Feynman diagrams with $k$-loops.

This question was motivated by Peter Shor's comment (about QED and the fine structure constant) in a discussion regarding quantum computers on my blog. For some background here is a relevant Wikipedea article.

It is known that a) The first few terms of this computation gives very accurate estimations for relations between experimental outcomes which are with excellent agreement with experiments. b) The computations are very heavy and computing more terms is beyond our computational powers. c) At some points the computation will explode - in other words, the radius of convergence of this power series is zero.

My question is very simple: Can these computations be carried out efficiently on a quantum computer.

Question 1

1): Can we actually efficiently compute (or well-approximate) with a quantum computers the coefficients $c_k$.

2) (Weaker) Is it at least feasible to compute the estimates given by QED computation in the regime before these coefficients explode?

3) (Even weaker) Is it at least feasible to compute the estimates given by these QED computation as long as they are relevant. (Namely for those terms in the series that gives good approximation to the physics.)

A similar question applies to QCD computations for computing properties of the proton or neutron. (Aram Harrow made a related comment on my blog on QCD computations, and the comments by Alexander Vlasov are also relevant.) I would be happy to learn the situation for QCD computations as well.

Following Peter Shor's comment:

Question 2

Can quantum computation give the answer more accurately than is possible classically because the coefficients explode?

In other words

Will quantum computers allow to model the situation and to give

efficiently approximate answer to the actual physical quantities.

Another way to ask it:

Can we compute using quantum computers more and more digits of the fine structure constant, just like we can compute with a digital computer more and more digits of e and $\pi$?

(Ohh, I wish I was a believer :) )

more background

The hope that computations in quantum field theory can be carried our efficiently with quantum computers was (perhaps) one of Feynman’s motivation for QC. Important progress towards quantum algorithms for computations in quantum field theories was achieved in this paper: Stephen Jordan, Keith Lee, and John Preskill Quantum Algorithms for Quantum Field Theories. I don't know if the work by Jordan, Lee, and Preskill (or some subsequent work) implies an affirmative answer to my question (at least in its weaker forms).

A related question on the physics side

I am curious also if there are estimations for how many terms in the expansion before we witness explosion. (To put it on more formal ground: Are there estimates for the minimum k for which $\alpha c_k/c_{k+1} > 1/5$ (say).) And what is the quality of the approximation we can expect when we use these terms. In other words, how much better results can we expect from this QED computations with an unlimited computation power.

Here are two related questions on the physics sister site. QED and QCD with unlimited computational power - how precise they are going to be?; The fine structure constant - can it genuinely be a random variable?

by Gil Kalai at May 01, 2016 05:31 PM



Terraform Modules – My Sharing Wishlist

I’ve been writing a few Terraform modules recently with the aim of sharing them among a few different teams and there are a couple of things missing that I think would make reusable modules much more powerful.

The first and more generic issue is using the inability to use more complex data structures. After you’ve spent a while using Terraform with AWS resources you’ll develop the urge to just create a hash of tags and use it nearly everywhere. Hopefully with the ability to override a key / value or two when actually using the hash. If your teams are using tags, and you really should be, it’s very hard to write a reusable module if the tag names in use by each team are not identical. Because you can only (currently) pass strings around, and you’re unable to use a variable as a tag name, you’re stuck with requiring everyone to use exactly the same tag names or not providing any at all. There’s no middle ground available.

tags {
    "${}" = "Baz"

# creates a Tag called literally '${}'

My second current pain point, and the one I’m more likely to have missed a solution to, is the ability to conditionally add or remove resource attributes. The most recent time this has bitten me is when trying to generalise a module that uses Elastic Load Balancers. Sometimes you’ll want an ELB with a cert and sometimes you won’t. Using the current module system there’s no way to handle this case.

If I was to do the same kind of thing in CloudFormation I’d use the AWS::NoValue pseudo parameter.

    "DBSnapshotIdentifier" : {
        "Fn::If" : [
                {"Ref" : "DBSnapshotName"},
                {"Ref" : "AWS::NoValue"}

If DBSnapshotName has a value the DBSnapshotIdentifier property is present and set to that value. If it’s not defined then the property is not set on the resource.

As an aside, after chatting with @andrewoutloud, it’s probably worth noting that you can make entire resources optional using a count and setting it to 0 when you don’t want the resource to be included. While this is handy and worth having in your Terraform toolkit it doesn’t cover my use case.

variable "include_rds" {
    default = 0
    description = "Should we include a aws_db_instance? Set to 1 to include it"

resource "aws_db_instance" "default" {
    count = "${var.include_rds}" # this serves as an if

    # ... snip ...

I’m sure these annoyances will be ironed out in time but it’s worth considering them and how they’ll impact the reusability of any modules you’d like to write or third party code you’d want to import. At the moment it’s a hard choice between rewriting everything for my own use and getting all the things I need or vendoring everything in and maintaining a branch with things like my own tagging scheme and required properties.

by Dean Wilson at May 01, 2016 05:15 PM


How do decompose this relation from 3NF to BCNF?

Given a relation with attributes { A, X, Y, Z } and functional dependencies $AX\rightarrow Y, AX\rightarrow Y, Y\rightarrow A, Z\rightarrow B$, I would think that this would be in 3NF because there are no partial or transitive dependencies. What I don't understand is how to decompose the relation to BCNF by ensuring no relation has a determinant that is not a candidate key.

by DBmass at May 01, 2016 05:14 PM

Recurrence for number of ways to write n as the sum [migrated]

I'm trying to find the recurrence for this problem:

m (odd), and O(n,m) the number of ways to write n as the sum of odd 
positive integers at most m (distinctness not required). objective: come up 
with recurrence for O(nm).

Here's my attempt:

ways(m,n) = 0    if m > n
ways(m,n) = 1    if m = n

ways(m,n) = ways(m+1,n)+m(m, n - m) otherwise

However, I don't think this is correct as it doesn't account for n having to be the sum of odd positive integers (at most m). It doesn't account for oddness.

by Carlo at May 01, 2016 05:07 PM



Building an undecidable T-Grammar

I am asked, "Show that these T-Grammars constitute a set of languages that are undecidable. Do this by building a T-Grammar for a Turing machine description. For a starting point you might think about machine configurations."

I am not sure how configurations is suppose to help, but the grammar that I built:

$$ abc \to def; d \to a; e \to b; f \to c $$

Is this undecidable?

by matttm at May 01, 2016 04:47 PM


Black Scholes Constant Implied Volatility

I hope someone can clarify my ideas about the constant implied volatility in the classical Black Scholes framework.

As well known, market practitioners quote the prices of vanilla call and put options in terms of implied volatilities. For inputs $K$, $S$, $r$, $T$ and the price of the option $V$, one can determine the implied volatility $σ$ such that

$V=BS(K,S,r,T,σ)$ (1)

When the market quoted implied volatilities are plotted against different strike prices for a fixed maturity $T$, the graph would tipically exhibit a 'smile' shape and hence the name volatility smile.

Theory says that this implies a deficiency in the Black Scholes model since it assumes a constant volatility parameter, not depending on $K$ nor $T$. Hence the volatility smile would be flat.

Here my ideas get confused. Assuming that $S$, $r$ and $T$ remain constant, for a fixed market price $V$ of a vanilla option the implied volatility will vary depending on the value of strike $K$ under the Black Scholes model (1). Hence, if the implied volatility is plotted against different strikes for a fixed $V$ it will indeed show a smile behaviour, which is in contrast to what theory states.

Furthermore, do the market quoted implied volatilities that form the volatility smile according to the theory correspond to a fixed vanilla option price $V$ and with varying $K$?

I think I am making a mistake in my reasoning but I do not understand where. I would be glad if someone can point me in the right thinking direction.

Thanks in advance.

by Tinkerbell at May 01, 2016 04:46 PM


Convert non-integer decimal to octal

I follow a course in Computer Architecture and I'm making exercises on number conversions. Now one of the questions asks me to convert 251.5625 to octal and hexdecimal base. No further info is given. Would this mean that I have to convert it and just ignore the point? Or what's the convention on something like this?

by Pieter Verschaffelt at May 01, 2016 04:38 PM


Kolmogorov-Smirnov test for Generalized Pareto Distribution

I've fitted my data to a generalized pareto distribution as to model the returns in the tails more accurately. The interior is fitted with kernel distributions.

I would like to now test whether the original returns conform to the hypothesized distribution (i.e. generalized pareto distribution). Can I do this with the Kolmogorov-Smirnov test? I've already QQ-plots. However, I would like to conduct a statistical significance test on top. Can some one help? Kind of struggling with implementing it in Matlab.


by Peter Miller at May 01, 2016 04:35 PM


Turing Machine that always returns a blank tape

Is it possible to construct a Turing Machine such that given any finite input on a tape $s$, it clears the tape in a finite amount of time?

I have used such a TM as an intermediate step to show a reduction from the State Entry Problem to is $w \in L(M)$ problem but I don't know if it is feasible to construct one.

Even if we assume that the head of the TM always starts at the leftmost character on the tape and keep moving write, clearing each symbol we encounter, if the tape is infinite, how will we know when to stop moving right?

by Banach Tarski at May 01, 2016 04:24 PM


AssertionError: Mismatch between dataset size and units in output layer

I want to do a NN classification using scikit-neuralnetwork , I have 5 classes, so in the output layer , I have units=5 ibut I am getting this error: Mismatch between dataset size and units in output layer, I reshaped my y_train and applied "Sigmoid" function to the output layer according to the documentation:

If you want to do multi-label classification, simply fit using a y array of integers that has multiple dimensions, e.g. shape (N, 3) for three different classes. Then, make sure the last layer is Sigmoid instead.

y_train shape is : (2115, 5) X_train shape is : (2115, 343) This is the code:

import sknn.mlp as mlp
from sknn.mlp import Classifier
ip_layer = mlp.Layer('Sigmoid', units=1)
hidden_layer = mlp.Layer('Tanh', units=100)
op_layer = mlp.Layer('Sigmoid', units=5) 

nn = Classifier(
    [ip_layer, hidden_layer, op_layer],
), y_train)

by sameh habboubi at May 01, 2016 04:13 PM


Pro-Tipp: Wenn du in Pornos mitspielst oder als Sex ...

Pro-Tipp: Wenn du in Pornos mitspielst oder als Sex Worker arbeitest, und dir das gegenüber deinem Umfeld peinlich ist oder du es ihnen verschweigst, dann lade keine Fotos von dir auf Social Media-Sites hoch.

In Russland deanonymisieren gerade "besorgte Bürger" per Gesichtserkennungs-App Pornodarsteller und Sex Worker gegenüber ihrer Familie. Ganz ekelhafte Nummer.

Auf der anderen Seite kann man natürlich die Frage stellen, wieso es überhaupt ein öffentliches Gesichtserkennungs-API mit App geben muss.

May 01, 2016 04:00 PM

In Irans neuem Parlament sind mehr Frauen als Mullahs.However ...

In Irans neuem Parlament sind mehr Frauen als Mullahs.
However clerical numbers have steadily fallen since 1980 with 153 elected in the second parliament, 85 in the 3rd, 67 in the 4th and 52 in the 5th.

The outgoing legislature had only 27 men of the cloth. Of the 16 who will enter parliament next month 13 have conservative political leanings and 3 are reformists.

May 01, 2016 04:00 PM

Du weißt, dass es um das Gesundheitssystem schlecht ...

Du weißt, dass es um das Gesundheitssystem schlecht bestellt ist, wenn das Klinikpersonal nicht für mehr Geld sondern für mehr Pflegepersonal streikt, weil die Stationen so krass überlastet ist.

May 01, 2016 04:00 PM

Old and busted: Briefkastenfirmen in Panama.New hotness: ...

Old and busted: Briefkastenfirmen in Panama.

New hotness: Briefkastenfirmen auf Mauritius.

Mit dem Gütesiegel der Deutschen Bank!

Die Dependance gibt es schon seit 1996, aktuell sind dort immerhin 200 Mitarbeiter beschäftigt.
Oha? 200 Mitarbeiter auf Mauritius? Das wird sicher lustig, mal zu fragen, was die da offiziell tun!
e Deutsche Bank braucht sie dort, um ihre Dienste großen Pensionskassen, Versicherern und Private-Equity-Gesellschaften anzubieten, die in Asien und Afrika investieren. Sie kümmern sich um wohl wenig aufregende Themen wie die Kontoführung und Depotbetreuung. Zudem würden interne Prozesse und Verwaltungsaufgaben für die Bank abgewickelt, heißt es in Frankfurt.
Oh ach soooo, DAS macht ihr da!1!! Ja nee, klar.

May 01, 2016 04:00 PM

Planet Theory

TR16-071 | Unprovability of circuit upper bounds in Cook&#39;s theory PV | Jan Krajicek, Igor Carboni Oliveira

We establish unconditionally that for every integer $k \geq 1$ there is a language $L \in P$ such that it is consistent with Cook's theory PV that $L \notin SIZE(n^k)$. Our argument is non-constructive and does not provide an explicit description of this language.

May 01, 2016 03:59 PM


Restart/reload IPFW remotely via ssh without losing connection

Is it possible to restart IPFW or reload its script remotely via ssh connection without loosing current connection?

by b.mazgarov at May 01, 2016 03:34 PM


Neural Networks to Upscale & Stylize Pixel Art

A small 550 line program is used to upscale minecraft textures.


by meskarune at May 01, 2016 03:25 PM


Image classification using Convolutional neural network

I'm trying to classify hotel image data using Convolutional neural network..

Below are some highlights:

  1. Image preprocessing:

    • converting to gray-scale
    • resizing all images to same resolution
    • normalizing image data
    • finding pca components
  2. Convolutional neural network:

    • Input- 32*32
    • convolution- 16 filters, 3*3 filter size
    • pooling- 2*2 filter size
    • dropout- dropping with 0.5 probability
    • fully connected- 256 units
    • dropout- dropping with 0.5 probability
    • output- 8 classes
  3. Libraries used:

    • Lasagne
    • nolearn

But, I'm getting less accuracy on test data which is around 28% only.

Any possible reason for such less accuracy? Any suggested improvement?

Thanks in advance.

by Chetan Borse at May 01, 2016 03:19 PM


Rigorous Books on Algorithms

I thoroughly enjoyed my algorithms class but I felt that it lacked rigor. Most of the time I was able to intuitively understand why the algorithms presented worked and why they had the time complexity that was presented but I'd like to be able to prove such things. As such, I'd like a book that goes over lots of common algorithms and has a focus on proving the correctness and time complexity of the algorithms. Any good recommendations?

by Budge at May 01, 2016 03:04 PM



What do you do when you cannot make progress on the problem you have been working on?

I am a 2nd year graduate student in theory. I have been working on a problem for the last year (in graph theory/algorithms). Until yesterday I thought I am doing well (I was extending a theorem from a paper). Today I realized that I have made a simple mistake. I realized that it will be much harder than I thought to do what I intended to do. I feel disappointed so much I am thinking about leaving grad school.

Is this a common situation that a researchers notices that her idea is not going to work after considerable amount of work?

What do you do when you realized that an approach you had in mind is not going to work and the problem seems too difficult to solve?

What advice would you give to a student in my situation?

by Kumar at May 01, 2016 02:54 PM


modeling for asset value by Automata

I want to model asset value and their relation ship.I model one asset's value like this: enter image description here

state A : when asset value decrease one unit

state B: when asset value increase one unit

my problem is to show relationship between two asset's value:
imagine we have two asset that each one have one asset value model like image above
when asset S have decrease in it's value another asset that have relation to it like Q have decrease in it's value simultaneously how can I show this simultaneous transition?

by ghasedak at May 01, 2016 02:43 PM


How to generate Extended Finite State Machines Randomly with some properties?

This is related to my academic project

An extended finite state machine is a tuple $SM=(I,S,T)$ (simplified):

  • $I$ is the set of identifiers and it's divided into two sets Inputs and outputs, for simplification we will just consider boolean variables. Let $\sigma$ be the evaluation function $\sigma :I \to \{0,1\}$ which associate to very identifier its value, And let $\Sigma$ be the set of all evaluations.
  • $S$ is the set of states
  • $T\subseteq S\times S \times G\times A$ the set of transitions, a transition $t=(s_1,s_2,g,a)$ means :

    • $s_1$ the source state
    • $s_2$ the target state
    • $g$ the guard,which is usual expressed in the guard language and here we can just consider it as its semantic $g:\Sigma \to \{0,1\}$ where $g(\sigma)$ is an element of $\{0,1\}$.

    • $a$ is usually expressed in the action language, and we will here identify it with its semantic $a:\Sigma \to \Sigma $ where $f(\sigma)$ is another evaluation function.

Now a configuration of the state machine $SM$ is a tuple $(s_i,\sigma)$ where $s_i$ is the current state and the $\sigma$ the current evaluation function. A run of the machine is : $$(\sigma_0,s_0)\to(\sigma_1,s_{1})\to (\sigma_2,s_2)\to \cdots (\sigma_k,s_k)\to(\sigma_{k+1},s_{k+1})\to \cdots $$ where for every $k$ we have $(s_k,s_{k+1},g,a)$ is a transition and $g(\sigma_k)=1$ (meaning true) and $a(\sigma_k)=\sigma_{k+1}$

Example of action : we denote an action by $Rep(a)=[e_1,\cdots,e_n] $ where $e_i$ is a boolean expression over the variables in $I=[o_1,\cdots,o_n,i_1,\cdots ,i_k]$ where $i_j$ are the inputs of the system and are determined by the environment. And from this representation of the action we can have $a(\sigma)(x)=e_j(\sigma) $ if $x=o_j$ and $a(\sigma)(x)=i_j$ if $x=i_j$

The guards are represented by expressions over the set of the variables $I$

What I am able to do : I was able to generate some boolean expressions that are valid or satisfiable using either well known examples, or using the threshold density of boolean formulas. Hence I am able to generate the guards and the actions (which are random, and have some properties like satisfiable or valid)

Question : how could I generate a random Extended State machine with the property that: Every state is reachable and every transition can be executed

you can use that fact that I am able to generate a formula that evaluates to $1$ or $0$ and I can generate a satisfiable or unsatisfiable formula

Any other ideas to do tests with Extended State Machines? Thanks

Edit 29/04/2016

As my question is either not very clear or difficult; I have an alternative question:

Question 2: what are some well known random Extended State Machines for which every state is reachable.

The purpose of my project is to test the efficiency of an algorithm

by Elaqqad at May 01, 2016 02:42 PM


Overpricing Bermudan swaption using Shifted LMM

I am trying to model a callable range accrual note linked to the EUR CMS spread, 20Y-10Y, with cap and floor. The note is Bermudan, callable starting year 3, every 3 years till maturity at 30 year. We plan on using shifted LMM for the EUR rate.

We plan to calibrate libor correlations to cms 20-10 spreadoptions 1Y maturity because those are the liquid ones, and vols to vanillas. A colleague told me that we will still overprice the trade.

I don't understand why. I understand that Bermudans in will trade at a discount to europeans but I don't understand why the modeling will generally overprice it. Any help or links to papers or books will be greatly appreciated. Also any links to books and papers that explain how to remedy the issue will also be great.


by Amatya at May 01, 2016 02:37 PM


Halting problem reducing to the blank tape halting problem

I was going through my book of proof and I find very confusing its definition, so I would like someone to help me in understanding this.

  • The blank tape problem takes a machine and an empty tape and tells if this machine halts or not
  • We prove it is unsolvable by proving it reduces to the halting problem

Whenever I read online, I read that we we write the input on the tape and we run this on the halting problem.

  • How do we construct the reduction?
  • Why do we write the input on the tape?
  • Isn't this all about having an empty tape?

update: I think my misunderstanding is in the definitio of input and tape

by revisingcomplexity at May 01, 2016 02:36 PM


short selling with collateral accounting

I don't know how the accounting works for short selling with collateral:

For example if a stock is \$10 a share and turn out to be $15 a share a week later.

At time 0, you borrow and sell 10 shares and get total proceeds $100

If collateral requirement is 50%: you have to keep $50 in the bank, and any potential losses are deducted from there first.

A week later, your position is worth 10 * $15 = $150. If you close, you have net loss \$100 + -\$150 = -$50

Basically wipe out your collateral account entirely, \$50-$50=0

So you still take home $50 which you got from the original sale but weren't required to put into the collateral account.

Doesn't this show you still take home positive amount of money, where as if there was no collateral accounting: very clearly you sold for \$100 and bought for $150, so your loss is very clear.

For shorting with collateral it appears you still have $50... very confused.

by MoronicHero at May 01, 2016 02:32 PM

How can I compute zero coupon bond prices from dirty/clean prices of coupon bonds?

I am having problems with computing zero-coupon bond prices. The question is the following:

Today is $t$=14.4.2016 and I know dirty and clean prices of coupon bonds expiring at maturities: 4.7.2016, 4.7.2017, 4.7.2018,4.7.2019, 4.7.2020,4.7.2021. Coupons are paid annually on the date of maturity.

How can we determine the term structure of zero-coupon bond prices?

My idea is simply to apply the formula:

$ P_{dirty}(t) = \sum_i^n c_i P(t,T_i)$

Starting from the bond A expiring on the 4th July 2016, we should have

$P_{dirty}^A (t)=c^A P(t,4.7.2016)$

from which we can compute the first discount factor. Then, considering the other bonds, recursively, we should be able to compute them all.

Finally, employing these results and the method of least squares (assuming a parametric form of the term structure) we should be able to estimate the term structure.

The issue is that the discount factors $P(t,T_i)$ turn out to be bigger than $1$, which impossible. Can you help me?

Is the formula above correct?

I can provide you with all the numerical values, if you need them.

Thank you very much for any help you can provide!!

by Fred G. at May 01, 2016 02:22 PM

The relation between exchange rate SDE and respective interest rates

The exchange rate between a domestic currency money market and a foreign currency money market can be expressed as $$ dQ(t) = (r_d - r_f)Q(t)dt + \sigma Q(t)d\tilde{W}(t) $$ where $r_d$ is the interest rate for the domestic market, and $r_f$ for foreign.

In my head, I believe that the exchange rate should decrease when the domestic interest rate goes up, indicating the domestic currency is strengthening. For example if the Fed were to increase rates, then $EUR/USD$ should decrease, given that the ECB doesn't do much. So, if $EUR/USD$ was 1.14 yesterday, it should be below 1.14 today.

To my understanding, the SDE for $Q(t)$ doesn't seem to reflect this fact. It seems that $Q(t)$ would increase if $r_d$ were to go up. I would like to resolve this contradiction, so any help would be appreciated.

by Astaboom at May 01, 2016 02:18 PM

Fred Wilson


I like to look at Google Trends from time to time to see what it can tell me about things. I realize that search keyword activity is only one data point in a complex system and that with the move to mobile, it is less important than it was in the web only era. And people search for things when they want them. Once they have them, the search volume goes down. But I still think Google Trends can reveal some interesting things.

Here are some queries I ran today:

Facebook and Google are battling it out for video supremacy, but this query really doesn’t tell us very much about where that battle is going and how it will end. It is interesting to note that YouTube has been a mature but stable business for a long time now.

Twitter and the smartphone seem to have risen with a similar curve and are now in decline, with Twitter falling a bit faster than smartphones.

We see a similar shaped curve with Facebook, but the order of magnitude is quite different which is why I did not combine it with the previous chart.

December 2013 sure seems like the high water mark for the mobile social sector.

But not all boats go out with the receding tide.

Here is Snapchat and Instagram, with Twitter thrown in for scale comparison

It will be interesting to see when Instagram and Snapchat start flattening off. My gut tells me Instagram may already be there but we just don’t see it in the data yet.

Moving on from the past to the future, here are some of the sectors that entrepreneurs and VCs are betting on as the next big thing:

If you take out the VR term and look at the other three, you see something that looks like the NCAA football rankings over the course of a season. Each team/term has had a moment at the top but it remains unclear who is going to prevail.

If we look at one of the most interesting coming battles in tech, the voice interface race, the data is less clear.

I think we haven’t really gotten going on this one. But it is an important one as Chris Dixon explained in a really good blog post last week.

My semi regular Google Trends session today confirms what I’ve known for a while and have written here before. We are largely moving on from mobile and social in terms of big megatrends, video is being played out now, and its not yet clear what is going to emerge as the next big thing. Google is betting on AI and I tend to agree with them on that. Voice interfaces may be a good proxy for that trend.

by Fred Wilson at May 01, 2016 01:29 PM



How many labels are acceptable before using regression over classification

I have a problem where I'm trying to use supervised learning in python. I have a series of x,y coordinates which i know belong to a label in one data set. In the other i have only the x,y coordinates. I am going to use one set to train the other, my approach is that of supervised learning and to use a classification algorithm (linear discriminant analysis) as the number of labels is discrete. Although they are discrete, they are large in number (n=~80,000). My question, at which number of labels should i consider regression over classification where regression is better suited to continuous labels. I'm using SciKit as my machine learning package and using astronml.orgs excellent tutorial as a guide.

by mapping dom at May 01, 2016 01:05 PM

DragonFly BSD Digest

Lazy Reading for 2016/05/01

Cinco De Mayo is coming up.

Your unrelated link of the week: What was the weirdest 911 call ever received?  (via)


by Justin Sherrill at May 01, 2016 12:58 PM


Python keras how to transform a dense layer into a convolutional layer

I have a problem finding the correct mapping of the weights in order to transform a dense layer into a convolutional layer.

This is an excerpt of a ConvNet that I'm working on:

model.add(Convolution2D(512, 3, 3, activation='relu'))
model.add(MaxPooling2D((2,2), strides=(2,2)))
model.add(Dense(4096, activation='relu'))

After the MaxPooling, the input is of shape (512,7,7). I would like to transform the dense layer into a convolutional layer to make it look like this:

model.add(Convolution2D(512, 3, 3, activation='relu'))
model.add(MaxPooling2D((2,2), strides=(2,2)))
model.add(Convolution2D(4096, 7, 7, activation='relu'))

However, I don't know how I need to reshape the weights in order to correctly map the flattened weights to the (4096,512,7,7) structure that is needed for the convolutional layer? Right now, the weights of the dense layer are of dimension (25088,4096). I need to somehow map these 25088 elements to a dimension of (512,7,7) while preserving the correct mapping of the weights to the neurons. So far, I have tried multiple ways of reshaping and then transposing but I haven't been able to find the correct mapping.

An example of what I have been trying would be this:

weights[0] = np.transpose(np.reshape(weights[0],(512,7,7,4096)),(3,0,1,2))

but it doesn't map the weights correctly. I verified whether the mapping is correct by comparing the output for both models. If done correctly, I expect the output should be the same.

by pkropf at May 01, 2016 12:47 PM


Consequence of negative mean reversion of hull white one factor model

I tried to calibrate the data for hull-white one-factor model. Sometimes, I get negative estimate of mean reversion factor after the calibration process. When I plug the negative mean reversion factor into the hull-white one factor model, the interest rate tree cannot be generated.

I just wonder the theoretical consequence of hull-white one-factor model. Can anyone provide the meaning of negative mean reversion of hull-white one-factor model. If the mean reversion factor is negative, can the model be implemented properly?


by Dennis at May 01, 2016 12:35 PM


What's the time complexity of Monte Carlo Tree Search?

I'm trying to find the time complexity of Monte Carlo Tree Search (MCTS). Googling doesn't help, so I'm trying to see how far I get calculating it myself.

It does four steps for n iterations, or before the time runs out. So we'll have


Expansion just adds a child to the currently selected node. Assuming you're not using a singly linked list or something like that to store tree children, this can happen in constant time, so we can exclude it:


Given the branching factor b, and d as the depth of our tree, I'm assuming the selection phase runs in O(b*d), because at each level, the selection phase goes to all the children of the previous node.

So our time complexity becomes


Backpropagation takes time proportional to the depth of the tree as well, so that becomes:


But now I'm not sure how to add the simulation phase to this. It'll be proportional to the depth of the tree, but for each iteration, the running time depends heavily on the implementation as far as I know.

What's monte carlo tree search

Monte Carlo Tree Search (MCTS) is a tree search algorithm that tries to find the best path down a decision tree, mostly used for game playing. In games with a high branching factor, it can often go deeper than algorithms like Minimax, even with Alpha-Beta pruning, because it only looks into nodes that look promising. For a certain number of iterations (or a certain amount of time), the algorithm goes through these four phases:

Selection Starting at the root node, go down the tree until you find a node that is not fully expanded, or a leaf node. At each level, select the child that looks most promising. There are many ways of making this selection, but most implementations (including mine) use UCB1.

expansion Expand the selected child by looking at one of its unexplored children.

simulation Simulate a "rollout" from the newly discovered child. Again, there are many ways to do this, but many implementations just take a random child until it reaches an end node. Take note of whether the simulation resulted in a success or a failure (win or loss in the case of a game).

backpropagation update the information about this node, and all nodes along the path back to the root. Most implementations store two values at each node: the number of times it was part of a selection path that lead to a success, and the total number of times it was part of a selection path. These values are used during the selection phase.

After you've run this for as long as you can, the child node of the root that looks most promising is the action to take next. enter image description here

by bigblind at May 01, 2016 12:17 PM

Are there any implementations of the spineless tagless G-machine other than in GHC?

From Simon Peyton Jones (recent Royal Society member) we read the paper: Implementing lazy functional languages on stock hardware: the Spineless Tagless G-Machine.

Now this paper is part of how they made Haskell a lazy language when they were implementing it, and solved some problems they had at the time.

The only comparable paper seems to be: Compiling Lazy Functional Programs Based on the Spineless Tagless G-machine for the Java Virtual Machine but there doesn't appear to be an implementation available.

One tangentially related is: Compiling Haskell to Java. However in this approach they leave the implementation of the Spineless Tagless G-Machine in GHC and just read the output.

My question is: Are there any implementations of the spineless tagless G-machine other than in GHC?

by hawkeye at May 01, 2016 12:11 PM


Machine Learning encoding the name of people

I am going to predict the box office of a movie using sklearn. The question is how to encode the name of actors, screenwriters,directors to fit the model?

by KengoTokukawa at May 01, 2016 12:04 PM

Oleg Kiselyov

Planet Emacsen

Grant Rettke: Emacs Keyboard Design 31 and 32: Give Emacs More Logical Modifiers

  • Impossible to design and fabricate a custom for a reasonable price in time and money
  • Use a XKE-128 instead

  • Rubber dome switches and caps
    • Disassembled a Dell keyboard
    • Found it had rubber dome switches (obviously, spongy)
    • Good to see and know
    • Been using them for years, and they were fine
    • Mechanical switch probably isn’t required by me
  • N-Key rollover
    • You could quickly hit
      • Control, Meta, Super, Hyper, Shift, j
    • If you designed the keyboard out to make it easy
    • 6 NKRO is probably fine
  • You must choose a keycap style
    • DSA makes it easy to try different layouts so use that
    • Cost is a big topic
    • Maybe a grab bag is a good option?
  • If you want a lot of rows and columns then you need a microcontroller with a lot of connections like the Teensy++
  • You must choose available key sizes
    • The design tools let you make a keycap any size which helps exploration
    • At build time you needs fabricated keycaps in that size
    • Easier to use pre-made caps
    • 3d printing caps is another option, but I don’t want to do that
  • Reality is that doing a custom build
    • Will require 3x iterations
    • Will cost 3x as much
    • Reviewed the Ergodox EZ and it’s not for me
      • Thoughtful ideas about OS-Hyper key
  • Might be best to use the XKE-128 instead
    • Zero fabrication costs
    • Well-built body
    • Rubber-dome is OK
    • Cherry MX compatible stems
    • No way I could built for less
      • Hobby-ish
  • Converting to XKE-128 follows
  • Make power keys 2 wide because they are available from PI or SP
  • Left align QAZ
  • Make PgUp PgDn 1w
  • Moved arrows to bottom right
  • Added CapsLock back under right super
  • Added Ultra* so had to move PgUp to each side of Enter row by shift
  • Added a space down the middle to occupy 8×16


  • Decided that it would be nice to have a space and return that went from C to M so expanded that
  • Move Alt and Gui up to middle because
    • They are important modifiers
      • Alt-Tab is always two-handed, that is OK
    • Their importance doesn’t overlap with Emacs modifiers so you use them in a cognitively different place
  • PgUp PgDn go all the way left
  • Didn’t add back ScrollLk and Break, can add later if needed
  • Swap Super and Shift
    • Muscle memory makes Shift happier as expected location
    • Makes super-shift easy negating opportunity for Super*
  • Every Emacs modifier with * appended includes shift
    • Wherever it isn’t easy to do by hand, and free keys
  • Add Hyper* to left of hyper making it one key
    • This placement of hyper makes sense if you recall the feel of the layout of a typical laptop keyboard after you made CapsLock super. Using your thumb to go to C, M, super with your pinky, and H with your thumb again are natural
    • C-s and M-s are natural
    • H-s is even natural and H*-s is doable
  • Ultra shift is easy now, so U* can go away
  • Added Xtrm key for Emacs
    • C-M-s
    • Ultra below it
  • You an go “all out” with Emacs modifiers if you like
  • H* still makes sense


by Grant at May 01, 2016 11:52 AM



How to fit model implied forward curve with market forward curve for Ornstein-Uhlebeck?

I have a spread option model of 2 correlated Ornstein-Uhlenbeck commodity prices that I estimate the parameters of with Maximum Likelihood. What is the formula for introducing the additional requirement that the (spot market) model implied forward curves are equal to the observed forward curves?
Thank you!

by LenaH at May 01, 2016 11:40 AM


How is MSE defined for Image comparison?

I am building a convolution autoencoder that uses MSE as its error function. How is MSE defined for images? If the image is presented in simple matrix form, is MSE simply the square of the difference of individual determinants? Or is it the square of the determinant of the difference of the matrices?

by goluhaque at May 01, 2016 11:33 AM

Maximum Depth for a Random Tree

I'm trying to get the better classifier for a data set on Weka and I'm studying different types of maximum depth for the Random Tree algorithm. But I don't understand the results I get: with a maximumDepth between 3 and 10 I get a far better acuracy rate than with a maximumDepth>10. Anyone can help me to figure out why? Deeper trees shouldn't give better acuracy?

by vandermies at May 01, 2016 11:29 AM

Haskell Currying

For the past two hours I have been reading about currying in Haskell and all the resources present how the functions with multiple parameters actually return other functions, but not how their definitions looks like, so this is what the question is about.

Let us define the function:

myFunc :: (Num a) => a -> a -> a
myFunc x y = x * 2 + x * y

:t (myFunc 2) prints Num a => a -> a, i.e. a function that takes a number and also outputs a number. However, what does the definition of the function returned by (myFunc 2) look like? Does the compiler substitute x in the definition and the new function becomes something like myFunc' y = 2 * 2 + 2 * y?

How does recursion handle currying? If I define the function

replicate' :: (Integral i, Ord i) => i -> a -> [a]
replicate' n x
    | n <= 0    = []
    | otherwise = x : replicate' (n - 1) x

, what is the function returned by (replicate' 3) in the context (replicate 3) 'a'?

by Razvan Meriniuc at May 01, 2016 11:28 AM



Intermediate problems between L and NL

It is well-known that directed st-connectivity is $NL$-complete. Reingold's breakthrough result showed that undirected st-connectivity is in $L$. Planar directed st-connectivity is known to be in $UL \cap coUL$. Cho and Huynh defined a parametrized knapsack problem and exhibited a hierarchy of problems between $L$ and $NL$.

I am looking for more problems that are intermediate between $L$ and $NL$ i.e., problems that are :

  • known to be in $NL$ but not known (or unlikely) to be $NL$-complete and
  • known to be $L$-hard but not known to be in $L$.

by Shiva Kintali at May 01, 2016 10:57 AM


What is the average-case running time of Fun-sort?

I read this paper: (you can check the PDF online for free), and I translated section 4's Fun-sort algorithm (correct me if I'm wrong):

    A[n+1]=∞    //|A|=n
    while (h != l+1)
        if (x<=A[m])
    return h    //success if A[h]=x

    for i=1 to n:
        while (!success):
            if (A[h]!=A[i]):
                if (i<(h-1)):
                elif (i>h):

As you can see, this sorting algorithm uses a binary search on a not necessarily sorted array. I realized that to get the average-case running time for Fun-sort, I needed the average number of iterations for its while cycle which I defined to be a random variable X. X would be geometrically distributed, so E[X]=1/p, where p is the probability of success of the binary-search procedure. I am stuck trying to get this probability of success.

So... any help would be very much appreciated =)

EDIT: I've been considering that binary-search will definitely fail when i < m and A[i]<=A[m]. In fact, every time the elements being compared in binary-search are an inversion, it will fail (thanks to Jorge M.). In other words, binary-search will end up in success if (x,A[m]) is not an inversion in each of the log n iterations. Now: what is the probability of (x,A[m]) being an inversion, in general?

EDIT2: I posted this on stackoverflow before I realized cs.stackexchange was a better option.

by Tom at May 01, 2016 10:49 AM

Is this decidable language a subset of this undecidable language?

I think I understand the theoretical definition of decidable and undecidable languages but I am struggling with their examples.

A(DFA) = {(M, w): M is a deterministic finite automaton that accepts the string w}

A(TM) = {(M, w): M is a turing machine that accepts the string w}

I know that A(DFA) is decidable and A(TM) is not. But, is A(DFA) a subset of A(TM)?

by user6268553 at May 01, 2016 10:45 AM



Reduce set partition search to decision?

I'm a little lost and don't know how to approach this problem.

Show the partition search problem can be poly-time reduced to the partition decision problem, the partition decision problem takes an input set of numbers and returns true if there is a subset of the initial set that sums up to half the total sum of the initial set.

With problems like ham-path search, clique search and SAT search, the key was to build the solution one piece at a time using the results from the decision "oracle". But I need to know how to approach this problem.

Initially, I thought about removing elements from the set while verifying if there is a partition in the remaining set, which led me nowhere. Now I'm wondering if adding elements to the initial set would have any results. I noticed if the initial set has a partition, adding elements to the set would then only have a partition if the added element is even, but I don't see how this can generate a subset of the original set that satisfies partition search. Am I going off the wrong track? Any pointers would be appreciated.

by Tl93 at May 01, 2016 10:37 AM

What Measure of Disorder to use when Analysing Quicksort

I'm trying to understand why quicksort using Lomuto partition and a fixed pivot is performing erratically, but overall poorly, on randomly generated inputs. I'm thinking that even though the inputs are randomly generated, there may be a lot of order to the sequences, but I'm not sure how to measure the level of disorder in the sequences. I thought about using the number of inversions, but I saw from this other question I asked that that's not really a good measure in this case.

The reason I suspect that my random sequences have a lot of "order" to them is that randomizing the pivot fixes the performance problem. But theoretically there shouldn't be any performance problem on these supposedly "random" input sequences.

by Robert S. Barnes at May 01, 2016 10:37 AM


RAID: ZFS or Btrfs?

I've mounted my own NAS with ArchLinux on an old HDD. I want to add 3x4To to have real storage capabilites and I would like to use a RAID5 system with these 3 disks.

I've read a lot about ZFS Raid-z and it's exactly what I want to do. But I've heard about Btrfs and it seems Btrfs is also able to handle software RAID-5 like ZFS. But I wonder if Btrfs RAID work as well as ZFS. I also couldn't find complete information regarding how to create and manage the raid. So my question is:

  • Is Btrfs able to handle a software raid with same protection as ZFS (no «write hole error», self-healing, etc… ?
  • Is Btrfs as reliable as ZFS Raid-z or is it still experimental features?
  • If the answer to my 2 first questions are «yes», where can I find full information about how to setup, repair and clean a Btrfs raid?

Thanks for your help :)

by user3194042 at May 01, 2016 10:24 AM


How to predict more than one class with random forest in python?

How can I predict more than one class with random forest in python?

I am familiar with the get probability, but how can I know to which class each probability belongs?

by avicohen at May 01, 2016 10:09 AM


Merkel und ihre Handlanger stellen erstmals die bedingungslose ...

Merkel und ihre Handlanger stellen erstmals die bedingungslose Freundschaft mit Israel in Frage. Also nicht mit Israel per se, mit der Netanjahu-Junta. Netanjahu scheint es ein bisschen zu weit getrieben zu haben, die Merkel fühlt sich instrumentalisiert.

May 01, 2016 10:00 AM


Imagine a red-black tree. Is there always a sequence of insertions and deletions that creates it?

Let's assume the following definition of a red-black tree:

  1. It is a binary search tree.
  2. Each node is colored either red or black. The root is black.
  3. Two nodes connected by an edge cannot be red at the same time.
  4. Here should be a good definition of a NIL leaf, like on wiki. The NIL leaf is colored black.
  5. A path from the root to any NIL leaf contains the same number of black nodes.


Suppose that you have implemented the insert and delete operations for the red-black tree. Now, if you are given a valid red-black tree, is there always a sequence of insert and delete operations that constructs it?


This question is motivated by this question and by the discussion from this question.

Personally, I do believe that if you imagine a valid red-black tree consisting only of black nodes (which implies that you are imagining a perfectly balanced tree), there is a sequence of insert and delete operations that constructs it. However,

  1. I do not know how to accurately prove that
  2. I am also interested in the more general case

by all3fox at May 01, 2016 09:46 AM

Is this special case of a scheduling problem solvable in linear time?

Alice, a student, has a lot of homework over the next weeks. Each item of homework takes her exactly one day. Each item also has a deadline, and a negative impact on her grades (assume a real number, bonus points for only assuming comparability), if she misses the deadline.

Write a function that given a list of (deadline, grade impact) figures out a schedule for which homework to do on which day that minimizes the sum of bad impact on her grades.

All homework has to be done eventually, but if she misses a deadline for an item, it doesn't matter how late she turns it in.

In an alternative formulation:

ACME corp wants to supply water to customers. They all live along one uphill street. ACME has several wells distributed along the street. Each well bears enough water for one customer. Customers bid different amounts of money to be supplied. The water only flows downhill. Maximize the revenue by choosing which customers to supply.

We can sort the deadlines using bucket sort (or just assume we have already sorted by deadline).

We can solve the problem easily with a greedy algorithm, if we sort by descending grade impact first. That solution will be no better than O(n log n).

Inspired by the Median of Medians and randomized linear minimum spanning tree algorithms, I suspect that we can solve my simple scheduling / flow problem in (randomized?) linear time as well.

I am looking for:

  • a (potentially randomized) linear time algorithm
  • or alternatively an argument that linear time is not possible

As a stepping stone:

  • I have already proven that just knowing which items can be done before their deadline, is enough to reconstruct the complete schedule in linear time. (That insight is underlying the second formulation where I am only asking about certificate.)
  • A simple (integral!) linear program can model this problem.
  • Using duality of this program, one can check a candidate proposed solution in linear time for optimality, if one is also given the solution to the dual program. (Both solutions can be represented in a linear number of bits.)

Ideally, I want to solve this problem in a model that only uses comparison between grade impacts, and does not assume numbers there.

I have two approaches to this problem---one based on treaps using deadline and impact, the other QuickSelect-like based on choosing random pivot elements and partitioning the items by impact. Both have worst cases that force O(n log n) or worse performance, but I haven't been able to construct a simple special case that degrades the performance of both.

by Matthias at May 01, 2016 09:32 AM


Implementing Mean Squared Error for matrices in TensorFlow

I am making a convolution autoencoder for images, and want to use MSE as the loss. Does an inbuilt function implementation of MSE exist for matrices?

by goluhaque at May 01, 2016 09:24 AM


Deletion in B+ Tree

The B+ tree deletion algorithm given in CLRS is as follows:

  1. If the key $k$ is in node $x$ and $x$ is a leaf, delete the key $k$ from $x$.
  2. If the key $k$ is in node $x$ and $x$ is an internal node, do the following:

    a. If the child $y$ that precedes $k$ in node $x$ has at least $t$ keys, then find the predecessor $k$’ of $k$ in the subtree rooted at $y$. Recursively delete $k$’, and replace $k$ by $k$’ in $x$. (We can find $k$’ and delete it in a single downward pass.)

    b. If $y$ has fewer than $t$ keys, then, symmetrically, examine the child $z$ that follows $k$ in node $x$. If $z$ has at least $t$ keys, then find the successor $k$’ of $k$ in the subtree rooted at $z$. Recursively delete $k$’, and replace $k$ by $k$’ in $x$. (We can find $k$’ and delete it in a single downward pass.)

    c. Otherwise, if both $y$ and $z$ have only $t – 1$ keys, merge $k$ and all of $z$ into $y$, so that $x$ loses both $k$ and the pointer to $z$, and y now contains $2t – 1$ keys. Then free $z$ and recursively delete $k$ from $y$.

  3. If the key $k$ is not present in internal node $x$, determine the root $x.c_i$ of the appropriate subtree that must contain $k$, if $k$ is in the tree at all. If $x.c_i$ has only $t – 1$ keys, execute step 3a or 3b as necessary to guarantee that we descend to a node containing at least $t$ keys. Then finish by recursing on the appropriate child of $x$.

    a. If $x.c_i$ has only $t – 1$ keys but has an immediate sibling with at least $t$ keys, give $x.c_i$ an extra key by moving a key from $x$ down into $x.c_i$, moving a key from $x.c_i$’s immediate left or right sibling up into $x$, and moving the appropriate child pointer from the sibling into $x.c_i$.

    b. If $x.c_i$ and both of $x.c_i$’s immediate siblings have $t – 1$ keys, merge $x.c_i$ with one sibling, which involves moving a key from $x$ down into the new merged node to become the median key for that node.

I have a doubt in step 2.a (or 2.b). It says "recursively delete $k'$", where $k'$ is a predecessor of $k$. As per my understanding $k'$ should always be in the leaf node in which case we apply step 1 and simply delete it from the leaf. Whats does the author mean to imply with word "recursively", since there will always be only one call to delete predecessor (or successor if you consider step 2.b)

The example given in CLRS also shows predecessor in the leaf as can be seen below in deletion of $M$:

enter image description here

by Mahesha999 at May 01, 2016 09:08 AM


trading strategy problem - initial capital x buys S over time [0,T] at the constant rate of x/T euros per unit of time

I am looking for clarification to the trading strategy problem where the number of stocks is depending on time.

In the Market with zero safe rate and stock dynamics defined as $$\frac{dS_t}{S_t}=\mu_t dt + \sigma_t dW_t \quad \quad (1)$$ investor with initial capital x buys stock for an interval of time [0,T] at the constant rate of x/T euros per unit of time.

I am calculating the number of shares at time t, first by defining the change rate

$$d \theta_t=\frac{x}{T} \frac{1}{S_t} dt \quad \quad (2)$$

and then getting the function for $\theta$ by integrating (2)

$$\theta_t=\frac{x}{T} \int_0^t \frac{1}{S_t} ds \quad \quad (3)$$

is this approach correct?

Further I want to show that the payoff $V_T$ at T equals $\frac{x}{T} \int_0^T R_{t,T} dt$ where $R_{t,T}$ is the simple return rate between t and T.

Solution manual says that this should be calculated as $$V_T= \int_0^T \theta_t dS_t \quad \quad (4)$$ and integration by parts yields $$V_T= \theta_T S_T - \int_0^T S_t d\theta_t = \int_0^T (S_T-S_t) d\theta_t = \int_0^T (\frac{S_T}{S_t}-1)S_t d\theta_t = \frac{x}{T} \int_0^T R_{t,T} dt \quad (5)$$

The way the payoffs are derived here is unclear to me. My understanding is that the number of shares is different at each time point but only two prices were used here $S_T$ and $S_t$ while integration is with respect to $\theta$. However the price changes over the time interval as well.

Can anybody explain the reasoning used for the payoffs?

by Michal at May 01, 2016 09:07 AM


Gute Nachrichten vom Überwachungsstaat: Das FBI hat ...

Gute Nachrichten vom Überwachungsstaat: Das FBI hat 2015 48642 National Security Letters rausgehauen. Und der geheime FISA Court hat keine einzige Regierungsanfrage nach Überwachung nicht durchgewunken.

May 01, 2016 09:00 AM


Intent from a sentence

I need to build a system or use any online service which can help me out to find the intent from the sentence like

Where should you like to have coffee Here the intent is location as where is there

When should you like to have a coffee Here the intent is on time

What cofee would you like to have Here the intent is in the type of coffee

How can i build such a system. Can someone guide me in this. Is it RNN. How in RNN

by Nipun at May 01, 2016 08:54 AM


EBNF Grammar and Phrase

Grammar G2 is defined by the following production in the EBNF
P = 1R| 0QR
Q = 1 | Q0
R = 0| 0Q | R1

(i) which of these is a sentence in G2 : 10 , 11, 00 , 011
(ii) which of these is a sentential form in G2 : 0Q1 , 1R1 , 110 , 0QQ0
(iii) which of these is a phrase of the non-terminal Q in G2 : R1, 1000 , OR1 , R00

by faisal abdulai at May 01, 2016 08:42 AM


Installing Avahi on FreeBSD - Daemon Doesn't Start

I've just installed package avahi-app-0.6.31_5 on a fairly-clean FreeBSD 10.3, but the service isn't starting on its own. I've consulted the documentation and discovered that there isn't any.

Can anyone fill me in on what I've overlooked?

by Darwin von Corax at May 01, 2016 08:42 AM


How to re order my list of list?

I have a list like:


and i want to have:


using only functional programming.

by vjmoreno at May 01, 2016 08:40 AM



Using my own dataset for classification

I am building a ANN module to conduct classification in python. The demo I get imports ClassificationDataSet module

    from pybrain.datasets import ClassificationDataSet
    alldata = ClassificationDataSet(2, 1, nb_classes=3)

and I am wondering how can I use my own data. My data is list type. Is there any processing I need to do?

by jz.Wang at May 01, 2016 07:52 AM


Recursion Type in Grammar Productions

The grammar G0 is defined by the productions P= xP|y which type of recursion is it
left , central , right or indirect recursion

by faisal abdulai at May 01, 2016 07:51 AM

Planet Theory


TSP problem with a benchmark data

I've got a test Travel Salesman Problem's data with known optimal solutions. It's in a form of set of 2D points. Particularly, this is a tsplib format; sources are here and here.

I'd started a simple test with the "Western Sahara - 29 Cities" (wi29) and found that my algorithm rapidly found a few better particular solutions than the proposed optimum.

I checked one of them manually and didn't find an error. So, I guess, here're the three options.

  1. I did a mistake.
  2. Wrong optimum.
  3. Different problems were solved.

1 and 2. My solution tour is:

17>18>19>15>22>23>21>29>28>26>20>25>27> 24>16>14>13>9>7>3>4>8>12>10>11>6>2>1>5

(will list my checking calculations if requested)

Rounded length: 26040.76. Optimal reference value: 27603.

  1. I can't find a particular task descriptions and especially rounding policy for the TSPLib examples optimums. This is important, because they're looking rounded or discretized in another manner, but simple result rounding isn't looks like it.

by Les at May 01, 2016 07:10 AM




Why CPU needs to be cooled? [on hold]

CPUs are made of Silicon and conductivity of Silicon increase with temperature, so why there is a need of cooling CPUs.

by user112349 at May 01, 2016 05:54 AM

Is there a formal CS definition of VCS and file versions?

I don't know whether it was a joke, but once I read what was referred to as a formal definition of a file in a versioning system such as git, hg or svn. It was something like a mathematical object like a homeomorphism. Was that a joke or is there really computer science theory about versioning systems and the mathematics of VCS?

by Programmer 400 at May 01, 2016 05:53 AM

How to improve the accuracy of automatic conflict resolution?

A little background information from my previous question:

When a court reporter strokes 2 different words with the same keys, this creates a conflict. Normally, the reporter will fix the error later, but sometimes there is a way for the court reporting software to fix the error for you. This can be referred to as automatic conflict resolution.

My court reporting software's system for accomplishing this is by recording the parts-of-speech before and after the conflicting words.

So for example, if my two conflicting words are Tallahassee and shake and I type the following sentences it will look something like this:

I eat at Tallahassee / shake all the time. (Prep - Determiner)

I eat a Tallahassee / shake all the time. (Article - Determiner)

At first it will make me choose between the two, but after I chose it will then automatically add its defined part of speech in a database, so that if I type something like this...

I eat in Tallahassee every afternoon. (Prep - Determiner)

my computer should correctly pick "Tallahassee" since I already told it that "Tallahassee" occurs after a Prep and before a Determiner. The rule for this is simply pos word pos

I tested the practicality of this conflict-resolution system with 79 random conflicts using parts of ANC's pos-tagged corpus and Excel VBA.

  • As the data shows, only 10 out of the 79 conflicts showed up with 0 collisions total. This means that in the entirety of the corpus, none of these conflicts had conflicting parts-of-speech which had caused an error.

  • 36/79 conflicts showed up with 5 or less collisions.

  • 32/79 had 10 or more collisions

  • Each collision represents 1 guaranteed error of real-time translation given the pos word pos rule (per the parsed text from ANC, which was about 4.5 million words long)

These results aren't very good for the kind of real-time accuracy I hope to achieve. It would be much better if I could get at least 30/79 (as opposed to only 10) to have 0 "guaranteed errors."

How can I improve this system so that I will have fewer real-time translation errors?

My best thought is to change the rule from pos word pos to pos pos word pos in the case of a collision, but that's all I've got. I'm not very experienced on this subject, so I'm not necessarily opposed to the idea of starting over with something fresh.

by bmende at May 01, 2016 05:17 AM


Value train not a member of object NaiveBayes

I am new to spark and trying to use MLlib - NaiveBayes from the documentation example. I tried to import NaiveBayes but I get the below error mentioning it doesnt have train method in it. I am not sure how to proceed with this? If you have any inputs, it would be helpful.

import org.apache.spark.{SparkConf, SparkContext}
import org.apache.spark.mllib.linalg.Vectors
import org.apache.spark.mllib.regression.LabeledPoint
import org.apache.spark.mllib.classification.NaiveBayes

object NaiveBayes {

def main(args: Array[String]){

val conf = new SparkConf().setMaster("local[1]").setAppName("NaiveBayesExample")
val sc = new SparkContext(conf)

val data = sc.textFile("/Users/Desktop/Studies/sample_naive_bayes_data.txt")
val parsedData = { line =>
  val parts = line.split(',')
  LabeledPoint(parts(0).toDouble, Vectors.dense(parts(1).split(' ').map(_.toDouble)))

// Split data into training (60%) and test (40%).
val splits = parsedData.randomSplit(Array(0.6, 0.4), seed = 11L)
val training = splits(0)
val test = splits(1)

val model = NaiveBayes.train(training, lambda = 1.0)

val predictionAndLabel = => (model.predict(p.features), p.label))
val accuracy = 1.0 * predictionAndLabel.filter(x => x._1 == x._2).count() / test.count()

println("Accuracy = " + accuracy * 100 + "%")



 Error:(26, 28) value train is not a member of object NaiveBayes
    val model = NaiveBayes.train(training, lambda = 1.0)
 Error:(29, 59) value _1 is not a member of Nothing
   val accuracy = 1.0 * predictionAndLabel.filter(x => x._1 == x._2).count() / test.count()

by Astros at May 01, 2016 05:05 AM

What is Train loss, Valid loss, and Train/Val mean in NNs

I'm currently learning about Convolutional Neural Networks by studying examples like the MNIST examples. During the training of a neural network, I often see output like:

 Epoch  |  Train loss  |  Valid loss  |  Train / Val
    50  |    0.004756  |    0.007043  |     0.675330
   100  |    0.004440  |    0.005321  |     0.834432
   250  |    0.003974  |    0.003928  |     1.011598
   500  |    0.002574  |    0.002347  |     1.096366
  1000  |    0.001861  |    0.001613  |     1.153796
  1500  |    0.001558  |    0.001372  |     1.135849
  2000  |    0.001409  |    0.001230  |     1.144821
  2500  |    0.001295  |    0.001146  |     1.130188
  3000  |    0.001195  |    0.001087  |     1.099271

Besides the epochs, can someone give me an explanation on what exactly each column represents and what the values mean? I see a lot of tutorials on basic cnn's, but I haven't run into one that explains this in detail.

by sebaqu at May 01, 2016 04:56 AM

Wrong Output for prediction using SVR

What could be the possible reason for svr to predict a wrong output for untrained dataset even though its predicting right answer for trained dataset.

I have implemented grid search for best C and Gamma

svr_rbf = GridSearchCV(SVR(kernel='rbf', gamma=0.1), cv=5,
                   param_grid={"C": [1e0, 1e1, 1e2, 1e3],
                               "gamma": np.logspace(-2, 2, 5)})
y_rbf =, train_trans_time).predict(test_data)

, do i have to train more data set or there is any other issue ?

by etherjain at May 01, 2016 04:43 AM


Should Microsoft have banned installing Windows on Macs back when Macs first used Intel processors?

It seems that Macs had very little adoption until they started using Intel processors and can therefore run Windows. So many consumers were initially scared of getting something with an incompatible OS, but now have the security of knowing they can always install their favorite OS. But after trying out OS X, they realized that it wasn't so bad and never bothered to install Windows, thereby permanently depriving Microsoft of a loyal customer. So what claiming "it runs Windows!" does is it lets the purchaser to try out OS X.

So my logic is: by banning Windows from Macs, although Microsoft would lose a bit of revenue, they would have prevented being phased out by Apple in the high end computer market.

Would the plan have worked? Or did most prospective Mac buyers not care about running Windows anyway?

(Of course, this is ignoring the fact that this would be a very underhanded and anti-competitive move lol)

by genealogyxie at May 01, 2016 04:11 AM


testing for the equality of proportions between groups and categories

I have a large sample of males and females grouped into 6 categories of different sizes. I would like to know whether the proportion of females is the same across all 6 categories. Could you please tell me what statistical test I should use?

Thank you!

by Dan at May 01, 2016 03:24 AM


What is an addressable cell size?

This question started with a quiz question from my university: Consider a big-endian computer system with an addressable cell size of one byte. The values in memory cells 372 to 375 are shown in the table below. What 16-bit two's complement value (expressed as a decimal number) is stored at address 374?

Address    Value
372        0xC5
372        0x5E
374        0x7F
375        0x23

One thing I'm not clear about is what exactly does "addressable cell size of one byte" mean?

by Jeevan at May 01, 2016 03:17 AM

Lambda the Ultimate Forum

PL's hotness challenge

Blog post on HN. Only the intro is related to PL:

I’m trying to get into neural networks. There have been a couple big breakthroughs in the field in recent years and suddenly my side project of messing around with programming languages seemed short sighted. It almost seems like we’ll have real AI soon and I want to be working on that. While making my first couple steps into the field it’s hard to keep that enthusiasm. A lot of the field is still kinda handwavy where when you want to find out why something is used the way it’s used, the only answer you can get is “because it works like this and it doesn’t work if we change it.”

Putting my ear to the ground, competition from ML has become more and more common not just in PL, but also in systems, I know many researchers who are re-purposing their skills right now while it has become more difficult to find grad students/interns.

So, is this something we should be worried about? What could happen to make PL more competitive in terms of exciting results/mindshare?

May 01, 2016 03:08 AM


Planet Theory

The shape of the Kresge Auditorium

The image below is a study of the geometry of MIT's Kresge Auditorium.

I found an article by Ivars Petersen claiming that this building's floor plan is "close to the geometry of a Reuleaux triangle" and I wanted to determine whether that was true. Other sources such as a 50-year retrospective published by MIT state that the roof of the building has the shape of an eighth of a sphere (a spherical right equilateral triangle); see this link on making a 3d model of the shape for an amusingly-captioned visualization of its construction.

So, the floor plan is the projection of an eighth-sphere; what is this shape? The edges of the roof are great circle arcs in 3d, so they project to ellipses in 2d. By my calculation, the aspect ratio of these ellipses is √3:1. To see this, let the sphere be the unit sphere in 3d, with the three corners of the roof at (1,0,0), (0,1,0), and (0,0,1), and project it onto the plane x+y+z=0. Then the semimajor axis of the ellipse is the radius of the sphere, 1, while the semiminor axis is the distance from the origin of the projected midpoint of an arc. The midpoint is √2(1/2,1/2,0), its projection is √2(1/6,1/6,-1/3), and the distance is 1/√3. So I drew three ellipses with that aspect ratio, rotated by a third of a circle around their common center, and to complete the illusion of being three-dimensional (though really it's just a 2d drawing) I added another circle, with radius equal to the semimajor axis of the ellipses. Those are the grey and black parts of the figure. The shape of the auditorium floor plan is the central triangle outlined by black arcs.

The red circles in the drawing are centered at the corners of this triangle, and pass through the other two corners. Their intersection forms a Reuleaux triangle, overlaid on the other curved triangle formed by the projected roof. As you can see, the floor plan is not actually a Reuleaux triangle. It differs from Reuleaux in two significant ways: It has slightly less area, and it has elliptical arcs for sides (with variable curvature, bendier near the corners and flatter near the centers of each side) rather than circular arcs. On the other hand, as Petersen stated, it is very close.

So, to state the obvious, not all curvy triangles are alike! Another example of this same phenomenon is given by the rotor of the Wankel rotary engine: also a curved triangle with sharper angles than the Reuleaux, but with another kind of curve for its sides (the envelope of an epitrochoid). I'm pretty sure this envelope is not an ellipse, even though I don't know how to draw it. And the angles are definitely different. So the Wankel would be yet another kind of curved equilateral triangle that differs from the first two.

May 01, 2016 02:58 AM


Greedy algorithm for submodular optimzation

In these notes, 4.2.1 exercise 1, the following argument works if $f$ takes values in the integers, but I don't know how to deal with it if $f$ can take values in the reals.

Problem: Given a monotone submodular function $f$ (whose value would be computed by an oracle) on N = {1, 2, . . . , m}, find the smallest set S ⊆ N such that f(S) = f(N).

A greedy algorithm for this problem is as follows:

  1. $S \leftarrow\emptyset$
  2. while $f(S) \neq f(N)$ {
  3. ____find $i$ to maximize $f(S+i)-f(S)$
  4. ____set $S \rightarrow S\cup \{i\}$
  5. }
  6. return $S$

Question: Show that this is a $1+\ln f(N)$ approximation algorithm.

An argument is as follows: If $O$ is an optimal set, we can show that for the $i$ chosen in line 3 of the algorithm,

$$ f(S+i) - f(S) \geq \frac{f(N)-f(S)}{|O\setminus S|} \geq \frac{f(N)-f(S)}{|O|} $$

Now letting $S_k$ denote the set $S$ after the $k$'th iteration (so $S_0=\emptyset$), and $z_k = f(N)-f(S_k)$ i.e., $z_k$ the "amount left" after the k'th iteration (so $z_0=f(N)$), the above inequality implies

$$ z_k \leq z_{k-1} - \frac{z_{k-1}}{|O|} = \left(1-\frac{1}{|O|}\right)z_{k-1} $$

And therefore $$ z_k \leq \left(1-\frac{1}{|O|}\right)^{k}f(N) \leq f(N)\exp(-k/|O|). $$ Setting $k=|O|(1+\ln f(N))$, we have $z_k<1$, and because $f$ takes only integral values, it must be that $z_k=0$.

But the question does not stipulate $f$ take only integral values. How can you deal with an $f$ that takes non-negative reals?

by John Harrison at May 01, 2016 02:40 AM


How can I improve numpy's broadcast

I'm trying implementing k-NN with Mahalanobis's distance in python with numpy. However, the code below works very slowly when I use broadcasting. Please teach me how can I improve numpy speed or implement this better.

from __future__ import division
from sklearn.utils import shuffle
from sklearn.metrics import f1_score
from sklearn.datasets import fetch_mldata
from sklearn.cross_validation import train_test_split

import numpy as np
import matplotlib.pyplot as plt

mnist = fetch_mldata('MNIST original')
mnist_X, mnist_y = shuffle(,'int32'))

mnist_X = mnist_X/255.0

train_X, test_X, train_y, test_y = train_test_split(mnist_X, mnist_y, test_size=0.2)

k = 2
def data_gen(n):
    return train_X[train_y == n]
train_X_num = [data_gen(i) for i in range(10)]
inv_cov = [np.linalg.inv(np.cov(train_X_num[i], rowvar=0)+np.eye(784)*0.00001) for i in range(10)]  # Making Inverse covariance matrices
for i in range(10):
    ivec = train_X_num[i]  # ivec size is (number of 'i' data, 784)
    ivec = ivec - test_X[:, np.newaxis, :]  # This code is too much slowly, and using huge memory
    iinv_cov = inv_cov[i]
    d[i] = np.add.reduce(, iinv_cov)*ivec, axis=2).sort(1)[:, :k+1]  # Calculate x.T inverse(sigma) x, and extract k-minimal distance

by Reiji at May 01, 2016 02:25 AM


Construct a grammar for $a^mb^n$ s.t $0<=n<=m<=3n$

So this is a homework question. It goes as follows

Construct a grammar over $\{a, b\}$ whose language $$ L = \{a^mb^n | 0 ≤ n ≤ m ≤ 3n\}$$

The work I have done is:

$$ S \rightarrow aaaSb|aaSb|aSb|\epsilon $$

The intuition behind the solution is that since the number of a's is greater than or equal to the number of b's but less than 3 times the number of b's the grammar can take the form $$ aaaSb \text{ (3a's to 1b) or } aaSb \text{ (2a's to 1b) or }aSb \text{ (equal number of a's and b's)}$$

Is this the solution or Am I missing anything else?

by Perseus14 at May 01, 2016 02:08 AM




Length of the longest subsequence in a list of integers

I want to get good at functional programming so I set myself some tasks.

I want to determine the length of the longest subsequence in a list of integers, where the next element is incremented by 1.

So the result should be

incsubseq [] ~?= 0,
incsubseq [5] ~?= 1,
incsubseq [1,2,3,5,6] ~?= 3,
incsubseq [5,6,1,2,3] ~?= 3,
incsubseq [5,6,1,4,3] ~?= 2,
incsubseq [6,5,4,3,2,1] ~?= 1]

My try was this:

incsubseq :: [Int] -> Int
incsubseq [] = 0
incsubseq [_] = 1
incsubseq (a:b)
          | a == ((head b)-1) = 1 + (incsubseq b)
          | a /= ((head b)-1) = (incsubseq b)

But of course this only works for lists that don't have a longer subsequence e.g. [1,2,3,42] = 3, but not for lists like [1,2,100,101,102] which should be 3 but is NOT (It's 2)!

I would really, really appreciate your help since this problem drives me crazy, coming from OO- Programming.

by HaskellDevNoob at May 01, 2016 01:07 AM


Proof with closure properties for regular languages [duplicate]

This question already has an answer here:

Hi how to prove using properties closure for regular languajes that: $L = \{w|w \in \{a,b\}^* \wedge |w|_b = 2|w|_a \}$ is not regular. Thanks

by lzamora at May 01, 2016 01:00 AM


Why do we assume quadratic utility in portfolio theory?

In my text (Investments by BKM), the investor's mean-variance utility (given as $U = E[R] - \frac12A\sigma^2$) is stated to be the objective function we wish to maximize. Upon further digging, it seems that this stems from the assumption of quadratic utility functions ($U = aW - bW^2$). This kind of bothers me since I see two unrealistic properties for quadratic utility functions. (1) They exhibit increasing absolute risk aversion, and (2) they achieve a satiation point, beyond which money/return begins to have negative value.

So why do we assume quadratic utility? Are there no other simple, more realistic functional forms for utility that would still lead to a reasonably clean portfolio optimization theory? Or are the issues I cited about the quadratic just negligible in practice?

by Varun P at May 01, 2016 12:41 AM


Appropriate Deep Learning Structure for multi-class classification

I have the following data

         feat_1    feat_2 ... feat_n   label
gene_1   100.33     10.2  ... 90.23    great
gene_2   13.32      87.9  ... 77.18    soso
gene_m   213.32     63.2  ... 12.23    quitegood

The size of M is large ~30K rows, and N is much smaller ~10 columns. My question is what is the appropriate Deep Learning structure to learn and test the data like above.

At the end of the day, the user will give a vector of genes with expression.

gene_1   989.00
gene_2   77.10
gene_N   100.10

And the system will label which label does each gene apply e.g. great or soso, etc...

By structure I mean one of these:

  • Convolutional Neural Network (CNN)
  • Autoencoder
  • Deep Belief Network (DBN)
  • Restricted Boltzman Machine

by neversaint at May 01, 2016 12:32 AM

HN Daily

April 30, 2016


Extrapolating SVI

In his paper Gatheral presents the following parametrization of the implied total variance $w(k,T) = \sigma_{BS}(k,T)^2T$

$$ w(k) = a + b\{\rho (k-m) + \sqrt{(k-m)^2 + \sigma^2} \}.$$

Assuming that we only have a few market prices e.g. 6 or 7 which are close to at-the-money. I wanted to know if there are any common techniques to extrapolate the implied volatility for Strikes that are far out-of-the-money.

by Jonkie at April 30, 2016 11:30 PM


Rotating a set of points around two fixed points

I have a set of points that are used to draw a shape. I want to rotate this shape without moving its start and end points. I tried to illustrate what I want in the image below (original shape on left, interpolated on right). Are there any algorithms to do that? I researched Hermite Spline and Bezier Curves but I did not think they are applicable in my problem.

enter image description here

My goal is achieving this action with only 2D transformations:


by Bünyamin Sarıgül at April 30, 2016 11:19 PM


p2k16 Hackathon Report: espie@ on proot

Our very first p2k16 hackathon report comes from none other than Marc Espie, who writes:

Lots of thanks to Gilles Chehade, Epitech Nantes, and Aymeric Fouchault for the organization. It was top-notch. The only complaint I might have is that the food was so good that I might have eaten too much.

April 30, 2016 11:06 PM


Maximum set of equalities, subject to some inequalities

I have $n$ variables $x_1,\dots,x_n$. I'm given a set $E$ of equalities (each of the form $x_i=x_j$ for some $i,j$) and a set $I$ of inequalities (each of the form $x_i \ne x_j$ for some $i,j$). I want to find a maximum-size subset $E' \subseteq E$ such that $E'$ is compatible with $I$, i.e., such that there is an assignment to the $n$ variables that satisfies every inequality in $I$ and every equality in $E'$.

Is there an efficient algorithm for this?

I can see that the greedy algorithm (try adding equalities to $E'$ as long as doesn't imply the negation of some inequality) doesn't yield an optimal solution. I have no idea what other approaches to try.

Equivalently, the problem can be formulated in graph-theoretic terms. I'm given an undirected graph $G=(V,E)$ and a set $I \subseteq V \times V$. I want to find a maximum-cardinality subset $E' \subseteq E$ of edges, such that when I decompose the graph $G'=(V,E')$ into connected components, $v,w$ are in different connected components for all $(v,w) \in I$.

by D.W. at April 30, 2016 11:06 PM


Subtyping rules for extension of System $F_\omega$ with subtyping and kind-level variance tracking

I need an extension of System $F_\omega$ with subtyping, and where the variance of type constructors is reflected in their kind. Unfortunately, System $F^\omega_{<:}$, as defined in chapter 31 of Pierce's Types and Programming Languages, doesn't address the latter requirement, so I decided to roll my own.

Here is the list of additions to $F_\omega$'s I've made so far:


  • Ground forms: $+$, $-$.

  • Inversion: $+^\dagger = -$ and $-^\dagger = +$.


  • Ground forms: sets of polarities.

  • Inclusion: set inclusion.

  • Inversion: memberwise.


  • Ground forms: $\Omega/V$ and $K \rightarrow K'$.

  • Inclusion:

    • Given $V_1 \subseteq V_2$, we can derive $\Omega/V_1 \subseteq \Omega/V_2$.

    • Given $K_2 \subseteq K_1$ and $K_1' \subseteq K_2'$, we can derive $K_1 \rightarrow K_1' \subseteq K_2 \rightarrow K_2'$.


  • Kinding:

    • $\Gamma \vdash \top, \bot : \Omega/V$.

    • $\Gamma \vdash (\rightarrow) : \Omega/V^\dagger \rightarrow \Omega/V \rightarrow \Omega/V$.

    • The remaining rules are as one would expect.

  • Inclusion:

    • Given $\Gamma \vdash T : \Omega/V$ and $\{+\} \subseteq V$, we can derive $\Gamma \vdash \bot \subseteq T \subseteq \top : \Omega/V$.

    • Given $\Gamma \vdash T : \Omega/V$ and $\{-\} \subseteq V$, we can derive $\Gamma \vdash \top \subseteq T \subseteq \bot : \Omega/V$.

    • Given $\Gamma \vdash T_1, T_2 : \Omega/\{+,-\}$, we can derive $\Gamma \vdash T_1 \subseteq T_2 : \Omega/\{+,-\}$.

    • The remaining rules are as one would expect.

And here I ran out of imagination. Now I have the following questions:

  • Is what I've sketched so far sound? What sanity checks can I use to make sure I'm not doing something wrong? Perhaps something akin to automated testing in computer programming?

  • A type system with polymorphism and subtyping must have bounded quantification. How hard should it be to add?

  • A very important desideratum in type systems with subtyping is that the inhabitants of each kind form a lattice. How hard should it be make sure that each kind is a lattice?

  • What's the most convenient tool for mechanizing formal systems like the one I sketched? Preferably, I'd like a library or framework that already does the “boring stuff”, like implementing variable substitution and handling contexts.

by Eduardo León at April 30, 2016 11:00 PM


Iterate over parameters without a loop

I need to iterate over two lists of numbers which form the inputs to a function. I'd like to do this in a functional way. Currently I'm doing:

results = []
for i in params_list1:
    for j in params_list2:

where myfunction() returns a number. I'm pretty sure there is a way to multiply params_list1 and params_list2 (maybe using numpy broadcasting?) and map them to myfunction(), but I'm not able to figure it out. Any tips?

by ilan man at April 30, 2016 10:46 PM


Circuit Classics

I really appreciate seeing a schematic printed on a circuit board next to its circuit. It reminds me that before Open Hardware, hardware was open. Andrew (bunnie) Huang

I think this is a really important point about expecting a schematic for a circuit you are building or using.


by fcbsd at April 30, 2016 10:45 PM


"Hedging" a put option, question on exercise

I have a question on the following exercise from S. Shreve: Stochastic Calculus for Finance, I:

Exercise 4.2. In Example 4.2.1, we computed the time-zero value of the American put with strike price $5$ to be $1.36$. Consider an agent who borrows $1.36$ at time zero and buys the put. Explain how this agent can generate sufficient funds to pay off his loan (which grows by $25 \%$ each period) by trading in the stock and money markets and optimally exercising the put.

The model from Example 4.2.1 he refers to is the following: \begin{align*} S_0 & = 4 \\ S_1(H) & = 8, S_1(T) = 2 \\ S_2(HH) & = 16, S_2(HT) = S_2(TH) = 4, S_2(TT) = 1 \end{align*} with $r = 1/4$ and risk-neutral probabilities $\hat p = \hat q = 1/2$.

So now my question, I am not sure how to solve this, I guess the agent wants in some way hedge that he can always pay the credit by utilising the option. First how should I think about it, that he should pay back his credit at time step $2$, or earlier if possible by exercising the option, or should he stay liquide until the end? And what means optimal, hedging with minimal invest?

Okay, I solved it by considering two scenarios, first using the option at time step $1$, and then at time step $2$. By using it at time step one I found that he has to invest additional $0.16$, buy $1/2$ from the share, and accordingly borrow $0.16 - 1/2 \cdot 4$ from the bank/monkey market. Then at the first time step, as the option was exercised and the value of the portfolio equals $(1 + r)1.36$ he could just invest everything riskless, i.e. readjusting his portfolio by not buying any shares of stock, and investing in the riskless asset $(1+r)1.36$, in this way at time step $2$ he could pay $(1+r)^2 1.36$.

In the second scenario, i.e. exercising the option at time step $2$, I found that he has to invest additional $1.36$ and buy no share at the initial step, and then readjust in the next step his portfolio as to buy $1/12$ of the share if it goes up and $1.06$ if it goes down, and by exersing the option, if it goes up after paying his debt $(1+r)^2 1.36$ his portfolio has the value $1.3$, meaning he still has money, or $-0.3$ if it goes down, meaning he still has some debt (this point I do not understand fully?...)

So can someone help me in understand and solving this exercise (if my approach is wrong...)?

by Stefan at April 30, 2016 10:30 PM


How to make prediction on an image took from my phone

I have a small nn trained to predict handwritten numbers ,it uses 20*20 images as input.Input later size is if I took a photo of a number on my mobile and convert it to grayscale and resize it to 20*20 and convert it to a matrix and then reshaped it to 1*400 ...can I make a prediction on that.....

by Abhilash V J at April 30, 2016 10:27 PM


Problems with a Black-Scholes modified equation

I haven't really studied much financial mathematics until about 2 months ago so I'm quite new to this stuff, so I'm sorry if this is a trivial question. At the moment I'm trying to work out what the terms of a modified Black-Scholes equation are equal to, so if someone could help me out I'd appreciate it. The equation is as follows:

$u(0,S_{0}) = \mathbb{E}^{\mathbb{Q}_{BS}}(\Phi) + \frac{\lambda}{2}(\mathbb{E}^{\mathbb{Q}_{BS}}((\Phi*)^2) - (\mathbb{E}^{\mathbb{Q}_{BS}}(\Phi*))^2)$

where $\Phi* = s\partial_s\Phi - \Phi$, $\Phi$ is the payoff, and $\mathbb{Q}_{BS}$ is the risk-neutral probability for the BS equation.

My question is, how do I find out what $\Phi*$ is equal to? In particular, what is $\mathbb{E}^{\mathbb{Q}_{BS}}(\Phi*)$ equal to? There is a lot of material on calculating $\mathbb{E}^{\mathbb{Q}_{BS}}(\Phi)$ so I know what that is equal to, but I have no idea how to find what the other term is equal to ($\lambda$ is just an arbitrary value so I don't need to worry about that).

Thanks in advance.

by ThePlowKing at April 30, 2016 10:25 PM



Random Number generator fail

How do I get the program to loop back around from the beginning if the incorrect number is picked? I'm not sure what I'm doing wrong. I tried ifs, do whiles, whiles, and if elses:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace ArrayProblms
    class Program
        public static void Main(string[] args)
            Console.WriteLine("Guess a number between 1 and 10: ");

        public static void RandomNumberGenerator()
            Random rand = new Random();
            int userValue = int.Parse(Console.ReadLine());
            int randValue = rand.Next(1, 11);
            int attempts = 0;

            if (userValue == randValue)
                Console.WriteLine("You have guessed correctly!");
            while (userValue != randValue)
                Console.WriteLine("You have guessed incorrectly");
                Console.WriteLine("You have made {0} incorrect guesses", attempts);

by D. Kuso at April 30, 2016 10:10 PM



Best way to get intersection of keys of two objects?

I have two object literals like so:

var firstObject =
    x: 0,
    y: 1,
    z: 2,

    a: 10,
    b: 20,
    e: 30

var secondObject =
    x: 0,
    y: 1,
    z: 2,

    a: 10,
    c: 20,
    d: 30

I want to get the intersection of the keys these two object literals have like so:

var intersectionKeys  = ['x', 'y', 'z', 'a']

I can obviously do a loop and see if a key with the same name exists in the other object, but I am wondering if this would be a good case for some functional programming and map / filter / reduce usage? I myself have not done that much functional programming, but I have a feeling, that there could exist a clean and clever solution for this problem.

by Piwwoli at April 30, 2016 09:42 PM

Weka does not have NominalToNumeric Filter [on hold]

In my dataset, there are 3 nominal attributes that I want to convert them to be numeric for the purpose of k-mean clustering algorithm. In Weka, the only filter I found is NominalToBinary and when I use it creates new attributes corresponding to the number of nominal values there. Is that normal? Why there is no NominalToNumeric is Weka?

Thank you.

by Jacki at April 30, 2016 09:38 PM

Sentiment Analysis classifier using Machine Learning

How can we make a working classifier for sentiment analysis since for that we need to train our classifier on huge data sets.

I have the huge data set to train, but the classifier object (here using Python), gives memory error when using 3000 words. And I need to train for more than 100K words.

What I thought was dividing the huge data set into smaller parts and make a classifier object for each and store it in a pickle file and use all of them. But it seems using all the classifier object for testing is not possible as it takes only one of the object during testing.

The solution which is coming in my mind is either to combine all the saved classifier objects stored in the pickle file (which is just not happening) or to keep appending the same object with new training set (but again, it is being overwritten and not appended).

I don't know why, but I could not find any solution for this problem even when it is the basic of machine learning. Every machine learning project needs to be trained in huge data set and the object size for training those data set will always give a memory error.

So, how to solve this problem? I am open to any solution, but would like to hear what is followed by people who do real time machine learning projects.

Code Snippet :

documents = [(list(movie_reviews.words(fileid)), category)
             for category in movie_reviews.categories()
             for fileid in movie_reviews.fileids(category)]

all_words = []
for w in movie_reviews.words():
all_words = nltk.FreqDist(all_words)
word_features = list(all_words.keys())[:3000]

def find_features(document):
    words = set(document)
    features = {}
    for w in word_features:
        features[w] = (w in words)
    return features

featuresets = [(find_features(rev), category) for (rev, category) in documents]
numtrain = int(len(documents) * 90 / 100)
training_set = featuresets[:numtrain]
testing_set = featuresets[numtrain:]

classifier = nltk.NaiveBayesClassifier.train(training_set)

PS : I am using the NLTK toolkit using NaiveBayes. My training dataset is being opened and stored in the documents.

by Arqam at April 30, 2016 09:33 PM



Smallest real root of a polynomial in a range

I'm trying to numerically find the smallest real root of a polynomial in a given range. My initial plan was to shift the polynomial so that the bottom of the range was 0, expand the resulting expressions to find the new coefficients, then use the Jenkins-Traub algorithm until it finds the first (smallest) root or increases out of the range. However, the Wiki page says that it only generally finds the roots in increasing order, so it sounds like that's not guaranteed. Is there a way to guarantee finding the first root? If not, what is the most efficient way to solve the problem?

Finding all the roots then sorting is possible, but is inefficient as the degree of the polynomial gets larger, and, I hope, unnecessary. The bisection method is another common algorithm, but consider the polynomial: $-5x^2+5x-1$ with a range of (0,2), which would cause the angle bisection to fail. The best answer is one that guarantees success with the fastest time for a reasonable degree polynomial (less than say 10).

by user3476782 at April 30, 2016 09:04 PM


Interpolation of forward zeros-coupons bonds simulations for missing maturities (ESG data)

I have a set of economic scenarios simulated with Barrie and Hibbert ESG. The stochastic model for interest rates used is Libor Market Model Shifted. I am facing a problem with zeros-coupons prices.

Indeed, I have for each maturity: 2,000 forward prices(in 1 year to 30 years) trajectories of zeros-coupons. I have the following maturities (1; 2; 3; 5; 8; 10; 15; 20; 30; 40; 50; 60) for each forward price but I want the maturities of 1 to 30 with an annual pace.

I cannot regenerate the scenarios: I have to work with this simulations and I don't have swaptions prices used to calibrate the model. So I have to interpolate the missing maturities throughout 2000 trajectories. Considering that I have to project in risk neutral world, zero coupon prices are martingale seen in $t = 0$: $E[B(t,T)D_t|\mathcal{F}_0]=B(0,t;T)$ with :

  • $B(t,T)$: the price seen in t of zero-coupon bond with maturity T.
  • $D_t$: the deflator is the deflator to calculate the present value of a cash-flow in t.
  • $\mathcal{F}_0$: the filtration(the information in $t=0$)

So I can not make cubic interpolation without considering the fact that the price interpolated must be martingale in all trajectories in expectation (the mean).

What can you propose to me in order to interpolate zeros-coupon prices for missing maturities so that the interpolated prices are still martingale.

by user20554 at April 30, 2016 09:03 PM


If graph isomorphism yields a polynomial time algorihtm

Greeting I'm studying computing theory and are trying to grasp the concept of complexity classes.

If graph isomorphism (suspected NPI) turns out to have polynomial time solution. What possible implication it has or what can we possibly deduce?

Thanks for any explanation

by Someguy at April 30, 2016 08:49 PM

Find all shortest paths in a graph where path has even number of edges and greater than 6

Let $G=(V,E)$, a directed with non-negative weights ($w:E\to\mathbb{R}^+$). Describe an algorithm, finds all shortest paths in the graph from a source vertex, $s\in V$, such that, each paths has an even number of edges and the number of edges is greater-equal to $6$.

So I know I need to use Dijkstra algorithm on a modified graph. Somehow I need to "count" the number of edges. I think I need to add some vertices for each vertex which will make the "count" but I can't figure it out completely.

I'd be glad for help.


by LiorGolan at April 30, 2016 08:44 PM

Show that language generated by grammar is regular

We have grammar with nonterminals $ X_1,...X_n$ terminals $V_t$ and rewriting rules of form:

$X_i \rightarrow a \in V_t $

$X_i \rightarrow X_jX_k, \ i \ge j , \ i > k $

How can I show that language generated by this grammar is regular?

I don't think this is duplicate question, because:

  • I don't have concrete language, but set of nonterminals and set of terminals
  • I have to show, that for every possible combination of terminals and nonterminals I get only language which is regular

If I had been given particular language, I could prove it by giving DFA, showing that rules are only of linear type,etc....

by Rupert Ehringer at April 30, 2016 08:41 PM

How to Modify SAT Solvers to Produce Resolution Refutations for Unsatisfiable Instances?

In recent SAT competitions, there is a Certified UNSAT track. The problem instances are all unsatisfiable and the solvers are asked to produce certificates for unsatisfiability. One way is to produce a resolution refutation for the set of clauses in a problem instance. How can a SAT solver, say MiniSAT, be modified to do this? Can this be done for every solver? I searched the Internet and found little information. A reference is good for me.

by Zirui Wang at April 30, 2016 08:34 PM

Wes Felter

"I have news for you: this is not the middle of the PS4’s lifespan, and there is no PS5."

“I have news for you: this is not the middle of the PS4’s lifespan, and there is no PS5.”

- David Galindo

April 30, 2016 08:29 PM


Origin of the concept of types

About the state of art that I'm running ahead of Type Theory I have the these question all related between them.

  1. Where did the idea of type come from? (It seems that all start when Russell and Whitehead propuse a way to avoid the contradiction that we know today as Russell's Paradox, am I right?)
  2. Before considering the type concept, was there something similar? (Maybe a refinement of a set, but I don't find a reference distinct of Russell).
  3. Who was the first person to put it on formal terms? (Was Russell with this paper of 1908 or ?

by jonaprieto at April 30, 2016 08:28 PM

Toads and frogs game algorithm

I am looking for an algorithm (or hint where to start), for Toads and Frogs Game. What I am interested in is not how to solve the problem (it's NP-hard), but how to plan one player's moves. I.e. how to design a computer player (AI), which could win against another player (another program or human player). I was looking for some clues but with no luck so far, there's not much about it on the Web.

Link to game description on Wikipedia.

And here you can play the game. Please bear in mind, that starting positions may not be that straightforward. They may be mixed up from the very start.

by alex at April 30, 2016 08:26 PM


Volatility of investment (/w currency hedging) [on hold]

I´ve been trying to compute a volatility of invesment with currency hedging and I have a question. Let's take this example. We have our money in a fond copying the S&P500 index, which has 16% volatility, we also know that the current volatility of a dollar toward our currency is 5%. We want to know the volatility of the whole invesment.

Can I compute with the following formula? If so, what is the reason for adding the two deviations instead of mulitplying them considering the volalitity of an index and a currency are mutualy independent (if any correlation exists, it is undefined, at least to my knowledge).


by egikm at April 30, 2016 08:26 PM


Dirty Hands - Cheating in Professional Bridge

There’s some interesting examples here of opsec and adhoc ciphers.


by tedu at April 30, 2016 08:25 PM


Tensorflow conv2d_transpose output_shape parameter format

I have recently started with tensorlfow and am building a convolution autoencoder. I wanted to know about the format in which the output_shape parameter(1D Tensor) is required in the "tf.nn.conv2d_transpose" function? What are the indices of the output image's height, weight and number of channels?

Documentation for this function only states that the output_shape is a 1D tensor.

by goluhaque at April 30, 2016 08:24 PM


How To Correct Someone

Not exactly programming, but there are some universal truths to training.


by tedu at April 30, 2016 08:11 PM


Which languages (apart from the list below) can do Howard Curry type checking at compile-time?

The Howard-Curry correspondence is enormously powerful. I'd like to use it, but I'd like a choice other than Haskell and Scala.

The languages that I can see that support the Howard-Curry correspondence (with type-checking at compile-time) are:

  • Agda
  • Coq
  • Haskell
  • Scala
  • Shen
  • Qi II

My question is: Which languages (apart from the list below) can do Howard Curry type checking at compile-time?

by hawkeye at April 30, 2016 08:07 PM

Is function application actually a memory manipulation algorithm?

I thought about how in lambda calculus (and many implementations of functional programming languages) function (lambda) application and lambda itself, as a construct, are "primitive things", usually somehow implemented by an interpeter. Then I thought, can you "boil-down" these two things to more primitve stuff. For instance, we have a following expression (apply is usually implicit by the syntax convetions, but whatever)

(apply (\x.\y.x) (a b))

The interpreter:
1. Constructs a new environment, where arguments are bound to lambda's terms, i.e. new_env = this_env.append({"x":a, "y":b, "body":"x"})
2. Performs a rewrite of the whole application term with lambda's body, i.e. new_env["body"]

Given only the environment manipualtion "primitives", like: "construct", "append", and "get", doesn't that make whole lambda calculus just a clever trick to hide memory ("tape") mutations? Now I know that turing machine and lambda calculus are equivalent, but is there something more to LC than just what I've described? What have I missed?

by artemonster at April 30, 2016 08:02 PM


Weka API giving ArrayIndexOutOfBoundsException?

I am trying to do prediction using Weka API, i have saved a model for the Weka explorer, now i want to do predict a value for a single instance.

Here is the java code-

import weka.classifiers.Classifier;
import weka.classifiers.functions.LinearRegression;
import weka.classifiers.Evaluation;
import weka.classifiers.bayes.NaiveBayes;
import weka.core.Attribute;
import weka.core.FastVector;
import weka.core.Instance;
import weka.core.Instances;
import weka.classifiers.*;
import java.util.*;
import weka.classifiers.trees.RandomTree;
import weka.core.DenseInstance;
import weka.core.SerializationHelper;
import weka.core.Utils;

public class topcoder {

    public static void main(String[] args) throws Exception {

        Classifier classifier = (Classifier)"C:\\Users\\abc\\Desktop\\one.model");

        FastVector atts = new FastVector();
        FastVector atag;
        Instance inst_co;

    Attribute attr1 = new Attribute("customer_id");
    Attribute attr2 = new Attribute("month");
    Attribute attr3 = new Attribute("call_exp");
    Attribute attr4 = new Attribute("data_exp");
    Attribute attr5 = new Attribute("sms_exp");
    Attribute attr6 = new Attribute("total_exp");
    Attribute attr7 = new Attribute("real_exp");

    ArrayList<Attribute> attributes = new ArrayList<Attribute>();

    Instances testing = new Instances("test",attributes,0);

     inst_co = new DenseInstance(testing.numAttributes());
     inst_co.setValue(attr2, 12);
     inst_co.setValue(attr3, 300);
     inst_co.setValue(attr4, 50);
     inst_co.setValue(attr5, 10);
     inst_co.setValue(attr6, 360);

        System.out.println(res+"  io");                                     

one.model is the model i save from weka GUI, and attr7 that is real_exp is what i am trying to predict.

Here is stack trace -

java.lang.ArrayIndexOutOfBoundsException: 8
    at weka.core.DenseInstance.value(
    at weka.filters.supervised.attribute.NominalToBinary.convertInstanceNumeric(
    at weka.filters.supervised.attribute.NominalToBinary.convertInstance(
    at weka.filters.supervised.attribute.NominalToBinary.input(
    at weka.classifiers.functions.LinearRegression.classifyInstance(
    at topcoder.main(

I am not sure what is it that i am doing wrong. Please help.

by doctorsherlock at April 30, 2016 08:01 PM

Planet Emacsen

Grant Rettke: Emacs Keyboard Design 30: Smaller Modifiers

  • After 3 years using single-key modifiers on a MBP and HP laptop I can do the same thing here.
  • USB spec defines all F keys so use them and see what happens
  • Get rid of right side of board in the process


  • Start with version 29
  • Add F13 to F24
  • Make bottom 1 wide:
    • Because
      • They are flow interrupting actions
      • It is OK to find home with 1 key
    • Keys
      • Alt
      • Meta
      • Ctrl
      • Gui
      • Ultra
      • Shift
  • Leave middle modifiers alone because they are pinky used; need to be large
  • Make spacebar and enter 4 wide
    • 3 is too small
    • 4 is perfect
  • Move arrows below right hyper
  • Move pgup pgdn below left hyper
    • Make them 1.5w
  • Move home above left arrow and end above right arrow
  • Move caps lock above escape making it 1w
  • Move insert and delete above backspace left
  • This leaves PrSc, ScrollLock, and Pause hanging
    • Replace CapsLock with PrSc and del caps lock
    • Delete Scroll Lock and Pause
    • Make delete and insert wide
  • Made F4 and F7 homing keys if it isn’t obvious

by Grant at April 30, 2016 08:01 PM

Lambda the Ultimate

Simon Peyton Jones elected into the Royal Society Fellowship

Simon Peyton Jones has been elected as a Fellow of the Royal Society. The Royal Society biography reads:

Simon's main research interest is in functional programming languages, their implementation, and their application. He was a key contributor to the design of the now-standard functional language Haskell, and is the lead designer of the widely-used Glasgow Haskell Compiler (GHC). He has written two textbooks about the implementation of functional languages.

More generally, Simon is interested in language design, rich type systems, compiler technology, code generation, runtime systems, virtual machines, and garbage collection. He is particularly motivated by direct use of principled theory to practical language design and implementation -- that is one reason he loves functional programming so much.

Simon is also chair of Computing at School, the grass-roots organisation that was at the epicentre of the 2014 reform of the English computing curriculum.

Congratulations SPJ!

April 30, 2016 07:44 PM


Neural Network for function approximation

I have created feedforward neural network in matlab, which is supposed to approximate sine - like functions. I have tested network for:

  1. Various number of neurons in hidden layer (around 10 produces the best result)
  2. Best transfer function in hidden layer (logsig gives best results)
  3. Best train function - trainlm gives best results.

This network with settings as specified above produces really good approximations, but I don't know why:

  1. logsig is better than tansig or hardlim
  2. trainlm is better than other train function available in matlab.

In general, why sigmoidal transfer functions are better for approximation tasks and why Levenberg-Marquardt backpropagation training function suites the best given problem?

by jjdev at April 30, 2016 07:44 PM



Total number of calls during insertion into binary tree

The problem:

Find a formula for the total number of calls occurring during the insertion of n elements into an initially empty set. Assume that the insertion process fills up the binary search tree level-by-level. Leave your answer in the form of a sum.

code for INSERT function:

procedure INSERT(x: elementtype; var A: SET); 
    if A = nil then begin
       A -> .element := x; 
       A ->.leftchild := nil; 
       A ->.rightchild := nil;
    else if x < A ->.element then
        INSERT(x, A->.leftchild);
    else if x > A ->.element then 
        INSERT(x, A ->.rightchild);

The main confusion for me here is with leaving my answer in the form of a sum. I'm not all that great at sums (haven't taken Calc 2 yet), so I don't really know how to set them up or extract information from them all that well. Any help here would be greatly appreciated.

For clarity: This is a review problem where the answer is:

Let $2^k \leq n \leq 2^{k+1}$. Then $k = \log n$ and the number of calls equals,

$$ \sum_{i = 0}^{k - 1} (i + 1)2^i + (k + 1)(n - 2^k + 1) $$

I'd like to know the process behind getting this answer. Thank you.

by Matthew Freihofer at April 30, 2016 07:26 PM

complexity of incrementally creating a graph [duplicate]

This question already has an answer here:

What is the complexity of the two functions f() and g()? They both build a graph G(V, E) incrementally, I think that they are both O(|V|)?

f(n): 1. start with an initial graph K3 (triangle); 2. for (i = 1, n-3) {add a vertexe v to V; and connect it to two randomly chosen verticis from V; }.


  1. Start with an initial graph G = K3 (triangle);
  2. while the number of nodes of the graph G is < n: {Replace an edge (e1, e2) randomly selected from G by a graph f(m); m<= n - |V|}.

        / * The replacement is performed as follows: chose a random edge (e'1, e'2) from G; chose two nodes {e'1, e'2} from a graph H(V', E'), created by f(m); V = V union V' minus {e1,e2}, E = E union E' minus {(e1,e2)}.


by dzakmor at April 30, 2016 07:11 PM

Planet Emacsen

Irreal: What to do When Emacs Hangs or Crashes

Jisang Yoo has a very nice post on recovering from Emacs hangs or crashes. He considers three topics

  1. What do do when Emacs hangs
  2. How to enable debugging
  3. What to do when Emacs crashes

His advice on Emacs hanging seems mostly aimed at Windows users and doesn't mention my preferred method of

pkill -SIGUSR2 Emacs

That will usually unstick Emacs enough that you can save your files and quit. Sometimes you can even keep going but I've found it's generally better to save your files and restart Emacs. If you aren't on a Mac, you will want to use

pkill -SIGUSR2 emacs


The problem with a crash is that you can lose unsaved work. What I've always done in that case is to use recover-file to get the file from disk and fold in the information in the autosave file. Yoo suggests using recover-session instead. This has the advantage of recovering all files from the session that crashed. That's something I didn't know but that I'll use from now on.

Yoo's post is fairly short and one well worth reading. Knowing the things he talks about doesn't make crashes/hangs painless but it does take a lot of the sting out of them.

by jcs at April 30, 2016 06:52 PM



Numerical Stability of Halley's Recurrence for Integer $n^{\mathrm{th}}$-Root

tl;dr? See last paragraph.

If I use the initial value $2^{\left(\big\lfloor\lfloor\log_2 x \rfloor/n\big\rfloor + 1\right)}$ with Halley's recurrence in the compact form

$ x_{k+1} = \frac{x_k\Big[A\left(n+1\right) + x_k^n\left(n-1\right) \Big] }{A\left(n-1\right) + x_k^n\left(n+1\right)} $

to evaluate $\lfloor x^{1/n}\rfloor$ with $x,n \in \mathbb{N}$, $x \gg 1$ and $n > 2$ it seems (empirical tests only!) to work. Slowly.

As is the case with all of these methods: the closer the initial value $x_0$ to the actual root, the smaller the amount of iterations needed. Many papers have been written about it, although not many for the integer versions, but a simple refinement can be implemented by observing that the root lies between $2^{\lfloor \lfloor\log_2 x \rfloor / n \rfloor + 1}$ and $2^{\lfloor \lfloor\log_2 x \rfloor / n \rfloor}$ so a simpel arithmetic average of these limits should give a significant decrease in the number of iterations needed and, lo! and behold, it does. Small problem: the algorithm is unstable with that seed. Visible in the following pretty picture with a highly abused y-axis (#iterations, bitsize of root, and absolute error) and the index along the x-axis. The radicand used was $5987^{797}$.

Numebr of Iterations and Error

The range of indices where the error occurs is outside the range where I would use Halley's recurrence and change to bisection. The cut-off point I have choosen is the intersection between the bisection which is linear $ax^{-1}$ and the approximately linear part of the Halley iterations at the beginning $bx$ which puts the intersection at $\sqrt{a/b}$. Some runs with up to $3\,321\,960$ bits (ca. one million decimal digits) showed $\Big\lfloor\sqrt{A_b\left(\left\lfloor\log_2\left(\left\lfloor\log_2 \left(A_b\right) \right\rfloor\right) \right\rfloor + 1\right)}\Big\rfloor$ with $A_b = \left\lfloor\log_2 \left(A\right) \right\rfloor$ to be a good estimate for the big-integer library in use.

Hence my question: is Halley's recurrence, implemented as described above, numerically stable in the range $(3,\Big\lfloor\sqrt{A_b\left(\left\lfloor\log_2\left(\left\lfloor\log_2 \left(A_b\right) \right\rfloor\right) \right\rfloor + 1\right)}\Big\rfloor)$ with the initial value the arithmetic average of $2^{\lfloor \lfloor\log_2 x \rfloor / n \rfloor + 1}$ and $2^{\lfloor \lfloor\log_2 x \rfloor / n \rfloor}$ or not and, much more interesting, why?

by deamentiaemundi at April 30, 2016 06:37 PM

Distributed systems according to Lamport?

I need some clarification over the following quotation of Leslie Lamport:

A distributed system can be described as a particular sequential state machine that is implemented with a network of processors. The ability to totally order the input requests leads immediately to an algorithm to implement an arbitrary state machine by a network of processors, and hence to implement any distributed system.

I see how a distributed system can be modeled as the product of several particular sequential state machines.

I understand that by totally ordering input requests, one can duplicate a given sequential state machine over several computers, and make sure that all input requests are processed in the same order at every computer, and thus ensuring that the individual local states are coherent.

What I don't understand is how this mechanism is sufficient to implement ANY distributed system. At least it's sufficient to duplicate a sequential state machine, but isn't that a very particular case of a distributed system?

by Nemo at April 30, 2016 06:26 PM


Tradable information from BS Implied volatility

These are two follow up questions to:

Implied volatility as price transform

  1. I understand that the BS model is used as a 'Blackbox' that takes a market price and maps it in a 1to1 fashion to a 'BS implied volatility'. What I don't understand is how there is any actionable information in that IV number given that everybody knows that most assumptions of BS do not hold. Yes, I know it is well understood and a bad model is better than no model. But let's say in a different universe somebody might have come up with a different model $\phi$ that shares the characteristics that make BS so popular. Now $\phi$ gives different IVs and hence also different actionable information. So how come traders actually use that information in trading if itstems from an 'arbitrary' Blackbox?
  2. The second question is based on the following slide of a talk given by P. Staneski, an MD quant at Credit Suisse:

    In the world of Black-Scholes, implied volatility is the expected value of future realized volatility because they are both constants!

    • Even if we allow for stochastic volatility, given the other assumptions implied volatility is the expected value of future realized volatility.

    In the real world, none of the assumptions are true (some being more false than others, particularly the constancy of vol and GBM).

    • The market gives us the price of an option, which “embeds” all the imperfections traders must deal with.

    • There is only one degree of freedom in the B-S model, namely, the volatility input, which must be “forced” to match the model to the market.

    Implied volatility is thus a lot more than expected volatility!

If as he claims IV is a lot more than expected volatility, how can a trader sensibly use that information in trading? Very often, I saw things like (in summary) "If the implied vol is 20 and you think volatility is gonna realise at 18, you just go short vol by selling a call/put and delta hedge it, thereby making money if you are right". Well, what if 5 of my 20 implied vol actually stems from the inaccuracy of the BS model and is not 'priced in expected realised variance'?

by sbm at April 30, 2016 06:17 PM

Planet Emacsen

emacspeak: Emacspeak 44.0 (SteadyDog) Unleashed

Emacspeak 44.0—SteadyDog—Unleashed!

For Immediate Release:

San Jose, Calif., (May 1, 2016)
Emacspeak: Redefining Accessibility In The Era Of (Real)Intelligent Computing
–Zero cost of Ownership makes priceless software Universally affordable!

Emacspeak Inc (NASDOG: ESPK) --– announces the
immediate world-wide availability of Emacspeak 44.0 (SteadyDog) –a
powerful audio desktop for leveraging today's evolving data, social
and service-oriented Internet cloud.

1 Investors Note:

With several prominent tweeters expanding coverage of
#emacspeak, NASDOG: ESPK has now been consistently trading over
the social net at levels close to that once attained by DogCom
high-fliers—and as of May 2016 is trading at levels close to
that achieved by once better known stocks in the tech sector.

2 What Is It?

Emacspeak is a fully functional audio desktop that provides complete
eyes-free access to all major 32 and 64 bit operating environments. By
seamlessly blending live access to all aspects of the Internet such as
Web-surfing, blogging, social computing and electronic messaging into
the audio desktop, Emacspeak enables speech access to local and remote
information with a consistent and well-integrated user interface. A
rich suite of task-oriented tools provides efficient speech-enabled
access to the evolving service-oriented social Internet cloud.

3 Major Enhancements:

  • Enable playing multiple media streams using mplayer. 🔊
  • Smart Ladspa effects in mplayer, including panning. 🕪
  • Sound theme chimes has been spatialized to create theme pan-chimes. 🕭-
  • Package elpy has been speech-enabled. 🐍
  • Emacspeak now implements automatic soundscapes. 🏙
  • Speech-enables package helm.𝍎
  • Emacs EWW: Consume Web content efficiently. 🕷
  • Updated Info manual 🕮
  • emacspeak-url-templates: Smart Web access. ♅
  • emacspeak-websearch.el Find things fast. ♁
  • And a lot more than wil fit this margin. …

4 Establishing Liberty, Equality And Freedom:

Never a toy system, Emacspeak is voluntarily bundled with all
major Linux distributions. Though designed to be modular,
distributors have freely chosen to bundle the fully integrated
system without any undue pressure—a documented success for
the integrated innovation embodied by Emacspeak. As the system
evolves, both upgrades and downgrades continue to be available at
the same zero-cost to all users. The integrity of the Emacspeak
codebase is ensured by the reliable and secure Linux platform
used to develop and distribute the software.

Extensive studies have shown that thanks to these features, users
consider Emacspeak to be absolutely priceless. Thanks to this
wide-spread user demand, the present version remains priceless
as ever—it is being made available at the same zero-cost as
previous releases.

At the same time, Emacspeak continues to innovate in the area of
eyes-free social interaction and carries forward the
well-established Open Source tradition of introducing user
interface features that eventually show up in luser environments.

On this theme, when once challenged by a proponent of a crash-prone
but well-marketed mousetrap with the assertion "Emacs is a system from
the 70's", the creator of Emacspeak evinced surprise at the unusual
candor manifest in the assertion that it would take popular
idiot-proven interfaces until the year 2070 to catch up to where the
Emacspeak audio desktop is today. Industry experts welcomed this
refreshing breath of Courage Certainty and Clarity (CCC) at a time
when users are reeling from the Fear Uncertainty and Doubt (FUD)
unleashed by complex software systems backed by even more convoluted
press releases.

5 Independent Test Results:

Independent test results have proven that unlike some modern (and
not so modern) software, Emacspeak can be safely uninstalled without
adversely affecting the continued performance of the computer. These
same tests also revealed that once uninstalled, the user stopped
functioning altogether. Speaking with Aster Labrador, the creator of
Emacspeak once pointed out that these results re-emphasize the
user-centric design of Emacspeak; "It is the user –and not the
computer– that stops functioning when Emacspeak is uninstalled!".

5.1 Note from Aster,Bubbles and Tilden:

UnDoctored Videos Inc. is looking for volunteers to star in a
video demonstrating such complete user failure.

6 Obtaining Emacspeak:

Emacspeak can be downloaded from GitHub –see you can visit Emacspeak on the
WWW at You can subscribe to the emacspeak
mailing list — — by sending mail to the
list request address The Emacspeak
is a good source for news about recent enhancements and how to
use them.

The latest development snapshot of Emacspeak is always available via
Git from GitHub at
Emacspeak GitHub .

7 History:

  • Emacspeak 44.0 continues the steady pace of innovation on the
    audio desktop.
  • Emacspeak 43.0 brings even more end-user efficiency by leveraging the
    ability to spatially place multiple audio streams to provide timely
    auditory feedback.
  • Emacspeak 42.0 while moving to GitHub from Google Code continues to
    innovate in the areas of auditory user interfaces and efficient,
    light-weight Internet access.
  • Emacspeak 41.0 continues to improve
    on the desire to provide not just equal, but superior access —
    technology when correctly implemented can significantly enhance the
    human ability.
  • Emacspeak 40.0 goes back to Web basics by enabling
    efficient access to large amounts of readable Web content.
  • Emacspeak 39.0 continues the Emacspeak tradition of increasing the breadth of
    user tasks that are covered without introducing unnecessary
  • Emacspeak 38.0 is the latest in a series of award-winning
    releases from Emacspeak Inc.
  • Emacspeak 37.0 continues the tradition of
    delivering robust software as reflected by its code-name.
  • Emacspeak 36.0 enhances the audio desktop with many new tools including full
    EPub support — hence the name EPubDog.
  • Emacspeak 35.0 is all about
    teaching a new dog old tricks — and is aptly code-named HeadDog in
    on of our new Press/Analyst contact. emacspeak-34.0 (AKA Bubbles)
    established a new beach-head with respect to rapid task completion in
    an eyes-free environment.
  • Emacspeak-33.0 AKA StarDog brings
    unparalleled cloud access to the audio desktop.
  • Emacspeak 32.0 AKA
    LuckyDog continues to innovate via open technologies for better
  • Emacspeak 31.0 AKA TweetDog — adds tweeting to the Emacspeak
  • Emacspeak 30.0 AKA SocialDog brings the Social Web to the
    audio desktop—you cant but be social if you speak!
  • Emacspeak 29.0—AKAAbleDog—is a testament to the resilliance and innovation
    embodied by Open Source software—it would not exist without the
    thriving Emacs community that continues to ensure that Emacs remains
    one of the premier user environments despite perhaps also being one of
    the oldest.
  • Emacspeak 28.0—AKA PuppyDog—exemplifies the rapid pace of
    development evinced by Open Source software.
  • Emacspeak 27.0—AKA
    FastDog—is the latest in a sequence of upgrades that make previous
    releases obsolete and downgrades unnecessary.
  • Emacspeak 26—AKA
    LeadDog—continues the tradition of introducing innovative access
    solutions that are unfettered by the constraints inherent in
    traditional adaptive technologies.
  • Emacspeak 25 —AKA ActiveDog
    —re-activates open, unfettered access to online
  • Emacspeak-Alive —AKA LiveDog —enlivens open, unfettered
    information access with a series of live updates that once again
    demonstrate the power and agility of open source software
  • Emacspeak 23.0 – AKA Retriever—went the extra mile in
    fetching full access.
  • Emacspeak 22.0 —AKA GuideDog —helps users
    navigate the Web more effectively than ever before.
  • Emacspeak 21.0
    —AKA PlayDog —continued the
    Emacspeak tradition of relying on enhanced
    productivity to liberate users.
  • Emacspeak-20.0 —AKA LeapDog —continues
    the long established GNU/Emacs tradition of integrated innovation to
    create a pleasurable computing environment for eyes-free
  • emacspeak-19.0 –AKA WorkDog– is designed to enhance
    user productivity at work and leisure.
  • Emacspeak-18.0 –code named
    GoodDog– continued the Emacspeak tradition of enhancing user
    productivity and thereby reducing total cost of
  • Emacspeak-17.0 –code named HappyDog– enhances user
    productivity by exploiting today's evolving WWW
  • Emacspeak-16.0 –code named CleverDog– the follow-up to
    SmartDog– continued the tradition of working better, faster,
  • Emacspeak-15.0 –code named SmartDog–followed up on TopDog
    as the next in a continuing series of award-winning audio desktop
    releases from Emacspeak Inc.
  • Emacspeak-14.0 –code named TopDog–was

the first release of this millennium.

  • Emacspeak-13.0 –codenamed
    YellowLab– was the closing release of the
    20th. century.
  • Emacspeak-12.0 –code named GoldenDog– began
    leveraging the evolving semantic WWW to provide task-oriented speech
    access to Webformation.
  • Emacspeak-11.0 –code named Aster– went the
    final step in making Linux a zero-cost Internet access solution for
    blind and visually impaired users.
  • Emacspeak-10.0 –(AKA
    Emacspeak-2000) code named WonderDog– continued the tradition of
    award-winning software releases designed to make eyes-free computing a
    productive and pleasurable experience.
  • Emacspeak-9.0 –(AKA
    Emacspeak 99) code named BlackLab– continued to innovate in the areas
    of speech interaction and interactive accessibility.
  • Emacspeak-8.0 –(AKA Emacspeak-98++) code named BlackDog– was a major upgrade to
    the speech output extension to Emacs.
  • Emacspeak-95 (code named Illinois) was released as OpenSource on
    the Internet in May 1995 as the first complete speech interface
    to UNIX workstations. The subsequent release, Emacspeak-96 (code
    named Egypt) made available in May 1996 provided significant
    enhancements to the interface. Emacspeak-97 (Tennessee) went
    further in providing a true audio desktop. Emacspeak-98
    integrated Internetworking into all aspects of the audio desktop
    to provide the first fully interactive speech-enabled WebTop.

8 About Emacspeak:

Originally based at Cornell (NY) — —home to Auditory User
Interfaces (AUI) on the WWW, - Emacspeak is now maintained on GitHub The system is mirrored
world-wide by an international network of software archives and
bundled voluntarily with all major Linux distributions. On Monday,
April 12, 1999, Emacspeak became part of the Smithsonian's Permanent
Research Collection
on Information Technology at the Smithsonian's
National Museum of American History.

The Emacspeak mailing list is archived at Vassar –the home of the
Emacspeak mailing list– thanks to Greg Priest-Dorman, and provides a
valuable knowledge base for new users.

9 Press/Analyst Contact: Tilden Labrador

Going forward, Tilden acknowledges his exclusive monopoly on
setting the direction of the Emacspeak Audio Desktop, and
promises to exercise this freedom to innovate and her resulting
power responsibly (as before) in the interest of all dogs.

*About This Release:

Windows-Free (WF) is a favorite battle-cry of The League Against
Forced Fenestration (LAFF). –see for details on
the ill-effects of Forced Fenestration.

CopyWrite )C( Aster, Hubbell and Tilden Labrador. All Writes Reserved.
HeadDog (DM), LiveDog (DM), GoldenDog (DM), BlackDog (DM) etc., are Registered
Dogmarks of Aster, Hubbell and Tilden Labrador. All other dogs belong to
their respective owners.

by T. V. Raman ( at April 30, 2016 05:51 PM


A coding question

We are given $n, m$ with $n - m > 1$. Let $S$ be the set of all $n$-bit words. Form $2^{n-m}$ disjoint subsets of $S$ of size $2^m$, denote a typical one of them by $A$, and let $B = S \setminus A$. With $H(a,b)$ denoting the Hamming distance of elements $a \in A$ and $b \in B$, let $G(a) = \max_{b \in B} H(a,b)$ and $F(A) = \min_{a \in A} G(a)$. How could one construct the $A$s such that the $F(A)$ values are as small as possible?

by Mok-Kong Shen at April 30, 2016 05:22 PM

number of balanced binary trees [duplicate]

This question already has an answer here:

How you can find the number of balanced binary tree knowing only the number of nodes? Is there a method better than generate all possible balanced trees and if not how can i generate those trees based only on the number of nodes?

by Otniel Mercea at April 30, 2016 05:20 PM


Meaning of Error Term e

I was reading the book "Introduction to Statistical Learning". The book says that:

More generally, suppose that we observe a quantitative response Y and a set of predictor variables X1, X2, .... Xn.

We assume that there is some relationship between Y and X (X1, X2, ... Xn) which can be written in the very general form as:

Y = f(X) + e

Here, f is some fixed but unknown function of X and e is a random error term which is independent of X and has mean zero.

I want to know what does it mean to have zero mean ?

by jaig at April 30, 2016 05:18 PM


How to check if a the language represented by a DFA is finite [on hold]

I am studying regular languages and D FA. I have implemented D FA in Java. I have to write a function which tells if the language represented by a D FA is finite or not. I need a method or algorithm to do so. What I have already figured out is that if the D FA has loops in it then it can possibly recognize infinite words.

by Sayam Qazi at April 30, 2016 04:57 PM


Thompson Reuters TRBC and GICS

I can retrieve the components But I would like 2 retrieve the Index RIC by the sector scheme Code for Both GCIS and TRBC. I would really like 2 get my hands on the components for the Dow Jones sectors / ICB I have them for NYSE but would like the other exchanges. :) This Finds the Components

RCSIssuerCountryGenealogy   'M:A9/G:6J' 
PriceClose  >5.00
GicsName    'Financials'
IsPrimaryRIC    Yes
Row 1262

This Finds the Sector Scheme and code for groupings

=TR(OFFSET($C2,0,0,B6,1),"TR.GICSSector; TR.GICSSectorCode;
    TR.GICSIndustryGroup; TR.GICSIndustryGroupCode; TR.GICSIndustry; 
    TR.GICSIndustryCode; TR.GICSSubIndustry; TR.GICSSubIndustryCode
    ","CH=Fd RH=IN",E1)

The problem is how do U retrieve the index ric for GICSIndustryGroupCode or GICSIndustryGroup

I found a list of the GCIS symbols broken down by Sectors, Industry Groups, Industries and Sub-Industries But they are EOD symbols :(

by Jeff Crystal at April 30, 2016 04:54 PM



proot: dpb meets chroot

With the p2k16 hackathon just coming to a close, Marc Espie has revealed one of the new things he worked on.

I've been using dpb(1) chroot'd for a long time, using my own methods. This is a first try at making things "simple." Basically,

proot -B /build

should more or less do something sane, and then you can build ports in that chroot.


April 30, 2016 04:32 PM


What are the concepts required to be conceptual perfect with computer science

what are the concepts required to be conceptual perfect with computer science and to works with higher end of the project.

by Joseph H at April 30, 2016 04:30 PM

How can we avoid mistake in LL(1) parse tree?

I'm learning about LL(1) parse tree. We need to find first and follow in order to construct a LL(1) parse tree. Each and every time when I find first and follow I'm missing one or the other terminal as we to back track each production. When makes me end up in wrong LL(1) parse tree. Is there any standardized way to check the first and follow table ?

by Pavan at April 30, 2016 04:11 PM



Raspberry-pi2 connection problem [on hold]

When I try to connect my raspberry pi to my laptop via Ethernet, it is not assigning any dynamic IP to my laptop(windows 10) though my laptop is set to obtain dynamic IP. I didn't even make the IP static for pi when it was working.

What to do?

by Yash at April 30, 2016 03:49 PM


Gradient Temporal Difference Lambda without Function Approximation

In every formalism of GTD(λ) seems to define it in terms of function approximation, using θ and some weight vector w.

I understand that the need for gradient methods widely came from their convergence properties for linear function approximators, but I would like to make use of GTD for the importance sampling.

Is it possible to take advantage of GTD without function approximation? If so, how are the update equations formalized?

by Andnp at April 30, 2016 03:43 PM


About randomness and minmax algorithm with alpha beta pruning

Will choosing the child of a node randomly in the alpha beta algorithm have a better chance to get a cut off than choosing them in order?

Here's the pseudocode with my addition marked with ***.

function alphabeta(node, depth, α, β, maximizingPlayer)
     if depth = 0 or node is a terminal node
         return the heuristic value of node
     arrange childs of node randomly ***
     if maximizingPlayer
         v := -∞
         for each child of node
             v := max(v, alphabeta(child, depth - 1, α, β, FALSE))
             α := max(α, v)
             if β ≤ α
                 break (* β cut-off*)
         return v
         v := ∞
         for each child of node
             v := min(v, alphabeta(child, depth - 1, α, β, TRUE))
             β := min(β, v)
             if β ≤ α
                 break (* α cut-off*)
         return v

I ran a small sample with this on a connect four game and it does seem to run a little faster, but when I actually count the cutoffs with and without randomness, there are more cutoffs without randomness. That's a little odd.

Is it possible to prove that it's faster (or slower)?

by kuhaku at April 30, 2016 03:29 PM



If the strings of a language can be enumerated in lexicographic order, is it recursive?

If the strings of a language L can be effectively enumerated in lexicographic order then is the statement "L is recursive but not necessarily context free" is true?

by user50339 at April 30, 2016 02:54 PM



FPÖ-Chef Strache zahlt 9000 Euro an die Flüchtlingshilfe. ...

FPÖ-Chef Strache zahlt 9000 Euro an die Flüchtlingshilfe. Nanu?
Allerdings geschah dies offensichtlich nicht freiwillig. Die Zahlung sei im Zuge einer außergerichtlichen Einigung mit einem Fotografen erfolgt, teilte ein FPÖ-Sprecher mit.
MWAHAHAHAHA, sehr schön!

April 30, 2016 02:00 PM

Habt ihr schon mal von Alfredo Stroessner gehört? ...

Habt ihr schon mal von Alfredo Stroessner gehört? Das war ein deutschstämmiger Diktator, der sich 1954 in Paraguy an die Macht geputscht hat.

Gaby Weber hat dazu beim Deutschlandfunk ein Feature gemacht, 45 Minuten Audio. Lohnt!

Der Stroessner hat damals erstmal Menschenjagd auf Indianer gemacht, die der Rinderzucht im Wege waren. Und zwar haben die Rinderzüchter damals Regenwald abgeholzt, um auf dem Boden Rinder züchten zu können, und damit die Jagdgründe der dort lebenden Indianer dezimiert. Die haben daraufhin gelegentlich zu wenig Essen vorgefunden und dann halt ein Rind von der Weide gejagd. Daraufhin haben die Rinderzüchter Jagd auf die Indianer gemacht und in Reservationen gefahren, wo die an Unterernährung und Krankheiten wie Grippe gestorben sind.

Das Auswärtige Amt hat sich damals dazu geäußert, dass von einer Verfolgung von Indianern nicht die Rede sein könne.

April 30, 2016 02:00 PM

Wisst ihr, welches Ministerium in Deutschland für ...

Wisst ihr, welches Ministerium in Deutschland für die digitale Infrastruktur zuständig ist?

Das Verkehrsministerium von Dobrindt.

Das ist ja schon eine Pointe an sich, aber wenn die dann auch noch Heartbleed haben, im April 2016, da fehlen mir dann schon ein bisschen die Worte.


April 30, 2016 02:00 PM

Eines der Hauptprobleme unserer Zeit ist, dass wir ...

Eines der Hauptprobleme unserer Zeit ist, dass wir keine effizienten Energiespeicher haben. Wenn Strom gebraucht wird, müssen wir ihn generieren. Wir haben kleinere Puffertechnologien, aber keine großen Speicher, in die man sagen wir mal eine Woche Windstrom einzahlen kann und dann hebt man sie ab, wenn Flaute herrscht.

Hier ist ein interessantes Pufferkonzept, bei dem man einen schweren Zug mit Elektromotor den Berg hoch fährt. Dann, wenn man Strom braucht, fährt man ihn wieder runter und betreibt die Motoren als Generatoren.

Klingt jetzt nicht, als könne man so richtig viel Strom speichern, aber der Prototype, den sie da gerade bauen wollen, hat 50 MW Leistung und 12.5 MWh Kapazität. Und sie wollen auf 1 GW hoch.

Update: Zum Vergleich:

Die maximale Speicherkapazität aller österreichischen (Pump-)Speicherkraftwerke beträgt derzeit ca. 3 TWh; für Pumpspeicherkraftwerke allein liegen keine Daten vor.
Auch diese Bahngeschichte ist also eher nur ein Puffer. Und die Frage, was man macht, wenn man keinen Berg in der Nähe hat, ist auch noch offen. Da gibt es so Ansätze mit riesigen Kreiseln.

Update: Hier ist noch ein cooles Energiespeicherkonzept mit Betonkugeln im Bodensee, und auch über Salz als Wärmespeicher denken Leute nach. Laut Technology Review hat das Betonkugelding einen Wirkungsgrad von 85%, das wären 10 Prozentpunkte mehr als Pumpspeicher.

Update: Hier gibt es ein Konzept für Flachland.

April 30, 2016 02:00 PM


combining matching and classification

I am working on a matching problem between an exposed and control group. I had an idea about using binary classification to solve it. I would assign all observations in the exposed to one class and create the other class artificially. For example,

Suppose I have an exposed group of 1,000 people and a control group of 4,000. There are 25 binary profiles

Gender, Age (18 to 34), Age (34 to 55), Age (over 55), income (less than 50,000), income (50,00, 100,000), income (greater than 100,000) outdoor enthusiast, video game player, fast food diner, cost conscious shopper, …, pet enthusiast

Suppose all of the exposed are Male, age 18-24, video-game players, and fast food diners, and vary in the remaining 21 categories.

Let a1, a2,…, a1,000 be my exposed group, so I would put all of my exposed group a1, a2,…,a1,000 in one class and for the remaining class I would take the opposite binary choice for each expose observation. So if

a1 has the following profile

Gender - Male
Age (18 to 34) - Yes
Age (34 to 55) - No
Age (over 55) - No
income (less than 50,000) - Yes 
income (50,00, 100,000) – No
income (greater than 100,000) - No 
outdoor enthusiast - No
video game player - Yes
fast food diner - Yes
cost conscious shopper - Yes 
per enthusiast – No

I would then create a new observation for the other class by taking the opposite choice for each category with random selection among the reaming choices for age and income.

Gender - Female
Age (18 to 34) - No
Age (34 to 55) - Yes
Age (over 55) - No
income (less than 50,000) - No 
income (50,00, 100,000) – Yes
income (greater than 100,000) - No 
outdoor enthusiast - Yes
video game player - No
fast food diner - No
cost conscious shopper - No 
pet enthusiast – Yes

I do this for all 1,000 exposed observations. Then I train a binary classifier on a portion of the this data.

I applied this to my data ( much larger than 1000, close to 1,000,00 exposed) and it resulted in to 100% classification rate on the my constructed data set and when I used the predict method, it classified all members in the control to the class of exposed observations (highly unlikely in real life).

My questions are

1- Does this approach make sense.

2- If it makes sense, why did all the control members get assigned to the exposed class? What is a good method to construct the other class of observations in the training set.

3- If it doesn’t make sense what is a good way to match categorical features? I don’t like the idea of converting them to numeric and using a clustering method.

by mikeL at April 30, 2016 01:56 PM



How to Convert a Directed Graph to an Undirected Graph (Adjacency Matrix) [on hold]

Given an adjacency matrix, what is an algorithm/pseudo-code to convert a directed graph to an undirected graph without adding additional vertices (does not have to be reversable)?

similar question here

by Ibrahim at April 30, 2016 01:09 PM

DragonFly BSD Digest


Election Algorithms - A ring algorithm

I been reading about Election algorithms in Distributed Systems. I read about the Bully Algorithm and understood it. I came across A Ring Algorithm, read about it an understood how it conducts the election but I could not understand how does it handle a situation when two processes 2 and 5 simultaneously discover that the coordinator 7is not functioning.

Each of these builds an ELECTION message and start circulating it.Eventually, both messages will go all the way around,and both 2 and 5 will convert them to COORDINATOR message, with exactly the same number and in the same order.

Who is going to be the coordinator (2or 5) and why according to this algorithm?

by Giovanrich at April 30, 2016 12:57 PM


TSP Edge Removal

Are there any papers/algorithms for finding edges in a graph that can be removed with affecting the graph's optimal TSP tour length?

For instance, in a Euclidean TSP instance, many edges could be ruled out for being too long (i.e., jumping between very far apart nodes) given a decent upper bound on the optimal tour length. Is there some efficient heuristic algorithm for eliminating edges like this?

by b2coutts at April 30, 2016 12:54 PM

Proving a certain superset the halting language is not recursive

Let $\Sigma =\{ 0, 1\}$. Let $val:\Sigma^* \rightarrow \mathbb{N}$ be a function that given a string returns its decimal value, and $L_{halt} = \{\langle M\rangle \langle w\rangle \mid M $ halts on $w \}$.

I define the following language:

$L=\{ x \mid x\neq x^R \wedge (x\in L_1 \vee x\in L_2 \vee x\in L_3)\}$


  • $L_1=\{ x \mid x\in L_{halt} \wedge x^R \in L_{halt} \wedge val(x)<val(x^R) \}$
  • $L_2=\{ x \mid x\notin L_{halt} \wedge x^R \notin L_{halt} \wedge val(x)<val(x^R) \}$
  • $L_3=\{ x \mid x\in L_{halt} \wedge x^R \notin L_{halt} \}$ (in this language there is no requirement on $val$ function)

That is, a string in $L$ has 3 options how to look like.

Intuitively I can feel that $L\notin R$, but I do not know how to approach a formal proof for this.

I tried using a reduction $L_{halt} \leq L$, and then since $L_{halt}\notin R$ I can deduce that $L\notin R$, but I could not find a reduction that will work.

The closest thing I could come up with is using an intermediate reduction, first defining $L_{halt}^{'} =\{ 1 x 0 \mid x \in L_{halt} \}$ and then easily $L_{halt} \leq L_{halt}^{'}$ and all I have to do is $L_{halt}^{'} \leq L$, so my idea for reduction was given $x=1y0$ then $f(x)=y$ (the reduction returns $y$), but then this reduction has a problem if $y\notin L_{halt}$ and $y^R\notin L_{halt}$ then it might be possible that $val(y) < val(y^R)$ and then $f(x)=y\in L$ but $y\notin L_{halt}^{'}$...

I'm kind of stuck at how my reduction should work?

I'm also assuming that any string from $\Sigma^*$ is a possible encoding of a $<M><w>$, that is I cannot assume something about the encoding itself and also every word is a possible encoding of $\langle M\rangle \langle w\rangle$...

by Dan D-man at April 30, 2016 12:51 PM

Winning strategy of Nim game when picking from multiple piles is allowed

I am studying with Game theory right now. In Nim game, in any turn, a player can move any number of stones from any one pile. I am wondering what might be the winning strategy of first player if in any move, a player can pick any number of stones from one or more piles? In this case, how the xor = 0 rule will work? Both are playing optimally.

by Kaidul Islam at April 30, 2016 12:48 PM

How to determine agreement between two sentences?

A common Natural Language Processing (NLP) task is to determine semantic similarity between two sentences. Has the question of agreement/disagreement between two sentences been covered in NLP or other literature? I tried searching on Google Scholar but didn't get any relevant results.

by Hamman Samuel at April 30, 2016 12:27 PM

CFG for the language "number of a's = number of b's + 2"

How can I construct a context-free grammar for the following language?

$$ L = \{ w \in \{a,b\}^* : \#_a(w) = \#_b(w) + 2 \}. $$

Please help me out in this. I am not sure how to approach this question. I would also appreciate general tips on constructing context-free grammars. I find it highly difficult.

by Zoey at April 30, 2016 12:25 PM


Why is the value of an adaptive stochastic process known at time t?

I am having a hard time to understand the concept of an adapted stochastic process. Using an analogy to finance, I have been told we can think of adaptiveness of a stock price process as having an access to a Bloomberg terminal and be able to check up the price of the stock at time $ t$, i.e. at each point in time the price of the stock is known. I have also learned that a stochastic process is nothing but a collection of random variables and can thus be interpreted as function-valued random variable. Stochastic processes in general need not be adaptive, but as e.g. Shreve (Stochastic Calculus for Finance vol.2 page 53, 2004) notes it is often safe to assume for finance related stochastic processes to be adapted.

Now let us assume that we are dealing with an adapted stochastic process X and fix $ t$. To me it seems that by doing this we will (at this arbitrary point in time) obtain a random variable $ X(\omega; \text{t fixed})$ by the definition of a stochastic process. But wait a minute, the value of a random variable should not be known, right? On the contrary, it should be random!

How is this seeming puzzle reconciled? To me it is not clear how the definition of an adapted process implies that the value of $ X(\omega; \text{t fixed})$ is known at time $ t$. Rather, it just states that at the fixed $ t$ $ X(\omega; \text{t fixed})$ is $ \mathcal{F}_{t} $-measurable, which is not enough. Just imagine a case of a single random variable (just one point in time) Y on $ (\Omega, \mathcal{F})$ (i.e. Y is a $ \mathcal{F} $-measurable function). Obviously the value of Y is not known but random.

I have found some earlier related questions (e.g. this) but these have not clarified the matter to me. Thank you in advance for the help!

by vvv at April 30, 2016 12:18 PM


Why IS is not in NL, only in NP?

All we need to do is guess $k$ vertices. We look at vertex $v_1$, and make sure $v_1$ is not connected to $v_2...v_k$. Then, we "throw" $v_1$, and look at $v_2$. We do this to all vertices.

Meaning that we only need to guess $k$ vertices, and in our working memory (which determinates NL or NP) we only need to keep 2 vertices.

Where is my mistake?

by Ran at April 30, 2016 12:15 PM


Error in `row.names<`(`*tmp*`, value = c(NA_real_, NA_real_

I am trying to build a model using the tweets and polarity. But in the middle I get this weird error: At this line:

analytics <- create_analytics(container, MAXENT_CLASSIFY)

I get this

Error in `row.names<`(`*tmp*`, value = c(NA_real_, NA_real_,  : 
  duplicate 'row.names' are not allowed
In addition: Warning messages:
1: In cbind(labels, BEST_LABEL = as.numeric(best_labels), BEST_PROB = best_probs,  :
  NAs introduced by coercion
2: In create_documentSummary(container, score_summary) :
  NAs introduced by coercion
3: In cbind(MANUAL_CODE = testing_codes, CONSENSUS_CODE = scores$BEST_LABEL,  :
  NAs introduced by coercion
4: In create_topicSummary(container, score_summary) :
  NAs introduced by coercion
5: In cbind(TOPIC_CODE = as.numeric(as.vector(topic_codes)), NUM_MANUALLY_CODED = manually_coded,  :
  NAs introduced by coercion
6: In cbind(labels, BEST_LABEL = as.numeric(best_labels), BEST_PROB = best_probs,  :
  NAs introduced by coercion
7: non-unique values when setting 'row.names':

My CSV file looks like:

text, polarity
Hello I forget the password of my credit card need to know how I can make my statement, neutral
can provide the swift code thanks, neutral
thanks just one more doubt has this card commissions with these characteristics, neutral
Thanks, neutral
are arriving mail scam, negative
can you help me I need to pay an online purchase and ask me for a terminal my debit which is, neutral
if I do not win anything this time I change banks, negative
you can be the next winner of the million that circumvents account award date January, neutral
account and see my accounts so I can have the, negative
thanks i just send the greetings consultation, neutral
may someday enable office not sick people, negative
hello is running payments through the online banking no, negative
thanks hope they do, neutral
should pay attention to many happened to us that your system flushed insufficient balance or had no money in the accounts, negative
yesterday someone had the dignity to answer the telephone banking and verify that the system is crap, negative
and tried but apparently the problem is just to pay movistar services, neutral
good morning was trying to pay for services through the website but get error retry in minutes, negative
if no system agent is non clients or customers also, positive

The code I am using is:


pg <- read.csv("cleened_tweets.csv", header=TRUE, row.names=NULL)


pgT <- as.factor(pg$text)

pgP <- as.factor(pg$polarity)

doc_matrix <- create_matrix(pgT, language="spanish", removeNumbers=TRUE, stemWords=TRUE, removeSparseTerms=.998)


container <- create_container(doc_matrix, pgP, trainSize=1:275, testSize=276:375, virgin=FALSE)

MAXENT <- train_model(container,"MAXENT")

MAXENT_CLASSIFY <- classify_model(container, MAXENT)

analytics <- create_analytics(container, MAXENT_CLASSIFY)


by user3827298 at April 30, 2016 12:14 PM

Classify data-set (stringToWord) filter by weka

i'm new in weka.

i've a data-set (twitter data) about specific company .. the filter i used : string to word .. and i change the option wordstokeep =100 , to improve the accuracy . then i applied classifiers : Kstar 55% , RandomForest 57% , SMO 58% these not that most good result ..

enter image description here

is there any idea , that help me to improve it very well >>

by user2199395 at April 30, 2016 11:55 AM


Using consensus for atomic commits

I read that consensus algorithms allow you to ensure the atomic commit. That is, in order to save a large amount of data on the disk, as if it is a single object, the article advises a consensus protocol.

I could not understand what consensus has to do with your writing on disk, where you seem to have only a single node writing a contigous piece of information (I asked this Q at dba but they blocked my account saying that it is a violation) but, anyway, let's take a distributed DB. I read that consensus allows me to agree upon a single value (without resorting to any locks). But, atomicity means that you should agree upon many values at the same time. How do you exploit the consensus algorithms for updating multiple values atomically?

I see in computerphile about Paxos that you can use consensus to take locks. Do you use consensus just to take them and commit a transaction under the locks?

by user6267925 at April 30, 2016 11:55 AM


Supreme Court moves to expand FBI’s hacking authority

“Simply, it will allow an FBI agent sitting in Virginia to hack into a computer or network in Nevada — or anywhere in the world.”


by Zuu at April 30, 2016 11:48 AM


How to update element inside List with ImmutableJS?

Here is what official docs said

updateIn(keyPath: Array<any>, updater: (value: any) => any): List<T>
updateIn(keyPath: Array<any>, notSetValue: any, updater: (value: any) => any): List<T>
updateIn(keyPath: Iterable<any, any>, updater: (value: any) => any): List<T>
updateIn(keyPath: Iterable<any, any>, notSetValue: any, updater: (value: any) => any): List<T>

There is no way normal web developer (not functional programmer) would understand that!

I have pretty simple (for non-functional approach) case.

var arr = [];
arr.push({id: 1, name: "first", count: 2});
arr.push({id: 2, name: "second", count: 1});
arr.push({id: 3, name: "third", count: 2});
arr.push({id: 4, name: "fourth", count: 1});
var list = Immutable.List.of(arr);

How can I update list where element with name third have its count set to 4?

by Vitalii Korsakov at April 30, 2016 11:12 AM


How to calculate virtual address space from page size, virtual address size and page table entry size?

I try to solve an exercise, unfortunately without any success yet.

From the following given information, the virtual address space should be calculated:

  • Page size is 16 KB
  • Logical address size is 47 bit
  • 3 levels of page tables; all have the same size
  • Page table entry size is 8 byte

My idea is to find out how many pages can be addressed. The page count should lead then to the size of the virtual address space, right?

Virtual address space = page size * page count

As far as I understand is the page count defined by the logical address size. The logical address is split up in 3 levels of page tables plus the offset. Due the page table entry size is 8 byte (2^6 = 64 bit), 6 bits of the logical address are used for each stage to address it. The offset will have the size of 30 bits.

Each page stage can address 64 bit plus the 30 bits offset. So does this result in the page count?

Page count = 64 * 3 + 30 = 222

With the page size and the page count I get the virtual address space of 3552 KB.

I think this is wrong. It should be much larger. What is not correct? Any help is appreciated!

by Robin at April 30, 2016 11:07 AM


Python lambda function to sort list of tuples

Initially, a list of tuples is sorted by their second value and their third value.

tlist = [('a', 1, 14), ('a', 1, 16), ('b', 1, 22), 
        ('a', 2, 1), ('c', 2, 9), ('d', 2, 11), ('d', 2, 12)]

Trying to sort this list of tuples by their second value, reverse by their third value; that is, want the output of the list to be sorted like so:

tlist= [('b', 1, 22), ('a', 1, 16), ('a', 1, 14), 
        ('d', 2, 12), ('d', 2, 11), ('c', 2, 9), ('a', 2, 1)]

This is what I have tried so far as seen in this answer:

tlist = sorted(tlist, key=lambda t: return (t[0], t[1], -t[2]))

but it does not work; gives a return outside of function error.

Any ides of how to get the desired output?


This question provides very good resources and perhaps all someone would need to go ahead and tackle my specific question. However, given my low level of experience and my specific instance of sorting with different sorting criteria (as described on the description above), I think this is no duplicate.

by nk-fford at April 30, 2016 10:40 AM

How to decide the size of layers in Keras' Dense method?

Below is the simple example of multi-class classification task with IRIS data.

import seaborn as sns
import numpy as np
from sklearn.cross_validation import train_test_split
from keras.models import Sequential
from keras.layers.core import Dense, Activation, Dropout
from keras.regularizers import l2
from keras.utils import np_utils


# Prepare data
iris = sns.load_dataset("iris")
X = iris.values[:, 0:4]
y = iris.values[:, 4]

# Make test and train set
train_X, test_X, train_y, test_y = train_test_split(X, y, train_size=0.5, random_state=0)

# Evaluate Keras Neural Network

# Make ONE-HOT
def one_hot_encode_object_array(arr):
    '''One hot encode a numpy array of objects (e.g. strings)'''
    uniques, ids = np.unique(arr, return_inverse=True)
    return np_utils.to_categorical(ids, len(uniques))

train_y_ohe = one_hot_encode_object_array(train_y)
test_y_ohe = one_hot_encode_object_array(test_y)

model = Sequential()
model.add(Dense(16, input_shape=(4,),
model.add(Dense(3, activation='sigmoid'))
model.compile(loss='categorical_crossentropy', metrics=['accuracy'], optimizer='adam')

# Actual modelling
# If you increase the epoch the accuracy will increase until it drop at
# certain point. Epoch 50 accuracy 0.99, and after that drop to 0.977, with
# epoch 70 
hist =, train_y_ohe, verbose=0,   nb_epoch=100,  batch_size=1)

score, accuracy = model.evaluate(test_X, test_y_ohe, batch_size=16, verbose=0)
print("Test fraction correct (NN-Score) = {:.2f}".format(score))
print("Test fraction correct (NN-Accuracy) = {:.2f}".format(accuracy))

My question is how do people usually decide the size of layers? For example based on code above we have:

model.add(Dense(16, input_shape=(4,),
model.add(Dense(3, activation='sigmoid'))

Where first parameter of Dense is 16 and second is 3.

  • Why two layers uses two different values for Dense?
  • How do we choose what's the best value for Dense?

by neversaint at April 30, 2016 10:38 AM

Fred Wilson

Video Of The Week: The Nitty Gritty Podcast

Bond Street, a startup company that makes small business loans, has started a podcast to tell stories about small business entrepreneurs and the companies they create and run. They call it the Nitty Gritty Podcast.

The first episode features an entrepreneur who is also a friend of ours, Gabe Stulman.

Gabe is a restaurant operator in the west village of Manhattan, where we live. We started our relationship with Gabe as regulars at his first restaurant and we have gone on to be investors in all of his current restaurants, as well as good friends with him.

Here is Gabe’s story. It’s a good one.

by Fred Wilson at April 30, 2016 10:37 AM


Why can't I enable Transmission on FreeNAS 9.2? I keep getting an error

I have installed the plugin correctly, but every time I try and start it either using service transmission start, or trying to flip the on switch from the plugins page, I get this error:


I also have tried creating a new standard jail and using portsnap to get transmission, but same error.

by rakeshdas at April 30, 2016 10:10 AM


How to do performance attribution for a few characteristics?

Let's say the characteristics that I am interested in are

  1. FX
  2. Country
  3. Security selection

I have the benchmark weights and returns, the FX returns, and the portfolio weights and returns. Can someone give a few pointers on how I would be able to get the performance attributions for 1,2,3?

by wwokkie at April 30, 2016 10:08 AM


Which languages (apart from the list below) can do Howard Curry type checking at compile-time?

The Howard-Curry correspondence is enormously powerful. I'd like to use it, but I'd like a choice other than Haskell and Scala.

The languages that I can see that support the Howard-Curry correspondence (with type-checking at compile-time) are: * Agda * Coq * Haskell * Scala * Shen * Qi II

My question is: Which languages (apart from the list below) can do Howard Curry type checking at compile-time?

by hawkeye at April 30, 2016 09:54 AM

Scala recursive API calls to get all the results

I'm pretty new to Scala, and what I'm trying to achieve is to make enough calls to an API with different offset until I get all the results.

Here's a simplified version of what I have, and I was wondering if there is a more idiomatic Scala way to do it. (The code sample might not be 100% accurate, it's just something I put up as an example)

def getProducts(
                 limit: Int = 50,
                 offset: Int = 0,
                 otherProducts: Seq[Product] = Seq()): Future[Seq[Product]] = {

  val eventualResponse: Future[ApiResponse] = apiService.getProducts(limit, offset)

  val results: Future[Seq[Product]] = eventualResponse.flatMap { response =>

    if (response.isComplete) {"Got every product!")
      Future.successful(response.products ++ otherProducts)
    } else {"Need to fetch more data...")

      val newOffset = offset + limit
      getProducts(limit, newOffset, otherProducts ++ response.products)


Passing the otherProducts parameter just doesn't feel right :P

Thanks in advance for any suggestions :)

by Soroush Mirzaei at April 30, 2016 09:37 AM

MATLAB: The determinant of a covariance matrix is either 0 or inf

I have a 1500x1500 covariance matrix of which I am trying to calculate the determinant for EM-ML method. The covariance matrix is obtained by finding the SIGMA matrix and then passing it into the nearestSPD library (Link) to make the matrix positive definite . In this case the matrix is always singular. Another method I tried was of manually generating a positive definite matrix using A'*A technique. (A was taken as a 1600x1500 matrix). This always gives me the determinant as infinite. Any idea on how I can get a positive definite matrix with a finite determinant?

by Timelapse at April 30, 2016 09:28 AM


Training accuracy in Tensorflow suddenly drops

I was training a ConvNet on CIFAR10 based on the code presented in "Deep MNIST for Experts" tutorial by TensorFlow. After around a few thousand epochs, the training accuracy suddenly drops from 90% to 4%. Further inspection shows that, together with the drop in accuracy, cross entropy and predicted scores also turns into NaN. Any ideas why this happen?

by amethystdnd at April 30, 2016 09:06 AM


Calibrating and simulating returns from a t-distribution

A slight twist (I hope) on the familiar problem of simulating log returns from a t distribution. My two questions concern calibration to sample data. First, one can infer the degrees of freedom in the t distribution, v, by equating the kurtosis of a sample of log returns with the kurtosis of the t distribution, which is 3(v-2)/(v-4). Alternatively, one could do the same thing with the variance, which is given by v/(v-2). In general, the two procedures will not yield the same value for v. Which is better? Or should one take a GMM approach?

My second concern involves scaling. It is often suggested that when simulating stock prices using a t-distribution, one should scale the sample volatility by the square root of (v-2)/v. I have found that this scaling can produce a density which (athough fatter tailed) is more peaked than the normally distributed returns for the same sample. This seems wrong.

by LukeG at April 30, 2016 08:46 AM


From text to K-Means Vectors input

I've just started diving into Machine Learning, specifically into Clustering. (I'm using Python but this is irrelevant) My goal is, starting from a collection of tweets (100K) about fashion world, to perform KMeans over their text.

Till now I've filtered texts, truncating stopwords, useless terms, punctuation; done lemmatization (exploiting Part Of Speech tagging for better results).

I show the user the most frequent terms, hashtags, bigrams, trigrams,..9grams so that he can refine preprocessing adding words to useless terms.

My initial idea was to use the top n(1K) terms as features, creating foreach tweet a vector of fixed size n(1K) having a cell set to a value if the top term (of this cell) appear in this tweet (maybe calculating the cell's value with TFIDF).

Am I missing something(the 0 values will be considered)? Can I exploit n-grams in some way?

This scikit article is pretty general and I'm not understanding the whole thing.

(Is LSA dimensionality reduction useful or is it better reducing the number of features (so vectors dimension) manually? )

by Jacopo at April 30, 2016 08:36 AM


2-thread consensus impl. with single FIFO queue and atomic read/write registers

I need to implement a 2-thread consensus with a single, initially empty FIFO queue and atomic read/write registers.

If deq() is called when the queue is empty, then a special EMPTY token is returned.

Any ideas anyone?


**The queue can't be initialized with elements inside it, the algorithm starts when the queue is empty - cannot apply the standard algorithm for 2-thread consensus using a queue here.

by Maor at April 30, 2016 08:03 AM

Hough transform: difference in cartesian to polar equation

According to Wikipedia and most other resources on the internet, the relation between cartesian coordinate and polar coordinate parameters are described by the equation $x\cos{\theta}+y\sin{\theta}=d$.

However, the computer vision course in Udacity used $x\cos{\theta}-y\sin{\theta}=d$ instead, as shown in this short video

What is the difference?

by Quevun at April 30, 2016 07:09 AM


How to write java program that simulates a "pseudo" MACHINE LEARNING program? [on hold]

I have to do this project where I have to solve a problem using my own "Simple pseudo" machine learning algorithm. I will not be using any complex algorithms. I only have to use the technique in which the program learns a task from prior experience.

I really have no idea how to start this since we have never even learned about Machine Learning and I don't get how I can write a code that will learn by prior experiences by itself

My directions state:

You will be training a self-driving Uber car (the machine) how to drive to a specific location. Using the attached Uber street map with various numbers that represent streets and houses which represent the target location, the Uber must begin at the start and learn how to drive to the target house. Build a user interface that asks the user to type in a location (house) for which they want to be driven to. I created the map with the houses/locations being represented by EVEN numbers. So the user must type in an EVEN number. Then let the Uber randomly select a ROUTE with a series of TURNs. The ROUTE would be similar to the GAME class that I gave you in the Machine Learning PowerPoint example and the TURN would be similar to the MOVE class that I gave you in the Machine Learning PowerPoint example. After the Uber arrives at a destination randomly the first time around, if it is NOT the correct destination, then the Uber has to go back to START and try it again. During the next ROUTE attempt and in the successive ROUTE attempts, the Uber will be referring back to its’ previous ROUTE TURN attempts and see what it learned as to whether the TURN was a good one or not (if it wasn’t a good turn, then the Uber is not going to be stupid enough to do it again). Ideally, after a few attempts, the Uber will ultimately learn how to get to the correct destination. The ROUTE attempts are done when the Uber finally arrives at the correct destination. MAKE SURE TO ADD COMMENTS IN YOUR CODE!!

enter image description here

by pyuntae at April 30, 2016 06:47 AM


Industry factors without GICS

I'm working through the Quantitative Equity Portfolio Management book by Chincarini and Kim.

I'd like to build a basic industry-based fundamental factor model. As this is a pet project for pedagogical purposes, I don't have the money to spend on Barra's GICS classifications. I also understand that other industry classifications (SIC and NAICS) are fairly useless for factor models.

Is there a reasonable open-source or homemade alternative (using, say, k-means clustering or non-negative matrix factorization) to create my own industry factors for US equities?

by MikeRand at April 30, 2016 06:08 AM

Planet Theory

They are perfect for each other

Now that Ted Cruz has chosen his running mate, everybody is wondering: who is going to be Trump’s pick for vice-president?

It would make sense if, to mitigate his negatives, Trump chose a person of color and someone who has a history of speaking out against income inequality.

He or she would have to be someone who is media-savvy and with some experience running a campaign, but definitely not a career politician. And of course he or she should be someone who endorsed Trump early on, like, say, in January.

I can think of only one person: Jimmy McMillan!


by luca at April 30, 2016 05:35 AM


What is the job of the Network Layer under the OSI reference model? [on hold]

What is the job of the Network Layer under the OSI reference model?

How does a network topology affect your decision in setting up a network?

by Joseph H at April 30, 2016 04:38 AM


How does Theory Of Computation actually been used in our Computer Systems and how does it interact with it directly or indirectly? [on hold]

I am doing engineering in INFORMATION SCIENCE and i am parallel y studying THEORY OF COMPUTATION and MACHINE INSTRUCTION.

Thank you

by Nilesh.Ghanti at April 30, 2016 04:36 AM

How to efficiently create balanced KD-Trees from a static set of points

From Wikipedia, KD-Trees:

Alternative algorithms for building a balanced k-d tree presort the data prior to building the tree. They then maintain the order of the presort during tree construction and hence eliminate the costly step of finding the median at each level of subdivision. [..] This algorithm presorts n points in each of k dimensions

I fail to understand how that is actually done. Consider this example:

Lets say I have 5 points in an array.

0 = (4, 7)
1 = (2, 9)
2 = (5, 4)
3 = (3, 6)
4 = (2, 1) 

I suppose we now want to create 2 sorted arrays, one by X and one by Y ?

Sort them by X     Sort them by Y
1 = (2, 9)         4 = (2, 1) 
4 = (2, 1)         2 = (5, 4)
3 = (3, 6)         3 = (3, 6)
0 = (4, 7)         0 = (4, 7)
2 = (5, 4)         1 = (2, 9)

We start creating the KD-Tree. Lets split by X axis, at the first step. We select the median of the points, 3 = (3, 6), so the Left and right trees will like this:

Left  (first half)         
1 = (2, 9)   
4 = (2, 1)     

Right  (second half)
0 = (4, 7)      
2 = (5, 4)  

Now we want to sort the Left and the Right trees by Y axis. How are we supposed to make use of the points that we sorted previously by Y axis ?

The only solution in my eyes is to re-sort the Left and Right trees respectively, by Y axis. What is Wikipedia talking about ?

by Shiro at April 30, 2016 04:21 AM

Where Can I Find DFA Practice?

Where can I find a web site or book where I can practice drawing DFAs from a language like " {w| w has at least three a’s and at least two b’s} ". It will be important to have access to the answer so as to check myself. I need the practice.

by PenguinsAndApples at April 30, 2016 04:14 AM


best datamine/classification techniques [on hold]

Are there exist some usual powerful techniques for data analysis which are common and suitable for various data in various type of situations?

For example I need to classify new data for part of which I already has the classification. I need to try (my thought examples are below):

  1. try to apply PCA, then RandomForestDecisions;
  2. find to most significant columns via method X(using lib A in python), then apply Kohonen network to all data using such way;
  3. try SVM with Markov chaines (see this example in R, and this in Mathematica, and improve result with K-nearest method on the result;
  4. use this toolkit to find data anomalies, and try usual backpropogation NN (like here) or reccurent neural networks like here;
  5. combine genetic algorithms (like this) on linear classificators ( I mean this).

I feel like astronaut/cosmonaut diving the Infinity in current amount of data-minning tools and algorithms.

by Jo Ja at April 30, 2016 04:00 AM


Vertex Disjoint Path Covers of Hypercube-Like Graphs

This is a followup question relating to an older question I posted, namely: Decomposing the n-cube into vertex-disjoint paths.

Given a graph $G = (V, E)$ and sets of distinct vertices $S = \{s_1, \ldots s_k \}$ and $T = \{t_1, \ldots t_k\}$, I am interested in finding a vertex disjoint path cover $P = \{P_1, \ldots P_k\}$ of $G$ such that each $P_i$ begins with $s_i$ and ends with $t_i$. Moreover, for $P$ to be a path cover of $G$, every vertex $v \in V$ must be part of a unique path $P_i \in P$.

In my previous question, I was interested in the case when $G = \mathcal{Q}_n$ where $\mathcal{Q}_n$ denotes the $n$ dimensional hypercube graph. It was shown by Gregor and Dvorak that such a cover exists when $P$ is balanced (in the sense that they contain the same amount of vertices from both bi-partitions of the $n$-cube), then such paths exists whenever $2k-e < n$, where $e$ is the number of pairs $(s_i, t_i)$ that form edges in $\mathcal{Q}_n$.

Now I am interested in the same problem for a graph $G = \mathcal{Q}_n - \mathcal{Q}_d$ (i.e a single copy of $\mathcal{Q}_d$ is deleted from $\mathcal{Q}_n$), for $1 \leq d \leq n$. Results were shown for the existence of Hamiltonian cycles in graphs for graphs $\mathcal{Q}_n - G$ when $G$ is an isometric tree or cycle (, but nothing for the problem of path covers in the desired graph class $G = \mathcal{Q}_n - \mathcal{Q}_d$. Is anyone familiar with such results?

Edit: I should add I am primarily interested in results for $1 \leq k \leq 2$, though I would appreciate results for any $k$.

by user340082710 at April 30, 2016 03:11 AM


Feature weightage from Azure Machine Learning Deployed Web Service

I am trying to predict from my past data which has around 20 attribute columns and a label. Out of those 20, only 4 are significant for prediction. But i also want to know that if a row falls into one of the classified categories, what other important correlated columns apart from those 4 and what are their weight. I want to get that result from my deployed web service on Azure.

by Stane Walsh at April 30, 2016 02:59 AM


get time slots available in any of workshop

I have a list of Workshops, each having open_days (Monday, Tuesday etc.) and open_time and close_time (which will be same for each day). Now, Based on the current time, I need to find out next 7 days available slots. A slot is of 2 hour and is available if any of the workshop is open in that 2 hour time. The first slot of each day will start at the open_time (nearest to hour time for eg. if open_time is 09:23:54 then first slot start time will be 10:00:00) of workshop opens first on that day.

How to find all the available slots?


workshops = [{"id":1,"open_days":[1,2,3,4,5,6],"open_time":"09:30:00","close_time":"19:30:00"},{"id":2,"open_days":[2,3,4,5,6,7],"open_time":"08:00:00","close_time":"16:30:00"}]

current_time = "2016-04-29 14:00:00"

Note: Here open_days will be list of days where Monday is 1 and sunday is 7.


slots = {"2016-04-29":[{"start_time":"14:00:00","end_time":"16:00:00"},{"start_time":"16:00:00","end_time":"18:00:00"}],"2016-04-30":[{"start_time":"08:00:00","end_time":"10:00:00"},{"start_time":"10:00:00","end_time":"12:00:00"},{"start_time":"12:00:00","end_time":"14:00:00"},

by Anuj at April 30, 2016 02:53 AM



Can any finite problem be in NP-Complete?

My lecturer made the statement

Any finite problem cannot be NP-Complete

He was talking about Sudoku's at the time saying something along the lines that for a 8x8 Sudoku there is a finite set of solutions but I can't remember exactly what he said. I wrote down the note that I've quoted but still don't really understand.

Sudoku's are NP complete if I'm not mistaken. The clique problem is also NP-Complete and if I had a 4-Clique problem is this not a finite problem that is NP-Complete?

by Aceboy1993 at April 30, 2016 02:26 AM


Where Can I Find DFA Practice? [migrated]

Where can I find a web site or book where I can practice drawing DFAs from a language like " {w| w has at least three a’s and at least two b’s} ". It will be important to have access to the answer so as to check myself. I need the practice.

by user38831 at April 30, 2016 02:25 AM

Computing the approximate population of a bloom filter

Given a bloom filter of size N-bits and K hash functions, of which M-bits (where M <= N) of the filter are set.

Is it possible to approximate the number of elements inserted into the bloom filter?

Simple Example

I've been mulling over the following example, assuming a BF of 100-bits and 5 hash functions where 10-bits are set...

Best case scenario: Assuming the hash functions are really perfect and uniquely map a bit for some X number of values, then given 10-bits have been set we can say that there has only been 2 elements inserted into the BF

Worst case scenario: Assuming the hash functions are bad and consistently map to the same bit (yet unique amongst each other), then we can say 10 elements have been inserted into the BF

The range seems to be [2,10] where abouts in this range is probably determined by the false-positive probability of filter - I'm stuck at this point.

by Tander Kulip at April 30, 2016 02:16 AM


Algorithm to convert rendered number back into symbolic form

If you have a number such as $3.14626437$ and you need to know what symbols create it, as far as I know, there are two tools:

1- ISC

2- wolframalpha

and the answer is $\sqrt2+\sqrt3$

I am wondering what algorithm these websites are using and how much is their complexity?

by ar2015 at April 30, 2016 02:07 AM

hubertf's NetBSD blog

OpenHUB's NetBSD Project Statistics

This flew by on Twitter (thanks ajcc @6LR61!), and I think it's neat so I point to it here: BlackDuck's OpenHUB has a number of NetBSD project statistics, generated automatically. Statis include activity and vulnerability reports, languages, lines-of-code statistics (with comment and blank lines), 30 day and 12 month activity reports with commit and contributor numbers, number of contributers per month since 1993 and more. In a nutshell, NetBSD consists of 5902 years of effort. Have a look!

April 30, 2016 01:58 AM


Using Headpose Vector and 2D Points to Compute Distances

I have a frame taken from a video. The frame contains a face and I have the (x, y) locations of the features (corners of lips, edge of eyebrows, etc.) and the headpose vector (pitch, yaw, roll), which shows the direction that the face is looking in degrees ((0, 0, 0) would be at the camera).

I need to calculate distances between specific points in real (3D) space. How can I map the feature locations to 3D space?

by ifyadig at April 30, 2016 01:41 AM

How does a recurrent connection in a neural network work?

I am reading a very interesting paper on genetic algorithms which define neural networks. I am familiar with how a feedforward neural network operates, but then I came across this:

Recurrent connection.

Where node #4 goes back to connect to #5. I was wondering how this is handled? Does the state of node 4 get kept from the last timestep and applied to node 5 when it is time to calculate its activation?

by Zach at April 30, 2016 01:20 AM



What does a negative stock amount mean in a single-period, binomial market model?

Consider a single-period, binomial market model with a $r > 0$ interest rate (in USD per period) and a portfolio $(x, y)$ consisting of two assets: a savings/lendings account and a stock, both measured in USD.

Now, both $x$ and $y$ may be positive and negative. If $x$ is positive, the savings account holds $x$ USD; if $x$ is negative - the account holder owes the bank $x$ USD. If $y$ is positive, the account holder has $y$ stocks at his/her possession.

What does it mean for $y$ to be negative?

The only idea I have is that a negative $y$ corresponds to short selling $y$ stocks. However, in the real world, short selling a stock is accompanied by setting up an interest-accruing margin account with the broker and possibly depositing an additional collateral, and this is not reflected in the model.

I understand that the model is a simplification of the real world, but I don't think ignoring an interest accruing debt is an acceptable simplification, and, in support of this I bring the following quote from Investopedia:

Most of the time, you can hold a short for as long as you want, although interest is charged on margin accounts, so keeping a short sale open for a long time will cost more.

It also simply doesn't make any economical sense, of the sort that exists even in the most simplified models of economic interactions, that one can borrow something of value without having to pay for it.

So what does it mean for $y$ to be negative? How can I wrap my mind around it?

EDIT: Here's my proposal for an answer, let me know what you think.

The difficulty arises from the fact that the word "stock" is a misnomer: the security referred to as a "stock" does not model a real-life stock, not even in simplified form. Rather, it models some other financial instrument that has no counterpart in real life, which, together with additional financial instruments, can be used to synthesize a simplified model of a real life stock.

A much better conceptualization of what the model "stock" means is it is a variation on a savings/lending account: whereas the value of a regular savings/lending account increments deterministically with time, the value of the so-called "stock" increments randomly.

From this follows the following conclusion: instead of referring to the model "stock" as such, it would be better to call it "a random savings/lending account", as opposed to "a deterministic savings/lending account", and save the term "stock" to a different financial instrument that actually models a real-life stock, and that can be synthesized from a combination of a deterministic and a random savings/lending accounts and possibly some other financial instruments.

by Evan Aad at April 30, 2016 01:00 AM


Observational Equivalence of open terms in PCF

The notion of observational equivalence is rather intuitive, but formally I'm having some doubts in the particular case of open terms.

Lets consider the simple case where the terms M and N are free variables, M = x and N = y. Choosing the program context C[-] = (λxy.[-]) v w we have,
C[M] = (λxy.x) v w →* v
C[N] = (λxy.y) v w →* w
So by definition M and N wouldn't be observationally equivalent.

However, consider any adequate denotational semantics of PCF. Then by definition of adequacy, ⟦M⟧ = ⟦N⟧ implies M and N observationally equivalent. The denotations of both (open) terms are,
⟦M⟧ = ⟦x ⊢ x⟧ = id
⟦N⟧ = ⟦y ⊢ y⟧ = id
From what we conclude M and N to be observationally equivalent.

So, surely I must be going wrong somewhere, and I suspect it to be some trivial mistake. But where?

by Adribar at April 30, 2016 12:56 AM


Sort a generator using generators?

I have a lot of data stored in generators, and i would like to sort them without using lists, to not go out of memory in the process. It's possible to sort the generators by this way?. I have some hours thinking this and i can't find a way to do it without saving the seen values somewhere (or there's a way saving them "partially"). I have read in google about lazy sorting, is that a nice approach? Thanks for the answers!!

EDIT: My final objective is to write all the sorted data to a file.

PS: sorry about my bad english ><

by Zealot at April 30, 2016 12:32 AM


Global relabeling heuristic: Push-relabel maxflow

I have a correct, working implementation of the preflow-push-relabel maxflow algorithm [2]. I am trying to implement the global relabeling update heuristic [3], but have run into some issues.

I have a specific instance of the problem here to illustrate my questions 1: enter image description here

a) For this problem instance, the "current preflow" is a state that is reached by my implemented max flow algorithm [[ even without global relabeling heuristic we reach this state ]]. Without applying any heuristics the algorithm proceeds to completion from this state giving the correct result.

b) The distance labels (in green) at this state represent a valid labeling as for every (u, v) \belong to E_{residual} d_u <= d_v + 1

Q1: The distance label (or height) is supposed to be a lower bound on the distance from the sink. However for several nodes in the residual graph (eg: 6, 7) this is not true. Eg. Node 7 has a distance label of 14...but clearly has a distance of 1 from the sink in the residual graph.

Q2: On running the global relabel at this stage, we get a labeling as seen in the extreme right. From this point on (depending on how frequently you do the global update), the algorithm can get stuck [[ deadlock ]] -- eg. nodes 6, 3 keep circulating flow between them as they each get relabeled to a higher height. If you run the global update before the reach the final height, you will reset the heights and the process repeats.

I am reasonably sure I am making a very trivial error but am unable to put my finger on it. Can someone help me with this issue? I am happy to provide a code snippet if that would help.

A couple more points that may be relevant:
-> I am implementing a single phase version of the push-relabel algorithm (ie. do not stop when you have a min. cut, but continue till you obtain a valid flow).
-> I am processing active vertices in FIFO order in the current problem instance. But I have a highest label first [3] implementation as well. I do not think this affects the two questions I have.

Thank You!

[[ This question has also been posted on programmers stackexchange -- I was unsure which forum is better for this question ]]

[2] A new approach to the maximum flow problem; A.Goldberg, R.Tarjan; JACM, Vol 35. Iss 4, 1988

[3] On implementing the push-relabel method for the maximum flow problem; B.V.Cherkassky, A.Goldberg

by user26203 at April 30, 2016 12:13 AM


true negative is 0% whereas true positive is 100% correctly classified

I used Naive Bayes from Spark's MlLib to train a model and test it on the data (in the form of an RDD). The results were confusing.

the data and results are as follows:

The problem is a binary classification one. The outcome should be either a label with '0' or '1'.

total number of labels with '0' in the testing dataset - 11774

total number of labels with '1' in the testing dataset - 246

Code for reference:

from pyspark.mllib.classification import LogisticRegressionWithLBFGS, LogisticRegressionModel
from pyspark.mllib.regression import LabeledPoint
from pyspark.mllib.util import MLUtils
from pyspark.mllib.evaluation import MulticlassMetrics

def parsePoint(line):
    values = [float(x) for x in line]
    return LabeledPoint(values[-1], values[0:-1])

data =

# Split data aproximately into training (60%) and test (40%)
training, test = data.randomSplit([0.6, 0.4], seed=0)

# Train a naive Bayes model.
model = LogisticRegressionWithLBFGS.train(training, 1.0)

#labelsAndPreds = p: (p.label, model.predict(p.features)))
predictionAndLabels = lp: (float(model.predict(lp.features)), lp.label))
accuracy =1.0 * predictionAndLabels.filter(lambda (v, p): v == p).count() / test.count()

after applying the model and obtaining the predictions :

True Positives - 11774

False Positives - 0

False Negatives - 246

True Negatives - 0

All my '0' labels are correctly classified and whereas all the '1' labels are incorrectly classified!

Now, this is a part of my project and I'm not sure if the results are fine to be submitted.

The code I wrote using Spark's Python API does this: it gets the data from a file and builds the RDD. I just fed this RDD into the Spark MlLib's Naive Bayes documentation provided on the website and the result is as above.

Can someone please tell me if this result is normal?

by prog-life at April 30, 2016 12:07 AM

HN Daily

April 29, 2016


DragonFly BSD Digest

garbage[24]: It’s all fun and games until someone’s domain gets hurt

The garbage podcast for this week is up, with discussion of OpenBSD and TRIM, and, well, a very wide range of topics, going by the summary.

by Justin Sherrill at April 29, 2016 11:25 PM



FreeBSD ports collection under PC-BSD?

I'm toying with the idea of installing FreeBSD or PC-BSD... as it looks now, I'll probably go for PC-BSD.

However, one thing I really would like, is the FreeBSD ports collection. So I was wondering if it can be installed on PC-BSD? If it can be used with PC-BSD - without too much conflict with PC-BSD's own package-installation and package-repository (ie. AppCafe)? (E.g. building and installing some packages from the port-collection, then removing it with PC-BSD's GUI package-manager... Or building and installing a package from the port-collection, and then adding a package that depend on the first one with PC-BSD's GUI package-manager...)

And finally how I can install it? Can I install the port-collection directly by installing some package (which?) from the PC-BSD repository? Or must I download it separately from FreeBSD (where?) and install it "manually" (how?) in PC-BSD?

by Baard Kopperud at April 29, 2016 11:09 PM


Regression model when samples are small and not correlated

I received this question during an onsite interview for a quant job and I'm still scratching my head on how to solve this problem. Any help would be appreciated.

Mr Quant thinks that there is a linear relationship between past and future intraday returns. So he would like to test this idea. For convenience, he decided to parameterize return in his data set using a regular time grid dt where $d=0, …, D-1$ labels date and $t=0, …, T-1$ intraday time period. For example, if we split day into 10 minute intervals then $T = 1440 / 10$. His model written on this time grid has the following form:

$y_{d,t}$ $=$ $\beta_t$ * $x_{d,t}$ + $\epsilon_{d,t}$

where $y_{d,t}$ is a return over the time interval $(t,t+1)$ and $x_{d,t}$ is a return over the previous time interval, $(t–1,t)$ at a given day $d$. In other words, he thinks that previous 10-minute return predicts future 10-minute return, but the coefficient between them might change intraday.

Of course, to fit $\beta_t$ he can use $T$ ordinary least square regressions, one for each “$t$”, but:

(a) his data set is fairly small $D$=300, $T$=100;

(b) he thinks that signal is very small, at best it has correlation with the target of 5%.

He hopes that some machine learning method that can combine regressions from nearby intraday times can help.

How would you solve this problem? Data provided is an $x$ matrix of predictors of size $300\times100$ and a $y$ matrix of targets of size $300\times100$.

by cogolesgas at April 29, 2016 11:00 PM


Does IBM Watson API learn from my data?

I'm testing couple of IBM Watson APIs like the following: enter image description here

Does Watson get smarter and learn more about my data the more I use it?

I read that Watson is getting smarter with more data it learns and processes. I'm not sure if this is only done behind the scenes by IBM Watson team, or if these API's as well allow an instance of Watson for example to be smarter with my specific application I'm developing.

by Marko at April 29, 2016 10:24 PM



Lists of Functions and their execution Erlang

Is it possible to create and send a list of functions as an argument to another function, and then have some functions within this list call other functions in this list?
For example, I want a function that works on a list passed as an argument, and then performs the first function in the list of functions on this list of numbers, and if that function makes calls to other functions within that list, they can be retrieved and used.

e.g.: deviation(Numbers, Functions) -> %Functions = [fun master/1, fun avg/1, fun meanroots/1]
Master calls avg, then passes that result into meanroots, etc. but at the end of the call chain the master is the one that will return a value.

I'd like to know if this is functionally possible, whether within OTP alone or using NIFs, and if there are samples of implementation to look at.

by Scy at April 29, 2016 10:07 PM


Minimal Steiner Tree in unweighted directed graph

I have an unweighted directed graph $(V, E)$ and a subset $T \subseteq V$ of these vertices. I want to find the minimum tree $(V',E')$ that contains all these $T$ vertices (minimize in number of nodes $|V'|$).

In my problem all vertices $t \in T$ have no leaving edges: $$\left\{(t,w) \in E \mid t \in T\right\}=\emptyset$$

This is a specific case of the Directed Steiner Tree problem.

What's the best algorithm and it's complexity to find the exact solution to the Directed Steiner Tree problem? (Or, if there is, a better solutions to this specific case of the Directed Steiner Tree)

What are the most used approximations for this problem?

by t.pimentel at April 29, 2016 10:02 PM


Implementing custom kernel in Matlab

I would like to use the PUK kernel for SVM in Matlab. The PUK kernel looks as follows:

enter image description here

In Matlab, I have to create a function which implements the kernel.

function G = myPUKKernel(U,V)


How would this function look like?

Second, what parameters ranges is reasonable to search for omega and sigma?

by machinery at April 29, 2016 09:48 PM

Jeff Atwood

They Have To Be Monsters

Since I started working on Discourse, I spend a lot of time thinking about how software can encourage and nudge people to be more empathetic online. That's why it's troubling to read articles like this one:

My brother’s 32nd birthday is today. It’s an especially emotional day for his family because he’s not alive for it.

He died of a heroin overdose last February. This year is even harder than the last. I started weeping at midnight and eventually cried myself to sleep. Today’s symptoms include explosions of sporadic sobbing and an insurmountable feeling of emptiness. My mom posted a gut-wrenching comment on my brother’s Facebook page about the unfairness of it all. Her baby should be here, not gone. “Where is the God that is making us all so sad?” she asked.

In response, someone — a stranger/(I assume) another human being — commented with one word: “Junkie.”

The interaction may seem a bit strange and out of context until you realize that this is the Facebook page of a person who was somewhat famous, who produced the excellent show Parks and Recreation. Not that this forgives the behavior in any way, of course, but it does explain why strangers would wander by and make observations.

There is deep truth in the old idea that people are able to say these things because they are looking at a screen full of words, not directly at the face of the person they're about to say a terrible thing to. That one level of abstraction the Internet allows, typing, which is so immensely powerful in so many other contexts …

… has some crippling emotional consequences.

As an exercise in empathy, try to imagine saying some of the terrible things people typed to each other online to a real person sitting directly in front of you. Or don't imagine, and just watch this video.

I challenge you to watch the entirety of that video. I couldn't do it. This is the second time I've tried, and I had to turn it off not even 2 minutes in because I couldn't take it any more.

It's no coincidence that these are comments directed at women. Over the last few years I have come to understand how, as a straight white man, I have the privilege of being immune from most of this kind of treatment. But others are not so fortunate. The Guardian analyzed 70 million comments and found that online abuse is heaped disproportionately on women, people of color, and people of different sexual orientation.

And avalanches happen easily online. Anonymity disinhibits people, making some of them more likely to be abusive. Mobs can form quickly: once one abusive comment is posted, others will often pile in, competing to see who can be the most cruel. This abuse can move across platforms at great speed – from Twitter, to Facebook, to blogposts – and it can be viewed on multiple devices – the desktop at work, the mobile phone at home. To the person targeted, it can feel like the perpetrator is everywhere: at home, in the office, on the bus, in the street.

I've only had a little taste of this treatment, once. The sense of being "under siege" – a constant barrage of vitriol and judgment pouring your way every day, every hour – was palpable. It was not pleasant. It absolutely affected my state of mind. Someone remarked in the comments that ultimately it did not matter, because as a white man I could walk away from the whole situation any time. And they were right. I began to appreciate what it would feel like when you can't walk away, when this harassment follows you around everywhere you go online, and you never really know when the next incident will occur, or exactly what shape it will take.

Imagine the feeling of being constantly on edge like that, every day. What happens to your state of mind when walking away isn't an option? It gave me great pause.

The Scream by Nathan Sawaya

I admired the way Stephanie Wittels Wachs actually engaged with the person who left that awful comment. This is a man who has two children of his own, and should be no stranger to the kind of pain involved in a child's death. And yet he felt the need to post the word "Junkie" in reply to a mother's anguish over losing her child to drug addiction.

Isn’t this what empathy is? Putting myself in someone else’s shoes with the knowledge and awareness that I, too, am human and, therefore, susceptible to this tragedy or any number of tragedies along the way?

Most would simply delete the comment, block the user, and walk away. Totally defensible. But she didn't. She takes the time and effort to attempt to understand this person who is abusing her mother, to reach them, to connect, to demonstrate the very empathy this man appears incapable of.

Consider the related story of Lenny Pozner, who lost a child at Sandy Hook, and became the target of groups who believe the event was a hoax, and similarly selflessly devotes much of his time to refuting and countering these bizarre claims.

Tracy’s alleged harassment was hardly the first, Pozner said. There’s a whole network of people who believe the media reported a mass shooting that never happened, he said, that the tragedy was an elaborate hoax designed to increase support for gun control. Pozner said he gets ugly comments often on social media, such as, “Eventually you’ll be tried for your crimes of treason against the people,” “… I won’t be satisfied until the caksets are opened…” and “How much money did you get for faking all of this?”

It's easy to practice empathy when you limit it to people that are easy to empathize with – the downtrodden, the undeserving victims. But it is another matter entirely to empathize with those that hate, harangue, and intentionally make other people's lives miserable. If you can do this, you are a far better person than me. I struggle with it. But my hat is off to you. There's no better way to teach empathy than to practice it, in the most difficult situations.

In individual cases, reaching out and really trying to empathize with people you disagree with or dislike can work, even people who happen to be lifelong members of hate organizations, as in the remarkable story of Megan Phelps-Roper:

As a member of the Westboro Baptist Church, in Topeka, Kansas, Phelps-Roper believed that AIDS was a curse sent by God. She believed that all manner of other tragedies—war, natural disaster, mass shootings—were warnings from God to a doomed nation, and that it was her duty to spread the news of His righteous judgments. To protest the increasing acceptance of homosexuality in America, the Westboro Baptist Church picketed the funerals of gay men who died of AIDS and of soldiers killed in Iraq and Afghanistan. Members held signs with slogans like “GOD HATES FAGS” and “THANK GOD FOR DEAD SOLDIERS,” and the outrage that their efforts attracted had turned the small church, which had fewer than a hundred members, into a global symbol of hatred.

Perhaps one of the greatest failings of the Internet is the breakdown in cost of emotional labor.

First we’ll reframe the problem: the real issue is not Problem Child’s opinions – he can have whatever opinions he wants. The issue is that he’s doing zero emotional labor – he’s not thinking about his audience or his effect on people at all. (Possibly, he’s just really bad at modeling other people’s responses – the outcome is the same whether he lacks the will or lacks the skill.) But to be a good community member, he needs to consider his audience.

True empathy means reaching out and engaging in a loving way with everyone, even those that are hurtful, hateful, or spiteful. But on the Internet, can you do it every day, multiple times a day, across hundreds of people? Is this a reasonable thing to ask of someone? Is it even possible, short of sainthood?

The question remains: why would people post such hateful things in the first place? Why reply "Junkie" to a mother's anguish? Why ask the father of a murdered child to publicly prove his child's death was not a hoax? Why tweet "Thank God for AIDS!"

Unfortunately, I think I know the answer to this question, and you're not going to like it.

Busy-Work by Shen,

I don't like it. I don't want it. But I know.

I have laid some heavy stuff on you in this post, and for that, I apologize. I think the weight of what I'm trying to communicate here requires it. I have to warn you that the next article I'm about to link is far heavier than anything I have posted above, maybe the heaviest thing I've ever posted. It's about the legal quandary presented in the tragic cases of children who died because their parents accidentally left them strapped into carseats, and it won a much deserved pulitzer. It is also one of the most harrowing things I have ever read.

Ed Hickling believes he knows why. Hickling is a clinical psychologist from Albany, N.Y., who has studied the effects of fatal auto accidents on the drivers who survive them. He says these people are often judged with disproportionate harshness by the public, even when it was clearly an accident, and even when it was indisputably not their fault.

Humans, Hickling said, have a fundamental need to create and maintain a narrative for their lives in which the universe is not implacable and heartless, that terrible things do not happen at random, and that catastrophe can be avoided if you are vigilant and responsible.

In hyperthermia cases, he believes, the parents are demonized for much the same reasons. “We are vulnerable, but we don’t want to be reminded of that. We want to believe that the world is understandable and controllable and unthreatening, that if we follow the rules, we’ll be okay. So, when this kind of thing happens to other people, we need to put them in a different category from us. We don’t want to resemble them, and the fact that we might is too terrifying to deal with. So, they have to be monsters.

This man left the junkie comment because he is afraid. He is afraid his own children could become drug addicts. He is afraid his children, through no fault of his, through no fault of anyone at all, could die at 30. When presented with real, tangible evidence of the pain and grief a mother feels at the drug related death of her own child, and the reality that it could happen to anyone, it became so overwhelming that it was too much for him to bear.

Those "Sandy Hook Truthers" harass the father of a victim because they are afraid. They are afraid their own children could be viciously gunned down in cold blood any day of the week, bullets tearing their way through the bodies of the teachers standing in front of them, desperately trying to protect them from being murdered. They can't do anything to protect their children from this, and in fact there's nothing any of us can do to protect our children from being murdered at random, at school any day of the week, at the whim of any mentally unstable individual with access to an assault rifle. That's the harsh reality.

When faced with the abyss of pain and grief that parents feel over the loss of their children, due to utter random chance in a world they can't control, they could never control, maybe none of us can ever control, the overwhelming sense of existential dread is simply too much to bear. So they have to be monsters. They must be.

And we will fight these monsters, tooth and nail, raging in our hatred, so we can forget our pain, at least for a while.

After Lyn Balfour’s acquittal, this comment appeared on the Charlottesville News Web site:

“If she had too many things on her mind then she should have kept her legs closed and not had any kids. They should lock her in a car during a hot day and see what happens.”

I imagine the suffering that these parents are already going through, reading these words that another human being typed to them, just typed, and something breaks inside me. I can't process it. But rather than pitting ourselves against each other out of fear, recognize that the monster who posted this terrible thing is me. It's you. It's all of us.

The weight of seeing through the fear and beyond the monster to simply discover yourself is often too terrible for many people to bear. In a world of hard things, it's the hardest there is.

[advertisement] At Stack Overflow, we help developers learn, share, and grow. Whether you’re looking for your next dream job or looking to build out your team, we've got your back.

by Jeff Atwood at April 29, 2016 09:47 PM


How to plot sigmoid probability curve in Scikitlearn?

I'm trying to recreate this image using Python given 2 classes and their associated predicted probability from a classifier.

I want to see something like this: sigmoid curve

It's not working though, as I get a mostly linear line. **NOTE: I know this data shown is currently suspect and/or bad. I need to tune the input & model, but wanted to look at the plot

Basically, I thought I'd "correct" the predict_proba() output so they are all with respect to the "0" class (meaning if it predicted "1" class, the probability that it's a "0" class is 1-(1classProbability) such that 95% prediction it's class "1" becomes a 5% change it's class "0". Then sort in order of my corrected predicition value and end up with something sigmoid-ish.

Unfortunately, I end up with this: enter image description here

Here's a chunk of my python where I'm trying (unsuccessfully) to plot the probability sigmoid:

import matplotlib.pyplot as plt
testPredProbas = clf.predict_proba(X_test)
testPredProbas = testPredProbas[:, 1]
#Sort in order to produce sigmoid curve
# from operator import itemgetter
# cosorted = [list(x) for x in zip(*sorted(zip(testPredProbas, y_test), key=itemgetter(0)))]
#cosorted = zip(testPredProbas, y_test)
from numpy import array, rec
c = rec.fromarrays([array(testPredProbas), array(y_test)])
cosorted[0]= c.f0   # fromarrays adds field names beginning with f0 automatically
cosorted[1]= c.f1   # fromarrays adds field names beginning with f0 automatically

sigmoid = cosorted[0]
classValue = cosorted[1]
xAxis = range(0,len(cosorted[0]))
numInstances = len(cosorted[0])

#Fix probabilities so they are all with respect to the 0 class
for idx, val in enumerate(testPredProbas):
    if testPred[idx] == 0:
    elif testPred[idx] == 1:

plt.plot(xAxis, sigmoid, lw=1, label='sigmoid plot')
plt.scatter(xAxis, classValue, marker='o', color='r', label='actual outcome')
plt.xlim([-0.05, numInstances + 0.05])
plt.ylim([-0.05, 1.05])
plt.ylabel('Prediction Probability')
plt.title('Sigmoid Curve')
plt.legend(bbox_to_anchor=(0., 1.02, 1., .102), loc=3,
           ncol=2, mode="expand", borderaxespad=0.)

For reference, below is the plot in Matlab that I'm trying to replicate in my Python model.

%Build the model
mdl = fitglm(X, Y, 'distr', 'binomial', 'link', 'logit')
%Build the sigmoid model
B = mdl.Coefficients{:, 1};
Z = mdl.Fitted.LinearPredictor
yhat = glmval(B, X, 'logit'); 
figure, scatter(Z, yhat), hold on,
gscatter(Z, zeros(length(X),1)-0.1, Y) % plot original classes
hold off, xlabel('\bf Z'),  grid on,  ylim([-0.2 1.05])
title('\bf Predicted Probability of each record')

by NumericOverflow at April 29, 2016 09:46 PM

Parameter Optimization and k-fold Cross-Validation

I was wondering about the correct data splitting for parameter optimization and later cross-validation.

In general, the data are split into a training and test set. Additionally, I split the training set into a smaller training and validation set to tune the SVM parameters (also CV) and are merged again to the final training set.

Everything is fine until now, I am running into theoretical problems when I want to carry out 10-fold cross validation in the end. The test set was left out to stay unbiased to the tuned parameters. In my case just 1 out of 9 is unbiased as the other test sets have been a part of the training set during the tuning.

Are my assumptions correct and can you maybe help me out with a better splitting strategy?

by jobooo at April 29, 2016 09:40 PM

Development of the real-time recommendation system using Spark

I've just started learning spark and machine learning and I'm going to develop real-time recommendation system.

Assume there is an HBase with the following tables: User(id, age, gender, occupation, ZIP code), Movie(id, title, release data, IMDB_link, category), data(user_id, movie_id, rating (1-5 scale), timestamp). The system should make a recommendation to user basing on user's previous grades to films.

Obviously, one should train a model and then show to user recommendation. But I do not understand, how to organize retraining of the model after user watched one or more films? Should I keep my trained model in memory(like static member)? Should the model be retrained after every new grade to the film?


Here I will try to describe how I understand how should all things work:

1) There is a history of grades which user assigned to the different films.

2) When user clicks "See recommendation", a request is sent to the server.

3) Server trains model basing on the previous user's grades to the films.

4) User watches new movie and assign a grade to the movie.

5) After step 4 our model should be retrained to be more accurate.

Step 5 could take a lot of time, so user should wait all these time to get new recommendation. And the questions are 1) Do I understand correctly, how all things should work? 2) Is there any way not to make user wait after each new grading before obtaining new recommendation?

by Oleksandr at April 29, 2016 09:39 PM

Loading other images into tensorflow, apart from MNIST

Iam interested in applying a convolutional neural networks using tensorflow. However, the only tutorial i have seen is loading the MNIST dataset. I have tried replicating the procedure done there and reading chunks of tutorials around the interwebs but it's not working. Here is my code so far

import tensorflow as tf
import os
import numpy as np

filename = os.getcwd() + '/sample_images/*.png'
filename_queue = tf.train.string_input_producer(tf.train.match_filenames_once(filename))

image_reader = tf.WholeFileReader()

_, image_file =

image = tf.image.decode_png(image_file, 3)
image = tf.image.rgb_to_grayscale(image, name=None)
image = tf.image.resize_images(image, 28, 28, method=0, align_corners=False)

data = []
with tf.Session() as sess:

    coord = tf.train.Coordinator()
    threads = tf.train.start_queue_runners(coord=coord)

    image_tensor =[image])

xx = np.asarray(data)
print xx[0].shape

Basically, i want to do the following:- - Load images in from the folder with their names

  • resize each image to 28*28

  • change it to grey scale

  • turn it into a tensor and add it to a training set

  • create it's target(from it's label and add it to a numpy array)

  • repeat for all images in the folder

when i am done, pass the dataset and target to a tensorflow RNN

All help will be highly appreciated

by Agg-rey Muhebwa at April 29, 2016 09:28 PM


Best way to store hourly/daily options data for research purposes

There are quite a few discussions here about storage, but I can't find quite what I'm looking for.

I'm in need to design a database to store (mostly) option data (strikes, premiums bid / ask, etc.). The problem I see with RDBMS is that given big number of strikes tables will be enormously long and, hence, result in slow processing. While I'm reluctant to use MongoDB or similar NoSQL solution, for now it seems a very good alternative (quick, flexible, scalable).

  1. There is no need in tick data, it will be hourly and daily closing prices & whatever other parameters I'd want to add. So, no need for it to be updated frequently and writing speed is not that important.

  2. The main performance requirement is in using it for data mining, stats and research, so it should be as quick as possible (and preferably easy) to pull and aggregate data from it. I.e., think of 10-year backtest which performs ~100 transactions weekly over various types of options or calculating volatility swap over some extended period of time. So the quicker is better.

  3. There is lots of existent historical data which will be transferred into the database, and it will be updated on a daily basis. I'm not sure how much memory exactly it will take, but AFAIK memory should not be a constraint at all.

  4. Support by popular programming languages & packages (C++, Java, Python, R) is very preferable, but would not be a deal breaker.

Any suggestions?

by sashkello at April 29, 2016 09:26 PM


Generic conversational dataset? [on hold]

I'm looking for a strong conversational data-set. I've considered the Public Streaming Twitter API, but that didn't work. (retweets, slang, poor grammar, etc..)

I'm looking for something generic, more along the lines of:

Text: Hello

Response: Hi

Text: How are you?

Response: Doing well (etc..)

by Ajax jQuery at April 29, 2016 09:26 PM


Merge k sorted arrays, each one's length is double than its' previous

I've seen many answers to merge identical-sized arrays, but haven't seen the answer to this question yet.

Given $A_1, A_2, ..., A_k$ sorted arrays where $|A_i| = 2^i$, what is the best comparison sort to merge them all?

by Daykvar at April 29, 2016 09:13 PM



Predict function of neuralnet gives odd results

I am relatively new to both machine learning techniques and programming in R, and at the moment I am trying to fit a neural network to some data that I have. However, the resulting predictions of the neural network do not make sense to me. I have looked through StackOverflow but could not find a solution to this problem.

My data (this is a part of the test set, the training set is of the same format)

    target monday tuesday wednesday thursday friday saturday  indepedent
428    277      1       0         0        0      0        0        3317
429    204      0       1         0        0      0        0        1942
430    309      0       0         1        0      0        0        2346
431    487      0       0         0        1      0        0        2394
432    289      0       0         0        0      1        0        2023
433    411      0       0         0        0      0        1        1886
434    182      0       0         0        0      0        0        1750
435    296      1       0         0        0      0        0        1749
436    212      0       1         0        0      0        0        1810
437    308      0       0         1        0      0        0        2021
438    378      0       0         0        1      0        0        2494
439    329      0       0         0        0      1        0        2110
440    349      0       0         0        0      0        1        1933

My code

resultsnn <- neuralnet(target~monday+tuesday+wednesday+thursday+friday+saturday+independent,data=training,hidden=3,threshold=0.01,linear.output = TRUE)

My results (the predicted value is the same for ALL test set cases)

428 508.4962231
429 508.4962231
430 508.4962231
431 508.4962231
432 508.4962231
433 508.4962231
434 508.4962231
435 508.4962231
436 508.4962231
437 508.4962231
438 508.4962231
439 508.4962231
440 508.4962231

What else have I tried?

I have tried versions without the dummies (only including the independent variable, this does not change the type of results)

I have created some synthetic data and used this as an input, for the same code, this does work properly:

#building training set
input_train <-
output_train <-

train <-,input_train)
colnames(train) <- c("output","input")

#building test set
input_test <-
output_test <-

test <-,input_test)
colnames(test) <- c("output","input")

#neural network 3 neurons
res.train <- neuralnet(output~input,data=train,hidden=3,threshold=0.01) #train nn
compute(res.train,test[,2])$net.result #predict using nn on test set

I have also tried other packages (e.g., nnet and RSNNS), but these packages already fail to provide correct predictions when using the synthetic data.

Some additional information

Some additional information on the data types:

 'data.frame':  82 obs. of  8 variables:
  $ target     : int  277 204 309 487 289 411 182 296 212 308 ...
  $ monday     : int  1 0 0 0 0 0 0 1 0 0 ...
  $ tuesday    : int  0 1 0 0 0 0 0 0 1 0 ...
  $ wednesday  : int  0 0 1 0 0 0 0 0 0 1 ...
  $ thursday   : int  0 0 0 1 0 0 0 0 0 0 ...
  $ friday     : int  0 0 0 0 1 0 0 0 0 0 ...
  $ saturday   : int  0 0 0 0 0 1 0 0 0 0 ...
  $ independent: int  3317 1942 2346 2394 2023 1886 1750 1749 1810 2021 ...

 'data.frame':  397 obs. of  8 variables:
  $ target     : int  1079 1164 1069 1038 629 412 873 790 904 898 ...
  $ monday     : int  0 0 0 0 0 0 1 0 0 0 ...
  $ tuesday    : int  1 0 0 0 0 0 0 1 0 0 ...
  $ wednesday  : int  0 1 0 0 0 0 0 0 1 0 ...
  $ thursday   : int  0 0 1 0 0 0 0 0 0 1 ...
  $ friday     : int  0 0 0 1 0 0 0 0 0 0 ...
  $ saturday   : int  0 0 0 0 1 0 0 0 0 0 ...
  $ independent: int  2249 2381 4185 2899 2387 2145 2933 2617 2378 3569 ...

Please let me know if you need any additional information! Thanks for the help guys (:

by Tomas at April 29, 2016 09:09 PM


How to implement the regret matching algorithm?

My question is the following: How to calculate the regret in practice?

I am trying to implement the regret matching algorithm but I do not understand how to do it.

  • First, I have $n$ players with the joint action space $\mathcal{A}=\{a_0, a_1,\cdots,a_m\}^n.$
  • Then, I fix some period $T$. The action set $A^t\in\mathcal{A}$ is the action set chosen by players at time $t$. After the period $T$ (every player has chosen an action). So I get $u_i(A^t)$.
  • Now the regret of player $i$ of not playing action $a_i$ in the past is: (here $A^t\oplus a_i$ denotes the strategy set obtained if player $i$ changed its strategy from $a'_i$ to $a_i$) $$\max\limits_{a_i\in A_i}\left\{\dfrac{1}{T}\sum_{t\leqslant T}\left(u_i(A^t\oplus a_i )-u_i(A^t)\right)\right\}.$$ I do not understand how to calculate this summation. Why there is a max over the action $a_i\in A_i$? Should I calculate the regret of all actions in $A_i$ and calculate the maximum? Also, In Hart's paper, the maximum is $\max\{R, 0\}$. Why is there such a difference?

    I mean if the regret was: $\dfrac{1}{T}\sum_{t\leqslant T}\left(u_i(A^t\oplus a_i )-u_i(A^t)\right),$ the calculation would be easy for me.

The regret is defined in the following two papers [1] (see page 4, equation (2.1c)) and [2] (see page 3, section I, subsection B).

  1. A simple adaptive procedure leading to correlated equilibrium by S. Hart et al (2000)
  2. Distributed algorithms for approximating wireless network capacity by Michael Dinitz (2010)

I would like to get some helps from you. Any suggestions step by step how to implement such an algorithm please?

by Blid at April 29, 2016 09:05 PM


Recur not at tail position

How can I use something similiar to recurnot at tail position?

Take a look at my code:

(defn -main [& args]

  (println "Hi! Type a file name...")

  (defn readFile[])
    (let [fileName(read-line)]
    (let [rdr (reader fileName)]
      (if-not (.exists rdr) 
        ((println "Sorry, this file doesn't exists. Type a valid file name...")
         (defn list '())
         (doseq [line (line-seq rdr)]
           (if-not (= "" line)
             (concat list '(line)))

(defn fileLinesList (readFile))

I know I can't use recur here... But I neither know how can I make it in clojure.

I'm a newbie in Clojure and I'm coming from a OOP context. So...

Is there a way to use recursion in this case? What would be an alternative?

by Tiago Dall'Oca at April 29, 2016 09:02 PM

Loading Mllib models outside Spark

I'm training a model in spark with mllib and saving it:

val model = SVMWithSGD.train(training, numIterations), "~/model")

but I'm having trouble loading it from a java app without spark to make real time predictions.

SparkConf sconf = new SparkConf().setAppName("Application").setMaster("local");
SparkContext sc = new SparkContext(sconf);
SVMModel model = SVMModel.load(sc, "/model");

I'm getting:

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/SparkConf
    at ModelUser$.main(ModelUser.scala:11)
    at ModelUser.main(ModelUser.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(
    at java.lang.reflect.Method.invoke(
    at com.intellij.rt.execution.application.AppMain.main(
Caused by: java.lang.ClassNotFoundException: org.apache.spark.SparkConf

Is there a way to load the model in normal java app?

by fyllera at April 29, 2016 09:02 PM


Schlimm: Verfassungsschutz fängt sich Ransomware-Trojaner ...

Schlimm: Verfassungsschutz fängt sich Ransomware-Trojaner ein.

Schlimmer: Beim Aufräumen finden sie auf ihrem Server eine Hintertür.

Einmal mit Profis arbeiten!

April 29, 2016 09:00 PM



Efficient algorithms for supersingular isogeny Diffie-Hellman

Supersingular isogeny Diffie-Hellman is a “post-quantum” algorithm notable for having manageable key sizes. The implementation here is fast and constant time. It also pairs with classic ECDH to provide hybrid security against classic and quantum attacks.


by tedu at April 29, 2016 08:59 PM


How many more edges can be added to a graph while keeping it acyclic? [migrated]

If I have a connected, directed graph with n vertices and m edges, is there some sort of formula that describes how many more edges can be added to the graph while keeping it acyclic?

by user50420 at April 29, 2016 08:52 PM



Multi-class cross validation LIBSVM in MATLAB

I am doing a multi-class SVM to solve a problem. I have 4 classes (1,2,3,4). I am also trying to implement the grid search method to find the best values of C and gamma. The code I adapted was from the official FAQ LIBSVM page:

The code there is about a single class problem. I tried to optimize it to fit my case. I followed their suggestion which they wrote here:

I tried to find the best possible parameters (C and gamma) for each pair of two classes (one-against-all method). My code is as follows:

%# Find best values for c and g using cross validation
bestcv                  = 0;
for k=1:numLabels
bestcv(k)                  = 0;

for log2c = -3:3,
for log2g = -5:4,
cmd                 = ['-v 10 -c ', num2str(2^log2c), ' -g ', num2str(2^log2g)];

cv                  = svmtrain(double(gridLabel==k), gridData, cmd);
if (cv >= bestcv(k)),
  bestcv(k)            = cv; 
  bestc(k)             = 2^log2c; 
  bestg(k)             = 2^log2g;

fprintf('%g %g %g (best c=%g, g=%g, rate=%g)\n', log2c, log2g, cv, bestc, bestg, bestcv);

Then, after finding the optimal parameters, I train the data with different parameters for every pair of classes (one against all) as in the following code:

model                   = cell(numLabels,1);
for k=1:numLabels
model{k}                = svmtrain(double(trainLabel==k), trainData, ['-c ', num2str(bestc(k)),' -g ', num2str(bestg(k)), ' -b 1']);

My questions are:

  • Is this the right way to do cross validation with multiple classes?
  • How can I measure the over all performance of my model after having different cross validation values? If by plotting the cost function, how to do that in my case?
  • How can I optimize the grid search method to fit it for my problem?

I am looking forward to reading your comments.

by Kevin.hammet at April 29, 2016 08:32 PM


Why should there be an equity risk premium?

After years of mathematical finance I am still not satisfied with the idea of a risk premium in the case of stocks.

I agree that (often) there is a premium for long dated bonds, illiquid bonds or bonds with credit risk (which in fact they all have). I can explain this to my grandmother. My question is very much in the vein of this one. Yes, investors want to earn more than risk free but do they always get it? Or does the risk premium just fill the gap - sometimes positive sometimes negative? Finally: Do you know any really good publications where equity risk premium is explained and made plausible in the case of stocks? Does it make any sense to say "with $x\%$ volatility I expect a premium of $y\%$"? Sometimes stocks are just falling and there is risk and no premium. What do you think?

by Richard at April 29, 2016 08:08 PM


Writing a neural network using python + numpy

I am trying to program e neural network without a ml library for a class. The graph of my cost function is decreasing smoothly when training the network with a small sample(~50). However when I am using a larger sample (100 +), my graph instead of decreasing, is more looking like a cardiogram.. Can someone point to me what I am doing wrong? I have tried different learning rates, but nothing is working. Here is my python code:

import numpy as np
import matplotlib.pyplot as plt

n_samples = 500
n_features = 10
n_outputs = 4
n_hidden0 = 10
n_hidden1 = 11

#Genereting training data
x = np.random.normal(size=(n_samples, n_features))
y = np.random.multinomial(1, [1.0/n_outputs]*n_outputs, size=n_samples).astype(np.float64)

#Generating the weights 
w0 = np.random.uniform(-0.1, +0.1, size=(n_hidden0, x.shape[1]))
b0 = np.zeros((1, n_hidden0))

w1 = np.random.uniform(-0.1, +0.1, size=(n_hidden1, n_hidden0))
b1 = np.zeros((1, n_hidden1))

w2 = np.random.uniform(-0.1, +0.1, size=(y.shape[1], n_hidden1))
b2 = np.zeros((1, y.shape[1]))

def softmax(z):
    m = z.max(1)[:, None]  # improves numerical stability of softmax
    e = np.exp(z - m)
    return e / e.sum(1)[:, None]

def softmax_loss(w0, b0, w1, b1, w2, b2, x, y):
    _, _, _, out = forward_pass(w0, b0, w1, b1,w2, b2, x)
    err = -np.sum(y * np.log(out + 1e-16)) #/ x.shape[0]
    return err

def forward_pass(w0, b0, w1, b1, w2, b2, x):
    z =, w0.T) + b0
    h0 = np.tanh(z)

    z =, w1.T) + b1
    h1 = np.tanh(z)

    z =, w2.T) + b2
    o = softmax(z)
    return [x, h0, h1, o]

def backward_pass(activations, w0, w1, w2, y):
    x, h0, h1, o = activations
    delta = o - y
    dw2 =, h1)
    db2 = delta.sum(0)

    delta =, w2)
    delta *= (1-h1*h1)
    dw1 =, h0)
    db1 = delta.sum(0)

    delta =, w1)
    delta *= (1-h0*h0)
    dw0 =, x)
    db0 = delta.sum(0)
    return [dw0, db0, dw1, db1, dw2, db2]

#updating the weights
eta = 0.03
epochs = 200
softmax_cost = []
for i in range(epochs):
    activations = forward_pass(w0, b0, w1, b1, w2, b2, x)
    dw0, db0, dw1, db1, dw2, db2 = backward_pass(activations, w0, w1, w2, y)

    w0 -= eta * dw0
    b0 -= eta * db0

    w1 -= eta * dw1
    b1 -= eta * db1

    w2 -= eta * dw2
    b2 -= eta * db2

    cost = softmax_loss(w0, b0, w1, b1, w2, b2, x, y)


by Potato-power at April 29, 2016 08:05 PM