Planet Primates

October 07, 2016

Planet Theory

Linear algebraic structure of word meanings

Word embeddings capture the meaning of a word using a low-dimensional vector and are ubiquitous in natural language processing (NLP). (See my earlier post 1 and post2.) It has always been unclear how to interpret the embedding when the word in question is polysemous, that is, has multiple senses. For example, tie can mean an article of clothing, a drawn sports match, and a physical action.

Polysemy is an important issue in NLP and much work relies upon WordNet, a hand-constructed repository of word senses and their interrelationships. Unfortunately, good WordNets do not exist for most languages, and even the one in English is believed to be rather incomplete. Thus some effort has been spent on methods to find different senses of words.

In this post I will talk about my joint work with Li, Liang, Ma, Risteski which shows that actually word senses are easily accessible in many current word embeddings. This goes against conventional wisdom in NLP, which is that of course, word embeddings do not suffice to capture polysemy since they use a single vector to represent the word, regardless of whether the word has one sense, or a dozen. Our work shows that major senses of the word lie in linear superposition within the embedding, and are extractable using sparse coding.

This post uses embeddings constructed using our method and the wikipedia corpus, but similar techniques also apply (with some loss in precision) to other embeddings described in post 1 such as word2vec, Glove, or even the decades-old PMI embedding.

A surprising experiment

Take the viewpoint –simplistic yet instructive– that a polysemous word like tie is a single lexical token that represents unrelated words tie1, tie2, … Here is a surprising experiment that suggests that the embedding for tie should be approximately a weighted sum of the (hypothethical) embeddings of tie1, tie2, …

Take two random words $w_1, w_2$. Combine them into an artificial polysemous word $w_{new}$ by replacing every occurrence of $w_1$ or $w_2$ in the corpus by $w_{new}.$ Next, compute an embedding for $w_{new}$ using the same embedding method while deleting embeddings for $w_1, w_2$ but preserving the embeddings for all other words. Compare the embedding $v_{w_{new}}$ to linear combinations of $v_{w_1}$ and $v_{w_2}$.

Repeating this experiment with a wide range of values for the ratio $r$ between the frequencies of $w_1$ and $w_2$, we find that $v_{w_{new}}$ lies close to the subspace spanned by $v_{w_1}$ and $v_{w_2}$: the cosine of its angle with the subspace is on average $0.97$ with standard deviation $0.02$. Thus $v_{w_{new}} \approx \alpha v_{w_1} + \beta v_{w_2}$. We find that $\alpha \approx 1$ whereas $\beta \approx 1- c\lg r$ for some constant $c\approx 0.5$. (Note this formula is meaningful when the frequency ratio $r$ is not too large, i.e. when $ r < 10^{1/c} \approx 100$.) Thanks to this logarithm, the infrequent sense is not swamped out in the embedding, even if it is 50 times less frequent than the dominant sense. This is an important reason behind the success of our method for extracting word senses.

This experiment –to which we were led by our theoretical investigations– is very surprising because the embedding is the solution to a complicated, nonconvex optimization, yet it behaves in such a striking linear way. You can read our paper for an intuitive explanation using our theoretical model from post2.

Extracting word senses from embeddings

The above experiment suggests that

but this alone is insufficient to mathematically pin down the senses, since $v_{tie}$ can be expressed in infinitely many ways as such a combination. To pin down the senses we will interrelate the senses of different words —for example, relate the “article of clothing” sense tie1 with shoe, jacket etc.

The word senses tie1, tie2,.. correspond to “different things being talked about” —in other words, different word distributions occuring around tie. Now remember that our earlier paper described in post2 gives an interpretation of “what’s being talked about”: it is called discourse and it is represented by a unit vector in the embedding space. In particular, the theoretical model of post2 imagines a text corpus as being generated by a random walk on discourse vectors. When the walk is at a discourse $c_t$ at time $t$, it outputs a few words using a loglinear distribution:

One imagines there exists a “clothing” discourse that has high probability of outputting the tie1 sense, and also of outputting related words such as shoe, jacket, etc. Similarly there may be a “games/matches” discourse that has high probability of outputting tie2 as well as team, score etc.

By equation (2) the probability of being output by a discourse is determined by the inner product, so one expects that the vector for “clothing” discourse has high inner product with all of shoe, jacket, tie1 etc., and thus can stand as surrogate for $v_{tie1}$ in expression (1)! This motivates the following global optimization:

Given word vectors in $\Re^d$, totaling about $60,000$ in this case, a sparsity parameter $k$, and an upper bound $m$, find a set of unit vectors $A_1, A_2, \ldots, A_m$ such that where at most $k$ of the coefficients $\alpha_{w,1},\dots,\alpha_{w,m}$ are nonzero (so-called hard sparsity constraint), and $\eta_w$ is a noise vector.

Here $A_1, \ldots A_m$ represent important discourses in the corpus, which we refer to as atoms of discourse.

Optimization (3) is a surrogate for the desired expansion of $v_{tie}$ in (1) because one can hope that the atoms of discourse will contain atoms corresponding to clothing, sports matches etc. that will have high inner product (close to $1$) with tie1, tie2 respectively. Furthermore, restricting $m$ to be much smaller than the number of words ensures that each atom needs to be used for multiple words, e.g., reuse the “clothing” atom for shoes, jacket etc. as well as for tie.

Both $A_j$’s and $\alpha_{w,j}$’s are unknowns in this optimization. This is nothing but sparse coding, useful in neuroscience, image processing, computer vision, etc. It is nonconvex and computationally NP-hard in the worst case, but can be solved quite efficiently in practice using something called the k-SVD algorithm described in Elad’s survey, lecture 4. We solved this problem with sparsity $k=5$ and using $m$ about $2000$. (Experimental details are in the paper. Also, some theoretical analysis of such an algorithm is possible; see this earlier post.)

Experimental Results

Each discourse atom defines via (2) a distribution on words, which due to the exponential appearing in (2) strongly favors words whose embeddings have a larger inner product with it. In practice, this distribution is quite concentrated on as few as 50-100 words, and the “meaning” of a discourse atom can be roughly determined by looking at a few nearby words. This is how we visualize atoms in the figures below. The first figure gives a few representative atoms of discourse.

A few of the 2000 atoms of discourse found

And here are the discourse atoms used to represent two polysemous words, tie and spring

Discourse atoms expressing the words tie and spring.

You can see that the discourse atoms do correspond to senses of these words.

Finally, we also have a technique that, given a target word, generates representative sentences according to its various senses as detected by the algorithm. Below are the sentences returned for ring. (N.B. The mathematical meaning was missing in WordNet but was picked up by our method.)

Representative sentences for different senses of the word ring.

A new testbed for testing comprehension of word senses

Many tests have been proposed to test an algorithm’s grasp of word senses. They often involve hard-to-understand metrics such as distance in WordNet, or sometimes tied to performance on specific applications like web search.

We propose a new simple test –inspired by word-intrusion tests for topic coherence due to Chang et al 2009– which has the advantages of being easy to understand, and can also be administered to humans.

We created a testbed using 200 polysemous words and their 704 senses according to WordNet. Each “sense” is represented by a set of 8 related words; these were collected from WordNet and online dictionaries by college students who were told to identify most relevant other words occurring in the online definitions of this word sense as well as in the accompanying illustrative sentences. These 8 words are considered as ground truth representation of the word sense: e.g., for the “tool/weapon” sense of axe they were: handle, harvest, cutting, split, tool, wood, battle, chop.

Police line-up test for word senses: the algorithm is given a random one of these 200 polysemous words and a set of $m$ senses which contain the true sense for the word as well as some distractors, which are randomly picked senses from other words in the testbed. The test taker has to identify the word’s true senses amont these $m$ senses.

As usual, accuracy is measured using precision (what fraction of the algorithm/human’s guesses were correct) and recall (how many correct senses were among the guesses).

For $m=20$ and $k=4$, our algorithm succeeds with precision $63\%$ and recall $70\%$, and performance remains reasonable for $m=50$. We also administered the test to a group of grad students. Native English speakers had precision/recall scores in the $75$ to $90$ percent range. Non-native speakers had scores roughly similar to our algorithm.

Our algorithm works something like this: If $w$ is the target word, then take all discourse atoms computed for that word, and compute a certain similarity score between each atom and each of the $m$ senses, where the words in the senses are represented by their word vectors. (Details are in the paper.)


Word embeddings have been useful in a host of other settings, and now it appears that they also can easily yield different senses of a polysemous word. We have some subsequent applications of these ideas to other previously studied settings, including topic models, creating WordNets for other languages, and understanding the semantic content of fMRI brain measurements. I’ll describe some of them in future posts.

October 07, 2016 01:00 PM

September 30, 2016


Why can't HashingTF in Spark MLib produce sparse vectors?

Why does the number of features produced by HashingTF has to be limited? I've tried to set it to Int.MaxValue, but it runs out of memory while trying to allocate a huge array. Why can't it produce a sparse vector?

by lizarisk at September 30, 2016 10:53 AM

Detecting time series patterns R or Matlab

I have a sales data of like 700K rows and three columns: Date (in dates), Store (as represented by numbers and there are more than 4800 different stores) and Sales amount (integer). Looks like this:

Date      Sales  Store
6/9/2012   392   7184
6/9/2012   507   584
6/9/2012   1279  3060
6/9/2012   503   5572

My objective is to detect the similar patterns in Sales data, and which store is assigned to which pattern. There a couple of pattern detection problems already asked but this time the patterns need to be linked by stores.

I know how to use Minitab, but it is not useful for this question. I guess R or Matlab, would work but i'm really novice in this languages. Probabaly, R would help more in the following problems, I ll face like forecasting, goodness of fit etc. I need a general help on for instance which package to use and how. I know it sounds open ended, but heck that's all I can right now:/


by volkang at September 30, 2016 10:49 AM

Parenthesis Balancing Algorithm

I'm working with a little program in scala for checking if certain expression iis correctly formed with respect to the opening and closing of parentheses. It is the problem as here but my own program.

def balance(chars: List[Char]): Boolean = {
  def balanced(chars: List[Char], opened: Int, closed: Int): Boolean = {
   if (chars.isEmpty && opened == closed) true
   else if (opened < closed) false
   else {
    if (chars.head == '(') balanced(chars.tail, opened + 1, closed)
    if (chars.head == ')') balanced(chars.tail, opened, closed + 1)
    else balanced(chars.tail, opened, closed)
  balanced(chars, 0, 0)


println(balance("I told him (that it's not (yet) done).\n(But he wasn't listening)".toList))

The problem is that for example it does not work for this example. I traced the program an apparently the problem comes when we return from the recursive calls but I cannot figure out what the error is.

by Rodrigo at September 30, 2016 10:42 AM


how type checking fails?

I was doing a type checking example in system f sub on paper to understand how it works.

according to Pierce's book Types and Programming Languages, numbers and their types are following in system f sub. (see chapter 26.3, page 399)

   1)top is universal type, 
   2)capital letters are type variables, small letters are term variables  

    church number 1
    sone = λA<:top.λB<:A.λC<:A.λx:(A-->B).λy:C.x y
    stwo = λA<:top.λB<:A.λC<:A.λx:(A-->B).λy:C.x (x y)

    type for church number 0
    SZero = ∀A<:top.∀B<:A.∀C<:A.(A-->B)-->C-->C

    type for numbers except 0
    SPos = ∀A<:top.∀B<:A.∀C<:A.(A-->B)-->C-->B
    SNat = ∀A<:top.∀B<:A.∀C<:A.(A-->B)-->C-->A

for the type checking

   Γ |-  stwo :  SZero  

should fail.

book said "SPos is inhabited by all the elements of SNat except SZero".
Also I saw there is a input test test files, it shows above example should fail.

Therefore, I assume

 Γ |-  sone :  SZero  

should fail too.

I want to see how it is going to fail and did pen-paper type checking as following (see type checking rules in the same book, previous page)

  1)for convenience, I wrote from top to down fashion
  2)variables are given distinct

   Γ |-  (λA<:top.λB<:A.λC<:A.λx:(A-->B).λy:C.x y)  
       : (∀A1<:top.∀B1<:A1.∀C1<:A1.(A1-->B1)-->C1-->C1)
   ---------------------------------------------(T-TABS) =>assume A=A1,top=top
                                                             ,and renamed A1 to A 
  A<:top |- (λB<:A.λC<:A.λx:(A-->B).λy:C.x y) 
  ----------------------------------------------(T-TABS) =>assume B=B1, rename B1 to B

A<:top,B<:A |- (λC<:A.λx:(A-->B).λy:C.x y)
 ---------------------------------------------------(T-TABS) =>assume C=C1, rename C1 to C
 A<:top,B<:A,C<:A |- (λx:(A-->B).λy:C.x y)
                   : (A-->B)-->C-->C
 -----------------------------------------------------T_ABS =>have A-->B=A-->B,   get A=A,B=B,remove them

 A<:top,B<:A,C<:A,x:A-->B |- (λy:C.x y) : C-->C 
 ------------------------------------------------T-ABS ==>have  C=C
 A<:top,B<:A,C<:A,x:A-->B,y:C | ( x y ) : C         
 ----------------------------------------------- T-APP  introduce type variable T1
 A<:top,B<:A,C<:A,x:A-->B,y:C | x : T1-->C  A:top,B<:A,C<:A,x:A-->B,y:C | y:T1     
 ----------------------------------------  ----------------------------------
   have X:T1-->C                                   have y:T1
        X:A-->B                                         y:C
    so, T1=/=A, B=/=C                                   T1=/=C

so fails I assume.

I thought the type checking

  Γ |-  sone :  SPos 

should be successful, but it ..

 1) type is different a bit here

   Γ |-  (λA<:top.λB<:A.λC<:A.λx:(A-->B).λy:C.x y)  
       : (∀A1<:top.∀B1<:A1.∀C1<:A1.(A1-->B1)-->C1-->B1)

        intermediate steps are all same 
  A<:top,B<:A,C<:A,x:A-->B |- (λy:C.x y) : C-->B 
 ------------------------------------------------T-ABS ==>have  C=C
 A<:top,B<:A,C<:A,x:A-->B,y:C | ( x y ) : B         
 ----------------------------------------------- T-APP  introduce type variable T1
 A<:top,B<:A,C<:A,x:A-->B,y:C | x : T1-->B  A:top,B<:A,C<:A,x:A-->B,y:C | y:T1     
 ----------------------------------------  ----------------------------------
   have X:T1-->B                                   have y:T1
        X:A-->B                                         y:C
    so, T1=/=A, B=B                                   T1=/=C

See, these two type checking ended up pretty same, I did not understand why the first type checking should fail, while second one should be successful.

How to do type checking in system F sub? If you know, please correct me, thank you.

by alim at September 30, 2016 10:41 AM

Fred Wilson

Funding Friday: A Computer Anyone Can Make

If you haven’t heard of Kano, you are missing out. This is so cool. I funded it the day the project launched. They are a little bit more than half way to their goal of $500k.

Join the fun and back this today.

by Fred Wilson at September 30, 2016 10:34 AM


How to get this upper bound on worst-case heaps?

I've seen answers on the subjects, however I still don't get such answers. In the Cormen book (Introduction to algorithms) it is explained that the worst case for a $Max-Heapify$ call happens when the last level is "half full", in this case the worst case cost is modeled by the inequality:

$$ T\left(n\right) \leq T\left( \frac{2n}{3} \right) + \Theta(1) $$

I understand the equality (i.e. what it is modeling) but I don't understand how the worst case is built, I've tried to imagine an heap where the left subtree of the root is full up to level $h$ while the right one has (for example) just one node at level $h$, and with this example I can't see why the worst case modeling doesn't work, why such case would not represent an instance of the worst case?

The second question is that I don't understand why the number of nodes at height $h$ are at most $\lceil \frac{n}{2^{h+1}} \rceil$. There are proofs but I can't understand the motivation.


Trying to explain in more detail what I don't understand. As far as I know/remember for given $n$ that somehow measures how large is the input of an algorithm the worst case analysis is "choose among all the possible instances of size $n$ the one that would give you the worst case execution cost"

Following this direction, to me is not particular clear how for given $n$ we can be sure that the left subtree has $\frac{2n}{3}$ nodes, it is ok when the value $n$ is such that the distribution of nodes is such that half of the last level is full, while the other one is empty. In the proof for cost bounding I would differently.

I would distinguish two cases :

  1. The tree has height $h$ and is complete, i.e. the total number of nodes is $2^{h+1} - 1$, I would also explain that it is possible to build in such case an heap such that starting from the root key it is possible to push down such key until it reach a leaf, how much is the cost in such case? it is exactly $h = ln_2(n+1) - 1 = O(ln \;n)$.

  2. The tree has height $h$ but it is not complete, given the heap structure the left-most leaf is at height $h$. As done in point 1 I would explain it is possible to build a tree such that the root would be pushed down always in the left branch. How much is the cost in this case? Given that $n > 2^{h} - 1$ we have that $h < ln_2(n+1) = O(ln(n))$.

I don't think is difficult to build a particular example for both 1. and 2. I also don't really see the point of "completing a tree" for computing the worst case cost, I can understand this operation can make the analysis easier, but for what I know about worst case analysis (assuming it is correct) it doesn't sounds correct to me.

by user8469759 at September 30, 2016 10:32 AM

Prove if $L = \{0^m1^n | m \neq n\}$ is regular or not

I am trying to show whether this language is regular or not:

$$L = \{0^m1^n | m \neq n\}$$

Since I cannot create or think of an automata that recognizes L, I am suspecting that L is not regular. From the book I am using, it seems I can use the pumping lemma, I have done this:

Let $p$ be the pumping lemma constant, and let $w=0^p1^{p+1}$

In this case $$|w|=2p + 1\geq p$$

We can divide $w$ into $xyz$, where $x=0^{p-1}$, $y=0$, $z=^{p+1}$, then $|y|\geq 1$

If L was regular, then $\forall k \geq 0, xy^kz \in L$. Let choose $k=2$, then:


Here is where I am stuck, how do I prove that L is not regular?

Another question, pumping lemma applies only to infinite languages?

by dpalma at September 30, 2016 10:18 AM


Distribution of Black-Scholes option price

Using sample variance estimator $\text{s}$ of volatility in Black-Scholes formula we obtain $$D\sim BSCall \left( \frac{(n-1) \text{s}^2}{\chi_{n-1} ^2} \right)$$ where $D$ is the option price and $BSCall(\sigma)$ is the well-known expression of option price considered as a function of volatility.

What is confusing — the $BSCall(\cdot)$ formula was obtained as $\mathbb{E} (S_T - K)^+$ under known 'true volatility' $\sigma$ that is determined, not random. But when using sample variance estimator it's considered like $\mathbb{E} \left[(S_T - K)^+ | \sigma \right]$ with random $\sigma$ distributed as $\dfrac{(n-1)\text{s}^2}{\chi_{n-1} ^2}$.

Is after that correct to calculate the theoretical option price as $\mathbb{E}D$?

NOTE. My question isn't about the expectation tower property. I mean We know $\dfrac{(n-1)\text{s}^2}{\sigma} \sim \chi_{n-1} ^2$ where $\sigma$ isn't random and $\text{s}$ is random. Is it correct to consider $\sigma$ as a random variable distributed as $\dfrac{(n-1)\text{s}^2}{\chi_{n-1} ^2}$ and compute expectation with respect to this distribution? To my mind aditional uncertainty is brought here by exogeneous random $\chi_{n-1} ^2$.

And what's, after all, the correct way to price an option using sample variance estimator of historical asset volatility?

by Denis Korzhenkov at September 30, 2016 10:14 AM


Parameterized vertex cover on $r$-regular graphs

I am trying to solve the following exercise from this book:

Show that CLIQUE PROBLEM, parameterized by the solution size $k$, is Fixed-parameter tractable (FTP) on $r$-regular graphs for every fixed integer $r$.

Here, the CLIQUE PROBLEM is given a instance $(G, k)$, decide whether $G$ has a clique of size $k$ or not.

First of all, for an instance $(G, k)$, if $k > r+1$, then the answer is NO, because each vertex is connected with exactly $r$ elements, the maximum size of a clique is $r + 1$ (vertex plus $r$ neighbours). So, we can assume that $k \le r+1$.

Let $N(v)$ be the set of neighbours of $v$.

I thought of that simple algorithm

.... for each vertex $v \in V(G)$

........ check if for any subset $X \subset N(v)$, such that $|X| = k - 1$, $X \cup \{v\}$ is a clique.

Since there is only $\binom r k$ such subsets $X$ for each vertex and we take time polynomial in $k$ to check if $X \cup \{v\}$ is a clique, then, this algorithm is already a FTP and is of the form $\left( k^{O(1)}\binom{r}{k} \right)n$.

If everything is right, them I have solved the exercise. However, the next thing I have to do in the exercise, is to show that this problem is also a FTP considering the parameter $k + r$ (so, $r$ is no longer seen as a constant), and the same algorithm works in this case. Since I was expecting to face a harder exercise in this case of $k + r$, I started to think my solution is not right.

So, what is wrong?

by Vitor at September 30, 2016 10:11 AM

String matching algorithm - check if a string matches a pattern

This looks like quite the challenge; given a pattern $P$ (of length $n$) and a string $S$ (of length $m$), how would you check whether the string matches the pattern? For instance:

  • If $P$ = "xyx" and $S$ = "foobarfoo" then $S$ matches $P.$
  • If $P$ = "acca" and $S$ = "carbuscarbus" then $S$ does not match $P.$

My thoughts so far: This looks like a dynamic programming problem. We can define a boolean

$M(i, j)$ = True iff pattern $P_{i}P_{i+1}...P_{n}$ matches $S_{j}S_{j+1}...S_{m}.$

We then need $M(0, 0).$

I couldn't think of an efficient recurrence beyond this. It seems like I would have to introduce an extra mapping (from a substring of $P$ to a substring of $S$, indicating that the earlier "matches" the latter; for example in the first example, ("x", "foo") would be part of the dictionary. You then replace "x" by "foo" in the substring of P). This would make it $O(2^n)$ or worse though.

Thoughts on this? Would appreciate it!

by Mathguy at September 30, 2016 10:09 AM


How feature importance and forest structures are related in scikit-learn RandomForestClassifier?

Here is a simple example of my problem, using the Iris dataset. I am puzzled when trying to understand how feature importance are computed and how this is visible when visualizing the forest of estimators using export_graphviz. Here is my code:

import pandas as pd
import numpy as np
from sklearn.datasets import load_iris
import matplotlib.pyplot as plt

data = load_iris()
X = pd.DataFrame(,columns=['sepallength', 'sepalwidth', 'petallength','petalwidth'])
y = pd.DataFrame(

from sklearn.cross_validation import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)

from sklearn.ensemble import RandomForestClassifier
rf = RandomForestClassifier(n_estimators=2,max_depth=1),y_train.iloc[:,0])

The classifier performs poorly (the score is 0.68) since the forest contains 2 trees with depth of 1. Anyway this doesn't matter here.

The feature importance are retrieved as follow:

importances = rf.feature_importances_
std = np.std([rf.feature_importances_ for tree in rf.estimators_],axis=0)
indices = np.argsort(importances)[::-1]

print("Feature ranking:")
for f in range(X.shape[1]):
    print("%d. feature %s (%f)" % (f + 1, X.columns.tolist()[f], importances[indices[f]]))

and the output is :

Feature ranking:
1. feature sepallength (1.000000)
2. feature sepalwidth (0.000000)
3. feature petallength (0.000000)
4. feature petalwidth (0.000000)

Now when showing the structure of trees that are built using the following code:

from sklearn.tree import export_graphviz
!dot -Tpng -o tree0.png
from IPython.display import Image

I obtain these 2 figures

export of tree #0

export of tree #1

I cannot understand how sepallength can have importance=1 but not be used for node splitting in both trees (only petallength is used) as shown in the figure.

Sorry if this may be obvious but could somebody please clarify this?

Many thanks!

by user6903745 at September 30, 2016 10:08 AM

I get a linear regression using the SVR by python scikit-learn when the data is not linear

train.sort_values(by=['mass'], ascending=True, inplace=True)
x = train['mass']
y = train['pa']

# Fit regression model
svr_rbf = SVR(kernel='rbf', C=1e3, gamma=0.1)
svr_lin = SVR(kernel='linear', C=1e3)
svr_poly = SVR(kernel='poly', C=1e3, degree=2)
x_train = x.reshape(x.shape[0], 1)
x = x_train
y_rbf =, y).predict(x)
y_lin =, y).predict(x)
y_poly =, y).predict(x)

# look at the results
plt.scatter(x, y, c='k', label='data')
plt.plot(x, y_rbf, c='g', label='RBF model')
plt.plot(x, y_lin, c='r', label='Linear model')
plt.plot(x, y_poly, c='b', label='Polynomial model')
plt.title('Support Vector Regression')

The code is copied from And what I change is only the dataset. I do not know what is the matter.

by gary yong at September 30, 2016 10:05 AM

How to limit code changes when introducing state?

I am a senior C/C++/Java/Assembler programmer and I have been always fascinated by the pure functional programming paradigm. From time to time, I try to implement something useful with it, e.g., a small tool, but often I quickly reach a point where I realize that I (and my tool, too) would be much faster in a non-pure language. It's probably because I have much more experience with imperative programming languages with thousands of idoms, patterns and typical solution approaches in my head.

Here is one of those situations. I have encountered it several times and I hope you guys can help me.

Let's assume I write a tool to simulate communication networks. One important task is the generation of network packets. The generation is quite complex, consisting of dozens of functions and configuration parameters, but at the end there is one master function and because I find it useful I always write down the signature:

generatePackets :: Configuration -> [Packet]

However, after a while I notice that it would be great if the packet generation would have some kind of random behavior deep down in one of the many sub-functions of the generation process. Since I need a random number generator for that (and I also need it at some other places in the code), this means to manually change dozens of signatures to something like

f :: Configuration -> RNGState [Packet]


type RNGState = State StdGen

I understand the "mathematical" necessity (no states) behind this. My question is on a higher (?) level: How would an experienced Haskell programmer have approached this situation? What kind of design pattern or work flow would have avoided the extra work later?

I have never worked with an experienced Haskell programmer. Maybe you will tell me that you never write signatures because you have to change them too often afterwards, or that you give all your functions a state monad, "just in case" :)

by RmS at September 30, 2016 10:01 AM



Replicating estimates of intraday volatility from high frequency data (Bollerslev et al)

I am trying to replicate figure A.2 from [0]. I am reasonably sure I use the same dataset as the authors since I could replicate figure A.1.

This figure depicts the yearly averaged values of the realized volatility $\text{RV}_t$:


as a function of the frequency ($\Delta$) for 4 assets and 3 years.

Since the formula for $\text{RV}_t$ itself is reasonably straightforward to compute, I suspect my failure to reproduce figure A.2 stems from a misunderstanding of what values of $P_{t-1+i\Delta}$ are used in the computation.

More specifically, I the authors divide the day into it's 'active' and 'overnight' component. Quoting from page 9:

To begin, we use the Financial Calendars (FinCal) to provide market open and close times for some of the market sessions dating back to the year 2000. For assets with data prior to 2000, or assets outside the FinCal database, we rely on so-called \liquidity plots," in which for each minute of the day, we plot the proportion with at least one trade. As the resulting liquidity plots for the four representative assets and three years given in Figure A.1 in the Appendix show, this effectively delineate the periods of the day when the markets are actively operating. Having defined the \active" market hours on a rolling annual basis, we subsequently split the days into intraday and overnight sessions.

My questions are:

  • Are overnight squared returns (typically) included in the signature plots shown in figure A.2? The authors are not clear on this point.
  • How are overnight and intra-day period defined for the first three of the futures displayed in plot A.2 (ES, CL and TY)? According to Fincal, The official trading sessions for all three spans about 23 hours a day. But these 23 hours contain mostly period of 'ghost' markets (as the authors call them) as is visible from Figure A.1. How do they define the active part of the day?

P.S. I have very little experience in finance so this maybe a case of not understanding conventions.

  • [0] Risk everywhere. Bollerslev , Hood, Huss and Pedersen. 28/01/2016. Ungated copy

by user189035 at September 30, 2016 09:58 AM


Algorithm to check if a string matches a pattern

Given a pattern $P$ (of length $n$) and a string $S$ (of length $m$), how would you check whether the string matches the pattern? For instance:

  • If $P$ = "xyx" and $S$ = "foobarfoo" then $S$ matches $P.$
  • If $P$ = "abab" and $S$ = "carbuscarbus" then $S$ matches $P.$
  • If $P$ = "acca" and $S$ = "carbuscarbus" then $S$ does not match $P.$

My thoughts so far: This looks like a dynamic programming problem. We can define

$m(i, j)$ = True iff pattern $P_{i}P_{i+1}...P_{n}$ matches $S_{j}S_{j+1}...S_{m}.$

But I couldn't think of an efficient recursive relation beyond this. It sounds like I would have to introduce an extra mapping (from a substring of $P$ to a substring of $S$, indicating that the earlier "matches" the latter; for example in the first example, ("x", "foo") would be part of the dictionary. This would make it $O(2^n)$ though.

Any ideas on this? Thank you!

by user10532 at September 30, 2016 09:53 AM


ValueError: Filter must not be larger than the input

I am pretty new to machine learning so I am playing around with examples and such. The image size specified in the code is (28,28) But for some reason I keep getting the same ValueError I cant figure out why this is happening.

Here's the code:

import pandas as pd
import numpy as np
np.random.seed(1337) # for reproducibility

from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation, Flatten
from keras.layers.convolutional import Convolution2D, MaxPooling2D
from keras.utils import np_utils

# input image dimensions
img_rows, img_cols = 28, 28

batch_size = 128 # Number of images used in each optimization step
nb_classes = 10 # One class per digit
nb_epoch = 35 # Number of times the whole data is used to learn

# Read the train and test datasets
train = pd.read_csv("../input/train.csv").values
test  = pd.read_csv("../input/test.csv").values

# Reshape the data to be used by a Theano CNN. Shape is
# (nb_of_samples, nb_of_color_channels, img_width, img_heigh)
X_train = train[:, 1:].reshape(train.shape[0], 1, img_rows, img_cols)
X_test = test.reshape(test.shape[0], 1, img_rows, img_cols)
y_train = train[:, 0] # First data is label (already removed from X_train)

# Make the value floats in [0;1] instead of int in [0;255]
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255
X_test /= 255

# convert class vectors to binary class matrices (ie one-hot vectors)
Y_train = np_utils.to_categorical(y_train, nb_classes)

#Display the shapes to check if everything's ok
print('X_train shape:', X_train.shape)
print('Y_train shape:', Y_train.shape)
print('X_test shape:', X_test.shape)

model = Sequential()
# For an explanation on conv layers see
# By default the stride/subsample is 1
# border_mode "valid" means no zero-padding.
# If you want zero-padding add a ZeroPadding layer or, if stride is 1 use border_mode="same"
model.add(Convolution2D(12, 5, 5, border_mode='valid',input_shape=(1,img_rows, img_cols)))

# For an explanation on pooling layers see
model.add(MaxPooling2D(pool_size=(2, 2)))


model.add(Convolution2D(24, 5, 5))

model.add(MaxPooling2D(pool_size=(2, 2)))


# Flatten the 3D output to 1D tensor for a fully connected layer to accept the input
model.add(Dense(nb_classes)) #Last layer with one output per class
model.add(Activation('softmax')) #We want a score simlar to a probability for each class

# The function to optimize is the cross entropy between the true label and the output (softmax) of the model
# We will use adadelta to do the gradient descent see
model.compile(loss='categorical_crossentropy', optimizer='adadelta', metrics=["accuracy"])

# Make the model learn, Y_train, batch_size=batch_size, nb_epoch=nb_epoch, verbose=1)

# Predict the label for X_test
yPred = model.predict_classes(X_test)

# Save prediction in file for Kaggle submission
np.savetxt('mnist-pred.csv', np.c_[range(1,len(yPred)+1),yPred], delimiter=',', header = 'ImageId,Label', comments = '', fmt='%d')

by user3430238 at September 30, 2016 09:48 AM

How to update element inside List with ImmutableJS?

Here is what official docs said

updateIn(keyPath: Array<any>, updater: (value: any) => any): List<T>
updateIn(keyPath: Array<any>, notSetValue: any, updater: (value: any) => any): List<T>
updateIn(keyPath: Iterable<any, any>, updater: (value: any) => any): List<T>
updateIn(keyPath: Iterable<any, any>, notSetValue: any, updater: (value: any) => any): List<T>

There is no way normal web developer (not functional programmer) would understand that!

I have pretty simple (for non-functional approach) case.

var arr = [];
arr.push({id: 1, name: "first", count: 2});
arr.push({id: 2, name: "second", count: 1});
arr.push({id: 3, name: "third", count: 2});
arr.push({id: 4, name: "fourth", count: 1});
var list = Immutable.List.of(arr);

How can I update list where element with name third have its count set to 4?

by Vitalii Korsakov at September 30, 2016 09:46 AM

Get min and max possible value for a forecast using sci-kit learn machine learning

I am trying to use linear regression to predict a future value in a time series. On forecasting for a given date I get a fixed number as the expected value. However, is it possible to get a range of value, so as to say that the maximum possible value be say x and minimum possible value be say y.


regr = linear_model.LinearRegression(), Y_train)  

pred = regr.predict([[a, b]])

The value of pred comes out be say 10 , but i would rather want something like max = 12 and min = 8

Simply saying Calculate Confidence Interval for Linear Regression


Tried looking into GMM , not sure if that work for this.

Tried Gausian processes but it again give a single value something like 11.137631, which really doesn't as i am looking for a range of value rather than a single value.

by Jibin Mathew at September 30, 2016 09:44 AM

sklearn DictVectorizer(sparse=False) with a different default value, Impute a constant

I'm building a pipeline that starts with a DictVectorizer that produces a sparse matrix. Specifying sparse=True changes the output from a scipy sparse matrix to a numpy dense matrix which is good, but the next stages in the pipeline complain about NaN values, which our obvious outcome of using the DictVectorizer in my case. I'd like the pipeline to consider missing dictionary values not as not available but as zero.

Imputer doesn't help me as far as I can see, because I want to "impute" with a constant value and not a statistical value dependant of other values of the column.

Following is the code I've been using:

vectorize = skl.feature_extraction.DictVectorizer(sparse=False)
variance = skl.feature_selection.VarianceThreshold()
knn = skl.neighbors.KNeighborsClassifier(4, weights='distance', p=1)

pipe = skl.pipeline.Pipeline([('vectorize', vectorize),
                            # here be dragons ('fillna', ),
                            ('variance', variance),
                            ('knn', knn)]), labels)

And some mocked dictionaries:

dict_data = [{'city': 'Dubai', 'temperature': 33., 'assume_zero_when_missing': 7},
             {'city': 'London', 'temperature': 12.},
             {'city': 'San Fransisco', 'temperature': 18.}]

Notiec that in this example, assume_zero_when_missing is missing from most dictionaries, which will lead later estimators to complain about NaN values:

ValueError: Input contains NaN, infinity or a value too large for dtype('float64').

While the result I'm hoping for is that NaN values will be replaced with 0.

by NirIzr at September 30, 2016 09:43 AM

TfidfVectorizer in scikit-learn : ValueError: np.nan is an invalid document

I'm using TfidfVectorizer from scikit-learn to do some feature extraction from text data. I have a CSV file with a Score (can be +1 or -1) and a Review (text). I pulled this data into a DataFrame so I can run the Vectorizer.

This is my code:

import pandas as pd
import numpy as np
from sklearn.feature_extraction.text import TfidfVectorizer

df = pd.read_csv("train_new.csv",
             names = ['Score', 'Review'], sep=',')

# x = df['Review'] == np.nan
# print x.to_csv(path='FindNaN.csv', sep=',', na_rep = 'string', index=True)
# print df.isnull().values.any()

v = TfidfVectorizer(decode_error='replace', encoding='utf-8')
x = v.fit_transform(df['Review'])

This is the traceback for the error I get:

Traceback (most recent call last):
  File "/home/PycharmProjects/Review/src/", line 16, in <module>
x = v.fit_transform(df['Review'])
 File "/home/b/hw1/local/lib/python2.7/site-   packages/sklearn/feature_extraction/", line 1305, in fit_transform
   X = super(TfidfVectorizer, self).fit_transform(raw_documents)
 File "/home/b/work/local/lib/python2.7/site-packages/sklearn/feature_extraction/", line 817, in fit_transform
 File "/home/b/work/local/lib/python2.7/site- packages/sklearn/feature_extraction/", line 752, in _count_vocab
   for feature in analyze(doc):
 File "/home/b/work/local/lib/python2.7/site-packages/sklearn/feature_extraction/", line 238, in <lambda>
tokenize(preprocess(self.decode(doc))), stop_words)
 File "/home/b/work/local/lib/python2.7/site-packages/sklearn/feature_extraction/", line 118, in decode
 raise ValueError("np.nan is an invalid document, expected byte or "
 ValueError: np.nan is an invalid document, expected byte or unicode string.

I checked the CSV file and DataFrame for anything that's being read as NaN but I can't find anything. There are 18000 rows, none of which return isnan as True.

This is what df['Review'].head() looks like:

  0    This book is such a life saver.  It has been s...
  1    I bought this a few times for my older son and...
  2    This is great for basics, but I wish the space...
  3    This book is perfect!  I'm a first time new mo...
  4    During your postpartum stay at the hospital th...
  Name: Review, dtype: object

by boltthrower at September 30, 2016 09:43 AM

Using LabelEncoder for a series in scikitlearn

I have a Column in a Dataset which has categorical values and I want to convert them in Numerical values. I am trying to use LabelEncoder but get errors doing so.

from sklearn.preprocessing import LabelEncoder
m = hsp_train["Alley"]
m_enc = LabelEncoder()
j = m_enc.fit_transform(m)

I am getting an error:

unorderable types: float() > str()

The series in the Column has 3 values. I want them to be 0, 1, 2 respectively but I am getting that error.

I also tried this:

l = hsp_train["Alley"]
l_enc = pd.factorize(l)
hsp_train["Alley"] = l_enc[0]

But this is giving me values -1, 1, 2. which I don't want I want it from 1.

by Sahil at September 30, 2016 09:41 AM

Why I get random result from seemingly non-random code with python sklearn?

I updated the question based on the responses.

I have a list of strings named "str_tuple". I want to compute some similarity measures between the first element in the list and the rest of the elements. I run the following six-line code snippet.

What completely baffles me is that the outcome seems to be completely random every time I run the code. However, I cannot see any randomness introduced in my six-liner.


It is pointed out that TruncatedSVD() has a "random_state" argument. Specifying "random_state" will give fixed result (which is completely True). However, if you change the "random_state", the result will change. But with other strings (e.g. str2), the result is the same regardless how you change "random_state". In fact, these strings are from the HOME_DEPOT Kaggle competition. I have a pd.Series containing thousands of such strings, most of them give non-random results behaving like str2 (no matter what "random_state" is set). For some unknown reasons, str1 is one of the examples that give random results every time you change "random_state". I start to think maybe some intrinsic characters with str1 make the difference.

from sklearn.feature_extraction.text import CountVectorizer
from sklearn.decomposition import TruncatedSVD
from sklearn.preprocessing import Normalizer

# str1 yields random results
str1 = [u'l bracket', u'simpson strong tie 12 gaug angl', u'angl make joint stronger provid consist straight corner simpson strong tie offer wide varieti angl various size thick handl light duti job project structur connect need bent skew match project outdoor project moistur present use zmax zinc coat connector provid extra resist corros look "z" end model number .versatil connector various 90 connect home repair projectsstrong angl nail screw fasten alonehelp ensur joint consist straight strongdimensions: 3 in. xbi 3 in. xbi 1 0.5 in. made 12 gaug steelgalvan extra corros resistanceinstal 10 d common nail 9 xbi 1 0.5 in. strong drive sd screw', u'simpson strong-tie', u'', u'versatile connector for various 90\xe2\xb0 connections and home repair projects stronger than angled nailing or screw fastening alone help ensure joints are consistently straight and strong dimensions: 3 in. x 3 in. x 1-1/2 in. made from 12-gauge steel galvanized for extra corrosion resistance install with 10d common nails or #9 x 1-1/2 in. strong-drive sd screws']
# str2 yields non-random result     
str2 = [u'angl bracket', u'simpson strong tie 12 gaug angl', u'angl make joint stronger provid consist straight corner simpson strong tie offer wide varieti angl various size thick handl light duti job project structur connect need bent skew match project outdoor project moistur present use zmax zinc coat connector provid extra resist corros look "z" end model number .versatil connector various 90 connect home repair projectsstrong angl nail screw fasten alonehelp ensur joint consist straight strongdimensions: 3 in. xbi 3 in. xbi 1 0.5 in. made 12 gaug steelgalvan extra corros resistanceinstal 10 d common nail 9 xbi 1 0.5 in. strong drive sd screw', u'simpson strong-tie', u'', u'versatile connector for various 90\xe2\xb0 connections and home repair projects stronger than angled nailing or screw fastening alone help ensure joints are consistently straight and strong dimensions: 3 in. x 3 in. x 1-1/2 in. made from 12-gauge steel galvanized for extra corrosion resistance install with 10d common nails or #9 x 1-1/2 in. strong-drive sd screws']   

vectorizer = CountVectorizer(token_pattern=r"\d+\.\d+|\d+\/\d+|\b\w+\b")
# replacing str1 with str2 gives non-ramdom result regardless of random_state
cmat = vectorizer.fit_transform(str1).astype(float)    # sparse matrix
cmat = TruncatedSVD(2).fit_transform(cmat)    # dense numpy array
cmat = Normalizer().fit_transform(cmat)    # dense numpy array
sim =, cmat.T)

by wen at September 30, 2016 09:40 AM

LabelEncoder specify classes in DataFrame

I’m applying a LabelEncoder to a pandas DataFrame, df

Feat1  Feat2  Feat3  Feat4  Feat5
  A      A      A      A      E
  B      B      C      C      E
  C      D      C      C      E
  D      A      C      D      E

I'm applying a label encoder to a dataframe like this -

from sklearn import preprocessing
le = preprocessing.LabelEncoder()
intIndexed = df.apply(le.fit_transform)

This is how the labels are mapped

A = 0
B = 1
C = 2
D = 3
E = 0

I'm guessing that E isn't given the value of 4 as it doesn't appear in any other column other than Feat 5 .

I want E to be given the value of 4 - but don't know how to do this in a DataFrame.

by gbhrea at September 30, 2016 09:39 AM

Obtaining the most informative features after dimensionality reduction

I basically have a python script that tries a variety of dimensionality reduction techniques combined with a variety of classifiers. I attempted to collect the most informative features for each classifier:

if 'forest' in type(classifier).__name__.lower():
            importances = classifier.feature_importances_
            coefs_with_fns = numpy.argsort(importances)[::-1]
            coefs_with_fns = sorted(zip(classifier.coef_, reduced_training.columns))

While this works in principal, the output is just a series of integers, which (i assume) correspond to the column numbers in the feature array before the classifier. Which brings me to the problem: This array is the direct result of one dimensionality reduction method, which throws away all the previously attached column labels.

So my question is: is there a way to trace back the result of the dimensionality reduction to the actual columns/labels in the original dataset?

by Max Uppenkamp at September 30, 2016 09:33 AM

Identifying long irregular patterns in image

I have to identify the patterns that you can see in the example image. I highlighted one of the patterns to detect in yellow (of course it should go from one side to the other of the image). Each pattern is formed by three lines.

As you can see there are many more with lots of irregularities (that I would like to follow). I highlighted with red circles some examples of problematic areas.

There is a lot of noise and possible false positives.

I am trying to remove the noise through some thresold processing, but it seem a little difficult. Edge detection does not work in this kind of application.

What do you think are the best techniques to do so?

Sample image

Thanks in advance for any answers.

EDIT: Unedited image: Unedited

Using erosion with an 1x10 kernel after rotating a bit the image seems a good way to reach what want to obtain: Eroded

by FMarazzi at September 30, 2016 09:12 AM

How to use cross_val_score with random_state

I'm getting different values for different runs ... what am I doing wrong here:

cv = StratifiedKFold(y, random_state=1)
s = cross_val_score(clf, X,y,scoring='roc_auc', cv=cv)
# [ 0.42321429  0.44360902  0.34398496]
s = cross_val_score(clf, X,y,scoring='roc_auc', cv=cv)
# [ 0.42678571  0.46804511  0.36090226]

by maxymoo at September 30, 2016 09:07 AM

Expectation Maximization algorithm(Gaussian Mixture Model) : ValueError: the input matrix must be positive semidefinite

I am trying to implement Expectation Maximization algorithm(Gaussian Mixture Model) on a data set data=[[x,y],...]. I am using mv_norm.pdf(data, mean,cov) function to calculate cluster responsibilities. But after calculating new values of covariance (cov matrix) after 6-7 iterations, cov matrix is becoming singular i.e determinant of cov is 0 (very small value) and hence it is giving errors

ValueError: the input matrix must be positive semidefinite


raise np.linalg.LinAlgError('singular matrix')

Can someone suggest any solution for this?

#E-step: Compute cluster responsibilities, given cluster parameters
def calculate_cluster_responsibility(data,centroids,cov_m):
    pdfmain=[[] for i in range(0,len(data))]
    for i in range(0,len(data)):
        pdfeach=[[] for m in range(0,len(centroids))]
        pdfeach[0]=1/3.*mv_norm.pdf(data[i], mean=centroids[0],cov=[[cov_m[0][0][0],cov_m[0][0][1]],[cov_m[0][1][0],cov_m[0][1][1]]])
        pdfeach[1]=1/3.*mv_norm.pdf(data[i], mean=centroids[1],cov=[[cov_m[1][0][0],cov_m[1][0][1]],[cov_m[1][1][0],cov_m[0][1][1]]])
        pdfeach[2]=1/3.*mv_norm.pdf(data[i], mean=centroids[2],cov=[[cov_m[2][0][0],cov_m[2][0][1]],[cov_m[2][1][0],cov_m[2][1][1]]])
        pdfeach[:] = [x / sum1 for x in pdfeach]

    global old_pdfmain
    if old_pdfmain==pdfmain:
    softcounts=[sum(i) for i in zip(*pdfmain)]
    calculate_cluster_weights(data,centroids,pdfmain,soft counts)

Initially, I've passed [[3,0],[0,3]] for each cluster covariance since expected number of clusters is 3.

by Madhura Raut at September 30, 2016 08:47 AM


MSGARCH package on R

I am using the MSGARCH package on R to fit a Markov switching GARCH model. I fit the GARCH model using fit.MLE (so standard Maximum Likelihood), using three regimes. The parameters are estimated and given by the vector:

$\theta = (\alpha_{11}, \alpha_{12}, \alpha_{13}, \alpha_{21}, \alpha_{22}, \alpha_{23}, \beta_{1}, \beta_{2}, \beta_{3}, P_{1}, P_{2}, P_{2}, P_{4}, P_5, P_6)$.

Where the j in $\alpha_{ij}$, $\beta_{j}$ denotes the state. The outputed $P_i$ is only six elements and with negative values. Its not the expected nine elements from the transition Probability matrix. Does anyone know what this is? Or how to lookup the correct matrix $P$?

Here is the code for outputing the vector above (importsnp is a series of log-returns):



snp <- as.matrix(importsnp)*100

spec <- MSGARCH::create.spec(model = c("sGARCH","sGARCH","sGARCH"), 
                             distribution = c("norm","norm","norm"), 
                             do.skew = c(FALSE,FALSE,FALSE), 
                             do.mix = FALSE, 
                             do.shape.ind = FALSE)

fit <- MSGARCH::fit.mle(spec = spec, y = snp)

theta <- fit$theta

by Melly Donald at September 30, 2016 08:28 AM

Combine EWMA or ARCH model with estimator other than squared returns

Currently I use the EWMA model with the squared logarithmic returns as proxy estimator for the volatility, in order to forecast the volatility one step ahead in an intraday scenario (time frame is a couple of minutes)

However I read and observed the squared returns as volatility estimator has its limitations. So now I want to use a more sophisticated estimator such as Garman-Klass.

My question is:

  • Is possible and moreover sensible to combine an estimator such as Garman-Klass with a volatility model like EWMA or any of the ARCH family?
  • Or do I even need a volatility model like EWMA or *ARCH when I use these estimator (i.e Garman-Klass) in order to forecast the volatility ?

by flxh at September 30, 2016 08:26 AM



Suppose i want to track S&P500 index using 15 stocks, how do i adjust their weights?

I am given 15 stocks (which is listed in NYSE), and want to track/replicate the S&P500 index. So i am currently have the information about the stock price, and given some capital to invest in (all must be in stocks), how do i determine the weights? I know that these 15 weights must equal to one, but if i take directly from current market, it won't be 1 (because there is 500 stocks in S&P, while i only have 15). Do i re-adjust, and if yes, how?

by user24645 at September 30, 2016 08:12 AM


Calculating the entropy of an attribute in the ID3 algorithm when a split is perfectly classified

I have been reading about the ID3 algorithm recently and it says that the best attribute to be selected for splitting should result in the maximum information gain which can be computed with the help of the entropy.

I have written a simple python program to compute the entropy. It is shown below:

def _E(p, n):
    x = (p/(p+n))
    y = (n/(p+n))
    return(-1* (x*math.log2(x)) -1* (y*math.log2(y)))

However suppose we have a table consisting of 10 elements as follows:

x = [1, 0, 1, 0, 0, 0, 0, 0, 0, 0]

y = [1, 1, 1, 0, 1, 0, 1, 0, 1, 0]

Where x is the attribute and y is the class. Here P(0) = 0.8 and P(1) = 0.2. The entropy will be as follows:

Entropy(x) = 0.8*_E(5, 3) + 0.2*_E(2, 0)

However the second split P(1) is perfectly classified and this results in a math error since log2(0) is negative infinity. How is the entropy calculated in such cases?

by Abc254 at September 30, 2016 08:11 AM

Changing the Input test data into the dimentional of feature matrix

I trained a model with feature matrix dimension (200,716),where 200 is the number of document and 716 is the number of total feature.Now i want to test the model with input test data having feature words (7).How can i mapped this feature to exact same number of feature in which our model get trained,Such that i can use model.predict(test_data) function for checking the prediction of model on new data.

by Saurabh at September 30, 2016 08:06 AM

Library for classifying/predicting ethnicity based on first/name name

Any library (python,java) for predicting/classifying ethnicity based on first/last name

Similar to this:

by samol at September 30, 2016 07:37 AM


patter: `go test -v` as TAP output

After having recently implemented a very, very, very simple shell script based functional testing setup for a CLI tool, that output TAP to use with prove(1), I’ve become super obsessed with the simplicity of TAP, so I wrote a simple output translator for use with Go’s builtin testing framework. Not sure exactly how useful this actually is for most people, but you never know.


by apg at September 30, 2016 07:35 AM


Risk neutral measure for jump processes

Assume we model the dynamics of a tradable asset as follows $$ S_t = S_0 \exp\left[\sigma W_t +(\alpha-\beta\lambda-\frac{1}{2}\sigma^2)t+J_t \right] $$ where $W_t$ is a standard Brownian motion independent from $J_t = \sum_{i=1}^{N_t} Y_i$ a compound Poisson process.

What conditions should $\alpha$ and $\beta$ verify for this dynamics to be a valid risk-neutral dynamics?

by user7843 at September 30, 2016 07:08 AM

What is mathematically rigorous way to estimate floating swap cash flow in the future?

In vanilla swap, the FL payments is fixed on one date and paid on the next reset date. So the next payment is known. However, the payment after that is not known. What would be the best estimate of that, mathematically?

Applying Markov property to bond price, expected price will not change. Calculate next to next payment from the next known payment along with forward rate at the next reset period. It seems to be simple.

I am looking for a more mathematically correct way to approach this, if any.

by user12348 at September 30, 2016 07:04 AM


Using unsupervised clustering algorithms to create more representative training data

I am exploring a unique machine learning project. For this project, I need to classify studies as Included or Excluded. I plan to use a supervised machine learning classification algorithm for this and so will need to create some training data.

Normally when choosing training data, I randomly choose a subset of the database in the hope that it truly is representative of the data. However, every time the algorithm is run, training data will need to be generated. So, in an attempt to generate more representative training data I plan to use unsupervised clustering algorithms, to cluster my data into multiple clusters. Then a random subset of each cluster is chosen, which then makes up the training data set in the hope that is is more representative of the entire dataset.

Has this been done previously, and if so can you cite the paper that mentions this?

by Wonderer at September 30, 2016 07:03 AM

Planet Theory

CS@Aalborg University: Research evaluation 2011-2015

Every five years, the Department of Computer Science at Aalborg University undergoes a research evaluation. The purpose of this exercise is to provide the department with qualified and independent opinions on its "actual research topics, results, and performance, but also on strategic issues like funding, internal organization and synergies, possible new directions, collaboration with industry, internationalization, positioning IT as a key enabler in society, etc." So the overall aim is to improve the quality and impact of the research carried out within the department.

The evaluation committee for the period 2011-2015 consisted of Peter Apers (University of Twente, the Netherlands), Jan Gulliksen (KTH Royal Institute of Technology, Stockholm, Sweden), Chris Hankin (Institute for Security Science and Technology and Imperial College, UK), Heikki Mannila (Aalto University, President of the Academy of Finland, Finland) and Torben Bach Pedersen (Aalborg University, Denmark), who was the internal member and chair of the committee.

The report resulting from the latest such evaluation has recently been released and can be found here. The editors of the report were Manfred Jaeger, Jesper Kjeldskov, Hua Lu and Brian Nielsen. As a former editor of such a report in days long gone, I know that their job required a considerable use of time and effort.

So, what did the evaluation committee have to say? Quoting from its evaluation of the department as a whole,
"The Computer Science Department has two world-class groups and excellent staff in all groups. The Danish IT benchmarking exercise of 2014 shows that the Department is the best department in Denmark for number of refereed publications and BFI points per full-time faculty member. The Department is also top in a number of other metrics. During the review it was also reported that Aalborg Computer Science graduates are highly prized by industry. The Department thus deserves to be ranked even higher in the QS World University Rankings by Subject or the Academic Ranking of World University (ARWU \Shanghai") Subject ranking. The current rankings are to a large degree caused by the poor coverage of computer science publications in the commercial bibliometric indices used in these rankings (WoS, Scopus). Here, Google Scholar provides a much better coverage. However, the Department clearly has the potential to rise considerably in these rankings but will require support from the Faculty and University to achieve this."
The two world-class groups mentioned in the above quotation are the Database and Programming Technologies and the Distributed and Embedded Systems units. (The latter is now called Distributed, Embedded and Intelligent Systems unit as it now also includes researcers from what used to be the Machine Intelligence group.) Those two groups are led by the Danish computer scientists with the highest h-index, and have a truly impressive publication and grant-winning record.

You can find the committee's evaluations for each of the research groups in the report. Here I'll limit myself to mentioning an excerpt of what the committee wrote about the Distributed and Embedded Systems unit, where I had the pleasure to work for ten years.

"The Distributed and Embedded Systems group is a world-class group. It is involved in a broad range of activities from semantic foundations through tool development for verification and validation to real-world applications. The group is making excellent contributions across the whole spectrum of activity; this is internationally recognized by prestigious awards such as:
  • The ERC Advanced Grant LASSO
  • The 2013 CAV Award for Uppaal - the first time that this award has been granted to a non-US team
  • The ranking of  "Uppaal in a Nutshell" as the 9th most in fluential paper in Software Engineering since 1972
  • Best paper awards, medals and other awards to Associate Professors
The h-index of Kim Guldstrand Larsen is outstanding and places him among the top echelon of researchers in this area; his h-index is higher than some Turing Award winners in cognate areas. It is also pleasing to note that some of the Associate Professors also have high h-indices for their career point. .....

The group has published well during the review period with 175 conference papers - 75% of which are in A and B venues - and 63 journals - 92% of which are in A and B venues. .....
The group has secured 37 new grants to a total value of DKK103.8M. ....

The major strength of the group is the people; not only the group leader but the strong group of more junior academic staff and an excellent group of support staff. The broad span from foundational work to applications is also unusual in such groups in other universities and is a considerable strength of DES. The profi le and reach of the group is enhanced by its dissemination activities but also the engagement of senior staff in policy-related activities at national and European levels."

Of course there is still a lot of room for improvement, but this will require support from the university as a whole, high-profile new hires in the future and the development of the talent the department already boasts. However, the opinion of the evaluation committee clearly highlights the current strength of a CS department that, in my admittedly biased opinion, deserves to be better known worldwide.

by Luca Aceto ( at September 30, 2016 07:01 AM


Python loop speed comparison

I was doing an extremely simple coding exercise to compare the run-times with different approaches in Python.

Task: Given a list of numbers, square and return each odd number in the list.

import timeit
# 1. functional sol
s161 = '''seq = range(1,10)
result = map(lambda x: x**2, filter(lambda x: x%2 !=0, seq))'''
u161= timeit.Timer(stmt=s161, setup='import numpy as np')
print u161.timeit(number = 10000)
>>>> 0.0283808019885

# 2. list comp sol!!! twice faster than map and filter
s162='''seq = range(1,10)
result2 = [j**2 for j in seq if j %2 != 0]
u162= timeit.Timer(stmt=s162, setup='import numpy as np')
print u162.timeit(number = 10000)
>>>> 0.0124025493424

# 3. numpy sol (surprisingly the slowest
s163='''seq = np.array(range(1,10))
result3 =np.square(seq[seq%2 !=0])
u163= timeit.Timer(stmt=s163, setup='import numpy as np')
print u163.timeit(number = 10000)
>>>> 0.0656246913131

So in terms of speed: List Comprehension > Functional > Numpy Vectorized. I would have expected the numpy solution to be the fastest and the functional solution and the list comprehension to come head to head. I am not really calling np.append(), but applying a vectorized operation.

Is the anlaysis above valid? If so, could you provide some insight as to why list comprehension is the fastest, and the numpy solution is comparatively very slow?


Per the comments, for seq=range(10000), the comparison changes to:

1. Functional = 1.87310951806
2. List Comprehension = 0.941395220259
3. Numpy = 0.670276152436

Numpy fastest as expected. I am not sure why list comprehension is much faster than map/filter though.

by Zhubarb at September 30, 2016 06:52 AM



How we should design architecture for machine learning engines? Monolithic based / service based?

I am engineer and not data scientist. Recently I transformed myself from monolithic approach to micro service approach. Each step I take I thick about decoupling. Same approach I followed to design architecture for my current ML project. ML has different components like pre-processing, clustering, regression, feature selection, evaluation, cross validation, and etc.. For all this step we have micro services ready. Now I want to address two important thing respect to model building.

  1. Prepare notes of experiment carried out for model building and Generate report.
  2. At any given time If I want to generate same model again I should have all my data and code tagged so that I can pick from there.

Problem statement one is pretty much solved. I found different resources which recommend to use s3 and other persistent storage. To addressed second problem I have to tag my code at every instance. If I manage multiple repos (multiple services) tagging multiple code can increase maintenance.

Engineers use service based approach so that any service can be scaled on demand. Thanks AWS auto scaling. But engine which I am designing for ML is not something which customers are going to use. This is something which data scientists are going to use for experiments / model building. In such scenario I thought I can design monolithic based engine and for every experiment start I can launch new ec2 instance and terminate once experiment is done.

Please advice.

by SangamAngre at September 30, 2016 06:32 AM


New P2 Instance Type for Amazon EC2 – Up to 16 GPUs

I like to watch long-term technology and business trends and watch as they shape the products and services that I get to use and to write about. As I was preparing to write today’s post, three such trends came to mind:

  • Moore’s Law – Coined in 1965, Moore’s Law postulates that the number of transistors on a chip doubles every year.
  • Mass Market / Mass Production – Because all of the technologies that we produce, use, and enjoy every day consume vast numbers of chips, there’s a huge market for them.
  • Specialization  – Due to the previous trend, even niche markets can be large enough to be addressed by purpose-built products.

As the industry pushes forward in accord with these trends, a couple of interesting challenges have surfaced over the past decade or so. Again, here’s a quick list (yes, I do think in bullet points):

  • Speed of Light – Even as transistor density increases, the speed of light imposes scaling limits (as computer pioneer Grace Hopper liked to point out, electricity can travel slightly less than 1 foot in a nanosecond).
  • Semiconductor Physics – Fundamental limits in the switching time (on/off) of a transistor ultimately determine the minimum achievable cycle time for a CPU.
  • Memory Bottlenecks – The well-known von Neumann Bottleneck imposes limits on the value of additional CPU power.

The GPU (Graphics Processing Unit) was born of these trends, and addresses many of the challenges! Processors have reached the upper bound on clock rates, but Moore’s Law gives designers more and more transistors to work with. Those transistors can be used to add more cache and more memory to a traditional architecture, but the von Neumann Bottleneck limits the value of doing so. On the other hand, we now have large markets for specialized hardware (gaming comes to mind as one of the early drivers for GPU consumption). Putting all of this together, the GPU scales out (more processors and parallel banks of memory) instead of up (faster processors and bottlenecked memory). Net-net: the GPU is an effective way to use lots of transistors to provide massive amounts of compute power!

With all of this as background, I would like to tell you about the newest EC2 instance type, the P2. These instances were designed to chew through tough, large-scale machine learning, deep learning, computational fluid dynamics (CFD), seismic analysis, molecular modeling, genomics, and computational finance workloads.

New P2 Instance Type
This new instance type incorporates up to 8 NVIDIA Tesla K80 Accelerators, each running a pair of NVIDIA GK210 GPUs. Each GPU provides 12 GB of memory (accessible via 240 GB/second of memory bandwidth), and 2,496 parallel processing cores. They also include ECC memory protection, allowing them to fix single-bit errors and to detect double-bit errors. The combination of ECC memory protection and double precision floating point operations makes these instances a great fit for all of the workloads that I mentioned above.

Here are the instance specs:

Instance Name GPU Count vCPU Count Memory Parallel Processing Cores
GPU Memory
Network Performance
p2.large 1 4 61 GiB 2,496 12 GB High
p2.8xlarge 8 32 488 GiB 19,968 96 GB 10 Gigabit
p2.16xlarge 16 64 732 GiB 39,936 192 GB 20 Gigabit

All of the instances are powered by an AWS-Specific version of Intel’s Broadwell processor, running at 2.7 GHz. The p2.16xlarge gives you control over C-states and P-states, and can turbo boost up to 3.0 GHz when running on 1 or 2 cores.

The GPUs support CUDA 7.5 and above, OpenCL 1.2, and the GPU Compute APIs. The GPUs on the p2.8xlarge and the p2.16xlarge are connected via a common PCI fabric. This allows for low-latency, peer to peer GPU to GPU transfers.

All of the instances make use of our new Enhanced Network Adapter (ENA – read Elastic Network Adapter – High Performance Network Interface for Amazon EC2 to learn more) and can, per the table above, support up to 20 Gbps of low-latency networking when used within a Placement Group.

Having a powerful multi-vCPU processor and multiple, well-connected GPUs on a single instance, along with low-latency access to other instances with the same features creates a very impressive hierarchy for scale-out processing:

  • One vCPU
  • Multiple vCPUs
  • One GPU
  • Multiple GPUs in an instance
  • Multiple GPUs in multiple instances within a Placement Group

P2 instances are VPC only, require the use of 64-bit, HVM-style, EBS-backed AMIs, and you can launch them today in the US East (Northern Virginia), US West (Oregon), and Europe (Ireland) regions as On-Demand Instances, Spot Instances, Reserved Instances, or Dedicated Hosts.

Here’s how I installed the NVIDIA drivers and the CUDA toolkit on my P2 instance, after first creating, formatting, attaching, and mounting (to /ebs) an EBS volume that had enough room for the CUDA toolkit and the associated samples (10 GiB is more than enough):

$ cd /ebs
$ sudo yum update -y
$ sudo yum groupinstall -y "Development tools"
$ sudo yum install -y kernel-devel-`uname -r`
$ wget
$ wget
$ chmod +x
$ sudo ./
$ chmod +x
$ sudo ./   # Don't install driver, just install CUDA and sample
$ sudo nvidia-smi -pm 1
$ sudo nvidia-smi -acp 0
$ sudo nvidia-smi --auto-boost-permission=0
$ sudo nvidia-smi -ac 2505,875

Note that and are interactive programs; you need to accept the license agreements, choose some options, and enter some paths. Here’s how I set up the CUDA toolkit and the samples when I ran

P2 and OpenCL in Action
With everything set up, I took this Gist and compiled it on a p2.8xlarge instance:

[ec2-user@ip-10-0-0-242 ~]$ gcc test.c -I /usr/local/cuda/include/ -L /usr/local/cuda-7.5/lib64/ -lOpenCL -o test

Here’s what it reported:

[ec2-user@ip-10-0-0-242 ~]$ ./test
1. Device: Tesla K80
 1.1 Hardware version: OpenCL 1.2 CUDA
 1.2 Software version: 352.99
 1.3 OpenCL C version: OpenCL C 1.2
 1.4 Parallel compute units: 13
2. Device: Tesla K80
 2.1 Hardware version: OpenCL 1.2 CUDA
 2.2 Software version: 352.99
 2.3 OpenCL C version: OpenCL C 1.2
 2.4 Parallel compute units: 13
3. Device: Tesla K80
 3.1 Hardware version: OpenCL 1.2 CUDA
 3.2 Software version: 352.99
 3.3 OpenCL C version: OpenCL C 1.2
 3.4 Parallel compute units: 13
4. Device: Tesla K80
 4.1 Hardware version: OpenCL 1.2 CUDA
 4.2 Software version: 352.99
 4.3 OpenCL C version: OpenCL C 1.2
 4.4 Parallel compute units: 13
5. Device: Tesla K80
 5.1 Hardware version: OpenCL 1.2 CUDA
 5.2 Software version: 352.99
 5.3 OpenCL C version: OpenCL C 1.2
 5.4 Parallel compute units: 13
6. Device: Tesla K80
 6.1 Hardware version: OpenCL 1.2 CUDA
 6.2 Software version: 352.99
 6.3 OpenCL C version: OpenCL C 1.2
 6.4 Parallel compute units: 13
7. Device: Tesla K80
 7.1 Hardware version: OpenCL 1.2 CUDA
 7.2 Software version: 352.99
 7.3 OpenCL C version: OpenCL C 1.2
 7.4 Parallel compute units: 13
8. Device: Tesla K80
 8.1 Hardware version: OpenCL 1.2 CUDA
 8.2 Software version: 352.99
 8.3 OpenCL C version: OpenCL C 1.2
 8.4 Parallel compute units: 13

As you can see, I have a ridiculous amount of compute power available at my fingertips!

New Deep Learning AMI
As I said at the beginning, these instances are a great fit for machine learning, deep learning, computational fluid dynamics (CFD), seismic analysis, molecular modeling, genomics, and computational finance workloads.

In order to help you to make great use of one or more P2 instances, we are launching a  Deep Learning AMI today. Deep learning has the potential to generate predictions (also known as scores or inferences) that are more reliable than those produced by less sophisticated machine learning, at the cost of a most complex and more computationally intensive training process. Fortunately, the newest generations of deep learning tools are able to distribute the training work across multiple GPUs on a single instance as well as across multiple instances each containing multiple GPUs.

The new AMI contains the following frameworks, each installed, configured, and tested against the popular MNIST database:

MXNet – This is a flexible, portable, and efficient library for deep learning. It supports declarative and imperative programming models across a wide variety of programming languages including C++, Python, R, Scala, Julia, Matlab, and JavaScript.

Caffe – This deep learning framework was designed with  expression, speed, and modularity in mind. It was developed at the Berkeley Vision and Learning Center (BVLC) with assistance from many community contributors.

Theano – This Python library allows you define, optimize, and evaluate mathematical expressions that involve multi-dimensional arrays.

TensorFlow – This is an open source library for numerical calculation using data flow graphs (each node in the graph represents a mathematical operation; each edge represents multidimensional data communicated between them).

Torch – This is a GPU-oriented scientific computing framework with support for machine learning algorithms, all accessible via LuaJIT.

Consult the README file in ~ec2-user/src to learn more about these frameworks.

You may also find the following AMIs to be of interest:


by Jeff Barr at September 30, 2016 06:26 AM


How do i simplify this SOP expression?

Hi i have derived the following SoP (Sum of Products) expression , by analyzing the truth table of a 3 bit , binary to gray code converter. I ask for verification, because i feel as though this answer may not be correct or complete.

X = a'bc' + a'bc + ab'c' + ab'c

which, using k-maps, was simplified to

X = ab' + a'b

Is this correct ?

by FutureSci at September 30, 2016 05:59 AM

I have a problem understanding a solution for number theory problem from SPOJ

Problem statement can be found here or down below.

The solution which I'm trying to understand can be found here or down below.

Problem Statement.

Peter wants to generate some prime numbers for his cryptosystem. Help him! Your task is to generate all prime numbers between two given numbers!

The input begins with the number t of test cases in a single line (t<=10). In each of the next t lines there are two numbers m and n (1 <= m <= n <= 1000000000, n-m<=100000) separated by a space.

For every test case print all prime numbers p such that m <= p <= n, one number per line, test cases separated by an empty line.

1 10
3 5




The idea behind solution here is to generate all the prime numbers that could be factors of numbers up to the maximum endpoint 1 billion.
That square root happens to be around 32000.
Using this array, do a bounded Sieve of Eratosthenes only in the range requested.

#include <stdio.h>
#include <string.h>
#include <stdbool.h>
#include <math.h>

int main() {
int primes[4000];
int numprimes = 0;

primes[numprimes++] = 2;
for (int i = 3; i <= 32000; i+=2) {
    bool isprime = true;
    int cap = sqrt(i)+1;
    for (int j = 0; j < numprimes; j++) {
        if (primes[j] >= cap) break;
        if (i % primes[j] == 0) {
            isprime = false;
    if (isprime) primes[numprimes++] = i;

int T,N,M;

for (int t = 0; t < T; t++) {
    if (t) printf("\n");
    scanf("%d %d",&M,&N);
    if (M < 2) M = 2;

    int cap = sqrt(N) + 1;

    bool isprime[100001];

    for (int i = 0; i < numprimes; i++) {
        int p = primes[i];

        if(p >= cap) break;

        int start;

        if (p >= M) start = p*2;
        else start = M + ((p - M % p) % p);

        for (int j = start; j <= N; j += p) {
            isprime[j - M] = false;

    int start = (M % 2)?M:M+1;

    if (M == 2) {
    for (int i = start; i <= N; i+=2) {
        if (isprime[i-M]) printf("%d\n",i);
return 0;

I know how Sieve of Eratosthenes works and I also ran this program using pen and paper.
It works fine but I 'm not able to understand why it works and how do I prove that this program and the algorithm used in it are right?
I spent hours but could not prove.

Any help would be appreciated.

by mac07 at September 30, 2016 05:56 AM


What desktop environment and window manager do you use?

I’m curious what desktop environments and window managers folks in the Lobsters are using. I’m gonna be converting a desktop machine to use FreeBSD soon, so I’m curious what others are using.

by zg at September 30, 2016 05:54 AM


Finding Regular expression for FSM with more than one final state

I am trying to find Regular expression for a FSM with more than one final state(in my case 2). problem statement FSM

In this I am getting two different RE for state q4 and q5. So my question is If we get more than one different RE for one FSM with more than one final state, is it correct??

by Tetragrammaton at September 30, 2016 05:49 AM


Trying to solve information asymmetry in elections. Public good project. webcrawlers [on hold]

Creating a yelp for politicians will greatly help the information asymmetry in elections. Information needed to participate in democratic elections is often low (except in cases of presidential elections) because most people don't know when it is, don't know( or don't care) much about the politicians. Apart from that information on bills, past support, funding information. All of that information is publicly available but you have to spend some time probably hours to get information on one election. The idea is to create a profile for each politicians running, and people subscribe to a newsletter based on location. More details here- Suggestion, and critics are welcome, webcrawler help required.

by Public Servant at September 30, 2016 05:36 AM


Efficiently check if any vertex has a path to its partner vertex

So I have a directed graph that looks something like:


I'm trying to make an algorithm that can go through all the vertices and tell me whether we have any path from an upper case letter to its lowercase version, or a path from a lowercase letter to its upper case version. In the above example, we have a path from C to c. We also have a path from B to b, from a to A, from E to e, and from G to g. I'd like the algorithm to find at least one of these paths.

The algorithm can stop looking if it just finds one such case where this happens. Each vertex points to at most one different vertex, so the out-degree of every vertex is either zero or one. I'm trying to get this done efficiently (linear time) and can put whatever extra information in the vertices to do it. I've been attempting to use the DFS algorithm to do it because you can work with ancestry but I'm not sure if it works for specific vertex relationships efficiently.

I'm aware that DFS can use discovery time and finish time to find if a vertex is the ancestor of another, and that this is used for cycle detection. I'm basically wondering if there is an efficient way to check if a specific node (that isn't necessarily part of a cycle) is the ancestor of another specific node. Unless, of course, this is unnecessary and there is a better way to go about it.

by barb at September 30, 2016 05:21 AM


How to generate the views in Black-Litterman model?

I want to apply a Black-Litterman approach for portfolio optimization. My question is how to select investor views? I need to base the choice on a model. I would be thankful if you could give me some references or suggestions.

by Nourhaine Nefzi at September 30, 2016 05:14 AM



Need help with comparing of elements one by one within a string [on hold]

What I was trying to do is to determine if the string has a form of "ABABAB...." (The number of A and B are the same). But when I tested "ABBBAB", it showed "the right form", "the wrong" and "the right" respectively. I guess there is something wrong in those conditional statements. Any guidance would be appreciated!


String a = sc.nextLine();
    if(a.length() % 2 == 0){
       for (int i=0;i<result.length(); i+=1){
            if(a.charAt(0) == 'A' && a.charAt(i) != a.charAt(i+1)){
                System.out.println("It is the right form");
                System.out.println("It is the wrong form");


by Johnny at September 30, 2016 04:58 AM


MATLAB's fitclinear function: significance of the Beta weights?

I'm looking to run a logistic regression classification model in MATLAB. "fitclinear" is a new feature that was introduced in MATLAB 2016a, and it has some options I find advantageous over previous functions like "glmfit". For instance, you can specify a prior over class probabilities.

I want to interpret the significance of the Beta weights for each feature when running fitclinear using the "logistic" learner, but for some reason, this isn't included in the model output when using this function. Is there any way to evaluate significance of the Beta values if the function doesn't output this automatically? That is, if you have the Beta weights, does MATLAB include any function that can tell me whether they're significant?

by statdat at September 30, 2016 04:46 AM


How is determinizing NFAs feasible if the statespace is exponential?

I have recently learned about using powerset construction for conversion of NFA to DFA - however this only seems to be feasible when the number of states we are working with are 3 or less, as $2^k$ seems to grow unmanageable very quickly.

I have a problem with 6 states, meaning that if I were to use powerset construction, I would have 64 states. What is a more efficient way?

by Jordan Andy at September 30, 2016 04:36 AM


tmux 2.3 released


  • New option ‘pane-border-status’ to add text in the oane borders.
  • Support for hooks on commands: ‘after’ and ‘before’ hooks.
  • ‘source-file’ understands ‘-q’ to supress errors for nonexistent files.
  • Lots of UTF8 improvements, especially on MacOS.
  • ‘window-status-separator’ understands #[] expansions.
  • ‘split-window’ understands ‘-f’ for performing a full-width split.
  • Allow report count to be specified when using ‘bind-key -R’.
  • ‘set -a’ for appending to user options (@foo) is now supported.
  • ‘display-panes’ can now accept a command to run, rather than always selecting the pane.


by romanzolotarev at September 30, 2016 03:59 AM


Program which provably halts on all inputs with undecidable behavior [on hold]

I'm trying to come up with an output $y$ and a program $f$ such that

  1. $f$ provably halts on all inputs
  2. The statement "$\exists~x,~ f(x)=y$" is undecidable

Can you think of an example?

by Arthur B at September 30, 2016 03:54 AM


Starting short-end OIS zero curve building

I understand the concept of bootstrapping and building the curve when we have the values for first few maturities. However, I can't quite get how the initial values for zero curve rates are derived from tradable instruments. As I understand, these values are directly implied from OIS par rates.

Can someone please clarify, how, given, say a 1M OIS swap bid and ask, can I get the zero curve point at 1M maturity?

by sashkello at September 30, 2016 03:54 AM


Is there a infinite loop in my codes? in ocaml

I want to get the sum of function f(i) values when i is equal from a to b = f(a)+f(a+1)+...+f(b-1)+f(b) So I wrote code like this.

let rec sigma : (int -> int) -> int -> int -> int
= fun f a b ->
if a=b then f a
else f b  + sigma f a b-1 ;;

but result is that there is stack overflow during evaluation. Is there a infinite loop? and why?

by Volnyar at September 30, 2016 03:48 AM


Mergesort recurssion tree depth...logs

I think some of the log properties are flying over my head but I'm trying to understand how the depth of mergesort is...

$1 + \log_2 n$

I understand that to get the depth, you would have to divide $n$ by $2^x$ but I don't how this leads to the above.

It's probably some simple log principal but I'm not sure.

by pad11 at September 30, 2016 03:36 AM


What do you think is the most pressing issue in your favorite language?

Is there anything that …

  • creators should have done from day 1, but is still not there yet?
  • got gradually worse over the years?
  • frustrates you, because a fix would be easy, but nobody seems interested in fixing it?

Please, share your stories and experiences!

by soc at September 30, 2016 03:21 AM


How can I optimize (minimize) the output of a machine leaning model?

The situation is the following: A want to optimize a process which takes wax to create wax products. For each product to be made, there is a different composition of wax, and a set o parameters for the process (machine configurations). During the process, some objects are discarded due to manufacturing defects. My first idea is to create a supervised machine leaning model that takes the wax composition and machine configurations as inputs, and the number of wasted objects as output (I have the data to do this). My problem is: after creating that model, how do I find the optimal process parameters (machine configuration) to minimize the number of wasted objects, for a given wax composition (it can be a new composition, never seen before).

by pedroszattoni at September 30, 2016 03:07 AM


Pricing interest rate options in emerging markets

I've been thinking how to price the early payment of mortgages in banks from emerging markets, where swaptions/caps/floors aren't available, and how to hedge this kind of options. At first i thought about implementing simple methodologies, like one factor short-rate models (in my sight the BDT model) for discounting the future cash flows that i could receive if the options are exercised, but i wonder if using this kind of model would end up in a wrong pricing.

So i would like to know if it's a good a idea to start pricing with this kind of models. How good or bad could this end up being? (compared to more sophisticated models). Should i spend time trying to calculate more complete models like the LMM or so? if so, which instruments should i use to get the right parameters?

Many models start assuming that there is an observable swaption/cap/floor market, but if there isnt such market, how parameters should be estimated? For example, i could get the zero yield from the swap market, but what about the vol? how do i benchmark my models?

by Jose Pedro Melo at September 30, 2016 02:54 AM

Issue with OLS Regression for Nelson Siegel Svensson parameters

I have been working on getting input parameters to the Non-Linear Optimization which gives the Nelson Siegel Svensson model parameters and am carrying out the OLS regression as described in this answer. However, the input parameters obtained from the OLS are too far off the actual parameters, which I checked against some parameters I actually do have. I am using the equations shown in 'Figure 5' on Page 12 of this paper, and obtain the yield data, by choosing Par Bonds and using their coupons as Par Yields to bootstrap from to get the Spot Rates, which appears to be an okay method based on Page 3 of this paper. The code that I use is below, where I've just implemented the formula in the previous link and have carried out the regression in Python. My query is if there is an issue with the way I set matrix_of_params or if it could be to do with the data in df itself.

I run the function above for different values of tau_1 and tau_2. I then have a function to get the params associated with the lowest residuals, which I am positive is correct.

#df is a Dataframe containing all the data about the Bonds
def obtainingparams(self, df, tau_1, tau_2, residuals):
    values = []
    face_values = df['FACE_VALUE'].values #Writing face values to an array
    yields = (df['coupon'].values) #COUPON = YTM for Par Bonds         
    spot_rate = np.zeros((yields.shape[0]))   

    #Calculating Spot Rates
    for x, value in np.ndenumerate(yields):
        index = x[0]
        if index == 0:
            spot_rate[index] = (yields[index]/face_values[index]) * 100

            adding_negatives = 0
            if index < spot_rate.shape[0]:
                for i in range (0, index, 1):
                    adding_negatives = adding_negatives + (value*face_values[index]/200)/np.power((1+(spot_rate[i]/200)),i+1)
                    term_1 = face_values[index] - adding_negatives 
                    spot_rate[index] = (2 * ((np.power(((((face_values[index] + ((value*face_values[index]/200)))/term_1))),1/(index+1)))-1))*100

    matrix_of_params = np.empty(shape=[1, 4])
    months_to_maturity_matrix = df.months_to_maturity.values #Writing months to maturity to an array

    #Populating the Matrix of Parameter Coefficients
    count = 0
    for x, value in np.ndenumerate(months_to_maturity_matrix):
        if count < months_to_maturity_matrix.shape[0]:
            months_to_maturity = months_to_maturity_matrix[count]
            years_to_maturity = months_to_maturity/12.0  

            #Applying the equation in the link
            newrow = [1, ((1-np.exp(-years_to_maturity/tau_1))/(years_to_maturity/tau_1)), ((1-np.exp(-years_to_maturity/tau_1))/(years_to_maturity/tau_1))-(np.exp(-years_to_maturity/tau_1)), ((((1-np.exp(-years_to_maturity/tau_2))/(years_to_maturity/tau_2))))-(np.exp(-years_to_maturity/tau_2))] 
            count = count + 1

            #Just adding the new row to the matrix of parameter coefficients
            matrix_of_param_coefficients = np.vstack([matrix_of_params, newrow]) 

    #Carrying out OLS Regression                
    params = np.linalg.lstsq(matrix_of_params,spot_rate)[0] 
    residuals = np.sqrt(((spot_rate -**2).sum())  

   #To keep track of which params are associated with which residuals
   values.append((tau_1, tau_2, residuals, params)) 
   return values

Thank You

by Jojo at September 30, 2016 02:54 AM


What is an intuitive explanation of Expectation Maximization technique?

Expectation maximization if a kind of probabilistic method to classify data. Please correct me if I am wrong if it is not a classifier.

What is an intuitive explanation of this EM technique? What is expectation here and what is being minimized?

by Abhishek Shivkumar at September 30, 2016 02:29 AM



How to determine 10^logn and 3n^2 which grows faster asymptotically?

My think is pretty easy that 10^logn = n, which is growing slower than 3n^2.

However, many tutorial shows that 3n^2 ranks before 10^logn.

I'm really confused.

by user59001 at September 30, 2016 02:14 AM


How do I get a list of all ETFs / REITs?

I am currently doing a research which will need ETF and REITs' historical data back in early 2000s.

The problem is I do not know what symbols existed back in then, unlike indices like S&P and DOW, which I can easily look at the historical components.

Is there any (free) way of downloading a dynamic list of symbols on a daily or monthly basis back in early 2000s?

Many thanks.

by cxwf at September 30, 2016 01:48 AM



Loss Not Converging Caffe Regression

I'm doing regression in Caffe. The dataset is 400 RGB images of 128x128 size and label contains float numbers in range(-1,1). The only transformation I applied to the dataset was Normalization (Divided every pixel value in RGB by 255). But the loss doesn't seem to converge at all.

What might be the possible reason for this? Can anyone please suggest me?

Here is my training log:

Using solver: solver_hdf5.prototxt
I0929 21:50:21.657784 13779 caffe.cpp:112] Use CPU.
I0929 21:50:21.658033 13779 caffe.cpp:174] Starting Optimization
I0929 21:50:21.658107 13779 solver.cpp:34] Initializing solver from parameters: 
test_iter: 100
test_interval: 500
base_lr: 0.0001
display: 25
max_iter: 10000
lr_policy: "inv"
gamma: 0.0001
power: 0.75
momentum: 0.9
weight_decay: 0.0005
snapshot: 5000
snapshot_prefix: "lenet_hdf5"
solver_mode: CPU
net: "train_test_hdf5.prototxt"
I0929 21:50:21.658143 13779 solver.cpp:75] Creating training net from net file: train_test_hdf5.prototxt
I0929 21:50:21.658567 13779 net.cpp:334] The NetState phase (0) differed from the phase (1) specified by a rule in layer data
I0929 21:50:21.658709 13779 net.cpp:46] Initializing net from parameters: 
name: "MSE regression"
state {
  phase: TRAIN
layer {
  name: "data"
  type: "HDF5Data"
  top: "data"
  top: "label"
  include {
    phase: TRAIN
  hdf5_data_param {
    source: "train_hdf5file.txt"
    batch_size: 64
    shuffle: true
layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  param {
    lr_mult: 1
  param {
    lr_mult: 2
  convolution_param {
    num_output: 20
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "xavier"
    bias_filler {
      type: "constant"
      value: 0
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "conv1"
  top: "conv1"
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "conv1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
layer {
  name: "dropout1"
  type: "Dropout"
  bottom: "pool1"
  top: "pool1"
  dropout_param {
    dropout_ratio: 0.1
layer {
  name: "fc1"
  type: "InnerProduct"
  bottom: "pool1"
  top: "fc1"
  param {
    lr_mult: 1
    decay_mult: 1
  param {
    lr_mult: 2
    decay_mult: 0
  inner_product_param {
    num_output: 500
    weight_filler {
      type: "xavier"
    bias_filler {
      type: "constant"
      value: 0
layer {
  name: "dropout2"
  type: "Dropout"
  bottom: "fc1"
  top: "fc1"
  dropout_param {
    dropout_ratio: 0.5
layer {
  name: "fc2"
  type: "InnerProduct"
  bottom: "fc1"
  top: "fc2"
  param {
    lr_mult: 1
    decay_mult: 1
  param {
    lr_mult: 2
    decay_mult: 0
  inner_product_param {
    num_output: 1
    weight_filler {
      type: "xavier"
    bias_filler {
      type: "constant"
      value: 0
layer {
  name: "loss"
  type: "EuclideanLoss"
  bottom: "fc2"
  bottom: "label"
  top: "loss"
I0929 21:50:21.658833 13779 layer_factory.hpp:74] Creating layer data
I0929 21:50:21.658859 13779 net.cpp:96] Creating Layer data
I0929 21:50:21.658871 13779 net.cpp:415] data -> data
I0929 21:50:21.658902 13779 net.cpp:415] data -> label
I0929 21:50:21.658926 13779 net.cpp:160] Setting up data
I0929 21:50:21.658936 13779 hdf5_data_layer.cpp:80] Loading list of HDF5 filenames from: train_hdf5file.txt
I0929 21:50:21.659220 13779 hdf5_data_layer.cpp:94] Number of HDF5 files: 1
I0929 21:50:21.920578 13779 net.cpp:167] Top shape: 64 3 128 128 (3145728)
I0929 21:50:21.920656 13779 net.cpp:167] Top shape: 64 1 (64)
I0929 21:50:21.920686 13779 layer_factory.hpp:74] Creating layer conv1
I0929 21:50:21.920740 13779 net.cpp:96] Creating Layer conv1
I0929 21:50:21.920774 13779 net.cpp:459] conv1 <- data
I0929 21:50:21.920825 13779 net.cpp:415] conv1 -> conv1
I0929 21:50:21.920877 13779 net.cpp:160] Setting up conv1
I0929 21:50:21.921985 13779 net.cpp:167] Top shape: 64 20 124 124 (19681280)
I0929 21:50:21.922050 13779 layer_factory.hpp:74] Creating layer relu1
I0929 21:50:21.922085 13779 net.cpp:96] Creating Layer relu1
I0929 21:50:21.922108 13779 net.cpp:459] relu1 <- conv1
I0929 21:50:21.922137 13779 net.cpp:404] relu1 -> conv1 (in-place)
I0929 21:50:21.922185 13779 net.cpp:160] Setting up relu1
I0929 21:50:21.922227 13779 net.cpp:167] Top shape: 64 20 124 124 (19681280)
I0929 21:50:21.922250 13779 layer_factory.hpp:74] Creating layer pool1
I0929 21:50:21.922277 13779 net.cpp:96] Creating Layer pool1
I0929 21:50:21.922298 13779 net.cpp:459] pool1 <- conv1
I0929 21:50:21.922323 13779 net.cpp:415] pool1 -> pool1
I0929 21:50:21.922418 13779 net.cpp:160] Setting up pool1
I0929 21:50:21.922472 13779 net.cpp:167] Top shape: 64 20 62 62 (4920320)
I0929 21:50:21.922495 13779 layer_factory.hpp:74] Creating layer dropout1
I0929 21:50:21.922534 13779 net.cpp:96] Creating Layer dropout1
I0929 21:50:21.922555 13779 net.cpp:459] dropout1 <- pool1
I0929 21:50:21.922582 13779 net.cpp:404] dropout1 -> pool1 (in-place)
I0929 21:50:21.922613 13779 net.cpp:160] Setting up dropout1
I0929 21:50:21.922652 13779 net.cpp:167] Top shape: 64 20 62 62 (4920320)
I0929 21:50:21.922672 13779 layer_factory.hpp:74] Creating layer fc1
I0929 21:50:21.922709 13779 net.cpp:96] Creating Layer fc1
I0929 21:50:21.922729 13779 net.cpp:459] fc1 <- pool1
I0929 21:50:21.922757 13779 net.cpp:415] fc1 -> fc1
I0929 21:50:21.922801 13779 net.cpp:160] Setting up fc1
I0929 21:50:22.301134 13779 net.cpp:167] Top shape: 64 500 (32000)
I0929 21:50:22.301193 13779 layer_factory.hpp:74] Creating layer dropout2
I0929 21:50:22.301210 13779 net.cpp:96] Creating Layer dropout2
I0929 21:50:22.301218 13779 net.cpp:459] dropout2 <- fc1
I0929 21:50:22.301232 13779 net.cpp:404] dropout2 -> fc1 (in-place)
I0929 21:50:22.301244 13779 net.cpp:160] Setting up dropout2
I0929 21:50:22.301254 13779 net.cpp:167] Top shape: 64 500 (32000)
I0929 21:50:22.301259 13779 layer_factory.hpp:74] Creating layer fc2
I0929 21:50:22.301270 13779 net.cpp:96] Creating Layer fc2
I0929 21:50:22.301275 13779 net.cpp:459] fc2 <- fc1
I0929 21:50:22.301285 13779 net.cpp:415] fc2 -> fc2
I0929 21:50:22.301295 13779 net.cpp:160] Setting up fc2
I0929 21:50:22.301317 13779 net.cpp:167] Top shape: 64 1 (64)
I0929 21:50:22.301328 13779 layer_factory.hpp:74] Creating layer loss
I0929 21:50:22.301338 13779 net.cpp:96] Creating Layer loss
I0929 21:50:22.301343 13779 net.cpp:459] loss <- fc2
I0929 21:50:22.301350 13779 net.cpp:459] loss <- label
I0929 21:50:22.301360 13779 net.cpp:415] loss -> loss
I0929 21:50:22.301374 13779 net.cpp:160] Setting up loss
I0929 21:50:22.301385 13779 net.cpp:167] Top shape: (1)
I0929 21:50:22.301391 13779 net.cpp:169]     with loss weight 1
I0929 21:50:22.301419 13779 net.cpp:239] loss needs backward computation.
I0929 21:50:22.301425 13779 net.cpp:239] fc2 needs backward computation.
I0929 21:50:22.301430 13779 net.cpp:239] dropout2 needs backward computation.
I0929 21:50:22.301436 13779 net.cpp:239] fc1 needs backward computation.
I0929 21:50:22.301441 13779 net.cpp:239] dropout1 needs backward computation.
I0929 21:50:22.301446 13779 net.cpp:239] pool1 needs backward computation.
I0929 21:50:22.301452 13779 net.cpp:239] relu1 needs backward computation.
I0929 21:50:22.301457 13779 net.cpp:239] conv1 needs backward computation.
I0929 21:50:22.301463 13779 net.cpp:241] data does not need backward computation.
I0929 21:50:22.301468 13779 net.cpp:282] This network produces output loss
I0929 21:50:22.301482 13779 net.cpp:531] Collecting Learning Rate and Weight Decay.
I0929 21:50:22.301491 13779 net.cpp:294] Network initialization done.
I0929 21:50:22.301496 13779 net.cpp:295] Memory required for data: 209652228
I0929 21:50:22.301908 13779 solver.cpp:159] Creating test net (#0) specified by net file: train_test_hdf5.prototxt
I0929 21:50:22.301935 13779 net.cpp:334] The NetState phase (1) differed from the phase (0) specified by a rule in layer data
I0929 21:50:22.302028 13779 net.cpp:46] Initializing net from parameters: 
name: "MSE regression"
state {
  phase: TEST
layer {
  name: "data"
  type: "HDF5Data"
  top: "data"
  top: "label"
  include {
    phase: TEST
  hdf5_data_param {
    source: "test_hdf5file.txt"
    batch_size: 30
layer {
  name: "conv1"
  type: "Convolution"
  bottom: "data"
  top: "conv1"
  param {
    lr_mult: 1
  param {
    lr_mult: 2
  convolution_param {
    num_output: 20
    kernel_size: 5
    stride: 1
    weight_filler {
      type: "xavier"
    bias_filler {
      type: "constant"
      value: 0
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "conv1"
  top: "conv1"
layer {
  name: "pool1"
  type: "Pooling"
  bottom: "conv1"
  top: "pool1"
  pooling_param {
    pool: MAX
    kernel_size: 2
    stride: 2
layer {
  name: "dropout1"
  type: "Dropout"
  bottom: "pool1"
  top: "pool1"
  dropout_param {
    dropout_ratio: 0.1
layer {
  name: "fc1"
  type: "InnerProduct"
  bottom: "pool1"
  top: "fc1"
  param {
    lr_mult: 1
    decay_mult: 1
  param {
    lr_mult: 2
    decay_mult: 0
  inner_product_param {
    num_output: 500
    weight_filler {
      type: "xavier"
    bias_filler {
      type: "constant"
      value: 0
layer {
  name: "dropout2"
  type: "Dropout"
  bottom: "fc1"
  top: "fc1"
  dropout_param {
    dropout_ratio: 0.5
layer {
  name: "fc2"
  type: "InnerProduct"
  bottom: "fc1"
  top: "fc2"
  param {
    lr_mult: 1
    decay_mult: 1
  param {
    lr_mult: 2
    decay_mult: 0
  inner_product_param {
    num_output: 1
    weight_filler {
      type: "xavier"
    bias_filler {
      type: "constant"
      value: 0
layer {
  name: "loss"
  type: "EuclideanLoss"
  bottom: "fc2"
  bottom: "label"
  top: "loss"
I0929 21:50:22.302146 13779 layer_factory.hpp:74] Creating layer data
I0929 21:50:22.302158 13779 net.cpp:96] Creating Layer data
I0929 21:50:22.302165 13779 net.cpp:415] data -> data
I0929 21:50:22.302176 13779 net.cpp:415] data -> label
I0929 21:50:22.302186 13779 net.cpp:160] Setting up data
I0929 21:50:22.302191 13779 hdf5_data_layer.cpp:80] Loading list of HDF5 filenames from: test_hdf5file.txt
I0929 21:50:22.302305 13779 hdf5_data_layer.cpp:94] Number of HDF5 files: 1
I0929 21:50:22.434798 13779 net.cpp:167] Top shape: 30 3 128 128 (1474560)
I0929 21:50:22.434849 13779 net.cpp:167] Top shape: 30 1 (30)
I0929 21:50:22.434864 13779 layer_factory.hpp:74] Creating layer conv1
I0929 21:50:22.434895 13779 net.cpp:96] Creating Layer conv1
I0929 21:50:22.434914 13779 net.cpp:459] conv1 <- data
I0929 21:50:22.434944 13779 net.cpp:415] conv1 -> conv1
I0929 21:50:22.434996 13779 net.cpp:160] Setting up conv1
I0929 21:50:22.435084 13779 net.cpp:167] Top shape: 30 20 124 124 (9225600)
I0929 21:50:22.435119 13779 layer_factory.hpp:74] Creating layer relu1
I0929 21:50:22.435205 13779 net.cpp:96] Creating Layer relu1
I0929 21:50:22.435237 13779 net.cpp:459] relu1 <- conv1
I0929 21:50:22.435292 13779 net.cpp:404] relu1 -> conv1 (in-place)
I0929 21:50:22.435328 13779 net.cpp:160] Setting up relu1
I0929 21:50:22.435371 13779 net.cpp:167] Top shape: 30 20 124 124 (9225600)
I0929 21:50:22.435400 13779 layer_factory.hpp:74] Creating layer pool1
I0929 21:50:22.435443 13779 net.cpp:96] Creating Layer pool1
I0929 21:50:22.435470 13779 net.cpp:459] pool1 <- conv1
I0929 21:50:22.435511 13779 net.cpp:415] pool1 -> pool1
I0929 21:50:22.435550 13779 net.cpp:160] Setting up pool1
I0929 21:50:22.435597 13779 net.cpp:167] Top shape: 30 20 62 62 (2306400)
I0929 21:50:22.435626 13779 layer_factory.hpp:74] Creating layer dropout1
I0929 21:50:22.435669 13779 net.cpp:96] Creating Layer dropout1
I0929 21:50:22.435698 13779 net.cpp:459] dropout1 <- pool1
I0929 21:50:22.435739 13779 net.cpp:404] dropout1 -> pool1 (in-place)
I0929 21:50:22.435780 13779 net.cpp:160] Setting up dropout1
I0929 21:50:22.435823 13779 net.cpp:167] Top shape: 30 20 62 62 (2306400)
I0929 21:50:22.435853 13779 layer_factory.hpp:74] Creating layer fc1
I0929 21:50:22.435899 13779 net.cpp:96] Creating Layer fc1
I0929 21:50:22.435926 13779 net.cpp:459] fc1 <- pool1
I0929 21:50:22.435971 13779 net.cpp:415] fc1 -> fc1
I0929 21:50:22.436018 13779 net.cpp:160] Setting up fc1
I0929 21:50:22.816076 13779 net.cpp:167] Top shape: 30 500 (15000)
I0929 21:50:22.816138 13779 layer_factory.hpp:74] Creating layer dropout2
I0929 21:50:22.816154 13779 net.cpp:96] Creating Layer dropout2
I0929 21:50:22.816160 13779 net.cpp:459] dropout2 <- fc1
I0929 21:50:22.816170 13779 net.cpp:404] dropout2 -> fc1 (in-place)
I0929 21:50:22.816182 13779 net.cpp:160] Setting up dropout2
I0929 21:50:22.816192 13779 net.cpp:167] Top shape: 30 500 (15000)
I0929 21:50:22.816197 13779 layer_factory.hpp:74] Creating layer fc2
I0929 21:50:22.816208 13779 net.cpp:96] Creating Layer fc2
I0929 21:50:22.816249 13779 net.cpp:459] fc2 <- fc1
I0929 21:50:22.816262 13779 net.cpp:415] fc2 -> fc2
I0929 21:50:22.816277 13779 net.cpp:160] Setting up fc2
I0929 21:50:22.816301 13779 net.cpp:167] Top shape: 30 1 (30)
I0929 21:50:22.816316 13779 layer_factory.hpp:74] Creating layer loss
I0929 21:50:22.816329 13779 net.cpp:96] Creating Layer loss
I0929 21:50:22.816337 13779 net.cpp:459] loss <- fc2
I0929 21:50:22.816347 13779 net.cpp:459] loss <- label
I0929 21:50:22.816359 13779 net.cpp:415] loss -> loss
I0929 21:50:22.816370 13779 net.cpp:160] Setting up loss
I0929 21:50:22.816381 13779 net.cpp:167] Top shape: (1)
I0929 21:50:22.816388 13779 net.cpp:169]     with loss weight 1
I0929 21:50:22.816407 13779 net.cpp:239] loss needs backward computation.
I0929 21:50:22.816416 13779 net.cpp:239] fc2 needs backward computation.
I0929 21:50:22.816426 13779 net.cpp:239] dropout2 needs backward computation.
I0929 21:50:22.816433 13779 net.cpp:239] fc1 needs backward computation.
I0929 21:50:22.816442 13779 net.cpp:239] dropout1 needs backward computation.
I0929 21:50:22.816452 13779 net.cpp:239] pool1 needs backward computation.
I0929 21:50:22.816460 13779 net.cpp:239] relu1 needs backward computation.
I0929 21:50:22.816468 13779 net.cpp:239] conv1 needs backward computation.
I0929 21:50:22.816478 13779 net.cpp:241] data does not need backward computation.
I0929 21:50:22.816486 13779 net.cpp:282] This network produces output loss
I0929 21:50:22.816500 13779 net.cpp:531] Collecting Learning Rate and Weight Decay.
I0929 21:50:22.816510 13779 net.cpp:294] Network initialization done.
I0929 21:50:22.816517 13779 net.cpp:295] Memory required for data: 98274484
I0929 21:50:22.816565 13779 solver.cpp:47] Solver scaffolding done.
I0929 21:50:22.816587 13779 solver.cpp:363] Solving MSE regression
I0929 21:50:22.816596 13779 solver.cpp:364] Learning Rate Policy: inv
I0929 21:50:22.870337 13779 solver.cpp:424] Iteration 0, Testing net (#0)

BeginTrain AfterSomeTime

enter image description here

by magneto at September 30, 2016 01:43 AM

arXiv Data Structures and Algorithms

Optimal Prefix Codes with Fewer Distinct Codeword Lengths are Faster to Construct. (arXiv:cs/0509015v4 [cs.DS] UPDATED)

A new method for constructing minimum-redundancy binary prefix codes is described. Our method does not explicitly build a Huffman tree; instead it uses a property of optimal prefix codes to compute the codeword lengths corresponding to the input weights. Let $n$ be the number of weights and $k$ be the number of distinct codeword lengths as produced by the algorithm for the optimum codes. The running time of our algorithm is $O(k \cdot n)$. Following our previous work in \cite{be}, no algorithm can possibly construct optimal prefix codes in $o(k \cdot n)$ time. When the given weights are presorted our algorithm performs $O(9^k \cdot \log^{2k}{n})$ comparisons.

by <a href="">Ahmed Belal</a>, <a href="">Amr Elmasry</a> at September 30, 2016 01:30 AM

Hierarchical Multi-stage Gaussian Signaling Games. (arXiv:1609.09448v1 [cs.GT])

We analyze in this paper finite horizon hierarchical signaling games between informed senders and decision maker receivers in a dynamic environment. The underlying information evolves in time while sender and receiver interact repeatedly. Different from the classical communication models, however, the sender and the receiver have different objectives and there is a hierarchy between the players such that the sender leads the game by announcing his policies beforehand. He needs to anticipate the reaction of the receiver and the impact of the actions on the horizon while controlling the transparency of the disclosed information at each interaction. With quadratic objective functions and stationary multi-variate Gaussian processes, evolving according to first order auto-regressive models, we show that memoryless linear sender policies are optimal (in the sense of game-theoretic hierarchical equilibrium) within the general class of policies.

by <a href="">Muhammed O. Sayin</a>, <a href="">Emrah Akyol</a>, <a href="">Tamer Basar</a> at September 30, 2016 01:30 AM

Design and Analysis of Deadline and Budget Constrained Autoscaling (DBCA) Algorithm for 5G Mobile Networks. (arXiv:1609.09368v1 [cs.NI])

In cloud computing paradigm, virtual resource autoscaling approaches have been intensively studied recent years. Those approaches dynamically scale in/out virtual resources to adjust system performance for saving operation cost. However, designing the autoscaling algorithm for desired performance with limited budget, while considering the existing capacity of legacy network equipment, is not a trivial task. In this paper, we propose a Deadline and Budget Constrained Autoscaling (DBCA) algorithm for addressing the budget-performance tradeoff. We develop an analytical model to quantify the tradeoff and cross-validate the model by extensive simulations. The results show that the DBCA can significantly improve system performance given the budget upper-bound. In addition, the model provides a quick way to evaluate the budget-performance tradeoff and system design without wide deployment, saving on cost and time.

by <a href="">Tuan Phung-Duc</a>, <a href="">Yi Ren</a>, <a href="">Jyh-Cheng Chen</a>, <a href="">Zheng-Wei Yu</a> at September 30, 2016 01:30 AM

Don't Skype & Type! Acoustic Eavesdropping in Voice-Over-IP. (arXiv:1609.09359v1 [cs.CR])

Acoustic emanations of computer keyboards represent a serious privacy issue. As demonstrated in prior work, spectral and temporal properties of keystroke sounds might reveal what a user is typing. However, previous attacks assumed relatively strong adversary models that are not very practical in many real-world settings. Such strong models assume: (i) adversary's physical proximity to the victim, (ii) precise profiling of the victim's typing style and keyboard, and/or (iii) significant amount of victim's typed information (and its corresponding sounds) available to the adversary.

In this paper, we investigate a new and practical keyboard acoustic eavesdropping attack, called Skype & Type (S&T), which is based on Voice-over-IP (VoIP). S&T relaxes prior strong adversary assumptions. Our work is motivated by the simple observation that people often engage in secondary activities (including typing) while participating in VoIP calls. VoIP software can acquire acoustic emanations of pressed keystrokes (which might include passwords and other sensitive information) and transmit them to others involved in the call. In fact, we show that very popular VoIP software (Skype) conveys enough audio information to reconstruct the victim's input -- keystrokes typed on the remote keyboard. In particular, our results demonstrate that, given some knowledge on the victim's typing style and the keyboard, the attacker attains top-5 accuracy of 91.7% in guessing a random key pressed by the victim. (The accuracy goes down to still alarming 41.89% if the attacker is oblivious to both the typing style and the keyboard). Finally, we provide evidence that Skype & Type attack is robust to various VoIP issues (e.g., Internet bandwidth fluctuations and presence of voice over keystrokes), thus confirming feasibility of this attack.

by <a href="">Alberto Compagno</a>, <a href="">Mauro Conti</a>, <a href="">Daniele Lain</a>, <a href="">Gene Tsudik</a> at September 30, 2016 01:30 AM

Machine Learning Techniques for Stackelberg Security Games: a Survey. (arXiv:1609.09341v1 [cs.GT])

The present survey aims at presenting the current machine learning techniques employed in security games domains. Specifically, we focused on papers and works developed by the Teamcore of University of Southern California, which deepened different directions in this field. After a brief introduction on Stackelberg Security Games (SSGs) and the poaching setting, the rest of the work presents how to model a boundedly rational attacker taking into account her human behavior, then describes how to face the problem of having attacker's payoffs not defined and how to estimate them and, finally, presents how online learning techniques have been exploited to learn a model of the attacker.

by <a href="">Giuseppe De Nittis</a>, <a href="">Francesco Trov&#xf2;</a> at September 30, 2016 01:30 AM

Measuring Economic Resilience to Natural Disasters with Big Economic Transaction Data. (arXiv:1609.09340v1 [cs.DB])

This research explores the potential to analyze bank card payments and ATM cash withdrawals in order to map and quantify how people are impacted by and recover from natural disasters. Our approach defines a disaster-affected community's economic recovery time as the time needed to return to baseline activity levels in terms of number of bank card payments and ATM cash withdrawals. For Hurricane Odile, which hit the state of Baja California Sur (BCS) in Mexico between 15 and 17 September 2014, we measured and mapped communities' economic recovery time, which ranged from 2 to 40 days in different locations. We found that -- among individuals with a bank account -- the lower the income level, the shorter the time needed for economic activity to return to normal levels. Gender differences in recovery times were also detected and quantified. In addition, our approach evaluated how communities prepared for the disaster by quantifying expenditure growth in food or gasoline before the hurricane struck. We believe this approach opens a new frontier in measuring the economic impact of disasters with high temporal and spatial resolution, and in understanding how populations bounce back and adapt.

by <a href="">Elena Alfaro Martinez</a> (BBVA Data &amp; Analytics), <a href="">Maria Hernandez Rubio</a> (BBVA Data &amp; Analytics), <a href="">Roberto Maestre Martinez</a> (BBVA Data &amp; Analytics), <a href="">Juan Murillo Arias</a> (BBVA Data &amp; Analytics), <a href="">Dario Patane</a> (BBVA Data &amp; Analytics), <a href="">Amanda Zerbe</a> (United Nations Global Pulse), <a href="">Robert Kirkpatrick</a> (United Nations Global Pulse), <a href="">Miguel Luengo-Oroz</a> (United Nations Global Pulse), <a href="">Amanda Zerbe</a> (United Nations Global Pulse) at September 30, 2016 01:30 AM

Towards performance portability through locality-awareness for applications using one-sided communication primitives. (arXiv:1609.09333v1 [cs.DC])

MPI is the most widely used data transfer and communication model in High Performance Computing. The latest version of the standard, MPI-3, allows skilled programmers to exploit all hardware capabilities of the latest and future supercomputing systems. The revised asynchronous remote-memory-access model in combination with the shared-memory window extension, in particular, allow writing code that hides communication latencies and optimizes communication paths according to the locality of data origin and destination. The latter is particularly important for today's multi- and many-core systems. However, writing such efficient code is highly complex and error-prone. In this paper we evaluate a recent remote-memory-access model, namely DART-MPI. This model claims to hide the aforementioned complexities from the programmer, but deliver locality-aware remote-memory-access semantics which outperforms MPI-3 one-sided communication primitives on multi-core systems. Conceptually, the DART-MPI interface is simple; at the same time it takes care of the complexities of the underlying MPI-3 and system topology. This makes DART-MPI an interesting candidate for porting legacy applications. We evaluate these claims using a realistic scientific application, specifically a finite-difference stencil code which solves the heat diffusion equation, on a large-scale Cray XC40 installation.

by <a href="">Huan Zhou</a>, <a href="">Jose Gracia</a> at September 30, 2016 01:30 AM

k-rAC - a Fine-Grained k-Resilient Access Control Scheme for Distributed Hash Tables. (arXiv:1609.09329v1 [cs.CR])

Distributed Hash Tables (DHT) are a common architecture for decentralized applications and, therefore, would be suited for privacy-aware applications. However, currently existing DHTs allow every peer to access any index. To build privacy-aware applications, we need to control this access. In this paper, we present k-rAC, a privacy-aware fine-grained AC for DHTs. For authentication, we present three different mechanisms based on public-key cryptography, zero-knowledge-proofs, and cryptographic hashes. For authorization, we use distributed AC lists. The security of our approach is based on k-resilience. We show that our approach introduces an acceptable overhead and discuss its suitability for different scenarios.

by <a href="">Olga Kieselmann</a>, <a href="">Arno Wacker</a> at September 30, 2016 01:30 AM

A New Queue Discipline for Reducing Bufferbloat Effects in HetNet Concurrent Multipath Transfer. (arXiv:1609.09314v1 [cs.NI])

Heterogeneous wireless networks have evolved to reach application requirements for low latency and high throughput on Internet access. Recent studies have improved network performance employing the Multipath TCP, which aggregates flows from heterogeneous wireless interfaces in a single connection. Although existing proposals are powerful, coupled congestion control algorithms are currently limited because of the high variation in path delays, bandwidth and loss rate, typical from heterogeneous wireless networks, even more over concurrent multipath transmissions. These transmissions experience bufferbloat, i.e., high delays caused by long queues. Hence, to cope with the current limitations, this work presents CoDel-LIFO, a new active queue management (AQM) discipline to reduce the dropped packet ratio in the Multipath TCP congestion control mechanism. Differently from other approaches, CoDel-LIFO gives priority to the most recent packets, being then promising. This paper provides a detailed simulation analysis over congestion control algorithms by comparing CoDel-LIFO to CoDel and DropTail disciplines. Results indicate that CoDel-LIFO reduces queue drops, diminishing the impact on congestion control; improving substantially the goodput; and keeping RTT low.

by <a href="">Benevid Felix</a>, <a href="">Aldri Santos</a>, <a href="">Michele Nogueira</a> at September 30, 2016 01:30 AM

One-loop diagrams in the Random Euclidean Matching Problem. (arXiv:1609.09310v1 [cond-mat.dis-nn])

The matching problem is a notorious combinatorial optimization problem that has attracted for many years the attention of the statistical physics community. Here we analyze the Euclidean version of the problem, i.e. the optimal matching problem between points randomly distributed on a $d$-dimensional Euclidean space, where the cost to minimize depends on the points' pairwise distances. Using Mayer's cluster expansion we write a formal expression for the replicated action that is suitable for a saddle point computation. We give the diagrammatic rules for each term of the expansion, and we analyze in detail the one-loop diagrams. A characteristic feature of the theory, when diagrams are perturbatively computed around the mean field part of the action, is the vanishing of the mass at zero momentum. In the non-Euclidean case of uncorrelated costs instead, we predict and numerically verify an anomalous scaling for the sub-sub-leading correction to the asymptotic average cost.

by <a href="">Carlo Lucibello</a>, <a href="">Giorgio Parisi</a>, <a href="">Gabriele Sicuro</a> at September 30, 2016 01:30 AM

Comprehensive Evaluation of OpenCL-based Convolutional Neural Network Accelerators in Xilinx and Altera FPGAs. (arXiv:1609.09296v1 [cs.CV])

Deep learning has significantly advanced the state of the art in artificial intelligence, gaining wide popularity from both industry and academia. Special interest is around Convolutional Neural Networks (CNN), which take inspiration from the hierarchical structure of the visual cortex, to form deep layers of convolutional operations, along with fully connected classifiers. Hardware implementations of these deep CNN architectures are challenged with memory bottlenecks that require many convolution and fully-connected layers demanding large amount of communication for parallel computation. Multi-core CPU based solutions have demonstrated their inadequacy for this problem due to the memory wall and low parallelism. Many-core GPU architectures show superior performance but they consume high power and also have memory constraints due to inconsistencies between cache and main memory. FPGA design solutions are also actively being explored, which allow implementing the memory hierarchy using embedded BlockRAM. This boosts the parallel use of shared memory elements between multiple processing units, avoiding data replicability and inconsistencies. This makes FPGAs potentially powerful solutions for real-time classification of CNNs. Both Altera and Xilinx have adopted OpenCL co-design framework from GPU for FPGA designs as a pseudo-automatic development solution. In this paper, a comprehensive evaluation and comparison of Altera and Xilinx OpenCL frameworks for a 5-layer deep CNN is presented. Hardware resources, temporal performance and the OpenCL architecture for CNNs are discussed. Xilinx demonstrates faster synthesis, better FPGA resource utilization and more compact boards. Altera provides multi-platforms tools, mature design community and better execution times.

by <a href="">R. Tapiador</a>, <a href="">A. Rios-Navarro</a>, <a href="">A. Linares-Barranco</a>, <a href="">Minkyu Kim</a>, <a href="">Deepak Kadetotad</a>, <a href="">Jae-sun Seo</a> at September 30, 2016 01:30 AM

DynIMS: A Dynamic Memory Controller for In-memory Storage on HPC Systems. (arXiv:1609.09294v1 [cs.PF])

In order to boost the performance of data-intensive computing on HPC systems, in-memory computing frameworks, such as Apache Spark and Flink, use local DRAM for data storage. Optimizing the memory allocation to data storage is critical to delivering performance to traditional HPC compute jobs and throughput to data-intensive applications sharing the HPC resources. Current practices that statically configure in-memory storage may leave inadequate space for compute jobs or lose the opportunity to utilize more available space for data-intensive applications. In this paper, we explore techniques to dynamically adjust in-memory storage and make the right amount of space for compute jobs. We have developed a dynamic memory controller, DynIMS, which infers memory demands of compute tasks online and employs a feedback-based control model to adapt the capacity of in-memory storage. We test DynIMS using mixed HPCC and Spark workloads on a HPC cluster. Experimental results show that DynIMS can achieve up to 5X performance improvement compared to systems with static memory allocations.

by <a href="">Pengfei Xuan</a>, <a href="">Feng Luo</a>, <a href="">Rong Ge</a>, <a href="">Pradip K Srimani</a> at September 30, 2016 01:30 AM

Formula Slicing: Inductive Invariants from Preconditions. (arXiv:1609.09288v1 [cs.LO])

We propose a "formula slicing" method for finding inductive invariants. It is based on the observation that many loops in the program affect only a small part of the memory, and many invariants which were valid before a loop are still valid after.

Given a precondition of the loop, obtained from the preceding program fragment, we weaken it until it becomes inductive. The weakening procedure is guided by counterexamples-to-induction given by an SMT solver. Our algorithm applies to programs with arbitrary loop structure, and it computes the strongest invariant in an abstract domain of weakenings of preconditions. We call this algorithm "formula slicing", as it effectively performs "slicing" on formulas derived from symbolic execution.

We evaluate our algorithm on the device driver benchmarks from the International Competition on Software Verification (SV-COMP), and we show that it is competitive with the state-of-the-art verification techniques.

by <a href="">Egor George Karpenkov</a>, <a href="">David Monniaux</a> at September 30, 2016 01:30 AM

Self-stabilizing Byzantine Clock Synchronization with Optimal Precision. (arXiv:1609.09281v1 [cs.DC])

We revisit the approach to Byzantine fault-tolerant clock synchronization based on approximate agreement introduced by Lynch and Welch. Our contribution is threefold:

(1) We provide a slightly refined variant of the algorithm yielding improved bounds on the skew that can be achieved and the sustainable frequency offsets.

(2) We show how to extend the technique to also synchronize clock rates. This permits less frequent communication without significant loss of precision, provided that clock rates change sufficiently slowly.

(3) We present a coupling scheme that allows to make these algorithms self-stabilizing while preserving their high precision. The scheme utilizes a low-precision, but self-stabilizing algorithm for the purpose of recovery.

by <a href="">Pankaj Khanchandani</a>, <a href="">Christoph Lenzen</a> at September 30, 2016 01:30 AM

Knapsack problem for automaton groups. (arXiv:1609.09274v1 [math.GR])

The knapsack problem is a classic optimisation problem that has been recently extended in the setting of groups. Its study reveals to be interesting since it provides many different behaviours, depending on the considered class of groups. In this paper we deal with groups generated by Mealy automata-a class that is often used to study group-theoretical conjectures-and prove that the knapsack problem is undecidable for this class. In a second time, we construct a graph that, if finite, provides a solution to the knapsack problem. We deduce that the knapsack problem is decidable for the so-called bounded automaton groups, a class where the order and conjugacy problems are already known to be decidable.

by <a href="">Thibault Godin</a> (IRIF) at September 30, 2016 01:30 AM

Experience with Heuristics, Benchmarks & Standards for Cylindrical Algebraic Decomposition. (arXiv:1609.09269v1 [cs.SC])

In the paper which inspired the SC-Square project, [E. Abraham, Building Bridges between Symbolic Computation and Satisfiability Checking, Proc. ISSAC '15, pp. 1-6, ACM, 2015] the author identified the use of sophisticated heuristics as a technique that the Satisfiability Checking community excels in and from which it is likely the Symbolic Computation community could learn and prosper. To start this learning process we summarise our experience with heuristic development for the computer algebra algorithm Cylindrical Algebraic Decomposition. We also propose and discuss standards and benchmarks as another area where Symbolic Computation could prosper from Satisfiability Checking expertise, noting that these have been identified as initial actions for the new SC-Square community in the CSA project, as described in [E.~Abraham et al., SC$^2$: Satisfiability Checking meets Symbolic Computation (Project Paper)}, Intelligent Computer Mathematics (LNCS 9761), pp. 28--43, Springer, 2015].

by <a href="">Matthew England</a>, <a href="">James H. Davenport</a> at September 30, 2016 01:30 AM

Auto-scaling Web Applications in Clouds: A Taxonomy and Survey. (arXiv:1609.09224v1 [cs.DC])

Web application providers have been migrating their applications to cloud data centers, attracted by the emerging cloud computing paradigm. One of the appealing features of cloud is elasticity. It allows cloud users to acquire or release computing resources on demand, which enables web application providers to auto-scale the resources provisioned to their applications under dynamic workload in order to minimize resource cost while satisfying Quality of Service (QoS) requirements. In this paper, we comprehensively analyze the challenges remain in auto-scaling web applications in clouds and review the developments in this field. We present a taxonomy of auto-scaling systems according to the identified challenges and key properties. We analyze the surveyed works and map them to the taxonomy to identify the weakness in this field. Moreover, based on the analysis, we propose new future directions.

by <a href="">Chenhao Qu</a>, <a href="">Rodrigo N. Calheiros</a>, <a href="">Rajkumar Buyya</a> at September 30, 2016 01:30 AM

Time/memory/data trade-off attack to a psuedo-random generator. (arXiv:1609.09219v1 [cs.CR])

Time, data and memory trade off attack is one of the most important threats against pseudo- random generators and resisting against it, is considered as a main criteria of designing such generators. In this research, the pseudo-random GMGK generator will be addressed and analyzed in details. Having indicated various weaknesses of this generator, we performed three different versions of structural attack on this generator and showed that proposed TMDTO attacks to this generator can discover blocks of plaintext with lower complexity than exhaustive search of space of key generator. Results indicated that the mentioned generator is lack of the security claimed by authors.

by <a href="">Behrooz Khadem</a>, <a href="">Ali Madadi</a> at September 30, 2016 01:30 AM

A Dynamic Web Service Registry Framework for Mobile Environments. (arXiv:1609.09211v1 [cs.DC])

Advancements in technology have transformed mobile devices from being mere communication widgets to versatile computing devices. Proliferation of these hand held devices has made them a common means to access and process digital information. Most web based applications are today available in a form that can conveniently be accessed over mobile devices. However, webservices (applications meant for consumption by other applications rather than humans) are not as commonly provided and consumed over mobile devices. Facilitating this and in effect realizing a service-oriented system over mobile devices has the potential to further enhance the potential of mobile devices. One of the major challenges in this integration is the lack of an efficient service registry system that caters to issues associated with the dynamic and volatile mobile environments. Existing service registry technologies designed for traditional systems fall short of accommodating such issues. In this paper, we propose a novel approach to manage service registry systems provided 'solely' over mobile devices, and thus realising an SOA without the need for high-end computing systems. The approach manages a dynamic service registry system in the form of light weight and distributed registries. We assess the feasibility of our approach by engineering and deploying a working prototype of the proposed registry system over actual mobile devices. A comparative study of the proposed approach and the traditional UDDI (Universal Description, Discovery, and Integration) registry is also included. The evaluation of our framework has shown propitious results in terms of battery cost, scalability, hindrance with native applications.

by <a href="">Rohit Verma</a>, <a href="">Abhishek Srivastava</a> at September 30, 2016 01:30 AM

Data Rate for Distributed Consensus of Multi-agent Systems with High Order Oscillator Dynamics. (arXiv:1609.09206v1 [cs.SY])

Distributed consensus with data rate constraint is an important research topic of multi-agent systems. Some results have been obtained for consensus of multi-agent systems with integrator dynamics, but it remains challenging for general high-order systems, especially in the presence of unmeasurable states. In this paper, we study the quantized consensus problem for a special kind of high-order systems and investigate the corresponding data rate required for achieving consensus. The state matrix of each agent is a 2m-th order real Jordan block admitting m identical pairs of conjugate poles on the unit circle; each agent has a single input, and only the first state variable can be measured. The case of harmonic oscillators corresponding to m=1 is first investigated under a directed communication topology which contains a spanning tree, while the general case of m >= 2 is considered for a connected and undirected network. In both cases it is concluded that the sufficient number of communication bits to guarantee the consensus at an exponential convergence rate is an integer between $m$ and $2m$, depending on the location of the poles.

by <a href="">Zhirong Qiu</a>, <a href="">Lihua Xie</a>, <a href="">Yiguang Hong</a> at September 30, 2016 01:30 AM

EXTRACT: Strong Examples from Weakly-Labeled Sensor Data. (arXiv:1609.09196v1 [stat.ML])

Thanks to the rise of wearable and connected devices, sensor-generated time series comprise a large and growing fraction of the world's data. Unfortunately, extracting value from this data can be challenging, since sensors report low-level signals (e.g., acceleration), not the high-level events that are typically of interest (e.g., gestures). We introduce a technique to bridge this gap by automatically extracting examples of real-world events in low-level data, given only a rough estimate of when these events have taken place.

By identifying sets of features that repeat in the same temporal arrangement, we isolate examples of such diverse events as human actions, power consumption patterns, and spoken words with up to 96% precision and recall. Our method is fast enough to run in real time and assumes only minimal knowledge of which variables are relevant or the lengths of events. Our evaluation uses numerous publicly available datasets and over 1 million samples of manually labeled sensor data.

by <a href="">Davis W. Blalock</a>, <a href="">John V. Guttag</a> at September 30, 2016 01:30 AM

DPHMM: Customizable Data Release with Differential Privacy via Hidden Markov Model. (arXiv:1609.09172v1 [cs.DB])

Hidden Markov model (HMM) has been well studied and extensively used. In this paper, we present DPHMM ({Differentially Private Hidden Markov Model}), an HMM embedded with a private data release mechanism, in which the privacy of the data is protected through a graph. Specifically, we treat every state in Markov model as a node, and use a graph to represent the privacy policy, in which "indistinguishability" between states is denoted by edges between nodes. Due to the temporal correlations in Markov model, we show that the graph may be reduced to a subgraph with disconnected nodes, which become unprotected and might be exposed. To detect such privacy risk, we define sensitivity hull and degree of protection based on the graph to capture the condition of information exposure. Then to tackle the detected exposure, we study how to build an optimal graph based on the existing graph. We also implement and evaluate the DPHMM on real-world datasets, showing that privacy and utility can be better tuned with customized policy graph.

by <a href="">Yonghui Xiao</a>, <a href="">Yilin Shen</a>, <a href="">Jinfei Liu</a>, <a href="">Li Xiong</a>, <a href="">Hongxia Jin</a>, <a href="">Xiaofeng Xu</a> at September 30, 2016 01:30 AM

Your Computer is Leaking. (arXiv:1609.09157v1 [cs.CR])

This presentation focuses on differences between quantum computing and quantum cryptography. Both are discussed related to classical computer systems in terms of vulnerability. Research concerning quantum cryptography is analyzed in terms of work done by the University of Cambridge in partnership with a division of Toshiba, and also attacks demonstrated by Swedish researchers against QKD of energy-time entangled systems. Quantum computing is covered in terms of classical cryptography related to weaknesses presented by Shor's algorithm. Previous classical vulnerabilities also discussed were conducted by Israeli researchers as a side-channel attack using parabolic curve microphones, which has since been patched.

by <a href="">Dennis Hollenbeck</a>, <a href="">Ian Malloy</a> at September 30, 2016 01:30 AM

MPI-FAUN: An MPI-Based Framework for Alternating-Updating Nonnegative Matrix Factorization. (arXiv:1609.09154v1 [cs.DC])

Non-negative matrix factorization (NMF) is the problem of determining two non-negative low rank factors $W$ and $H$, for the given input matrix $A$, such that $A \approx W H$. NMF is a useful tool for many applications in different domains such as topic modeling in text mining, background separation in video analysis, and community detection in social networks. Despite its popularity in the data mining community, there is a lack of efficient parallel algorithms to solve the problem for big data sets.

The main contribution of this work is a new, high-performance parallel computational framework for a broad class of NMF algorithms that iteratively solves alternating non-negative least squares (NLS) subproblems for $W$ and $H$. It maintains the data and factor matrices in memory (distributed across processors), uses MPI for interprocessor communication, and, in the dense case, provably minimizes communication costs (under mild assumptions). The framework is flexible and able to leverage a variety of NMF and NLS algorithms, including Multiplicative Update, Hierarchical Alternating Least Squares, and Block Principal Pivoting. Our implementation allows us to benchmark and compare different algorithms on massive dense and sparse data matrices of size that spans for few hundreds of millions to billions. We demonstrate the scalability of our algorithm and compare it with baseline implementations, showing significant performance improvements. The code and the datasets used for conducting the experiments are available online.

by <a href="">Ramakrishnan Kannan</a>, <a href="">Grey Ballard</a>, <a href="">Haesun Park</a> at September 30, 2016 01:30 AM

Independent sets near the lower bound in bounded degree graphs. (arXiv:1609.09134v1 [cs.DM])

By Brook's Theorem, every n-vertex graph of maximum degree at most Delta >= 3 and clique number at most Delta is Delta-colorable, and thus it has an independent set of size at least n/Delta. We give an approximate characterization of graphs with independence number close to this bound, and use it to show that the problem of deciding whether such a graph has an indepdendent set of size at least n/Delta+k has a kernel of size O(k).

by <a href="">Zdenek Dvorak</a>, <a href="">Bernard Lidicky</a> at September 30, 2016 01:30 AM

A Study on Altering PostgreSQL from Multi-Processes Structure to Multi-Threads Structure. (arXiv:1609.09062v1 [cs.DB])

How to altering PostgreSQL database from multi-processes structure to multi-threads structure is a difficult problem. In the paper, we bring forward a comprehensive alteration scheme. Especially, put rational methods to account for three difficult points: semaphores, signal processing and global variables. At last, applied the scheme successfully to modify a famous open source DBMS.

by <a href="">Zhiyong Shan</a> at September 30, 2016 01:30 AM

Breaking a chaotic image encryption algorithm based on modulo addition and XOR operation. (arXiv:1207.6536v2 [cs.CR] UPDATED)

This paper re-evaluates the security of a chaotic image encryption algorithm called MCKBA/HCKBA and finds that it can be broken efficiently with two known plain-images and the corresponding cipher-images. In addition, it is reported that a previously proposed breaking on MCKBA/HCKBA can be further improved by reducing the number of chosen plain-images from four to two. The two attacks are both based on the properties of solving a composite function involving the carry bit, which is composed of the modulo addition and the bitwise OR operations. Both rigorous theoretical analysis and detailed experimental results are provided.

by <a href="">Chengqing Li</a>, <a href="">Yuansheng Liu</a>, <a href="">Leo Yu Zhang</a>, <a href="">Michael Z. Q. Chen</a> at September 30, 2016 01:30 AM

Overcoming Bias

Liu Cixin’s Trilogy

I just finished Liu Cixin’s trilogy of books, Three Body Problem, Dark Forest, and Death’s End. They’ve gotten a lot of praise as perhaps the best classic-style science fiction in the past decade. This praise usually makes sure to mention that Cixin is Chinese, and thus adds to diversity in science fiction. Which I think has shielded him from criticism he’d get if he were white. To explain, I have to give some spoilers, below the fold. You are warned.

These books are mainly about conflicts between humanity and aliens, over a period that lasts for many centuries. Cixin assumes that even though tech and economic progress continue, we never develop artificial intelligence that threatens the central role of humans in running things, we never extend human lifespans beyond two centuries, and rates or progress never speed up substantially. Even so, a few centuries of progress is sufficient to achieve vast physical powers, including the ability to change basic physical parameters like the speed of light and the dimensionality of space. And hibernation is achieved early on, allowing some few characters to span the entire story.

What ends up mattering the most in human conflict with clients is the mood and personality of a few key characters, and the typical mood of humanity, which is treated as if it were a character with moods that drift over centuries. (Impersonal economic forces aren’t given much of a role.) Specifically what matters is how tough characters are – how willing they are to assume the worst of aliens and set aside the usual human morals. When characters follow the usual human inclination to be nice and trusting, things go badly, and when characters are tough and ruthless, things go well. Key characters are often too nice, and near the end humanity is almost entirely exterminated.

This tough vs. soft split within humanity is mapped explicitly onto gender. Key characters who cause problems by being soft and nice are consistently female, while the ones who most help humanity to survive are consistently male. This isn’t at all accidental. Key long-lived soft females who overlap eras when typical humans are soft lament the lack of tough men around, preferring men from prior eras. The story is told mostly from the female characters point of view, and sympathetically to that view. Even so, men choose survival while females choose extinction. This is the sort of thing that I suspect Cixin only gets away with, so far, because he is Chinese. (A movie of the first book is out next year.)

Cixin describes a universe where most resources seem to go unused, and where great powers hide and destroy any new civilizations that they detect. And even though he has two new civilizations appearing within four light years of each other and within a few centuries of being at the same tech level, he has this basic situation continuing for billions of years. In such a universe it makes sense for young civilizations to hide and also spread out. But whether this scenario can make sense depends on whether it makes sense for the biggest old powers to also hide, as opposed to visibly grabbing a large volume of resources and using it intensely. If you are competing with other old powers hiding near you, why stick your neck out to destroy newcomers, as opposed to leaving that task to the other old powers?

Cixim just doesn’t address these issues – his only brief passage from the point of view of old aliens doesn’t consider such things.

by Robin Hanson at September 30, 2016 01:15 AM


Explain Quantum algorithm for solving a system of linear equations by taking an example [on hold]

I am reading a paper on the quantum algorithm for solving a system of linear equations. I have read 2-3 books on the subject. But I could not understand the paper at all.Please explain it to me with an example.

by Vaibhav Patel at September 30, 2016 01:03 AM

Planet Theory

Real Rank Two Geometry

Authors: Anna Seigal, Bernd Sturmfels
Download: PDF
Abstract: The real rank two locus of an algebraic variety is the closure of the union of all secant lines spanned by real points. We seek a semi-algebraic description of this set. Its algebraic boundary consists of the tangential variety and the edge variety. Our study of Segre and Veronese varieties yields a characterization of tensors of real rank two.

September 30, 2016 01:02 AM


Meaning of cross sectional rank

This paper mentions the concept of rank which is defined as cross sectional rank. For e.g. one of the alphas (#3) is

(-1 * correlation(rank(open), rank(volume), 10))

10 is just the number of days to take any correlation over. I think we can rank the securities according to Open and Volume each day. So we will be getting different set of securities each day. I don't understand how can this daily varying set be used to get a correlation value.

I thus need guidance on now to calculate this alpha. Any help will be appreciated. Thanks

Update I understand what rank is. What I don't get is how do you calculate correlation between changing values.

Lets say the universe is 3 stocks. On Day 1, Rank Open is 1,2,3 and Rank Volume is 3,2,1. On Day 2, Rank Open is 1,3,2 and Rank Volume is 2,3,1. On Day 3, Rank Open is 3,2,1 and Rank Volume is 1,2,3. This happens for n days (in this case 10).

My primary question is how do you calculate correlation between such vectors to arrive at a single value. Because normal correlation is between two same type of vectors.

by user1434997 at September 30, 2016 01:02 AM

Planet Theory

Local and Union Boxicity

Authors: Thomas Bläsius, Peter Stumpf, Torsten Ueckerdt
Download: PDF
Abstract: The boxicity $\operatorname{box}(H)$ of a graph $H$ is the smallest integer $d$ such that $H$ is the intersection of $d$ interval graphs, or equivalently, that $H$ is the intersection graph of axis-aligned boxes in $\mathbb{R}^d$. These intersection representations can be interpreted as covering representations of the complement $H^c$ of $H$ with co-interval graphs, that is, complements of interval graphs. We follow the recent framework of global, local and folded covering numbers (Knauer and Ueckerdt, Discrete Mathematics 339 (2016)) to define two new parameters: the local boxicity $\operatorname{box}_\ell(H)$ and the union boxicity $\overline{\operatorname{box}}(H)$ of $H$. The union boxicity of $H$ is the smallest $d$ such that $H^c$ can be covered with $d$ vertex-disjoint unions of co-interval graphs, while the local boxicity of $H$ is the smallest $d$ such that $H^c$ can be covered with co-interval graphs, at most $d$ at every vertex.

We show that for every graph $H$ we have $\operatorname{box}_\ell(H) \leq \overline{\operatorname{box}}(H) \leq \operatorname{box}(H)$ and that each of these inequalities can be arbitrarily far apart. Moreover, we show that local and union boxicity are also characterized by intersection representations of appropriate axis-aligned boxes in $\mathbb{R}^d$. We demonstrate with a few striking examples, that in a sense, the local boxicity is a better indication for the complexity of a graph, than the classical boxicity.

September 30, 2016 01:02 AM

Lower Bounds for Protrusion Replacement by Counting Equivalence Classes

Authors: Bart M. P. Jansen, Jules J. H. M. Wulms
Download: PDF
Abstract: Garnero et al. [SIAM J. Discrete Math. 2015, 29(4):1864--1894] recently introduced a framework based on dynamic programming to make applications of the protrusion replacement technique constructive and to obtain explicit upper bounds on the involved constants. They show that for several graph problems, for every boundary size $t$ one can find an explicit set $\mathcal{R}_t$ of representatives. Any subgraph $H$ with a boundary of size $t$ can be replaced with a representative $H' \in \mathcal{R}_t$ such that the effect of this replacement on the optimum can be deduced from $H$ and $H'$ alone. Their upper bounds on the size of the graphs in $\mathcal{R}_t$ grow triple-exponentially with $t$. In this paper we complement their results by lower bounds on the sizes of representatives, in terms of the boundary size $t$. For example, we show that each set of planar representatives $\mathcal{R}_t$ for Independent Set or Dominating Set contains a graph with $\Omega(2^t / \sqrt{4t})$ vertices. This lower bound even holds for sets that only represent the planar subgraphs of bounded pathwidth. To obtain our results we provide a lower bound on the number of equivalence classes of the canonical equivalence relation for Independent Set on $t$-boundaried graphs. We also find an elegant characterization of the number of equivalence classes in general graphs, in terms of the number of monotone functions of a certain kind. Our results show that the number of equivalence classes is at most $2^{2^t}$, improving on earlier bounds of the form $(t+1)^{2^t}$.

September 30, 2016 01:01 AM

Graph partitioning and a componentwise PageRank algorithm

Authors: Christopher Engström, Sergei Silvestrov
Download: PDF
Abstract: In this article we will present a graph partitioning algorithm which partitions a graph into two different types of components: the well-known `strongly connected components' as well as another type of components we call `connected acyclic component'. We will give an algorithm based on Tarjan's algorithm for finding strongly connected components used to find such a partitioning. We will also show that the partitioning given by the algorithm is unique and that the underlying graph can be represented as a directed acyclic graph (similar to a pure strongly connected component partitioning).

In the second part we will show how such an partitioning of a graph can be used to calculate PageRank of a graph effectively by calculating PageRank for different components on the same `level' in parallel as well as allowing for the use of different types of PageRank algorithms for different types of components.

To evaluate the method we have calculated PageRank on four large example graphs and compared it with a basic approach, as well as our algorithm in a serial as well as parallel implementation.

September 30, 2016 01:01 AM

A linear programming based heuristic framework for min-max regret combinatorial optimization problems with interval costs

Authors: Lucas Assunção, Thiago F. Noronha, Andréa Cynthia Santos, Rafael Andrade
Download: PDF
Abstract: This work deals with a class of problems under interval data uncertainty, namely interval robust-hard problems, composed of interval data min-max regret generalizations of classical NP-hard combinatorial problems modeled as 0-1 integer linear programming problems. These problems are more challenging than other interval data min-max regret problems, as solely computing the cost of any feasible solution requires solving an instance of an NP-hard problem. The state-of-the-art exact algorithms in the literature are based on the generation of a possibly exponential number of cuts. As each cut separation involves the resolution of an NP-hard classical optimization problem, the size of the instances that can be solved efficiently is relatively small. To smooth this issue, we present a modeling technique for interval robust-hard problems in the context of a heuristic framework. The heuristic obtains feasible solutions by exploring dual information of a linearly relaxed model associated with the classical optimization problem counterpart. Computational experiments for interval data min-max regret versions of the restricted shortest path problem and the set covering problem show that our heuristic is able to find optimal or near-optimal solutions and also improves the primal bounds obtained by a state-of-the-art exact algorithm and a 2-approximation procedure for interval data min-max regret problems.

September 30, 2016 01:01 AM


is there any way to prevent side effects in python?

Is there any way to prevent side effects in python? For example, the following function has a side effect, is there any keyword or any other way to have the python complain about it?

def func_with_side_affect(a):

by yigal at September 30, 2016 12:49 AM


BST Help with Words [on hold]

I am using the word SUNBEAM and I need to use it in a BST. I need to write the letter is in a post order transversal and I am confused how to do it.

by Dennis Gill at September 30, 2016 12:20 AM


DOM Tree reduce using RAMDA

I want to walk a DOM tree in the browser collecting DOM nodes that are "leaves", containing no DOM children, only text nodes.

I'm imagining there's a way to do this with reduce, but it's not obvious to me how to ... Recursively reduce on a tree like structure.

I've built a bunch of usable components...

let nodeFromJQuery = R.invoker(1,'get')(0);
let nodeFromAny = R.ifElse(R.isArrayLike,nodeFromJQuery,R.identity);
let nodeType = R.pipe(nodeFromAny,R.prop('nodeType'));
let children = R.pipe(nodeFromAny,R.prop('childNodes'));
let textNodeType = R.equals(3);
let domNodeType = R.equals(1);

let domNodes =;
let textNodes =;

let isTextNode = R.pipe(nodeType, textNodeType);
let isDomNode = R.pipe(nodeType,domNodeType);
let domChildren = R.pipe(children,R.filter(isDomNode));

isLeaf = R.pipe(domChildren, R.isEmpty);

getNodes = R.filter(R.not(isLeaf));
getLeaves = R.filter(isLeaf)

but I don't see the simple reduction... Any thoughts?


by mangr3n at September 30, 2016 12:18 AM

F# remove trailing space

I have this method that takes in a list and turns it into a bytecode string. It works the way I expect; however, I get one trailing space that I do not want. Question: how do I get rid of this last trailing 0?

Input: byteCode [SC 10; SC 2; SAdd; SC 32; SC 4; SC 5; SAdd; SMul; SAdd]

let rec byteCode (l : sInstr list) : string = 
  match l with 
  | [] -> "" 
  | (SC    n :: l)     -> "0 " + string n + " " + byteCode l 
  | (SAdd    :: l)     -> "1 " + byteCode l 
  | (SSub    :: l)     -> "2 " + byteCode l 
  | (SMul    :: l)     -> "3 " + byteCode l 
  | (SNeg    :: l)     -> "4 " + byteCode l 
  | (SLess   :: l)     -> "5 " + byteCode l 
  | (SIfze n :: l)     -> "6 " + string n + " " + byteCode l 
  | (SJump n :: l)     -> "7 " + string n + " " + byteCode l

This probably won't compile because I didn't give my entire program.

This returns: "0 10 0 2 1 0 32 0 4 0 5 1 3 1 "
I expect:     "0 10 0 2 1 0 32 0 4 0 5 1 3 1"

by Bob Long at September 30, 2016 12:06 AM

HN Daily

Planet Theory

Maximizing the Strong Triadic Closure in Split Graphs and Proper Interval Graphs

Authors: Athanasios Konstantinidis, Charis Papadopoulos
Download: PDF
Abstract: In social networks the {\sc Strong Triadic Closure} is an assignment of the edges with strong or weak labels such that any two vertices that have a common neighbor with a strong edge are adjacent. The problem of maximizing the number of strong edges that satisfy the strong triadic closure was recently shown to be NP-complete for general graphs. Here we initiate the study of graph classes for which the problem is solvable. We show that the problem admits a polynomial-time algorithm for two unrelated classes of graphs: proper interval graphs and trivially-perfect graphs. To complement our result, we show that the problem remains NP-complete on split graphs, and consequently also on chordal graphs. Thus we contribute to define the first border between graph classes on which the problem is polynomially solvable and on which it remains NP-complete.

September 30, 2016 12:00 AM

September 29, 2016


How to get RUSER and EUSER of the process (FreeBSD)

I have tried this but doesn't work ps -eo euser,ruser,suser,fuser,f,comm,label | grep processname

can anyone show me the right way to do this?

by FallingFromBed at September 29, 2016 11:34 PM


OpenBSD 6.0 Limited Edition CD set (signed by developers)

Five OpenBSD 6.0 CD-ROM copies were signed by 40 developers during the g2k16 Hackathon in Cambridge, UK.

Those copies are being auctioned sequentially on ebay.

All proceeds will be donated to the OpenBSD Foundation to support and further the development of free software based on the OpenBSD operating system. Read more...

September 29, 2016 11:16 PM


OpenBSD 6.0 CD Set - Limited Edition signed by 40 developers

“All proceeds from this auction will be donated to the OpenBSD Foundation to support and further the development of free software based on the OpenBSD operating system.”


by jcs at September 29, 2016 11:12 PM


What does (n,) mean in the context of numpy and vectors?

I've tried searching StackOverflow, googling, and even using symbolhound to do character searches, but was unable to find an answer. Specifically, I'm confused about Ch. 1 of Nielsen's Neural Networks and Deep Learning, where he says "It is assumed that the input a is an (n, 1) Numpy ndarray, not a (n,) vector."

At first I thought (n,) referred to the orientation of the array - so it might refer to a one-column vector as opposed to a vector with only one row. But then I don't see why we need (n,) and (n, 1) both - they seem to say the same thing. I know I'm misunderstanding something but am unsure.

For reference a refers to a vector of activations that will be input to a given layer of a neural network, before being transformed by the weights and biases to produce the output vector of activations for the next layer.

EDIT: This question equivocates between a "one-column vector" (there's no such thing) and a "one-column matrix" (does actually exist). Same for "one-row vector" and "one-row matrix".

A vector is only a list of numbers, or (equivalently) a list of scalar transformations on the basis vectors of a vector space. A vector might look like a matrix when we write it out, if it only has one row (or one column). Confusingly, we will sometimes refer to a "vector of activations" but actually mean "a single-row matrix of activation values transposed so that it is a single-column."

Be aware that in neither case are we discussing a one-dimensional vector, which would be a vector defined by only one number (unless, trivially, n==1, in which case the concept of a "column" or "row" distinction would be meaningless).

by bobo at September 29, 2016 11:06 PM



What is the number of filter in CNN?

I am currently seeing the API of theano,

theano.tensor.nnet.conv2d(input, filters, input_shape=None, filter_shape=None, border_mode='valid', subsample=(1, 1), filter_flip=True, image_shape=None, **kwargs)

where the filter_shape is a tuple of (num_filter, num_channel, height, width), I am confusing about this because isn't that the number of filter decided by the stride while sliding the filter window on the image? How can I specify on filter number just like this? It would be reasonable to me if it is calculated by the parameter stride (if there is any).

Also, I am confused with the term feature map as well, is it the neurons at each layer? How about the batch size? How are they correlated?

by xxx222 at September 29, 2016 10:38 PM

Theano multivariable regression, gradient descent

I am new to Theano and Machine learning. I try to do Multivariate Regression with Theano. The ideal model is Z=2x-2y+1, simple. I made a simple training data set and used gradient descent algorithm. But I cannot get theta1=2 , theta2=-2. Could you give me any help? Or advice? The below is the code I used. Thanks in advance.

import theano
from theano import tensor as T
import numpy as np

trX = np.linspace(-1,1,101)
trY = np.linspace(-1,1,101)
trZ = 2*trX - 2*trY + 1  #make Z value without noise

X = T.scalar()
Y = T.scalar()
Z = T.scalar()

def model(X, Y, theta1, theta2):
    return theta1*X + theta2*Y + 1  #ideal value I want is 2, 2

#give initial value 0 and 0
theta1 = theano.shared(np.asarray(0., dtype=theano.config.floatX))
theta2 = theano.shared(np.asarray(0., dtype=theano.config.floatX))

Z = model(X,Y,theta1,theta2)

cost = T.mean(T.sqr(Z-trZ))  #cost function

gx= T.grad(cost,theta1)
gy= T.grad(cost,theta2)

learning_rate = 0.01

updates=[(theta1, theta1 - learning_rate * gx),(theta2, theta2 - learning_rate * gy)]   

train = theano.function(inputs=[X,Y,Z], outputs=cost, updates = updates)

# I got 0
# I got 0
# I got 0

by nathanlim45 at September 29, 2016 10:34 PM


Find all bonds associated with an equity

I would like to use Python to programmatically find the cusips of all bonds that are currently issued by a given equity. Assume I can use any free api and bloomberg. Thank you!

by Steven Setteducati Jr. at September 29, 2016 10:32 PM


EC2 Reserved Instance Update – Convertible RIs and Regional Benefit

We launched EC2 Reserved Instances almost eight years ago. The model that we originated in 2009 provides you with two separate benefits: capacity reservations and a significant discount on the use of specific instances in an Availability Zone. Over time, based on customer feedback, we have refined the model and made additional options available including Scheduled Reserved Instances, the ability to Modify Reserved Instances Reservations, and the ability to buy and sell Reserved Instances (RIs) on the Reserved Instance Marketplace.

Today we are enhancing the Reserved Instance model once again. Here’s what we are launching:

Regional Benefit -Many customers have told us that the discount is more important than the capacity reservation, and that they would be willing to trade it for increased flexibility. Starting today, you can choose to waive the capacity reservation associated with Standard RI, run your instance in any AZ in the Region, and have your RI discount automatically applied.

Convertible Reserved Instances -Convertible RIs give you even more flexibility and offer a significant discount (typically 45% compared to On-Demand). They allow you to change the instance family and other parameters associated with a Reserved Instance at any time. For example, you can convert C3 RIs to C4 RIs to take advantage of a newer instance type, or convert C4 RIs to M4 RIs if your application turns out to need more memory. You can also use Convertible RIs to take advantage of EC2 price reductions over time.

Let’s take a closer look…

Regional Benefit
Reserved Instances (either Standard or Convertible) can now be set to automatically apply across all Availability Zones in a region. The regional benefit automatically applies your RIs to instances across all Availability Zones in a region, broadening the application of your RI discounts. When this benefit is used, capacity is not reserved since the selection of an Availability Zone is required to provide a capacity reservation. In dynamic environments where you frequently launch, use, and then terminate instances this new benefit will expand your options and reduce the amount of time you spend seeking optimal alignment between your RIs and your instances. In horizontally scaled architectures using instances launched via Auto Scaling and connected via Elastic Load Balancing, this new benefit can be of considerable value.

After you click on Purchase Reserved Instances in the AWS Management Console, clicking on Search will display RI’s that have this new benefit:

You can check Only show offerings that reserve capacity if you want to shop for RIs that apply to a single Availability Zone and also reserve capacity:

Convertible RIs
Perhaps you, like many of our customers, purchase RIs to benefit from the best pricing for their workloads. However, if you don’t have a good understanding of your long-term requirements you may be able to make use of our new Convertible RI. If your needs change, you simply exchange your Convertible Reserved Instances for other ones. You can change into Convertible RIs that have a new instance type, operating system, or tenancy without resetting the term. Also, there’s no fee for making an exchange and you can do so as often as you like.

When you make the exchange, you must acquire new RIs that are of equal or greater value than those you started with; in some cases you’ll need to make a true-up payment in order to balance the books. The exchange process is based on the list value of each Convertible RI; this value is simply the sum of all payments you’ll make over the remaining term of the original RI.

You can shop for a Convertible RI by making sure that the Offering Class to Convertible before clicking on Search:

The Convertible RIs offer capacity assurance, are typically priced at a 45% discount when compared to On-Demand, and are available for all current EC2 instance types on a three year term. All three payment options (No Upfront, Partial Upfront, and All Upfront) are available.

Available Now
All of the purchasing and exchange options that I described above can be accessed from the AWS Management Console, AWS Command Line Interface (CLI), AWS Tools for Windows PowerShell, or the Reserved Instance APIs (DescribeReservedInstances, PurchaseReservedInstances, ModifyReservedInstances, and so forth).

Convertible RIs and the regional benefit are available in all public AWS Regions, excluding AWS GovCloud (US) and China (Beijing), which are coming soon.



by Jeff Barr at September 29, 2016 10:23 PM



ZFS read-only mount on Linux + simultaneous read-write mount on Solaris

We have to regularly copy quite huge files from Solaris to Linux (using network). It currently takes almost half a day for one file. The files in Solaris are on a ZFS filesystem.

So I thought what a heck - we could probably mount that ZFS on Linux.

But ZFS is not a clustered (or clusterable) filesystem.

Hypothesis: So I thought we could since we're just copying from Solaris - we can mount that same ZFS filesystem read-only, so it doesn't have to be clustered in this case? As writes will be only on Solaris side (we can't unmount it there).

That Solaris box is very busy and network NICs almost always are very busy too. So by moving file copy to FC it should be way faster.

That Linux box is a virtual guest on a VMWare host. So yes, it's possible to present the same FC fabric to that Linux guest.

Thoughts? I think that hypothesis piece is most where I look for feedback on. Not sure if it's possible to do ZFS read-only mount on Linux + simultaneous read-write mount on Solaris.

by Ruslan at September 29, 2016 10:09 PM


Asymptotic bounds on geometric sums

So I'm doing exercises from Dasgupta's Algorithms. The exercise i'm having trouble with is:

Show that, if $c$ is a positive real number, then $g(n) = 1 + c + c^2 +...+c^n$ is:

  1. $\Theta(1)$ if $c<1$

  2. $\Theta(n)$ if $c=1$

  3. $\Theta(c^n)$ if $c>1$

(I dont know if this is a hint but it is included in the text: "The moral: in big-$\Theta$ terms, the sum of a geometric series is simply the first term if the series is strictly decreasing, the last term if the series is strictly indreasing or the number of terms if the series in unchanging")

The only one that makes sense for me is 2) where $1+1^2+..+1^n$ is the same as $n+1$, and removing the 1 gives $O(n)$. I dont know if my reasoning makes sense, but thats all i've got. I have no idea where to start or how to think on the other two. Any suggestions?

by student201 at September 29, 2016 10:06 PM


Cost function erratically varying


I am designing a neural network solution for multiclass classification problem using tensorflow.The input data consist of 16 features and 6000 training examples to be read from csv file having 17 columns(16 features+1 label) and 6000 rows(training examples).I have decided to take 16 neurons as input layer 16 neurons in hidden layer and 16 neurons in output layer(as it is a 16 class classification).Here is my code for implementation-

import tensorflow as tf
def weight_variable(shape):
    return tf.Variable(initial)
def bias_variable(shape):
    return tf.Variable(initial)
def read_from_csv(filename_queue):
    record_defaults=[[1.], [1.], [1.], [1.], [1.],[1.], [1.], [1.], [1.], [1.],[1.], [1.], [1.], [1.], [1.],[1.],[1.]]
    features = tf.pack([col1, col2, col3, col4,col5,col6,col7,col8,col9,col10,col11,col12,col13,col14,col15,col16])
    return features,labels
def input_pipeline(filenames,batch_size,num_epochs=None):
    return feature_batch,label_batch

#input layer

#hidden layer

summaries = tf.merge_all_summaries()

init_op = tf.initialize_all_variables()

# Create a session for running operations in the Graph.
sess = tf.Session()
summary_writer = tf.train.SummaryWriter('stats', sess.graph)

# Initialize the variables (like the epoch counter).
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(sess=sess, coord=coord)

    while not coord.should_stop():

    #summary_writer.add_summary(, count)
    if count in range(300,90000,300):
except tf.errors.OutOfRangeError:
    print('Done training -- epoch limit reached')
    # When done, ask the threads to stop.

# Wait for threads to finish.


The problem here is that as I print my cost function during training instead of generally decreasing trend it increases and decreases pretty randomly and erratically.I am pasting full code because it looks like implementation problem that I am unable to find.(varying learning rate is vain).

Edit:Decreasing learning rate to 10^-12 gives following costs(still erratic)

201.928, 173.078, 144.212, 97.6255, 133.125, 164.19, 208.571, 208.599, 188.594, 244.078, 237.414, 224.085, 224.1, 206.36, 217.457, 244.083, 246.309, 268.496, 248.517, 272.924, 228.551, 239.637, 301.759,....

I am printing cost after every 300 counts because 1 batch=20 examples,6000/20=300 counts for 1 epoch after which weights are updated.

by Jasdeep Singh Chhabra at September 29, 2016 09:52 PM

Adding complex objects to an array using foreach or map with javascript

I've reviewed similar questions here on stackoverflow, read the javascript MDN, and not found comparable examples, and am asking now for advice on how to create the desired array from the given array.

The desired array is in this format, not at each array element is an object consisting of one key followed by a value that is itself an object of key:value pairs. (if you are curious why I'm coding the array like this, I need this format in order to use the bind.js library).

Desired format:

var userSummary = { 
ryanc: {
"New Contacts":7,
"New Invites (Challenge Groups)":8,
"New Invites (Coaching Opportunity)":9,
"New Follow Ups":12,
"New Challenger Check-Ins":11,
"Team Call Participation (National, upline team, or our team)":3,
Pupperpie: {
"New Contacts":5,
"New Invites (Challenge Groups)":4,
"New Invites (Coaching Opportunity)":3,
"New Follow Ups":3,
"New Challenger Check-Ins":5,
"Team Call Participation (National, upline team, or our team)":1,
bowdenke: {
"New Contacts":14,
"New Invites (Challenge Groups)":3,
"New Invites (Coaching Opportunity)":3,
"New Follow Ups":1,
"New Challenger Check-Ins":0,
"Team Call Participation (National, upline team, or our team)":2,

The current format is this:

var userSummary = { 
"Created By":"ryanc",
"New Contacts":7,
"New Invites (Challenge Groups)":8,
"New Invites (Coaching Opportunity)":9,
"New Follow Ups":12,
"New Challenger Check-Ins":11,
"Team Call Participation (National, upline team, or our team)":3,
"Created By":"Pupperpie",
"New Contacts":5,
"New Invites (Challenge Groups)":4,
"New Invites (Coaching Opportunity)":3,
"New Follow Ups":3,
"New Challenger Check-Ins":5,
"Team Call Participation (National, upline team, or our team)":1,
"Created By":"bowdenke",
"New Contacts":14,
"New Invites (Challenge Groups)":3,
"New Invites (Coaching Opportunity)":3,
"New Follow Ups":1,
"New Challenger Check-Ins":0,
"Team Call Participation (National, upline team, or our team)":2,

Currently I have tried this code, which inserts a literal 'user' instead of the users name like this:

{"user":{"Created By":"ryanc","New Contacts":0,"New Invites (Challenge Groups)":0,"New Invites (Coaching Opportunity)":0,"New Follow Ups":0,"New Challenger Check-Ins":0,"Team Call Participation (National, upline team, or our team)":0,"SC Points":0,"Volume":0}}

The array uniqueUser contains an array of usernames, userSummary should contain the newly formatted object.

I have tried a few other variations of this as well but cannot seem to get the syntax correct.

uniqueUsers.forEach(function(user) {
var obj = { user:  {
  'Created By': user,
  'New Contacts': 0,
  'New Invites (Challenge Groups)': 0,
  'New Invites (Coaching Opportunity)': 0,
  'New Follow Ups': 0,
  'New Challenger Check-Ins': 0, 
  'Team Call Participation (National, upline team, or our team)': 0,
  'SC Points': 0,
  'Volume': 0

I would appreciate any additional ideas, please let me know if additional information is needed.

by Shazam at September 29, 2016 09:42 PM

scikit-learn - how to force selection of at least a single label in LinearSVC

I'm doing a multi-label classification. I've trained on a dataset and am getting back suggested labels. However, not all have at least a single label. I'm running into this exact issue that was discussed on the mailing list. It looks like there was discussion around potentially adding a parameter to force selection of a minimum number of labels, however, in looking at the documentation I don't see that it was ever implemented. I don't quite understand the suggested hack. Is there no way to do this after all the learning has completed?

The learning portion of my code:

lb = preprocessing.MultiLabelBinarizer()

Y = lb.fit_transform(y_train_text)

classifier = Pipeline([
    ('vectorizer', CountVectorizer(stop_words="english")),
    ('tfidf', TfidfTransformer()),
    ('clf', OneVsRestClassifier(LinearSVC()))]), Y)
predicted = classifier.predict(X_test)
all_labels = lb.inverse_transform(predicted)

by firefly2442 at September 29, 2016 09:40 PM

Counting with Regression in Caffe, training Loss reducing but predicted values remain the same

I am trying to count objects in an image using Alexnet.

I have currently images containing 1 object, 2, 3 and 4 objects per image. For initial checkup, i have 10 images per class. such that in training set i have:

image  label
image1  1
image2  1
image3  1
image39 4
image40 4

I used imagenet create script to create a lmdb file for this dataset. Which successfully converted my set of images to lmdb. Alexnet, as an example is converted to a regression model for learning the number of objects in the image by introducing EucledeanLosslayer instead of Softmax Layer. As suggested by many. The rest of the network is the same.

However, despite doing all the above, when i run the model, I received only zeros as output (shown below). It does not learn any thing. However, the training loss is decreased continuously in each iteration. I don't understand whether am i doing some mistake in displaying the predicted data or am I doing any other mistake.

The predicted and the actual label of the test dataset is given as :

I0928 17:52:45.585160 18302 solver.cpp:243] Iteration 1880, loss = 0.60498
I0928 17:52:45.585212 18302 solver.cpp:259]     Train net output #0: loss = 0.60498 (* 1 = 0.60498 loss)
I0928 17:52:45.585225 18302 solver.cpp:592] Iteration 1880, lr = 1e-06
I0928 17:52:48.397922 18302 solver.cpp:347] Iteration 1900, Testing net (#0)
I0928 17:52:48.499543 18302 accuracy_layer.cpp:88] Predicted_Value: 0 Actual Label: 1
I0928 17:52:48.499630 18302 accuracy_layer.cpp:88] Predicted_Value: 0 Actual Label: 1
I0928 17:52:48.499641 18302 accuracy_layer.cpp:88] Predicted_Value: 0 Actual Label: 2
I0928 17:52:48.499645 18302 accuracy_layer.cpp:88] Predicted_Value: 0 Actual Label: 2
I0928 17:52:48.499650 18302 accuracy_layer.cpp:88] Predicted_Value: 0 Actual Label: 2
I0928 17:52:48.499660 18302 accuracy_layer.cpp:88] Predicted_Value: 0 Actual Label: 3
I0928 17:52:48.499663 18302 accuracy_layer.cpp:88] Predicted_Value: 0 Actual Label: 3
I0928 17:52:48.499677 18302 accuracy_layer.cpp:88] Predicted_Value: 0 Actual Label: 4
I0928 17:52:48.499681 18302 accuracy_layer.cpp:88] Predicted_Value: 0 Actual Label: 4

Can any body guide me what could be the point due to which the predicted values are always 0.

Note: I also created hdf5 format files in-order to have floating labels, i.e. 1.0, 2.0, 3.0 and 4.0. However, when i changed data layer to HDF5 type, i can not crop the image for data-augmentation as being done in alexnet with lmdb layer, as well as normalization. I used the script given on "" for hdf5 data and followed his steps for using it in my model.

by khan at September 29, 2016 09:27 PM


Question about the following function [migrated]

I am trying to understand what the following code does

void chomp (char* string, char delim) {
size_t len = strlen (string);
if (len == 0) return;
char* nlpos = string + len - 1;
if (*nlpos == delim) *nlpos = '\0';

what is a delimiter? . The forth line basically saves the last character in the string?

by TheMathNoob at September 29, 2016 09:15 PM

Distance vector in a weighted graph

I got a weighted, connected and directed graph $G$. There is a vector called the distance vector $Dv \in \mathbb{N}^n$ in which $Dv_i$ is the shortest distance from $1$ to $i$. All edge weights are positive integers. I have to show that every distance vector $Dv$ satisfies:

  1. $Dv_1 = 0$.

  2. For all $j \neq 1$ there exists $i$ such that $Dv_j = Dv_i + w(i,j)$.

  3. For all $i,j$ it holds that $Dv_j \leq Dv_i + w(i,j)$.

I think that 1 is trivial: from 1 to 1 you have no distance. But the rest? Can you give me an idea how to prove 2 and 3?

by Asker at September 29, 2016 09:05 PM



Is there relation between K-Framework and structural operational semantics?

K-framework strives to give one (instead of two - operational and denotational) semantics for industrial programming languages. The same unification is done by structural operational semantic as well. Are those two frameworks related somehow? Such relation could be very frutiful, because K-framework has excellent tools and full completed semantics, but SOS has deep theory and connection with category theory.

by TomR at September 29, 2016 09:03 PM


Scala map function to remove fields

I have a list of Person objects with many fields and I can easily do: => person.getName)

In order to generate another collection with all the peoples names.

How can you use the map function to create a new collection with all the fields of the Person class, BUT their name though?

In other words, how can you create a new collection out of a given collection which will contain all the elements of your initial collection with some of their fields removed?

by anton4o at September 29, 2016 08:57 PM


DragonFly BSD Digest

BSDNow 161: The BSD Bromance

BSDNow 161 has sort of 2 interviews this week.  has Allan Jude talking about his EuroBSDCon trip, plus Michael Shirk talking about Bro on FreeBSD.  Also, lots of news items right on the BSDNow page.

by Justin Sherrill at September 29, 2016 08:42 PM


Regular Expression for declaration statement in c++ [on hold]

What will the Regular Expression for the various types of Declaration statement.
Also give the minimized DFA for the same. (e.g. int a , b ;)

Please draw the Deterministic Finite automata for integers,constants,punctuators,operators and constants.

by ChÃrming ßoy at September 29, 2016 08:38 PM


Python gradient descent - cost keeps increasing

I'm trying to implement gradient descent in python and my loss/cost keeps increasing with every iteration.

I've seen a few people post about this, and saw an answer here: gradient descent using python and numpy

I believe my implementation is similar, but cant see what I'm doing wrong to get an exploding cost value:

Iteration: 1 | Cost: 697361.660000
Iteration: 2 | Cost: 42325117406694536.000000
Iteration: 3 | Cost: 2582619233752172973298548736.000000
Iteration: 4 | Cost: 157587870187822131053636619678439702528.000000
Iteration: 5 | Cost: 9615794890267613993157742129590663647488278265856.000000

I'm testing this on a dataset I found online (LA Heart Data):

Import code:

dataset = np.genfromtxt('heart.csv', delimiter=",")

x = dataset[:]
x = np.insert(x,0,1,axis=1)  # Add 1's for bias
y = dataset[:,6]
y = np.reshape(y, (y.shape[0],1))

Gradient descent:

def gradientDescent(weights, X, Y, iterations = 1000, alpha = 0.01):
    theta = weights
    m = Y.shape[0]
    cost_history = []

    for i in xrange(iterations):
        residuals, cost = calculateCost(theta, X, Y)
        gradient = (float(1)/m) *, X).T
        theta = theta - (alpha * gradient)

        # Store the cost for this iteration
        print "Iteration: %d | Cost: %f" % (i+1, cost)

Calculate cost:

def calculateCost(weights, X, Y):
    m = Y.shape[0]
    residuals = h(weights, X) - Y
    squared_error =, residuals)

    return residuals, float(1)/(2*m) * squared_error

Calculate hypothesis:

def h(weights, X):   
    return, weights)

To actually run it:

gradientDescent(np.ones((x.shape[1],1)), x, y, 5)

by Simon at September 29, 2016 08:28 PM



Difference between time-bounded and memory-bounded Kolmogorov complexity

Let $x$ be a finite string of length $n$.

Denote by $C^t(x)$ the Kolmogorov complexity of $x$ bounded by time $t$ (i.e. the length of a minimal program that outputs $x$ and running at most $t$ steps).

Denote by $C_m(x)$ the Kolmogorov complexity of $x$ bounded by memory $m$.

Can $C^{\mathsf{poly}(n)}(x)$ be much greater than $C_{\mathsf{poly}(n)}(x)$? It seems that the answer is "yes" but how to prove it under natural assumptions?

More accurately: let $(x_i)$ be a sequence of finite strings. Is it true that for every polynomial $p$ there exist a polynomial $q$ and a constant $c$ such that for every $x_i$ of length $n$ the following inequality holds: $$C^{q(n)}(x_i) < C_{p(n)}(x_i) + c\log n ?$$ Does it contradict to some natural assumptions of Computational complexity theory?

by Alexey at September 29, 2016 08:25 PM


Job Scheduling: Two Resources, Defined Job Length, Defined Earliest Start


I stumbled over the following job scheduling problem.

  • There are two resources, for simplicity I call them ...
    • CPU_RAM (MAX_CPU_RAM specifies what is available in total)
    • GPU_RAM (MAX_GPU_RAM specifies what is available in total)
  • Each job has ...
    • CPU_RAM requirements
    • GPU_RAM requirements
    • a duration, that is the time needed for execution
    • an earliest start time, i.e. it cannot be executed before that
    • is known "offline". There are changes (addition/removal, resource usage change) but they are infrequent.

For example, Job A needs 20MB GPU_RAM, 100MB CPU_RAM and 5 minutes to execute. It cannot be started before 9:00. There are other jobs that have different requirements.


Given the constraints above I want to find a schedule with the earliest completion time.

I don't need a perfect solution as I doubt it would be feasible. Since there will be roughly 50 jobs, yet up to 200 jobs should be possible. Instead I am interested in a solution that works in practice.

Execution time should be less than 10 seconds to be feasible. Yes the domain I am working in is not actually scheduling jobs on a PC. But scheduling jobs on a PC is easier to describe, than what I am doing.

First Heuristic

The heuristic I am using atm. is quite primitive.

  1. I sort the jobs by their earliest start time.
  2. Then I pick the one that can start earliest and add it to the schedule.
  3. Taking the reduced resources into consideration the possible start times of the remaining jobs are updated.
  4. The job with the earliest possible start time is added. On a tie I take the job that uses most of the RAM for both CPU/GPU.

The problem with that approach is that jobs using less RAM are preferred since they fit easier and thus tend to have lower earliest possible start times. So I am thinking of making the heuristic more like bin packing, by doing first-fit decreasing.


After the heuristic I looked into The Algorithm Design Manual, if anything there might be of help.

I think pin packing algorithms with 3D boxes might apply. The constraint though would be that each box needs to have a specific alignment. Then one dimension could represent CPU_RAM, the other GPU_RAM and finally one time. Missing in that case would be the earliest start time, which I would have to consider myself when deciding which box can be added next. Moreover the bin would have to be "open" in one dimension (time).

My google search was not that fruitful though, finding an algorithm that has a specific alignment for the boxes.


I would be very grateful if you could give me advice of what algorithms, papers, or topics to look into, that can help me fulfil the goal outlined.

by mat69 at September 29, 2016 08:25 PM


Monte-Carlo simulations and Asian Option

If I wish to price a fixed-strike Asian Call option via Monte-Carlo (This has no early-exercise), are my following steps correct?:

1) Simulate random asset prices. (Milstein)

$\ d S(t) = \ rS(t)dt + \sigma S(t) d B(t)$

$\ S_{t+dt} = S_t + r S_tdt + \sigma S_t \sqrt{ dt}Z + \frac{1}{2}\sigma^2dt(Z^2-1)$

2) Average the asset prices for each simulation.

$\ A[i]$ is the average for each simulation.

I'll be using both Geometric and Arithmetic averages

3) Calculate each payoff and discount it. Find the average of these payoffs

$\ Payoff[i]= exp[-r(T-t)] * max[A[i]-K,0] $

$\ Average = \frac{1}{N}\sum_{i=1}^N Payoff[i]$

I'm aware that there are some approximation formulae, Finite-Difference methods and closed-form solutions but I'm trying to focus on Monte-Carlo simulations for now.

by mathnoob at September 29, 2016 08:22 PM



data set of American put prices [duplicate]

This question already has an answer here:

I need to find a data set of American put prices for different maturities and strikes and corresponding index prices as a part of my research. I tried yahoo and Google finance but no luck so far. I really appreciate if someone can point out where can I download such data for free.

Thanks in advance.

by Kushan at September 29, 2016 07:50 PM

how to choose a price adjustment, a roll date and a data center for my trading strategy?

I have many doubts about Which roll date and price adjustment should I use. I need to backtest like 50 diferents futures. 6 index(mini sp500, Nikkei 225…), 10 Agriculture (soybean, Oat, Corn….),3 Meats (live Cattle, Lean Hog, Feeder Cattle), 8 Currencies (yen , Australian Dollar, Pound, Swiss Franc…), 5 Metals (Silver, Gold, Palladium…), Treasury Notes (10 years, 5 years…), Us Bond 30 year and some more…

My backtest is for 15 years from 2000 to 2015. I have choosen the backward Panama canal method, rolling with the open interest switch and with a depth #1 in all of then.

My question is…Is that correct? Or I should use differents kind of methods for the differents kinds of futures(agricultures, metals, currencies…)

Another question is that the SCF FUTURES of some futures have gaps in the graphics. There are severals with this gaps between 2009 until 2012 (the mayority of the currencies and the agricultures futures) . The example below is the yen future.

enter image description here

I don’t know why produce this gaps or undiscontinuous bars.

Thank you very much for your time .

by Manuel Botias at September 29, 2016 07:41 PM


BNF grammar for infix arithmetic expression over integers and identifiers

Write a BNF grammar for infix arithmetic expressions over integers and identifiers.

So I have that a BNF grammar has $ 4 $ parts: terminal, non-terminal, starting symbols, and production. So for the terminal part, I can pick two identifiers $ x $ and $ y $ and for the production part, I can put the expression ($ x + $ operand $ + y $), but how about the non-terminal and starting symbol part and how can you put all $ 4 $ parts together to form a BNF grammar?

by user59083 at September 29, 2016 07:26 PM



Make derivative zero in Theano

I am trying to implement LSTM optimizer from this paper:

They are making an assumption about derivative of gradient w.r.t. LSTM parameters equal to zero:


Looking at my code I think that when I optimize loss function that assumption is not used because Theano can compute this gradient and it does so. How can I prevent it from doing this?

Here's the code:

def step_opt(cell_previous, hid_previous, theta_previous, *args):
    func = self.func(theta_previous)

    grad = theano.grad(func, theta_previous)
    input_n = grad.dimshuffle(0, 'x')

    cell, hid = step(input_n, cell_previous, hid_previous, *args) # function that recomputes LSTM hidden state and cell 

    theta = theta_previous +
    return cell, hid, theta, func

cell_out, hid_out, theta_out, loss_out = theano.scan(
         outputs_info=[cell_init, hid_init, theta_init, None],

loss = loss_out.sum()

by justanothercoder at September 29, 2016 07:17 PM



How to programmatically learn regexes?

My question is a continuation of this one. Basically, I have a table of words like so:


For my purposes, I do not need the terminal .1 or .2 for this set of names. I can manually write the following regex (using Python syntax):

r = re.compile('(.*\.\d+)\.\d+')

However, I cannot guarantee that my next set of names will have a similar structure where the final 2 characters will be discardable - it could be 3 characters (i.e. .12) and the separator could change as well (i.e. . to _).

What is the appropriate way to either explicitly learn a regex or to determine which characters are unnecessary?

by learner at September 29, 2016 07:05 PM


Initiating new orders with active "order-session" only?

Is it a must to establish "quote-session" & subscribing to quotes/market data before initiating a "New Order-single(Market-GTC)"? I actually can't see any use of quote-session for trading activities & my FIX-bridge is opening "single-new orders" using "order-session" only.

The reason i want to avoid quote-session: 1.It can add up to over-all processing time/latency 2.I can logon to quote-session & use the quote-flow from a different application. Any thoughts on these?

by Reza Str at September 29, 2016 06:37 PM

Unexplained, empty candlestick spikes appear after large movements

The following picture and highlight:

enter image description here

This happens often on large gap downs/ups, what is the reasoning for this phenomenon?

by Robert Tan at September 29, 2016 06:31 PM

DragonFly BSD Digest

Synaptics improvements

If you had trouble getting your laptop’s touchpad to work under DragonFly, try again.  (If you are running DragonFly-current)

by Justin Sherrill at September 29, 2016 06:23 PM


converting a polynomial regression to an PMML file

I need to create an PMML model using polynomial regression .

Data set :

Temperature <-c(2.6 ,5.7 ,5.8 ,6.6 ,8.8 ,9.3 ,10.2 ,11.5 ,12.3 ,12.8 ,13.7 ,14.9 ,17.2 ,18.1 ,19.1 ,22.7 ,23 ,29.6 ,29.9 ,30.1 ,30.3 ,33.2 ,36.7 ,36.8 ,38.7 ,41.3 ,43.5 ,44.2 ,48.2 ,49.9)

Index <-c(0.89 ,0.78 ,0.79 ,0.74 ,0.66 ,0.62 ,0.58 ,0.54 ,0.51 ,0.5 ,0.45 ,0.4 ,0.31 ,0.28 ,0.24 ,0.1 ,0.04 ,0.17 ,0.24 ,0.26 ,0.27 ,0.33 ,0.46 ,0.47 ,0.54 ,0.65 ,0.74 ,0.77 ,0.93 ,1)

I used the below code for doing polynomial regression ( as my curve is U shaped) and create an PMML out of it.

Code :



my.Pmodel<-lm(Index~poly(Temperature,6),data = discomfort_index_trainint_set)



But I was not able to generate pmml when I used polynomial regression. Please assist.enter image description here

Error pmml::pmml(my.Pmodel) Error in datypelist[[namelist[ndf2][1]]] : subscript out of bounds

by Arul at September 29, 2016 06:08 PM


Krasse Story der Woche: Helferinnen im Flüchtlingscamp ...

Krasse Story der Woche: Helferinnen im Flüchtlingscamp in Calais sollen mit Sex mit minderjährigen Flüchtlingen gesucht und gehabt haben. Beobachten wir hier gerade, wie die PR-Abteilung der Nazis besser wird? Oder gibt es einfach nur mehr fiese Unmenschen unter angeblich für Menschenrechte und gegen Rassismus Kämpfenden? Oder erfahren wir nur häufiger von sowas? Was meint ihr?

September 29, 2016 06:00 PM

Geil, der Ausländerhass in England schwenkt nach Brexit ...

Geil, der Ausländerhass in England schwenkt nach Brexit nahelos auf die EU-Mitgliedsländer um. Schaut mal hier.

Die Parasiten-Kinder reicher EU-Oligarchen schreiben sich vor dem Brexit noch schnell bei den international bewunderten Super-Unis in England ein, und stehlen unseren armen, sich das mühsam erarbeitet habenden Kindern die Betten!1!!

September 29, 2016 06:00 PM


Do undecidable languages exist in constructivist logic?

Constructivist logic is a system which removes the Law of the Excluded Middle, as well as Double Negation, as axioms. It's described on Wikipedia here and here. In particular, the system doesn't allow for proof by contradiction.

I'm wondering, is anyone familiar with how this affects results regarding Turing Machines and formal languages? I notice that almost every proof that a language is undecidable relies on proof by contradiction. Both the Diagonalization argument and the concept of a reduction work this way. Can there ever be a "constructive" proof of the existence of an undecidable language, and if so, what would it look like?

EDIT: To be clear, my understanding of proof by contradiction in constructivist logic was wrong, and the answers have clarified this.

by jmite at September 29, 2016 06:00 PM

Master Theorem: How to find the value of b in this recurrence relation

The master theorem is used with recurrences of the form T(n) = aT(n/b) + f(n) where a >=1 and b > 1, in which case the value of b can be easily seen from the recurrence, however I have a recurrence of the form

T(n) = T((n/4)+3) + f(n)

How do I get the value of b in this case?

This question Particularly Tricky Recurrence Relation (Master's Theorem) is the only thing I found that has a similar case with T(n/4 +1) but gives no detail about how the b was calculated.

by EvaD at September 29, 2016 05:52 PM

Planet Theory

Give a second order statement true in (R,+) but false in (Q,+) or show there isn't one

Here is a logic question I will ask today and answer next week. Feel free to leave comments with
the answer- you may come up with a different proof than me and that would be great!

Our lang will have the usual logic symbols, quantification over the domain, quantification over subsets of the domain (so second order) the = sign, and the symbol +

Examples of sentences:

(forall x)(forall y)[ x+y=y+x]

true over Q,R,N.  False in S_n for n\ge 4 (group of perms of n elements)

(exists x)(forall y)[ x+y=y]

true in Q, R by taking 0. not true in {1,2,3,...}

Lets assume it is true and call the x 0

(forall x)(exists y)[x+y=0]

True in Q, R, Z, not true in N.

QUESTION ONE: Is there any sentence in the first order theory that is TRUE over (Q,+) but
FALSE over (R,+)?

QUESTION TWO: Is there any sentence in the second order theory that is TRUE over (Q,+)
but false over (R,+)?

by GASARCH ( at September 29, 2016 05:47 PM


Best Machine Learning algorithm for Product-Event mapping for impact analysis

I am new to Machine Learning. Appreciate experts help to solve my business problem.

Business scenario is, map Events with Products to calculate the impact of Event(s) on sales of a Product.

| Event Name--------------|Feature1----|Feature2--|Feature3|Feature4|

| Longhorns football-------| Sports-------|Football---| Austin--- | 78712 |

| Sahaja Yoga Meditation | Spiritual---- | Yoga------| Austin--- | 78757 |

| Texas Robot RoundUp-| Education-- | Science- | Austin--- | 78701 |

Product Name | Store-Zip | Feature1

Diet Coke--------| 78712 -----| Beverages

Yoga Mat---------| 78712------| Yoga

Zero Coke-------| 78757------| Beverages

From the above list of Events and list of Products, assume below is the impact matrix based on the Product type and Location of the store.

Product/Event-- |Longhorns football| Sahaja Yoga Meditation|Texas Robot RoundUp|

Diet Coke-78712 | Yes--------------------- | No--------------------------- | No

Yoga Mat-78712- | No---------------------- | Yes-------------------------- | No

Zero Coke-78757| No-----------------------| No ---------------------------| No

What is the best Machine Learning algorithm to solve this problem?

Please help me out designing the algorithm and another other thing if I missed out.

by Kishore at September 29, 2016 05:46 PM


Looking for open swaption implied vol data

Anyone have a good place to find interest rate swaption implied volatility data? Does Bloomberg's python API allow access?

by VolSurfing at September 29, 2016 05:36 PM


Why does this Pumping Lemma example show irregularity?

the first 4 steps of these are my own work - however the following steps are from my book, and I don't understand what it's saying, and I have not found any resources. Could someone clarify this?

Show C = {w|w has an equal number of 0s and 1s} is not regular. Show by contradiction.

1) Assume C is regular, and thus the Pumping Lemma conditions hold.

2) Assume a pumping length of p.

3) Let the string s = $0^p$$1^p$.

4) s = xyz and let x and z be the empty string. Therefor for i > 0, x$y^i$z always has an equal number of 0s and 1s, so it seems like it can be pumped.

5) But since condition 3 of the pumping lemma says that |xy| <= p, then y must consist of only 0s.

How does step 5 justify y consisting of only zeros? If we said ealrier that x and z are the empty strings, then doesn't that imply that s = y = $0^p$ $1^p$?

How does this make y only zeros?


by Jordan Andy at September 29, 2016 05:30 PM


Welcome to the Newest AWS Community Heroes (Fall 2016)

I would like to extend a very warm welcome to the newest AWS Community Heroes:

  • Cyrus Wong
  • Paul Duvall
  • Vit Niennattrakul
  • Habeeb Rahman
  • Francisco Edilton
  • Jeevan Dongre

The Heroes share their knowledge and demonstrate their enthusiasm for AWS via social media, blog posts, user groups, and workshops. Let’s take a look at their bios to learn more.

Cyrus Wong
Based in Hong Kong, Cyrus is a Data Scientist in the IT Department of the Hong Kong Institute of Vocational Education. He actively promotes the use of AWS at live events and via social media, and has received multiple awards for his AWS-powered Data Science and Machine Learning Projects.

Cyrus provides professional AWS training to students in Hong Kong, with an eye toward certification. One of his most popular blog posts is How to get all AWS Certifications in Asia, where he recommends watching the entire set of re:Invent videos at a 2.0 to 2.5x speedup!

You can connect with Cyrus on LinkedIn or at a meeting of the AWS Hong Kong User Group.

Paul Duvall
As co-founder and CTO of Stelligent (an AWS Advanced Consulting Partner), Paul has been using AWS to implement Continuous Delivery Systems since 2009.

Based in Northern Virginia, he’s an AWS Certified SysOps Administrator and and AWS Certified Solutions Architect, and has been designing, implementing, and managing software and systems for over 20 years. Paul has written over 30 articles on AWS, automation, and DevOps and is currently writing a book on Enterprise DevOps in AWS.

You can connect with Paul on LinkedIn, follow him on Twitter, or read his posts on the Stelligent Blog.

Vit Niennattrakul
Armed with a Ph.D. in time series data mining and passionate about machine learning, artificial intelligence, and natural language processing, Vit is a consummate entrepreneur who has already founded four companies including Dailitech, an AWS Consulting Partner. They focus on cloud migration and cloud-native applications, and have also created cloud-native solutions for their customers.

Shortly after starting to use AWS in 2013, Vit decided that it could help to drive innovation in Thailand. In order to make this happen, he founded the AWS User Group Thailand and has built it up to over 2,000 members.


Habeeb Rahman
Based in India, Habeeb is interested in cognitive science and leadership, and works on application delivery automation at Citrix. Before that, he helped to build AWS-powered SaaS infrastructure at Apigee, and held several engineering roles at Cable & Wireless.

After presenting at AWS community meetups and conferences, Habeen helped to organize the AWS User Group in Bangalore and is actively pursuing his goal of making it the best user group in India for peer learning.

You can connect with Habeeb on LinkedIn or follow him on Twitter.

Francisco Edilton
As a self-described “full-time geek,” Francisco likes to study topics related to cloud computing, and is also interested in the stock market, travel, and food. He brings over 15 years of network security and Linux server experience to the table, and is currently deepening his knowledge of AW by learning about serverless computing, and data science.

Francisco works for TDSIS, a Brazilian company that specializes in cloud architecture, software development, and network security, and helps customers of all sizes to make the move to the cloud. On the AWS side, Francisco organizes regular AWS Meetups in São Paulo, Brazil, writes blog posts, and posts code to his GitHub repo.

Jeevan Dongre
As a DevOps Engineer based in India, Jeevan has built his career around application development, e-commerce, and product development. His passions include automation, cloud computing, and the management of large-scale web applications.

Back in 2011, Jeevan and several other like-minded people formed the Bengaluru AWS User Group in order to share and develop AWS knowledge and skills. The group is still going strong and Jeevan expects it to become the premier group for peer-to-peer learning.

You can connect with Jeevan on LinkedIn or follow him on Twitter.

Please join me in offering a warm welcome to our newest AWS Community Heroes!


by Jeff Barr at September 29, 2016 05:27 PM

First Annual Alexa Prize – $2.5 Million to Advance Conversational AI

Every evening we ask Alexa for the time of sunset, subtract 10 minutes to account for the Olympic Mountains on the horizon, and plan our walk accordingly!

My family and my friends love the Amazon Echo in our kitchen! In the past week we have asked for jokes, inquired about the time of the impending sunset, played music, and checked on the time for the next Seattle Seahawks game. Many of our guests already know how to make requests of Alexa. The others learn after hearing an example or two, and quickly take charge.

While Alexa is pretty cool as-is, we are highly confident that it can be a lot cooler. We want our customers to be able to hold lengthy, meaningful conversations with their Alexa-powered devices. Imagine the day when Alexa is as fluent as LCARS, the computer in Star Trek!

Alexa Prize
In order to advance conversational Artificial Intelligence (AI) a reality, I am happy to announce the first annual Alexa Prize. This is an annual university competition aimed at advancing the field of conversational AI, with Amazon investing up to 2.5 million dollars in the first year.

Teams of university students (each led by a faculty sponsor) can use the Alexa Skills Kit (ASK) to build a “socialbot” that is able to converse with people about popular topics and news events. Participants will have access to a corpus of digital content from multiple sources including the Washington Post, which has agreed to make their corpus available to the students for non-commercial use.

Millions of Alexa customers will initiate conversations with the socialbots on topics ranging from celebrity gossip, scientific breakthroughs, sports, and technology (to name a few). After each conversation concludes Alexa users will provide feedback that will help the students to improve their socialbot. This feedback will also help Amazon to select the socialbots that will advance to the final phase.

Apply Now
Teams have until October 28, 2016 to apply. Up to 10 teams will be sponsored by Amazon and will receive a $100,000 stipend, Alexa-enabled devices, free AWS services, and support from the Alexa team; other teams may also be invited to participate.

On November 14, we’ll announce the selected teams and the competition will begin.

In November 2017, the competition will conclude at AWS re:Invent. At that time, the team behind the best-performing socialbot will be awarded a $500,000 prize, with an additional $1,000,000 awarded to their university if their socialbot achieves the grand challenge of conversing coherently and engagingly with humans for 20 minutes.

To learn more, read the Alexa Prize Rules , read the Alexa Prize FAQ, and visit the Alexa Prize page. This contest is governed by the Alexa Prize Rules.


by Jeff Barr at September 29, 2016 05:22 PM


Old and busted: Neonazis.New hotness: Identitäre. ...

Old and busted: Neonazis.

New hotness: Identitäre. Das ist eine tolle Medienkompetenzübung. Noch besser allerdings ist dieser krasse Verein hier. Lest euch nur mal "Über uns" durch. Die Hipster-Grafik gibt sich schön fremdenfreundlich und diversifiziert, und dann klickt man da ein bisschen herum und finden Aussagen wie:

dass bestimmte Wertvorstellungen und Meinungen pauschal nicht mehr toleriert, sondern nur noch stigmatisiert werden – beispielsweise als „reaktionär“, „christlich-fundamentalistisch“ oder „homophob“.
Wow. Die sich als verfolgte Minderheit sehenden Weißen sind endgültig in Deutschland angekommen.

September 29, 2016 05:00 PM

Kachelmanns Ex hat die Situation klar verstanden und ...

Kachelmanns Ex hat die Situation klar verstanden und kommentiert wie folgt:
Sie soll die Entscheidung als "katastrophales Fehlurteil" und die Richter als "armselige, feige Frauenverächter" angegriffen haben. Als Ursache des "Justizskandals" sieht sie einen "rein männlich besetzen Senat", der "uns [sic] Frauen stumm schalten" wolle.
Herzlichen Glückwunsch, lieber Feminismus! Das habt ihr ja schön hingekriegt, dass sich selbst überführte Falschbeschuldiger noch als Opfer fühlen, denn auf der anderen Seite stand ein weißer Mann. Opfer eines "Justizskandals" und einer Verschwörung des Patriarchats. Lügnerinnen nicht unbesehen zu glauben ist ein Zeichen von Frauenfeindlichkeit. Soweit sind wir inzwischen.

September 29, 2016 05:00 PM



How to get the rate of the regerssion?

I start to learn the machine learning shorly. I meet a problem when I read the book of PRML. It talk about the LMS algorithm and use it to solve the problem of the regression. wi+1 = wi + alpha*gradient I don't know how to determine the 'alpha'. So, how to solve it?

by Innocent at September 29, 2016 04:55 PM


Translating matrix expression of Lagrangian into solve.qp() parameters (R)

I have no idea how to do this. I can set up the Lagrangian, but I don't know how to translate it into solve.qp() inputs.

The inputs are Dmat, dvec, amat, bvec, meq.

I don't know LaTeX so excuse my notation here. My problem is a global minimum variance portfolio optimization subject to the constraints x1+x2+x3 = 1, as well as 0 < x1 < w1, etc. Essentially, the weights must equal one, and the weights have a min/max condition.

So the matrix notation is (where Sigma is the covariance matrix of 3 assets)

[2*Sigma  [x1  = [0
 2*Sigma   x2  =  0
 2*Sigma   x3  =  0
 1  1  1   L1  =  1
-1  0  0   L2  =  0
 0 -1  0   L3  =  0
 0  0 -1   L4  =  0
 1  0  0   L5  =  w1
 0  1  0   L6  =  w2
 0  0  1]  L7] =  w3]

I know meq=1, as there is only one equality constraint. How do I translate the rest of the matrices into inputs for solve.qp()?

by milkmotel at September 29, 2016 04:46 PM


Time complexity of a compiler

I am interested in the time complexity of a compiler. Clearly this is a very complicated question as there are many compilers, compiler options and variables to consider. Specifically, I am interested in LLVM but would be interested in any thoughts people had or places to start research. A quite google seems to bring little to light.

My guess would be that there are some optimisation steps which are exponential, but which have little impact on the actual time. Eg, exponential based on the number are arguments of a function.

From the top of my head, I would say that generating the AST tree would be linear. IR generation would require stepping through the tree while looking up values in ever growing tables, so $O(n^2)$ or $O(n\log n)$. Code generation and linking would be a similar type of operation. Therefore, my guess would be $O(n^2)$, if we removed exponentials of variables which do not realistically grow.

I could be completely wrong though. Does anyone have any thoughts on it?

by superbriggs at September 29, 2016 04:23 PM



Solve Travelling Salesman once you know the distance of the shortest possible route

I am trying to solve the TSP (Travelling Salesman Problem), but not in a traditional way. I am following these steps.

1) First I change the TSP to a true / false problem.

The definition of this problem is now: "Is there a route by all the cities visiting each one only once with a total distance less or equals than k?" Let's assume I have an algorithm TSP_tf(k) to solve it.

2) Then I search the minimum k.

This is, I search "which is the distance of the shortest possible route".

An efficient algorithm to solve it would be with a dichotomic search. I begin with k=1, and I call TSP_tf(k). If it returns false, I multiply k by 2, and keep calling TSP_tf until true is returned. When this happens, search the minimum k that returns true in the interval (k/2 - k], also with a dichotomic search.

I will get then the minimum distance min_k.

3) Return the shortest route of the TSP knowing its distance is min_k.

And here is where my question comes. How would be an efficient algorithm to solve this? By efficient I mean a good approach :) It is obvious that TSP will remain being NP.

by Santi Gil at September 29, 2016 04:02 PM


Ihr werdet jetzt sicher genau so überrascht und schockiert ...

Ihr werdet jetzt sicher genau so überrascht und schockiert sein wie ich: AP hat mal recherchiert, wie häufig Polizisten eigentlich Polizeidatenbanken missbrauchen:
Police officers across the country misuse confidential law enforcement databases to get information on romantic partners, business associates, neighbors, journalists and others for reasons that have nothing to do with daily police work, an Associated Press investigation has found.
Also DAMIT konnte ja wohl NIEMAND rechnen! Und das sind bloß die Fälle, bei denen die Polizisten erwischt wurden.

September 29, 2016 04:01 PM

Die Springerpresse findet, das Todenhöfer-Interview ...

Die Springerpresse findet, das Todenhöfer-Interview sei Fake. Es geht um dieses hier.

Tsjaha, die Springer-Presse, die letzte Bastion des investigativen Journalismus in Deutschland. Bekannt durch Reportagen wie, … äh … *blätter* … uhm … *raschel* Oh schaut nur, schon so spät *wegrenn*

Ich denke mal, die wollten sich in Sachen Rektal-Feng-Shui bei den Diensten vom "Focus" nicht die Butter vom Brot nehmen lassen.

Auf der anderen Seite schreiben sie da Behauptungen, die für mich als Laien schon so klingen, als müsste man sie inhaltlich prüfen, beispielsweise:

Die betroffene Terrorgruppe Fatah al-Scham, wie sich die Nusra-Front längst nennt, bezeichnete das Interview als eine „Lüge“. Man kenne den Kommandeur nicht, mit dem der deutsche Journalist gesprochen haben will, und auch keinen anderen, der das getan hätte.
Da stellt sich ja für mich die Frage: Wenn die Springer-Presse Kontakte zu Terrorgruppen in Syrien hat, wieso publizieren die dann nicht mal ein Interview mit denen?
Auffallend ist zudem, dass Abu al-Ezz weder den religiös verbrämten Sprachduktus der Islamisten hat noch die sonst unter den Rebellen üblichen Sprachfloskeln benutzt. Er hat auch Probleme, die islamistischen Fraktionen auseinanderzuhalten. „Das ist nie ein Mann von Fatah al-Scham“, meinen syrische Revolutionsaktivisten. Das Video sei eine Fälschung. Daran gebe es für sie keinen Zweifel.
Syrische … was? Syrische Revolutionsaktivisten? DAS nimmt die Springerpresse in den Mund und tut nicht mal Gänsefüßchen drum?! Au weia.

Todenhöfer hat auf die Vorwürfe reagiert:

Todenhöfer bestreitet die Fälschungsvorwürfe. Er habe die Identität von Abu al-Ezz recherchiert und wisse „praktisch alles über diesen Mann.“ Er sei „einfacher, nicht hochrangiger Kommandeur“ und auch „kein Salafist“, sondern ein „Kriegsknecht“, der bei der Nusra-Front nur wegen der besseren Bezahlung diene.
Bisher 1:0 für Todenhöfer, würde ich sagen?

Update: Allerdings: Nur weil Springer keine Glaubwürdigkeit genießt, heißt das nicht ja nicht, dass sie nicht trotzdem zufällig recht haben könnten. Hier ist zum Beispiel ein Blogger, der ähnliche Vorwürfe erhebt.

September 29, 2016 04:01 PM


Daniel Lemire

Can Swift code call C code without overhead?

Swift is the latest hot new language from Apple. It is becoming the standard programming language on Apple systems.

I complained in a previous post that Swift 3.0 has only about half of Java’s speed in tests that I care about. That’s not great for high-performance programming.

But we do have a language that produces very fast code: the C language.

Many languages like Objective-C, C++, Python and Go allow you to call C code with relative ease. C++ and Objective-C can call C code with no overhead. Go makes it very easy, but the performance overhead is huge. So it is almost never a good idea to call C from Go for performance. Python also suffers from a significant overhead when calling C code, but since native Python is not so fast, it is often a practical idea to rewrite performance-sensitive code in C and call it from Python. Java makes it hard to call C code, so it is usually not even considered.

What about Swift? We know, as per Apple’s requirements, that Swift must interact constantly with legacy Objective-C code. So we know that it must be good. How good is it?

To put it to the test, I decided to call from Swift a simple Fibonacci recurrence function :

void fibo(int * x, int * y) {
  int c = * y;
  *y = *x + *y;
  *x = c;

(Note: this function can overflow and that is undefined behavior in C.)

How does it fare against pure Swift code?

let c = j;
j = i &+ j;
i = c;

To be clear, this is a really extreme case. You should never rewrite such a tiny piece of code in C for performance. I am intentionally pushing the limits.

I wrote a test that calls these functions 3.2 billion times. The pure Swift takes 9.6 seconds on a Haswell processor… or about 3 nanosecond per call. The C function takes a bit over 13 seconds or about 4 nanoseconds per iteration. Ok. But what if I rewrote the whole thing into one C function, called only once? Then it runs in 11 seconds (it is slower than pure Swift code).

The numbers I have suggest that calling C from Swift is effectively free.

In these tests, I do not pass to Swift any optimization flag. The way you build a swift program is by typing “swift build” which is nice and elegant. To optimize the binary, you can type “swift build --configuration release“. Nice! But benchmark code is part of your tests. Sadly, swift seems to insist on only testing “debug” code for some reason. Typing “swift test --configuration release” fails since the test option does not have a configuration flag. (Calling swift test -Xswiftc -O gives me linking errors.)

I rewrote the code using a pure C program, without any Swift. Sure enough, the program runs in about 11 seconds without any optimization flag. This confirms my theory that Swift is testing the code with all optimizations turned off. What if I turn on all C optimizations? Then I go down to 1.7 seconds (or about half a nanosecond per iteration).

So while calling C from Swift is very cheap, insuring that Swift properly optimizes the code might be trickier.

It seems odd that, by default, Swift runs benchmarks in debug mode. It is not helping programmers who care about performance.

Anyhow, a good way around this problem is to simply build binaries in release mode and measure how long it takes them to run. It is crude, but it gets the job done in this case:

$ swift build --configuration release
$ time ./.build/release/LittleSwiftTest

real       0m2.030s
user       0m2.028s
sys        0m0.000s
$ time ./.build/release/LittleCOverheadTest

real       0m1.778s
user       0m1.776s
sys        0m0.000s

$ clang -Ofast -o purec  code/purec.c
$ time ./purec

real       0m1.747s
user       0m1.744s
sys        0m0.000s

So there is no difference between a straight C program, and a Swift program that calls billions of times a C function. They are both just as fast.

The pure Swift program is slightly slower in this case, however. It suggests that using C for performance-sensitive code could be beneficial in a Swift project.

So I have solid evidence that calling C functions from Swift is very cheap. That is very good news. It means that if for whatever reason, Swift is not fast enough for your needs, you stand a very good chance of being able to rely on C instead.

My Swift source code is available (works under Linux and Mac).

Credit: Thanks to Stephen Canon for helping me realize that I could lower the call overhead by calling directly the C function instead of wrapping it first in a tiny Swift function.

by Daniel Lemire at September 29, 2016 03:56 PM


How to create a string interpreter in python that takes propositions and return an answer based on a given dictionary keys?

So after 2 days of struggling with this problem i gave up. You are given two inputs the first one is a lists that contains propositions and the second one is a dictionary.


arg= [<prop1>, "OPERATOR", <prop2>]

dicti= {<prop1>: key1, <prop2394>: key2394,<prop2>:key2}

the following is a possible input:

    arg= [<prop1>, "OPERATOR (AND OR )",
[ "NOT" ,["NOT",<prop2>,"OPERATOR"[<prop2>, "OPERATOR", <prop3>]]]

I am betting that the problem wont be solved without using double recursion. This is my attempt to solve the problem, i started with base case that the input is a "flat" list which means 1D list that has no lists as elements of the the list. The program should not return boolean values but return true or false which are given in the dictionary.

def interpret(arg, keys ):
    if not arg :
        return "false"
    elif not arg and not keys:
        return "false"
    elif not keys and isinstance(arg,list):
        return "true"
    elif isinstance(arg[0],list):
        return interperter(arg[0:],keys)
        for i in range(len(arg)):
            if arg[i] in keys and keys[arg[i]]=="true":
                if isinstance (arg[i], list):
                    if("NOT" in arg):
                        indx= arg.index("NOT")
                        keys[arg[indx+1]]= "true" if keys[arg[indx+1]]=="true" else "false"
        if trueCnr==len(arg)-1 and "AND" in arg: return "true"
        elif trueCnr!= 0 and "OR" in arg: return "true"
        else: return "false"

print (interpret(["door_open", "AND", ["NOT","cat_gone"]], {"door_open" : "false", "cat_gone" : "true", "cat_asleep" : "true"} ))

My question is how do i proceed from here.

by Leo wahyd at September 29, 2016 03:45 PM

Writing a functor in C

I want to make something like this in C:

typedef int (*func)(int);

func make_adder(int a) {
  int add(int x) {
    return a + x;
  return &add;

int main() {
  func add_42 = make_adder(42);
  // add_42(10) == 52

But this doesn't work. Is it doable? Where is my mistake?

by andrepd at September 29, 2016 03:45 PM


What are system clock and CPU clock; and what are their functions?

While reading a book, I came across a paragraph given below:

In order to synchronize all of a computer’s operations, a system clock—a small quartz crystal located on the motherboard—is used. The system clock sends out a signal on a regular basis to all other computer components.

And another paragraph:

Many personal computers today have system clocks that run at 200 MHz, and all devices (such as CPUs) that are synchronized with these system clocks run at either the system clock speed or at a multiple of or a fraction of the system clock speed.

Can anyone kindly tell:

  • What is the function of the system clock? And what is meant by “synchronize” in the first paragraph?
  • Is there any difference between “system clock” and “CPU clock”? If yes, then what is the function of the CPU clock?

by swdeveloper at September 29, 2016 03:43 PM

Planet Emacsen

Irreal: Reevaluating Local Variables

An odd but very useful feature of Emacs is (file) local variables. This allows you to specify certain Emacs variables either on the first line of a file or at the end in a special Local Variables block. Typical uses include specifying line lengths, indention amounts, and other formatting features and specifying how to compile the file so that the compile command will work correctly.

I described it as odd because I first encountered the Local Variables block before I became an Emacs user and I thought it was very odd to see that sort of thing in a C source file. Regardless, local variables can be very useful.

One problem with them is what happens if you change a value or when you first add them to a file. How do you get Emacs to recognize the new value? I always solved this problem by reloading the file with Ctrl+x Ctrl+v (find-alternate-file) but there are other, better ways.

Grant Rettke over at Wisdom and Wonder points to two methods for reevaluating local variables. One method re-runs the hooks associated with the file and the other method doesn't. Hop on over and take a look; it will take you less than a minute.

by jcs at September 29, 2016 03:37 PM


Augmented Dickey-Fuller Questions

I've been searching in bibliography about this test applied to an AR(p) model. $$Q(L)(Y_{t})=c+\epsilon_{t}$$

Where L represent the Lag Operator and $Q=1-\phi_{1}x-.....-\phi_{p}x^{p}$ is the polynomial expression associated to the model.

I know that if $Q(r)=0$ implies $|r|>1$, then the process is stationary (at least in weak sense).

My question is: Why the Null Hypothesis of Augmented Dickey-Fuller test is stated as: "$r=1$ is a root of the polynomial"? Rejecting that hypothesis implies that every single root of Q lies outside the unit circle??

I'm new at this area so every recommendation or suggestion will be useful. Thanks.

by Ivan Rey at September 29, 2016 03:28 PM


Rust as a language for high performance GC implementation

Immix implementation in Rust that performs just as well as the C implementation, and is still able to take advantage of Rust’s safety guarantees in much of the code despite GCs generally being very low level and chasing random pointers.

Edit: If you’re having trouble with the PDF from the main link, this one might treat you better:


by fitzgen at September 29, 2016 03:16 PM


Is it feasible to implement a Clean backend with LLVM

Would it be feasible to implement a backend for Clean using the LLVM toolkit? If not, what are the stumbling blocks?

Also, if you happened to know of a good reference for the "ABC assembler" used as an IR by the Clean compiler, please include it in your answer. Thanks.

by brooks94 at September 29, 2016 02:55 PM

Call-by-value and by-name equivalence

I'm working in a Coursera course on functional programming and at some point they discuss the difference between call-by-value and call-by-name evaluation techniques. They're some point that confuses me, they say:

Both techniques reduce to the same final values as long as:

  1. the reduced expressions consists of pure functions and
  2. both evaluations terminate

which seems to be a lambda calculus theorem.

Could you explain me what they mean by "the reduced expressions conssist of pure functions"?

by Rodrigo at September 29, 2016 02:50 PM


Linguistic relativity (Sapir–Whorf hypothesis)

Linguistic relativity, also known as the Sapir–Whorf hypothesis or Whorfianism, is a concept-paradigm in linguistics and cognitive science that holds that the structure of a language affects its speakers' cognition or world view. It used to have a strong version that claims that language determines thought and that linguistic categories limit and determine cognitive categories. The more accepted weak version claims that linguistic categories and usage only influence thoughts and decisions.

This was mentioned in this talk


by kghose at September 29, 2016 02:40 PM


How to use a previously saved model in R on a new data [on hold]

My data is in .txt format. I have built the model using k-NN.When using the new dataset on the previous saved model I get a error in Dim message.

 knn.pred_n <-knn([train.idx, ], tdm.stack.nl_n[test.idx_n, ], tdm.cand[train.idx])

by rohit kumar at September 29, 2016 02:25 PM


Volar Higher Order Parametrizations

I came across this presentation from The authors show fitting examples for a flexible volatility smile parametrization in 5 to 8 parameters which is also able to fit the locally concave market implied volatility smiles around special events.

Does anybody know the details of their parametrization and can you provide a reference? In particular, is it a simple extension of their C3 parametrization where the Cn curve is given by

\begin{equation} \sigma^2(z) = \sigma_0^2 \left( 1 + \sum_{i = 1}^{n - 1} \frac{1}{n!} \xi_i z^n \right) \end{equation}


\begin{equation} z = \frac{\ln(K / F)}{\sigma_0 \sqrt{T}}. \end{equation}

I suppose this is not the case and there is more to it as their examples look very stable on the wings which I would not expect from higher order polynomials.

by LocalVolatility at September 29, 2016 02:12 PM

How can I go about applying machine learning algorithms to stock markets?

I am not very sure, if this question fits in here.

I have recently begun, reading and learning about machine learning. Can someone throw some light onto how to go about it or rather can anyone share their experience and few basic pointers about how to go about it or atleast start applying it to see some results from data sets? How ambitious does this sound?

Also, do mention about standard algorithms that should be tried or looked at while doing this.

by zubinmehta at September 29, 2016 02:10 PM


Logistic Regression and Naive Bayes for this dataset

Can both Naive Bayes and Logistic regression classify both of these dataset perfectly ? My understanding is that Naive Bayes can , and Logistic regression with complex terms can classify these datasets. Please help if I am wrong.

Image of datasets is here:

enter image description here

by pa1geek at September 29, 2016 01:56 PM

how to extract the decision rules from scikit-learn decision-tree?

Can I extract the underlying decision-rules (or 'decision paths') from a trained tree in a decision tree - as a textual list ?
something like: "if A>0.4 then if B<0.2 then if C>0.8 then class='X' etc...
If anyone knows of a simple way to do so, it will be very helpful.

by Dror Hilman at September 29, 2016 01:49 PM

I don't understand how parameter sweep is done in the Azure Machine Learning?

In azure ml if we select a train algorithm(for ex "Two Class Logistic Regression") we can then have a set of parameters to do a parameter sweep while training.But how can I know how they change values of parameters in the training?

by Gayal Shamane at September 29, 2016 01:43 PM

Need help in preprocessing imdb dataset

We are doing final year project i.e. movie success prediction

The dataset is taken from imdb organization.

All the files(.list) are having size more than 700MB. But i don't know how to clean and preprocess the data where to start.

Suggest me.

  1. Which file format is the best to merge the data after converging for analysizing the data.

  2. How to clean and preprocess the datasets.

by Kasthuri Shravankumar at September 29, 2016 01:37 PM

Why do C++ STL function calls need to be so verbose?

Why can't calls to STL functions be more brief? I was looking at the following code snippet on

#include <string>
#include <cctype>
#include <algorithm>
#include <iostream>

int main()
    std::string s("hello");
    std::transform(s.begin(), s.end(), s.begin(),
                   [](unsigned char c) { return std::toupper(c); });
    std::cout << s;

It seems to me that it should be possible to make this call more brief. The first obvious thing would be to do away with the lambda:

std::string s("hello");
std::transform(s.begin(), s.end(), s.begin(), std::toupper);
std::cout << s;

But this doesn't work. Since you usually want to convert a whole container it should be possible to just use that as a parameter:

std::string s("hello");
std::transform(s, s.begin(), std::toupper);
std::cout << s;

You could also omit the output iterator to return the result by value:

std::string s("hello");
std::cout << std::transform(s, std::toupper);

At this point the temporary variable is unnecessary:

std::cout << std::transform("hello"s, std::toupper);

With added readability:

using namespace std;
cout << transform("hello"s, toupper);

Wouldn't this be more readable and just nicer? Why weren't STL functions designed to allow writing brief code like this? Will it be possible to shorten these calls in a future version of the C++ standard?

by Xoralunga at September 29, 2016 01:17 PM

Lobsters has some serious dynamic UI going on when you adjust the width and font size

I usually Command+ to make the font size larger ( or CTRL+ for PC ) so when I resized my browser window width making it very narrow, I was pleasantly met with a change UI—I could clearly see the number of comments on a story! Nice work to the UI creator.

by Usermac at September 29, 2016 01:17 PM

CSS Tag Proposal

I’d like to propose adding a tag for CSS. We already have a javascript tag, it only makes sense to have a css tag as well.

I know this has been brought up before, but I think there are articles where the combination of web and design tags doesn’t quite cut it, for example:

by flyingfisch at September 29, 2016 01:15 PM


Is there an operator form of map in python?

In Mathematica it is possible to write Map[f,list] as f/@list where /@ is the operator form of Map. In python there is map(f, list) but is there a similar operator form or a package providing this?

The application is that deeply nested transformations using many maps end up with a lot of brackets whereas operator chaining can be simpler to read (and type).

by Ymareth at September 29, 2016 01:03 PM


Im Vergleich zu den 22 Millionen E-Mails, die die Bush-Junta ...

Im Vergleich zu den 22 Millionen E-Mails, die die Bush-Junta über die Jahre "verloren" hat, wirken Hillary und ihr Mailserver geradezu vorbildlich transparent.
Clinton’s email habits look positively transparent when compared with the subpoena-dodging, email-hiding, private-server-using George W. Bush administration. Between 2003 and 2009, the Bush White House “lost” 22 million emails. This correspondence included millions of emails written during the darkest period in America’s recent history, when the Bush administration was ginning up support for what turned out to be a disastrous war in Iraq with false claims that the country possessed weapons of mass destruction (WMD), and, later, when it was firing U.S. attorneys for political reasons.

September 29, 2016 01:00 PM


Scikit classification report - change the format of displayed results

Scikit classification report would show precision and recall scores with two digits only. Is it possible to make it display 4 digits after the dot, I mean instead of 0.67 to show 0.6783?

 from sklearn.metrics import classification_report
 print classification_report(testLabels, p, labels=list(set(testLabels)), target_names=['POSITIVE', 'NEGATIVE', 'NEUTRAL'])
                     precision    recall  f1-score   support

         POSITIVE       1.00      0.82      0.90     41887
         NEGATIVE       0.65      0.86      0.74     19989
         NEUTRAL        0.62      0.67      0.64     10578

Also, should I worry about a precision score of 1.00? Thanks!

by Crista23 at September 29, 2016 01:00 PM

Scikit-learn, get accuracy scores for each class

Is there a built-in way for getting accuracy scores for each class separatetly? I know in sklearn we can get overall accuracy by using metric.accuracy_score. Is there a way to get the breakdown of accuracy scores for individual classes? Something similar to metrics.classification_report.

from sklearn.metrics import classification_report
from sklearn.metrics import accuracy_score

y_true = [0, 1, 2, 2, 2]
y_pred = [0, 0, 2, 2, 1]
target_names = ['class 0', 'class 1', 'class 2']

classification_report does not give accuracy scores:

print(classification_report(y_true, y_pred, target_names=target_names, digits=4))

Out[9]:         precision    recall  f1-score   support

class 0     0.5000    1.0000    0.6667         1
class 1     0.0000    0.0000    0.0000         1
class 2     1.0000    0.6667    0.8000         3

avg / total     0.7000    0.6000    0.6133         5

Accuracy score gives only the overall accuracy:

accuracy_score(y_true, y_pred)
Out[10]: 0.59999999999999998

by CentAu at September 29, 2016 12:49 PM



How can I use R to get confidence intervals in Azure ML?

I came across this question which asks if Azure ML can calculate confidence - or probabilities - for row data prediction. However, given that the answer to that question is No, and suggests to use R, I am trying to figure out how to use R to do exactly this for a regression model.

Does anyone have any suggestions for references on where to look for this?

My scenario is that I have used Azure ML to build a boosted decision tree regression model, which outputs a Scored Label column. But I don't know regression analysis well enough to write R code to use the outputted model to get confidence intervals.

I am looking for any references that can help me understand how to do this in R (in conjuncture with Azure ML).

by Brett at September 29, 2016 12:31 PM


How do I construct a palindrome from a given string using only replacement operations?

I have an input string s.

Operation allowed:- replace any character any number of times.

Constraints:- The final string must be a palindrome and contain “hello” as its substring.

Output:- The minimum number of modifications required to make such a string.


Input:- thilloaoyreot

Output:- thelloaolleht and minimum number of replacements are 4.

Note:- The input string may or may not contain "hello" or its substring as a substring in itself.

Can anyone help me out with this one, how should I go about solving this?

by Sasa at September 29, 2016 12:28 PM

Why are some programming languages turing complete but lack some abilities of other languages?

I came across an odd problem when writing an interpreter that (should) hooks to external programs/functions: functions in 'C' and 'C++' can't hook variadic functions, e.g. I can't make a function that calls 'printf' with the exact same arguments that it got, and instead has to call an alternate version that take a variadic object. This is very problematic since I want to be able to make an object that hold an anonymous hook.

So, I thought that this was weird since 'Forth', 'JavaScript', and perhaps a plethora of other languages can do this very easily without having to resort to assembly language/machine code. Since other languages can do this so easily, does that mean that the class of problems that each programming language can solve acctually varies by language, even though these languages are all turing complete?

by Mr. Minty Fresh at September 29, 2016 12:28 PM


An oracle in $\mathsf{NEXP}$ that separates ZPP from BPP

Does there exist an oracle $A \in \mathsf{NEXP}$ such that $ \mathsf{ZPP}^A \neq \mathsf{BPP}^A$?

by Ilya Volkovich at September 29, 2016 12:21 PM


How one can run xgboost on hadoop cluster for distributed model training?

I am trying to built a CTR prediction model using XGBoost on 100 million of impressions for contextual ads and in order to achieve the same, I want to try XGboost on hadoop as I have all of the impressions data available in HDFS.

Can someone cite a working tutorial for the same for python?

by user2237536 at September 29, 2016 12:11 PM


Old and busted: Schläfer-Zelle.New hotness: Propaganda-Zelle.Polizeieinheiten ...

Old and busted: Schläfer-Zelle.

New hotness: Propaganda-Zelle.

Polizeieinheiten haben bei Razzien in Spanien, Belgien und Deutschland eine Propaganda-Zelle der Terrororganisation "Islamischer Staat" (IS) zerschlagen und fünf mutmaßliche Anhänger der Extremistenmiliz festgenommen.

September 29, 2016 12:01 PM

HP kloppt ein bemerkenswert wieseliges (selbst für ...

HP kloppt ein bemerkenswert wieseliges (selbst für HP-Wiesel-Statements) Statement zu ihrem Druckerpatronen-Bricking neulich raus. Ich zitere mal:
HP stated that it made changes to the firmware in a number of its printers in order to “protect the printers and to protect the communication between the cartridge and the printer.”

HP went on to state that its printers “reject non-HP cartridges in several cases,” and that these modifications are crucial to not only “protect innovation and intellectual property, but also to improve the safety of products for customers”

“When ink cartridges are cloned or counterfeited, the customer is exposed to quality and potential security risks, compromising the printing experience.”

Seht ihr? Das ist zu eurem eigenen Schutz, dass HP eure HP-Drucker sabotiert!1!!

Was sie wohl mit security risks meinen? Vielleicht wie bei Samsung, dass man mit Explosionen rechnen muss?

Und ich hätte ja, mit Verlaub, lieber ein "kompromittiertes Druckererlebnis" als gar keines, weil der Hersteller meines Druckers mich sabotiert. (Ich habe keine HP-Drucker)

September 29, 2016 12:01 PM

Die britische Regierung stellt ihre Drohnenmord-Logs ...

Die britische Regierung stellt ihre Drohnenmord-Logs ins Netz. Natürlich ein bisschen aufbereitet, damit da nicht "wildfremde Menschen, von denen wir aufgrund nicht vorhandener oder ultradünner Beweislage annehmen, dass es sich möglicherweise um Terroristen handeln könnte" steht. Nein, nein! Alles Kombattanten! (Soundtrack dazu)

September 29, 2016 12:01 PM

Amnesty International sagt, der Sudan soll in Darfur ...

Amnesty International sagt, der Sudan soll in Darfur Chemiewaffen eingesetzt haben. Sobald die Amis merken, dass es im Sudan Erdöl gibt, kommen die und sorgen für Ordnung, bringen Demokratie und Freiheit und Menschenrechte!

September 29, 2016 12:01 PM


[VIDEO] Get the Most Out of ArtPrize Eight Using Our 5 Favorite App Features

ArtPrize, Grand Rapids’ growing international art competition that takes over the city every autumn, is back for its eighth year. Also returning is the new and improved ArtPrize app, developed by Atomic Object in partnership with the ArtPrize team (for both Android and iOS).

With so many ways to use the app to interact with the city and the artwork, I asked five insiders from the Atomic and ArtPrize teams to share their favorite features of this year’s app—from registering to voting to finding a bathroom when you need one. Here’s what they said, along with directions for how to use each feature.

Job Vranish, Atomic Object Software Developer

Favorite Feature: Downtown In-App Registration

“I think one of the most convenient features of the ArtPrize app is the ability to register for voting just on your phone, if you’re downtown.”

How to Use It

To register, make sure you’re in the vicinity of downtown Grand Rapids. After installing the app, go to the “Voting” screen, and press the register button. Follow the instructions on the screen, and the app will send you a text message with a code. You can enter that code in the app to finish registering to vote.

Janenell Woods, ArtPrize Communications Manager

Favorite Feature: Discover View

“My favorite part of the ArtPrize mobile app this year is the Discover View. It’s a really great resource for visitors whether they’re already downtown, or whether they’re planning their visit ahead of time to check in, see what’s the latest news, see what’s happening near them, and to have the greatest ArtPrize experience that they can.”

How to Use It

Once you sign into the app, the first page you’ll open up is the Discover View. A bunch of icons will appear at the top or bottom of the mobile app, depending on whether you’re using iOS or Android. You can scroll through the long list of events, happenings, and news tidbits about ArtPrize in these tiles. Once you see something you want to explore further, tap on it, and it’ll take you out to more information.

Chris Farber, Atomic Object Software Developer

Favorite Feature: Map of Venues

“My favorite way to use the app is to look at the map of venues so that I can see which entries are near me, and when I can see them.”

How to Use It

Once you sign into the app, click on the Venues icon. From there, you can see where you are. You can scroll though the list of venues to see their open hours and how many entries they have. You can even filter venues by tapping the word “Filter” to see which venues are currently open, which have public restrooms, or which have free public parking.

Jeff Wheeler, ArtPrize Technology Viceroy

Favorite Feature: Events

“I think my favorite feature is the Events portion of the app. Events are a critical part of the ArtPrize experience, and you can use it to see what’s going on today, tomorrow, or when you’re planning on attending.”

How to Use It

After downloading the app, click on the Events tile. Then select the day you’re interested in. The map will populate with events that are going on around you, and it will let you know when they’re scheduled. If you tap on an event, you can add it to your calendar and share the event with friends.

Todd Herring, ArtPrize Director of Creative & Communications

Favorite Feature: Voting Made Easy

“My favorite thing about the ArtPrize app is just how easy it is to vote for your favorite art. There are a couple ways to do it. We try to put your ability to vote really close to whatever action you might be in.”

How to Use It

Say you went to ArtPrize and saw some art you liked, and you remember the venue where you saw it, but not the artist’s name. To vote for that entry, you can use the Venue view. Just click on the venue to pull up its profile and find the entry you liked. Then, tap on the vote icon to place your vote. Alternately, you can tap the “thumbs up” icon when you’re on any page in the app. From there, you can enter the five-digit code for the entry (posted at each entry’s display). Then press submit to vote for that entry.

Find Your Favorite Feature

Download the free ArtPrize app for Android and iOS to find your favorite feature. Let us know what it is in the comments!

The post [VIDEO] Get the Most Out of ArtPrize Eight Using Our 5 Favorite App Features appeared first on Atomic Spin.

by Elaine Ezekiel at September 29, 2016 12:00 PM


what is the optimal cost of Maximum independent set for LP and ILP? [on hold]

For the graph C5: the optimal cost of MIS is =2, because we will choose 2 vertices not-adjacents. When relaxing the problem, and assign 0.5 for every vertex, we will have optimal cost =1. I have checked that the optimal cost of LP is =2.5. Where is the mistake then?

by Mo Farouk at September 29, 2016 11:48 AM



Python 2 or Python 3 for computer vision and machine learning? [on hold]

Someone said Python 3 is much greater as a language than Python 2 but not many packages support Python 3. Some others said that nowadays most of the packages support Python 3 perfectly. Is it a good choice to learn and use Python 3 directly, skipping Python 2, in the fields of computer vision and machine learning? Which popular CV and ML packages still not support Python 3?

by user145055 at September 29, 2016 11:18 AM


Why unsafe state not always cause deadlock?

I was reading Operating Systems by Galvin and came across the below line,

Not all unsafe states are deadlock, however. An unsafe state may lead to deadlock

Can someone please explain how deadlock != unsafe state ? I also caught the same line here

If a safe sequence does not exist, then the system is in an unsafe state, which MAY lead to deadlock. ( All safe states are deadlock free, but not all unsafe states lead to deadlocks. )

by vikkyhacks at September 29, 2016 11:08 AM


Get a value for Y, given fitted spd for X and a fitted copula (R)

I have a dataframe D with 2 variables, X and Y. I am doing the following:

  1. I fit a spd to the data using spd package in R
  2. I get the pseudo-uniform numbers using pspd
  3. I put the element in a new matrix U
  4. I fit a copula to U
  5. I simulate N realizations
  6. I convert back to get quantiles using qspd

At this point I have a set of N correlated realizations for X and Y. What I need now is: given a value X* for my variable X, and given the fitted copula, what is:

  • the expected value of Y
  • the 99th percentile of the conditional distribution for Y

Given that the value X* might not be in the simulated dataset, I guess I need to use the fitted spd and the copula, but I am not sure how to do it. Any help please? Thank you.

by opt at September 29, 2016 11:06 AM


Stuck on fully deleting eset smart secuiraty [on hold]

I am not sure if I had to ask it here but I cannot find any other sites to ask.

I inistalled eset smart securaty on my computer but I didn't like the trial and I deleted it but it isn't fully deleted and it makes a lot of problems it fill's 90% of my cpu how can I delete it?

by Taha Akbari at September 29, 2016 10:57 AM

Decidability of the TM's computing a none empty subset of total functions

I have this HW problem: Let $F$ be the set of computable total functions, and let $\emptyset\subsetneq S\subseteq F$. Denote $$L_S=\{ \langle M \rangle | M \text{ is a TM that computes a function and } f_M\in S \}$$

where $f_M$ is the function $M$ computes.

Prove that for every such none-trivial $S$, $L_S \notin \mathcal{R}$.

I tried to construct

  • $L_{f}=\left\{ x\#y\in{\Sigma^*}|y=f\left(x\right)\right\} $
  • $\tilde{S}=\left\{ L_{f}|f\in S\right\} $
  • $L_{\tilde{S}}=\left\{ \left\langle M\right\rangle |L\left\langle M\right\rangle \in\tilde{S}\right\} $

and then show with Rice that $L_{\tilde{S}} \notin \mathcal{R}$, when the idea behind it was to eventually show that $L_{\tilde{S}} \leq_m L_S $. But the problem here is that I couldn't show a mapping reduction from $L_{\tilde{S}}$ to $L_S$ without assuming $L_{\tilde{S}} \in \mathcal{RE}$ (which I'm quite sure is not true).
So any other directions will be warmly welcomed!

by Uria Mor at September 29, 2016 10:50 AM


Jupyter Notebook - EOF Error when loading CIFAR-10 dataset

I'm doing the assignments for the CS231n course. The first part of this requires to load the CIFAR-10 Dataset. This is the code that loads it :

import cPickle as pickle
import numpy as np
import os
from scipy.misc import imread

def load_CIFAR_batch(filename):
  """ load single batch of cifar """
  with open(filename, 'rb') as f:
    datadict = pickle.load(f)
    X = datadict['data']
    Y = datadict['labels']
    X = X.reshape(10000, 3, 32, 32).transpose(0,2,3,1).astype("float")
    Y = np.array(Y)
    return X, Y

def load_CIFAR10(ROOT):
  """ load all of cifar """
  xs = []
  ys = []
  for b in range(1,6):
    f = os.path.join(ROOT, 'data_batch_%d' % (b, ))
    X, Y = load_CIFAR_batch(f)
  Xtr = np.concatenate(xs)
  Ytr = np.concatenate(ys)
  del X, Y
  Xte, Yte = load_CIFAR_batch(os.path.join(ROOT, 'test_batch'))
  return Xtr, Ytr, Xte, Yte
Actually here is the part that calls the functions:

# Load the raw CIFAR-10 data.
cifar10_dir = 'cs231n/datasets/cifar-10-batches-py'
X_train, y_train, X_test, y_test = load_CIFAR10(cifar10_dir)

# As a sanity check, we print out the size of the training and test data.
print 'Training data shape: ', X_train.shape
print 'Training labels shape: ', y_train.shape
print 'Test data shape: ', X_test.shape
print 'Test labels shape: ', y_test.shape

This is the error I'm getting:

EOFError                                  Traceback (most recent call last)
<ipython-input-2-76ab1121c87e> in <module>()
      1 # Load the raw CIFAR-10 data.
      2 cifar10_dir = 'cs231n/datasets/cifar-10-batches-py'
----> 3 X_train, y_train, X_test, y_test = load_CIFAR10(cifar10_dir)
      5 # As a sanity check, we print out the size of the training and test data.

/home/rishabh/CS231-master/assignment1/cs231n/data_utils.pyc in load_CIFAR10(ROOT)
     19   for b in range(1,6):
     20     f = os.path.join(ROOT, 'data_batch_%d' % (b, ))
---> 21     X, Y = load_CIFAR_batch(f)
     22     xs.append(X)
     23     ys.append(Y)

/home/rishabh/CS231-master/assignment1/cs231n/data_utils.pyc in load_CIFAR_batch(filename)
      6   """ load single batch of cifar """
      7   with open(filename, 'rb') as f:
----> 8     datadict = pickle.load(f)
      9     X = datadict['data']
     10     Y = datadict['labels']


How do I resolve this error?

by Rishabh Madan at September 29, 2016 10:33 AM

User path anaysis in web log analytics

I'm trying to find the time taken by a user to view a specific web page. For example in the following link, the user's initial entry level was bbc home page before proceeding to the team of interest which is arsenal. Now I'm trying to find out the time spent on each of the pages visited i.e. sport, football, teams etc rather than the total duration. Please suggest on ways to achieve this. I hope this make sense.

by Julius at September 29, 2016 10:10 AM

Machine learning to predict Equipment Failure

I am in a confused state of Mind to actually decide if My use case does qualify to be resolved by Machine learning algorithms.

I have a temperature measuring sensor connected of an equipment. so the sensor mainly captures two attributes , One being temperature itself and other being Time of measurement. My Aim is to send out an alarm signal when the temp remains high after certain time frame , Lets say the temp sores to 70 degree and remain constant or increase even after 1 Minute (Sensor measure data once in every 5 seconds) - So with successive 10 reading shows the Temp not going down and remains As high as 70 degree or sometime going further up - This is a Alarm situation and Must predict that equipment can burn out with such high temperature.

So in this case , the previous Dataset is of no use , as the outcome is predefined (Temperature >= 70 degree & stable for 10 readings) . Can this be treated as Machine learning use-case ? I only have live data and no prior data for this use case.

by Ashesh Nair at September 29, 2016 09:56 AM

Fred Wilson

Relaunching At Age Fourteen

Our portfolio company Meetup was launched in June of 2002. We invested five years later, in 2007, when the company was already profitable and was approaching double digit revenues. Nine years later they are 3x the size when we invested in revenues, meetups, and more. But the founder and leaders of Meetup feel like they are just beginning to scratch the surface of their mission which is to get people out of the house and into the real world doing things they enjoy with other like minded people.

This video, which they created as part of their re-launch this week, shows the range of things people use Meetup to do with others:

Many people associate Meetup with night time events where people with name tags on them walk around and introduce themselves to others. Those sorts of things happen on Meetup for sure. But the more common uses are runners using Meetup to schedule group runs, moms using Meetup to hang out with other moms, and, apparently, jugglers using Meetup to juggle together.

So Meetup is relaunching the Company this week, fourteen and a quarter years after its initial launch. This means a new logo (the name badge is gone), new mobile apps that use deep learning to understand what you want to do and encourage you to do more of it, a new team (with women leaders in both product and engineering) with lots of new engineers and data scientists, and a sharper focus on marketing.

If you want to see what the new Meetup is all about, download the new app (iOS and Android) and check it out. Maybe you will find yourself juggling in the park this weekend. I sure hope so.

by Fred Wilson at September 29, 2016 09:49 AM


Python: classify text into the categories

I have a part of training set

url  categoryДжинсы&_dcat=11483&Inseam=33&rt=nc&_trksid=p2045573.m1684 Онлайн-магазин  Search    Searchавито&oq=авито&aqs=chrome..69i57j0l5.1608j0j7&sourceid=chrome&es_sm=122&ie=UTF-8 Search    Форумы и отзывыДжинсы&_sop=15  Онлайн-магазинДжинсы&_sop=15  Онлайн-магазин   Форумы и отзывыяндекс&oq=яндекс&aqs=chrome..69i57j69i61l3j69i59l2.1383j0j1&sourceid=chrome&es_sm=93&ie=UTF-8    Searchавито&oq=авито&aqs=chrome..69i57j69i59j69i60.1095j0j1&sourceid=chrome&es_sm=93&ie=UTF-8  Search   Форумы и отзывы Онлайн-магазин  Онлайн-магазин    Онлайн-магазин Онлайн-магазин    Search   Social network

it's a connection between url and category And also I have test set and I need to get category to every url.


I don't know, what algorithm should I use to solve this task. I need the best way to get the most accuracy. And I think it's a problem, that I have multiple categories.

I try first parse html tag title, because I think, that I can's determine category only with url.

by Petr Petrov at September 29, 2016 09:15 AM

Monad interface in C++

I am currently learning a little bit haskell and started to figure out how monads work. Since I normaly code C++ and I think the monad pattern would be (as fas as I understand it right now) be realy awesome to use in C++, too, for example for futures etc,

I wonder if there is a way to implement an interface, or a base class, to enforce a correct overload of the functions bind and return (of cause with another name than return for C++) for derived types?

To make more clear what I am thinking of:

consider we have the following non member function:

auto foo(const int x) const -> std::string;

And a member function bar which has different overloads in for different classes:

auto bar() const -> const *Monad<int>;

If we now want to do something like this: foo(, this simply doesnt work. So if have to know what bar returns, and for example if it returns a future<int>, we have to call bar().get(), which blocks, even if we dont need to block here.

In haskell we could do something like bar >>= foo

So I am asking myself if we could achieve such a behaviour in C++, because when calling foo(x) we dont care if x is a object which boxes an int, and in what kind of class the int is boxed, we just want to apply function foo on the boxed type.

I am sorry I have some problem formulating my thoughts in english, since I am not a native speaker.

by Exagon at September 29, 2016 09:12 AM


Shortest paths in isomorphic graphs with different edge weights

I'm looking for a way to find the shortest paths from a source to all destinations in isomorphic undirected graphs with different edge weights.

The only thing I can think of is using Dijkstra on each graph separately, but I was wondering if there is a way to take advantage of the isomorphism for a faster algorithm.

by devil0150 at September 29, 2016 09:06 AM



Using static_cast in character array to use string functions [migrated]

I want to use static_cast to use string functions like erase() and find() in a character array, say char str[50]. For instance, please consider following segment:

char str[]="abcdefghi"; char val[]="bc"; static_cast(str).erase(static_cast(str).find(val[0],1);

Please tell me if it is correct and if it is, then str is not retaining the value it should have, i.e. "acdefghi". Please bear with me even if it sounds naive.

by Komal Joshi at September 29, 2016 08:28 AM


How can quotient types help safely expose module internals?

Reading up on quotient types and their use in functional programming, I came across this post. The author mentions Data.Set as an example of a module which provides a ton of functions which need access to module's internals:

Data.Set has 36 functions, when all that are really needed to ensure the meaning of a set ("These elements are distinct") are toList and fromList.

The author's point seems to be that we need to "open up the module and break the abstraction" if we forgot some function which can be implemented efficiently only using module's internals.

He then says

We could alleviate all of this mess with quotient types.

but gives no explanation to that claim.

So my question is: how are quotient types helping here?


I've done a bit more research and found a paper "Constructing Polymorphic Programs with Quotient Types". It elaborates on declaring quotient containers and mentions the word "efficient" in abstract and introduction. But if I haven't misread, it does not give any example of an efficient representation "hiding behind" a quotient container.


A bit more is revealed in "[PDF] Programming in Homotopy Type Theory" paper in Chapter 3. The fact that quotient type can be implemented as a dependent sum is used. Views on abstract types are introduced (which look very similar to type classes to me) and some relevant Agda code is provided. Yet the chapter focuses on reasoning about abstract types, so I'm not sure how this relates to my question.

by fizruk at September 29, 2016 08:27 AM


What does $\mathbf{Q}^+$ mean in approximation texts?

Often, in a lot of texts concerning approximation algorithms I see the following notation (for example, here in page 19 of the PDF or the first page of the introduction):

... and a cost function on vertices $c:V\to \mathbf{Q}^+$ ...

Is this just the positive rationals? Or is this something else?

by Guillermo Angeris at September 29, 2016 08:25 AM


Justification for Binary Option's Infinite Delta?

First time poster here. Glad to be here. I just graduated with an MSc in computational finance.

I recently read a question by another user about the delta of an at-the-money binary option as it approaches expiration. After writing a quick Matlab script I have confirmed that the delta in this situation explodes to infinity as the option approaches expiration.

How can this be valid? If we understand delta as the change in derivative price for every 1 dollar change in the price of the underlying asset, how could the delta ever exceed the payoff of the binary option? If the payoff on the derivative is 1 dollar when the price of the underlying asset exceeds the strike price, then a rational investor would be willing to pay 1 dollar AT MOST for that derivative, and would be willing to pay 0 dollars AT LEAST for that derivative. Therefore, the maximum range of fair values for the option would be 1 dollar right? How could the value of the derivative change by anything greater than 1 dollar then?

It's almost as if the pricing model broke in this case. Considering I found values for d2 that were negative, when d2 is supposed to be a probability measure that the asset price expires above the strike, I would say that the model somehow broke. Can anyone explain why it breaks in this case?

by Alex Ockenden at September 29, 2016 08:24 AM


jQuery notificiation function after post data with Laravel 5.3

This function needs to fire after posting the data and redirecting AND after deleting data from the database.

Binded this function to the same button used to delete the data, but i can't get it to work.

Please help!

  jQuery.fn.pushNotification = function(message, type ,speed){
    var pushBox          = "push-notification",
        pushNotification = "<div class='"+ pushBox +"'></div>";

    $(pushNotification).appendTo( $( 'header' ) );


      if (type == "alert"){
        $("."+pushBox).css("background-color", "#d15e3e");
      } else if (type == "succes"){
        $("."+pushBox).css("background-color", "#9ad362");
      } else if (type == "info"){
        $("."+pushBox).css("background-color", "#195a89");
      $("."+pushBox).animate({ right: "-10px" }, speed );
        $("."+pushBox).animate({ right: "-300px" }, speed );
      }, 2000);


Call to function with the button used for deleting from DB

$(".sI5").pushNotification("Presentatie verwijderd", "alert", 1000);

by Dirk at September 29, 2016 08:24 AM


Analysing an algorithm with a random recursive parameter

I want to start off by mentioning this is not a homework problem, but it is a suggested practice problem that I cannot seem to figure out.

The algorithm uses a random function that generates a random number between 1 and n with uniform distribution.

 Func1(A, n)
 /* A is an array of integers */

 1 if (n <= 1) then return (0);
 2 ;
 3 k = Random(n - 1)
 4 s = 0
 5 for i = 1 to n/2 do
 6    for j = 1 to n/2 do
 7        A[i] = A[i] + A[2i]*A[2j];
 8        s= s + A[i];
 9    end
10 end
11 s = s + Func1(A, k);
12 return (s);

(a) What is the asymptotic worst case running time of Func1?

(b) What is the asymptotic expected running time of Func1? Justify your solution.

I am having a hard figuring this one out. So far I only have that the two for loops take ${cn}^2$ time. I understand how probabilistic analysis works, but am still stuck :/

by Mike at September 29, 2016 08:21 AM


invoking curried function without passing another argument

I have a simple function that takes one argument

fn = function(argument) {console.log(argument)}

In setInterval, I want to call the function and pass an external variable:

argument = 1

I realize that I could do it with a higher-order function, i.e.

fn = function(argument) {
  function () {
argument = 1
setInterval(fn(argument), 1000)

And this does work, but I want to know if it can be done with curry.

I've tried:

fn = _.curry(fn)("foo")
// since the function takes only one argument,
// here it is invoked and no longer can be
// passed as a function to setInterval

fn = _.curry(fn, 2)("foo")
// setting the arity to 2 makes it so the function
// isn't invoked. But setInterval does not pass
// the additional argument and so it never ends
// up getting invoked.

I feel like there is something I'm missing with these curry examples. Am I, or will curry not help here?

by max pleaner at September 29, 2016 07:37 AM

Tensorflow Grid LSTM RNN TypeError

I'm trying to build a LSTM RNN that handles 3D data in Tensorflow. From this paper, Grid LSTM RNN's can be n-dimensional. The idea for my network is a have a 3D volume [depth, x, y] and the network should be [depth, x, y, n_hidden] where n_hidden is the number of LSTM cell recursive calls. The idea is that each pixel gets its own "string" of LSTM recursive calls.

The output should be [depth, x, y, n_classes]. I'm doing a binary segmentation -- think foreground and background, so the number of classes is just 2.

# Network Parameters
n_depth = 5
n_input_x = 200 # MNIST data input (img shape: 28*28)
n_input_y = 200
n_hidden = 128 # hidden layer num of features
n_classes = 2

# tf Graph input
x = tf.placeholder("float", [None, n_depth, n_input_x, n_input_y])
y = tf.placeholder("float", [None, n_depth, n_input_x, n_input_y, n_classes])

# Define weights
weights = {}
biases = {}

# Initialize weights
for i in xrange(n_depth * n_input_x * n_input_y):
    weights[i] = tf.Variable(tf.random_normal([n_hidden, n_classes]))
    biases[i] = tf.Variable(tf.random_normal([n_classes]))

def RNN(x, weights, biases):

    # Prepare data shape to match `rnn` function requirements
    # Current data input shape: (batch_size, n_input_y, n_input_x)
    # Permuting batch_size and n_input_y
    x = tf.reshape(x, [-1, n_input_y, n_depth * n_input_x])
    x = tf.transpose(x, [1, 0, 2])
    # Reshaping to (n_input_y*batch_size, n_input_x)

    x =  tf.reshape(x, [-1, n_input_x * n_depth])

    # Split to get a list of 'n_input_y' tensors of shape (batch_size, n_hidden)
    # This input shape is required by `rnn` function
    x = tf.split(0, n_depth * n_input_x * n_input_y, x)

    # Define a lstm cell with tensorflow
    lstm_cell = grid_rnn_cell.GridRNNCell(n_hidden, input_dims=[n_depth, n_input_x, n_input_y])
    # lstm_cell = rnn_cell.MultiRNNCell([lstm_cell] * 12, state_is_tuple=True)
    # lstm_cell = rnn_cell.DropoutWrapper(lstm_cell, output_keep_prob=0.8)
    outputs, states = rnn.rnn(lstm_cell, x, dtype=tf.float32)
    # Linear activation, using rnn inner loop last output
    # pdb.set_trace()

    output = []
    for i in xrange(n_depth * n_input_x * n_input_y):
        #I'll need to do some sort of reshape here on outputs[i]
        output.append(tf.matmul(outputs[i], weights[i]) + biases[i])

    return output

pred = RNN(x, weights, biases)
pred = tf.transpose(tf.pack(pred),[1,0,2])
pred = tf.reshape(pred, [-1, n_depth, n_input_x, n_input_y, n_classes])
# pdb.set_trace()
temp_pred = tf.reshape(pred, [-1, n_classes])
n_input_y = tf.reshape(y, [-1, n_classes])

cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(temp_pred, n_input_y))

Currently I'm getting the error: TypeError: unsupported operand type(s) for +: 'int' and 'NoneType'

It occurs after the RNN intialization: outputs, states = rnn.rnn(lstm_cell, x, dtype=tf.float32)

x of course is of type float32

I am unable to tell what type GridRNNCell returns, any helpe here? This could be the issue. Should I be defining more arguments to this? input_dims makes sense, but what should output_dims be?

Is this a bug in the contrib code?

GridRNNCell is located in contrib/grid_rnn/python/ops/

by Kendall Weihe at September 29, 2016 07:35 AM

How to find the base case for a recursive function that changes a string of characters in a certain way

Lately I have been working with recursion as, according to my professor, it represents pure functional programming approach as neither changes on variables nor side effects take place. Through my previous two questions HERE and HERE I have come to realise that its not the recursive definition per say is my problem, I understand how a recursive definition work and I have tried solving many mathematics related questions using the recursive definition and managed to solve them on first try. Because in mathematics you always have a crystal clear base case such as 0! is 1 etc. However when it comes to working with string it seems to be always the case that i have no idea how constitute my base case in form of :

if (something):
     return something
      invoke the recursive function

for example give a list of string, or char use a recursive definition to remove the vowels or alphanumeric char etc. As mention earlier its functional programming so no side effects no variable changes are permitted. Which raises the question, such questions are not mathematical how can one come up with base case?

Thanks everybody in advance for helping me to figure out my misery

by Leo wahyd at September 29, 2016 07:33 AM


How to find the position of camera given the known world coordinate of objects?

Assume that I have a picture of multiple objects (lets say more than 6). The world coordinate of these objects are known. The intrinsic parameters of the camera are also known. How can I find the position and the pose of the camera?

I know some of you will give me an answer like "For each object with image coordinate x and world coordinate X, we can form a equation. Then with multiple objects, we have multiple equations. Solve these equation and get the position of camera. Done." But this is not the answer I want. I want it to be more detailed.

I know that for each object we can form an equation:

lamda * x=P * X=K * [R t] * X

where K is the camera intrinsic matrix, R is the rotation matrix, t is the rotation vector, t=-R*C where C is the camera position.

The equation above can be transformed in to a homogeneous linear system of equation using Direct Linear Transformation. Depending of the specific type of Direct Linear Transformation, this homogeneous linear system of equations will contain lamda or not. Then with multiple objects, we can form a system with enough equations to find a non-trivial solution using the Singular Value Decomposition method.

The problem is that the solution (i.e. the camera matrix K*[R t]) can only be found up to scale (since if v is the solution then s*v is also a solution with whatever scalar s). Thus, we can only find (multiple) camera matrix that have the same projection, but the camera pose and position corresponding to these camera matrix are different (i.e. the exact position and pose of camera cannot be found by this method)

by James Do at September 29, 2016 07:21 AM

Planet Theory

Zoltán Ésik (1951-2016): In Memoriam

The following obituary for Zoltán Ésik will appear in the October issue of the Bulletin of the EATCS and on the web page of Academia Europaea.

Zoltán Ésik (1951-2016)
In Memoriam 

Luca Aceto and Anna Ingólfsdóttir
ICE-TCS, School of Computer Science, Reykjavik University

Our friend and colleague Zoltán Ésik passed away in Reykjavik, Iceland, on Wednesday, 25 May 2016. He was visiting us as he did with some  regularity, compatibly with his many engagements throughout the world. 

The day before his untimely death, Zoltán had delivered an ICE-TCS seminar entitled Equational Logic of Fixed Point Operations at Reykjavik University. At the start of his talk, he looked somewhat tired and out of breath. However, the more he was presenting a research topic that he loved and that has kept him busy for most of his research career, the more he seemed to be feeling at ease. After the talk, we spent some time making plans for mutual visits in the autumn of 2016 and we discussed some EATCS-related matters. His wife Zsuzsa and he were due to spend a few days travelling in the north of Iceland before their return to Szeged, but life had other ideas. 

Zoltán was a scientist of the highest calibre and has left behind a large body of deep and seminal work that will keep researchers in theoretical computer science busy for a long time to come. The list of refereed publications available from his web site at 
includes two books, 32 edited volumes, 135 journal papers, four book chapters, 86 conference papers and seven papers in other edited volumes. However, impressive as they undoubtedly are, these numbers give only a very partial picture of Zoltán's scientific stature. Together with the late Stephen Bloom, Zoltán was the prime mover in the monumental development of Iteration Theories. As Stephen and Zoltán wrote in the preface of their massive book on the topic, which was published in 1993 by Springer:

Iteration plays a fundamental role in the theory of computation: for
example, in the theory of automata, in formal language theory, in the
study of formal power series, in the semantics of flowchart algorithms
and programming languages, and in circular data type definitions.  It
is shown that in all structures that have been used as semantical
models, the equational properties of the fixed point operation are
captured by the axioms describing iteration theories. These structures
include ordered algebras, partial functions, relations, finitary and
infinitary regular languages, trees, synchronization trees, 2-categories,
and others.

It is truly remarkable that the equational laws satisfied by fixed point operations are essentially the same in a large number of structures used in computer science. Isolating those laws, and showing their applicability, has been one of the goals of Zoltán's scientific life and we trust that the members of our community will keep reading his work on iteration theories, which continued and went from strength to strength after Stephen and he published their 600-page research monograph in 1993. During his last talk in Reykjavik, we asked Zoltán whether he was planning to write a new edition of that book, and half-jokingly told him that it would probably be about 1,200 pages.

Zoltán's research output includes contributions to automata theory, category theory, concurrency theory, formal languages, fuzzy sets and fuzzy logic, graph theory, logic in computer science, logic programming, order theory, semiring theory and universal algebra, amongst others. The breadth of research areas to which he has contributed bears witness to his amazing mathematical powers and to his curiosity. Wherever he went and no matter how long he had travelled to get there, Zoltán's brain was always open. 

Zoltán also contributed to the research community with his service work and received several awards. Here we will limit ourselves to mentioning that he was elected member of the Academy of Europe in 2010, was named Fellow of the EATCS in 2016, was a member of the council of the EATCS from 2003 to 2015, and of the Presburger Award Committee in 2015--2016. He represented the Hungarian theoretical computer science community in the International Federation for Information Processing (IFIP) as member of TC1 since 2000 and was one of the prime mover in the establishment of the IFIP WG 1.8, Working Group on Concurrency. He also received the Gy. Farkas Research Award and the K. Rényi Research Award of the János Bolyai Mathematical Society.

Zoltán's appetite for work was phenomenal, but he also liked to have fun, to spend time with friends eating good food and drinking excellent wine, and to travel. Indeed, Zoltán's lust for travel was amazing. We lost track of his visits to myriads of research institutions and universities all over the world. He attended conferences in the most remote locations and always made sure that he would reserve some time for enjoying the most beautiful and known sites. At times, we had the feeling that he had been everywhere in the world.  

Despite being often on the move, Zoltán was very much a family man. He was very proud of his wife Zsuzsanna, their daughter Eszter and their son Robert. He always told us about the latest developments in their lives and was happy about his four grandchildren. We had the pleasure of enjoying Zsuzsanna and Zoltán's exquisite hospitality both in Szeged and in their summer home on Lake Balaton.

Zoltán was very loyal to his friends and would make trips to see them wherever they were living. We were lucky to be amongst them and had the pleasure of hosting him in Aalborg, Florence and Reykjavik, where he visited us a few times and where the thread of his life was cut. We will miss the time we spent doing research or relaxing together, his sense of humour, his conviviality and his hospitality. 

by Luca Aceto ( at September 29, 2016 07:13 AM


Understanding Type Classes, Scala implicit and C#

I read this blog post by Joe Duffy about Haskell type classes and C# interfaces.

I'm trying to understand what could have enabled c# to have type classes, and I wonder whether a feature like scala's implicits could solve it?

having this kind of feature would enable writing something like this:

public interface IReducableOf<T>
   T Append(T a, T b);
   T Empty();

public T Reduce(this IEnumerable<T> vals, **implicit** IReducerOf<T> reducer )
  Enumerable.Aggregate(vals, reducer.Append);

making sure that we have in our context an implementation of IReducerOf<T> than the compiler could "just" pick the reducer and use it to execute the code.

of course, this code cannot compile.

But my questions are:

  1. Can this enable implementing type classes?

  2. Is this similar to what is happening in scala?

I'm asking this for general understanding and not for a particular problem.


I've encountered this GitHub repo on possible implementation of type classes in c#

by barakcaf at September 29, 2016 07:07 AM


Coming in 2017 – New AWS Region in France

As cloud computing becomes the new normal for organizations all over the world and as our customer base becomes larger and more diverse, we will continue to build and launch additional AWS Regions.

Bonjour la France
I am happy to announce that we will be opening an AWS Region in Paris, France in 2017. The new Region will give AWS partners and customers the ability to run their workloads and store their data in France.

This will be the fourth AWS Region in Europe. We currently have two other Regions in Europe — EU (Ireland) and EU (Frankfurt) and an additional Region in the UK expected to launch in the coming months. Together, these Regions will provide our customers with a total of 10 Availability Zones (AZs) and allow them to architect highly fault tolerant applications while storing their data in the EU.

Today’s announcement means that our global infrastructure now comprises 35 Availability Zones across 13 geographic regions worldwide, with another five AWS Regions (and 12 Availability Zones) in France, Canada, China, Ohio, and the United Kingdom coming online throughout the next year (see the AWS Global Infrastructure page for more info).

As always, we are looking forward to serving new and existing French customers and working with partners across Europe. Of course, the new Region will also be open to existing AWS customers who would like to process and store data in France.

To learn more about the AWS France Region feel free to contact our team in Paris at

A venir en 2017 – Une nouvelle région AWS en France

Je suis heureux d’annoncer que nous allons ouvrir une nouvelle région AWS à Paris, en France, en 2017. Cette nouvelle région offrira aux partenaires et clients AWS la possibilité de gérer leurs charges de travail et de stocker leurs données en France.

Cette Région sera la quatrième en Europe. Nous avons actuellement deux autres régions en Europe – EU (Irlande) et EU (Francfort) et une région supplémentaire ouvrira dans les prochains mois au Royaume-Uni. Cela portera à dix le total des Zones de Disponibilités (AZ) en Europe permettant aux clients de concevoir des applications tolérantes aux pannes et de stocker leurs données au sein de l’Union Européenne.

Cette annonce signifie que notre infrastructure globale comprend désormais 35 Zones de Disponibilités, réparties sur 13 régions dans le monde et que s’ajoute à cela l’ouverture l’année prochaine de cinq régions AWS (et 12 Zones de Disponibilités) en France, au Canada, en Chine, dans l’Ohio, et au Royaume-Uni (pour plus d’informations, visitez la page d’AWS Global Infrastructure).

Comme toujours, nous sommes impatients de répondre aux besoins de nos clients français, actuels et futurs, et de travailler avec nos partenaires en Europe. Bien entendu, cette nouvelle région sera également disponible pour tous les clients AWS souhaitant traiter et stocker leurs données en France.

Pour en savoir plus sur la région AWS en France, vous pouvez contacter nos équipes à Paris:


by Jeff Barr at September 29, 2016 06:59 AM


Comparison of Decision Trees: RTED vs MLE

I read two papers discussing how to compare between decision trees. One of them is RTED (that is specifically for trees and sub-case is decision trees) can be found here (RTED: A Robust Algorithm for the Tree Edit Distance)here:
and additional approach is COMBINING CLASSIFICATION TREES USING MLE by Shannon and Banks (Shannon, W. D., & Banks, D. (1999). Combining classification trees using MLE. Statistics in Medicine, 18(6), 727-740.). I would like to know if the MLE method is a sub case of RTED algorithm for comparison between decision trees? "Comparison between decision trees" I mean, given 2 decision trees we would like to compare the structure of them. It means by the location and the name of each node. The RTED transfers the trees into a Bracket Tree Format that contains the names of the tree nodes and compares between the 2 strings. The output is a number that is the Tree distance value (also know in literature Edit Distance). The question is if the algorithm described by Shannon and Banks (called MLE) is a specific case that is covered by RTED algorithm. If the final result of comparison between 2 decision trees will yield same result. If the answer is No, what are the cases in which the different algorithms will give different results?

by Avi at September 29, 2016 06:50 AM


How will i able check trained model prediction on new data

I build a model with the help of Scikit library in python and trained and test using cross-validation method.But now i want to test the model accuracy with more new data,how can i able to test with new data after building it.

by Saurabh at September 29, 2016 06:39 AM


What can I do personally to increase my chances of landing my first computer science internship? [on hold]

I am currently in my fall semester of my sophomore year double majoring in computer science and mathematics. I want to start applying and looking for internships with companies in the field of computer science. What can I do personally that can make myself stand out to employers hiring for internships? I have plenty of good work experience but none of it is related to computer science. I want to stick out from other applicants but am unsure what I could do personally to give me an advantage over other applicants.

by B. Nott at September 29, 2016 06:28 AM



My Learning Program on Human Pose Estimation using MatConvNet

I am writing my first MatConvNet program on human pose estimation, and when i ran it, i got the below error message (I am running on my MacBook Pro which do not have CUDA).

>> imdb = load('imdb_lsp.mat');

>> [net, info] = maxnet(imdb, [], '../Library/matconvnet/');

     layer|    0|    1|    2|    3|    4|    5|    6|    7|    8|    9|   10|   11|   12|   13|   14|   15|   16|   17|

      type|input| conv| relu| norm|mpool| conv| relu| norm|mpool| conv| relu| conv| relu| conv| relu|mpool| conv| relu|

      name|  n/a|conv1|relu1|norm1|pool1|conv2|relu2|norm2|pool2|conv3|relu3|conv4|relu4|conv5|relu5|pool5|  fc6|relu6|


   support|  n/a|   11|    1|    1|    3|    5|    1|    1|    3|    3|    1|    3|    1|    3|    1|    3|    6|    1|

  filt dim|  n/a|    3|  n/a|  n/a|  n/a|   48|  n/a|  n/a|  n/a|  256|  n/a|  192|  n/a|  192|  n/a|  n/a|  256|  n/a|

filt dilat|  n/a|    1|  n/a|  n/a|  n/a|    1|  n/a|  n/a|  n/a|    1|  n/a|    1|  n/a|    1|  n/a|  n/a|    1|  n/a|

 num filts|  n/a|   96|  n/a|  n/a|  n/a|  256|  n/a|  n/a|  n/a|  384|  n/a|  384|  n/a|  256|  n/a|  n/a| 4096|  n/a|

    stride|  n/a|    4|    1|    1|    2|    1|    1|    1|    2|    1|    1|    1|    1|    1|    1|    2|    1|    1|

       pad|  n/a|    0|    0|    0|    0|    2|    0|    0|    0|    1|    0|    1|    0|    1|    0|    0|    0|    0|


   rf size|  n/a|   11|   11|   11|   19|   51|   51|   51|   67|   99|   99|  131|  131|  163|  163|  195|  355|  355|

 rf offset|  n/a|    6|    6|    6|   10|   10|   10|   10|   18|   18|   18|   18|   18|   18|   18|   34|  114|  114|

 rf stride|  n/a|    4|    4|    4|    8|    8|    8|    8|   16|   16|   16|   16|   16|   16|   16|   32|   32|   32|


 data size|  227|   55|   55|   55|   27|   27|   27|   27|   13|   13|   13|   13|   13|   13|   13|    6|    1|    1|

data depth|    3|   96|   96|   96|   96|  256|  256|  256|  256|  384|  384|  384|  384|  256|  256|  256| 4096| 4096|

  data num|  128|  128|  128|  128|  128|  128|  128|  128|  128|  128|  128|  128|  128|  128|  128|  128|  128|  128|


  data mem| 75MB|142MB|142MB|142MB| 34MB| 91MB| 91MB| 91MB| 21MB| 32MB| 32MB| 32MB| 32MB| 21MB| 21MB|  4MB|  2MB|  2MB|

 param mem|  n/a|136KB|   0B|   0B|   0B|  1MB|   0B|   0B|   0B|  3MB|   0B|  3MB|   0B|  2MB|   0B|   0B|144MB|   0B|

     layer|      18|  19|   20|      21|   22|    23|

      type| dropout|conv| relu| dropout| conv|custom|

      name|dropout6| fc7|relu7|dropout7|  fc8|  loss|


   support|       1|   1|    1|       1|    1|     1|

  filt dim|     n/a|4096|  n/a|     n/a| 4096|   n/a|

filt dilat|     n/a|   1|  n/a|     n/a|    1|   n/a|

 num filts|     n/a|4096|  n/a|     n/a|   28|   n/a|

    stride|       1|   1|    1|       1|    1|     1|

       pad|       0|   0|    0|       0|    0|     0|


   rf size|     355| 355|  355|     355|  355|   355|

 rf offset|     114| 114|  114|     114|  114|   114|

 rf stride|      32|  32|   32|      32|   32|    32|


 data size|       1|   1|    1|       1|    1|     1|

data depth|    4096|4096| 4096|    4096|   28|   NaN|

  data num|     128| 128|  128|     128|  128|   128|


  data mem|     2MB| 2MB|  2MB|     2MB| 14KB|   NaN|

 param mem|      0B|64MB|   0B|      0B|448KB|    0B|

parameter memory|217MB (5.7e+07 parameters)|

     data memory|  NaN (for batch size 128)|

train: epoch 01:   1/ 79:Error using vl_nnconv

An input is not a numeric array (or GPU support not compiled).

Error in vl_simplenn (line 300)

      res(i+1).x = vl_nnconv(res(i).x, l.weights{1}, l.weights{2}, ...

Error in cnn_train>processEpoch (line 316)

    res = vl_simplenn(net, im, dzdy, res, ...

Error in cnn_train (line 132)

    [net, state] = processEpoch(net, state, params, 'train') ;

Error in maxnet (line 63)

[net info] = cnn_train(net, imdb, @getBatch, net.meta.trainOpts);

Some details of my program.

  • I have generated a structure called “imdb_lsp.mat” which is a image data base where it just points to the location of a training, validation and testing images in order to save memory for training. The “imdb_lsp.mat” contains all the joint coordinates of the images.
  • I am using AlexNet in ImageNet, but using the MSE as the loss function.
  • The code is on my google drive:

by Max Shek-wai Chu at September 29, 2016 06:16 AM

How to read binary files in Python using NumPy?

I know how to read binary files in Python using NumPy's np.fromfile() function. The issue I'm faced with is that when I do so, the array has exceedingly large numbers of the order of 10^100 or so, with random nan and inf values.

I need to apply machine learning algorithms to this dataset and I cannot work with this data. I cannot normalise the dataset because of the nan values.

I've tried np.nan_to_num() but that doesn't seem to work. After doing so, my min and max values range from 3e-38 and 3e+38 respectively, so I could not normalize it.

Is there any way to scale this data down? If not, how should I deal with this?

Thank you.


Some context. I'm working on a malware classification problem. My dataset consists of live malware binaries. They are files of the type .exe, .apk etc. My idea is store these binaries as a numpy array, convert to a grayscale image and then perform pattern analysis on it.

by Suyash Shetty at September 29, 2016 06:15 AM


Does XML covers the representation of any structured data and knowledge

I see nowadays XML is used to structure any file or data. It can represent both flat and hierarchical relationships. For example, it can be used to show the parse tree of a sentence in a natural language.

I am not familiar with different types of data and knowledge structures, but I would like to know if XML is theoretically the ultimate representation scheme which can model and structure any data and knowledge?

What are the reasons behind the popularity of XML?

by Ahmad at September 29, 2016 05:57 AM


Conversion of DFA to NFA, and how to build some DFA that does not contain a substring? [on hold]

1) I have recently learned about using powerset construction for conversion of DFA to NFA - however this only seems to be feasible when the number of states we are working with are 3 or less, as $2^k$ seems to grow immanageable very quickly.

I have a problem with 6 states, meaning that if I were to use powerset construction, I would have 64 states. What is a more efficient way?

2) How can I efficiently build a DFA to recognize binary strings that do not contain some substring? (for example the substring 0010)

Would it be correct to build a DFA that accepts 0010, and then swap the accepting/final states?

by Hazim at September 29, 2016 04:22 AM


Does TensorFlow support binary/boolean tensors with type: 1 bit?

Does TensorFlow support boolean tensors where every element occupies just 1 bit (not 8) ? Can I define large tensors ~100 MB ?

Where can I read about it ?

Couldn't find anything searching on Internet, thats why I'm asking.

by user1019129 at September 29, 2016 04:08 AM

assigning a function(with arguments) a variable name? [JavaScript]

I've decided to teach myself functional programming.

I've submitted all of my assignments whilst maintaining the rules of functional programming, and my instructor is totally fine with it. To grab inputs, I've been doing something like this:

var getWidth = function(){
  return prompt("What is the width?");

This works fine but it could be something simpler, like:

var getWidth = prompt("What is the Width?");

The problem with this, is when I call the function getWidth(), it simply runs the function once and assigns it to the getWidth variable. But that's not what I want, I want it to run prompt("What is the Width?") every time getWidth() is called. I've tried to search for it, but I'm not really entirely sure how to phrase it. Google's useful, if you know how to use it.

by Ucenna at September 29, 2016 04:05 AM


How would you call a (real) number within the 0‥1 interval?

Computer graphics and statistics often use parameters with values defined in the [0‥1] interval. How would you name a type for such as value (as in typedef NNN float)

It seems that the only mathematical term that fits these requirements is “mantissa”‘ but it’s an overloaded term in computer science.

I settled for “norm” as in practice, the normalization of a value will bring it back to the [0‥1] interval. Thoughts? Recommendations? Alternatives?

Here’s another thread on the topic, btw.


by sebastien at September 29, 2016 03:38 AM


How to compute the spot rates as a function of the forward rates? [on hold]

Given information:

  • Spot rate at time 0.5 $r= 1.25\%$
  • Forward rate at 0.5 $f_{0.5} = 1.5\%$
  • Forward rate at 0.5 $f_1 = 1.75\%$

How do I find a formula that gives me the spot rates as a function of forward rates?

by Dan2729 at September 29, 2016 02:49 AM


how to determine when to increase parallelism of a single worker or increase workers in storm?

In storm web site(storm) wrirtes:

The "capacity" metric is very useful and tells you what % of the time in the last 10 minutes the bolt spent executing tuples. If this value is close to 1, then the bolt is "at capacity" and is a bottleneck in your topology. The solution to at-capacity bolts is to increase the parallelism of that bolt.

What does "increase the parallelism of that bolt" mean? Add tasks? executors? workers?

How to determine when to increase parallelism of a single worker or increase workers in storm?

by David at September 29, 2016 02:18 AM


$\mathsf{NP^{PP}}$ vs $\mathsf{P^{PP}}$

Is $\mathsf{NP^{PP}} = \mathsf{P^{PP}}$? Or, more generally, Is $\mathsf{NP^{PP}} \subseteq \mathsf{P^{PP}/poly}$?

by Ilya Volkovich at September 29, 2016 02:09 AM



arXiv Cryptography and Security

Quantum Tokens for Digital Signatures. (arXiv:1609.09047v2 [quant-ph] UPDATED)

The fisherman caught a quantum fish. "Fisherman, please let me go", begged the fish, "and I will grant you three wishes". The fisherman agreed. The fish gave the fisherman a quantum computer, three quantum signing tokens and his classical public key. The fish explained: "to sign your three wishes, use the tokenized signature scheme on this quantum computer, then show your valid signature to the king, who owes me a favor".

The fisherman used one of the signing tokens to sign the document "give me a castle!" and rushed to the palace. The king executed the classical verification algorithm using the fish's public key, and since it was valid, the king complied.

The fisherman's wife wanted to sign ten wishes using their two remaining signing tokens. The fisherman did not want to cheat, and secretly sailed to meet the fish. "Fish, my wife wants to sign ten more wishes". But the fish was not worried: "I have learned quantum cryptography following the previous story (The Fisherman and His Wife by the brothers Grimm). The quantum tokens are consumed during the signing. Your polynomial wife cannot even sign four wishes using the three signing tokens I gave you".

"How does it work?" wondered the fisherman. "Have you heard of quantum money? These are quantum states which can be easily verified but are hard to copy. This tokenized quantum signature scheme extends Aaronson and Christiano's quantum money scheme, which is why the signing tokens cannot be copied".

"Does your scheme have additional fancy properties?" the fisherman asked. "Yes, the scheme has other security guarantees: revocability, testability and everlasting security. Furthermore, If you're at the sea and your quantum phone has only classical reception, you can use this scheme to transfer the value of the quantum money to shore", said the fish, and swam his way.

by <a href="">Shalev Ben David</a>, <a href="">Or Sattath</a> at September 29, 2016 01:30 AM

Control of Charging of Electric Vehicles through Menu-Based Pricing. (arXiv:1609.09037v1 [math.OC])

We propose a novel online pricing mechanism for electric vehicle (EV) charging. The charging station decides prices for each arriving EV depending on the energy and the time within which the EV will be served. The user selects either one of the contracts by paying the prescribed price or rejects all of those depending on their surpluses. The charging station can serve users using renewable energy or conventional energy. Users may select longer deadlines as they may have to pay less because of the less amount of conventional energy, however, they have to wait a longer period. We consider a myopic charging station and show that there exists a pricing mechanism which jointly maximizes the social welfare and the profit of the charging station when the charging station knows the utilities of the users. However, when the charging station does not know the utilities of the users, the social welfare pricing strategy may not maximize the expected profit of the charging station and even the profit may be $0$. We propose a fixed profit pricing strategy which provides a guaranteed fixed profit to the charging station when it is unaware of the utilities of the users. Numerically, we show that how the charging station can select a profit margin to trade-off between profit and the users' surpluses. We also show empirically that since our proposed mechanism also controls the deadline of the vehicles compared to the existing pricing mechanisms, hence, the number of charging spots required can be lower and a greater efficiency can be achieved.

by <a href="">Arnob Ghosh</a>, <a href="">Vaneet Aggarwal</a> at September 29, 2016 01:30 AM

Dynamic control of agents playing aggregative games. (arXiv:1609.08962v1 [math.OC])

We address the problem to control a population of noncooperative heterogeneous agents, each with strongly convex cost function depending on the average population state, and all sharing a convex constraint, towards a competitive aggregative equilibrium. We assume an information structure through which a central controller has access to the average population state and can broadcast control signals for steering the decentralized optimal responses of the agents. We propose a dynamic control law that, based on monotone operator theory arguments, ensures global convergence to an equilibrium independently on the problem data, that are the cost functions and the constraints, local and global, of the agents. We illustrate the proposed method in two application domains: demand side management and network congestion control.

by <a href="">Sergio Grammatico</a> at September 29, 2016 01:30 AM

Independent lazy better-response dynamics on network games. (arXiv:1609.08953v1 [cs.GT])

We study an independent best-response dynamics on network games in which the nodes (players) decide to revise their strategies independently with some probability. We are interested in the convergence time to the equilibrium as a function of this probability, the degree of the network, and the potential of the underlying games.

by <a href="">Paolo Penna</a> (ETH), <a href="">Laurent Viennot</a> (IRIF, LINCS, GANG) at September 29, 2016 01:30 AM

Models of Level-0 Behavior for Predicting Human Behavior in Games. (arXiv:1609.08923v1 [cs.GT])

Behavioral game theory seeks to describe the way actual people (as compared to idealized, "rational" agents) act in strategic situations. Our own recent work has identified iterative models (such as quantal cognitive hierarchy) as the state of the art for predicting human play in unrepeated, simultaneous-move games (Wright & Leyton-Brown 2012, 2016). Iterative models predict that agents reason iteratively about their opponents, building up from a specification of nonstrategic behavior called level-0. The modeler is in principle free to choose any description of level-0 behavior that makes sense for the setting. However, almost all existing work specifies this behavior as a uniform distribution over actions. In most games it is not plausible that even nonstrategic agents would choose an action uniformly at random, nor that other agents would expect them to do so. A more accurate model for level-0 behavior has the potential to dramatically improve predictions of human behavior, since a substantial fraction of agents may play level-0 strategies directly, and furthermore since iterative models ground all higher-level strategies in responses to the level-0 strategy. Our work considers models of the way in which level-0 agents construct a probability distribution over actions, given an arbitrary game. Using a Bayesian optimization package called SMAC (Hutter, Hoos, & Leyton-Brown, 2010, 2011, 2012), we systematically evaluated a large space of such models, each of which makes its prediction based only on general features that can be computed from any normal form game. In the end, we recommend a model that achieved excellent performance across the board: a linear weighting of features that requires the estimation of four weights. We evaluated the effects of combining this new level-0 model with several iterative models, and observed large improvements in the models' predictive accuracies.

by <a href="">James R. Wright</a>, <a href="">Kevin Leyton-Brown</a> at September 29, 2016 01:30 AM

Encoding Monomorphic and Polymorphic Types. (arXiv:1609.08916v1 [cs.LO])

Many automatic theorem provers are restricted to untyped logics, and existing translations from typed logics are bulky or unsound. Recent research proposes monotonicity as a means to remove some clutter when translating monomorphic to untyped first-order logic. Here we pursue this approach systematically, analysing formally a variety of encodings that further improve on efficiency while retaining soundness and completeness. We extend the approach to rank-1 polymorphism and present alternative schemes that lighten the translation of polymorphic symbols based on the novel notion of "cover". The new encodings are implemented in Isabelle/HOL as part of the Sledgehammer tool. We include informal proofs of soundness and correctness, and have formalised the monomorphic part of this work in Isabelle/HOL. Our evaluation finds the new encodings vastly superior to previous schemes.

by <a href="">Jasmin Christian Blanchette</a>, <a href="">Sascha B&#xf6;hme</a>, <a href="">Andrei Popescu</a>, <a href="">Nicholas Smallbone</a> at September 29, 2016 01:30 AM

A generic framework for the development of geospatial processing pipelines on clusters. (arXiv:1609.08893v1 [cs.DC])

The amount of remote sensing data available to applications is constantly growing due to the rise of very-high-resolution sensors and short repeat cycle satellites. Consequently, tackling computational complexity in Earth Observation information extraction is rising as a major challenge. Resorting to High Performance Computing (HPC) is becoming a common practice, since it provides environments and programming facilities able to speed-up processes. In particular, clusters are flexible, cost-effective systems able to perform data-intensive tasks ideally fulfilling any computational requirement. However, their use typically implies a significant coding effort to build proper implementations of specific processing pipelines. This paper presents a generic framework for the development of RS images processing applications targeting cluster computing. It is based on common open sources libraries, and leverages the parallelization of a wide variety of image processing pipelines in a transparent way. Performances on typical RS tasks implemented using the proposed framework demonstrate a great potential for the effective and timely processing of large amount of data.

by <a href="">Remi Cresson</a> at September 29, 2016 01:30 AM

Flexible Dual-Connectivity Spectrum Aggregation for Decoupled Uplink and Downlink Access in 5G Heterogeneous Systems. (arXiv:1609.08888v1 [cs.NI])

Maintaining multiple wireless connections is a promising solution to boost capacity in fifth-generation (5G) networks, where user equipment is able to consume radio resources of several serving cells simultaneously and potentially aggregate bandwidth across all of them. The emerging dual connectivity paradigm can be regarded as an attractive access mechanism in dense heterogeneous 5G networks, where bandwidth sharing and cooperative techniques are evolving to meet the increased capacity requirements. Dual connectivity in the uplink remained highly controversial, since the user device has a limited power budget to share between two different access points, especially when located close to the cell edge. On the other hand, in an attempt to enhance the uplink communications performance, the concept of uplink and downlink decoupling has recently been introduced. Leveraging these latest developments, our work significantly advances prior art by proposing and investigating the concept of flexible cell association in dual connectivity scenarios, where users are able to aggregate resources from more than one serving cell. In this setup, the preferred association policies for the uplink may differ from those for the downlink, thereby allowing for a truly decoupled access. With the use of stochastic geometry, the dual connectivity association regions for decoupled access are derived and the resultant performance is evaluated in terms of capacity gains over the conventional downlink received power access policies.

by <a href="">Maria A. Lema</a>, <a href="">Enric Pardo</a>, <a href="">Olga Galinina</a>, <a href="">Sergey Andreev</a>, <a href="">Mischa Dohler</a> at September 29, 2016 01:30 AM

Approachability of convex sets in generalized quitting games. (arXiv:1609.08870v1 [cs.GT])

We consider Blackwell approachability, a very powerful and geometric tool in game theory, used for example to design strategies of the uninformed player in repeated games with incomplete information. We extend this theory to "generalized quitting games" , a class of repeated stochastic games in which each player may have quitting actions, such as the Big-Match. We provide three simple geometric and strongly related conditions for the weak approachability of a convex target set. The first is sufficient: it guarantees that, for any fixed horizon, a player has a strategy ensuring that the expected time-average payoff vector converges to the target set as horizon goes to infinity. The third is necessary: if it is not satisfied, the opponent can weakly exclude the target set. In the special case where only the approaching player can quit the game (Big-Match of type I), the three conditions are equivalent and coincide with Blackwell's condition. Consequently, we obtain a full characterization and prove that the game is weakly determined-every convex set is either weakly approachable or weakly excludable. In games where only the opponent can quit (Big-Match of type II), none of our conditions is both sufficient and necessary for weak approachability. We provide a continuous time sufficient condition using techniques coming from differential games, and show its usefulness in practice, in the spirit of Vieille's seminal work for weak approachability.Finally, we study uniform approachability where the strategy should not depend on the horizon and demonstrate that, in contrast with classical Blackwell approacha-bility for convex sets, weak approachability does not imply uniform approachability.

by <a href="">J&#xe1;nos Flesch</a>, <a href="">Rida Laraki</a> (LAMSADE, CNRS), <a href="">Vianney Perchet</a> at September 29, 2016 01:30 AM

Localization bounds for the graph translation. (arXiv:1609.08820v1 [cs.DM])

The graph translation operator has been defined with good spectral properties in mind, and in particular with the end goal of being an isometric operator. Unfortunately, the resulting definitions do not provide good intuitions on a vertex-domain interpretation. In this paper, we show that this operator does have a vertex-domain interpretation as a diffusion operator using a polynomial approximation. We show that its impulse response exhibit an exponential decay of the energy way from the impulse, demonstrating localization preservation. Additionally, we formalize several techniques that can be used to study other graph signal operators.

by <a href="">Benjamin Girault</a> (USC), <a href="">Paulo Gon&#xe7;alves</a> (DANTE), <a href="">Shrikanth Narayanan</a> (USC), <a href="">Antonio Ortega</a> (USC) at September 29, 2016 01:30 AM

Ignoring Extreme Opinions in Complex Networks: The Impact of Heterogeneous Thresholds. (arXiv:1609.08768v1 [cs.SI])

We consider a class of opinion dynamics on networks where at each time-step, each node in the network disregards the opinions of a certain number of its most extreme neighbors and updates its own opinion as a weighted average of the remaining opinions. When all nodes disregard the same number of extreme neighbors, previous work has shown that consensus will be reached if and only if the network satisfies certain topological properties. In this paper, we consider the implications of allowing each node to have a personal threshold for the number of extreme neighbors to ignore. We provide graph conditions under which consensus is guaranteed for such dynamics. We then study random networks where each node's threshold is drawn from a certain distribution, and provide conditions on that distribution, together with conditions on the edge formation probability, that guarantee that consensus will be reached asymptotically almost surely.

by <a href="">Shreyas Sundaram</a> at September 29, 2016 01:30 AM

When Big Data Fails! Relative success of adaptive agents using coarse-grained information to compete for limited resources. (arXiv:1609.08746v1 [physics.soc-ph])

The recent trend for acquiring big data assumes that possessing quantitatively more and qualitatively finer data necessarily provides an advantage that may be critical in competitive situations. Using a model complex adaptive system where agents compete for a limited resource using information coarse-grained to different levels, we show that agents having access to more and better data can perform worse than others in certain situations. The relation between information asymmetry and individual payoffs is seen to be complex, depending on the composition of the population of competing agents.

by <a href="">V. Sasidevan</a>, <a href="">Appilineni Kushal</a>, <a href="">Sitabhra Sinha</a> at September 29, 2016 01:30 AM

Adaptive 360 VR Video Streaming: Divide and Conquer!. (arXiv:1609.08729v1 [cs.MM])

While traditional multimedia applications such as games and videos are still popular, there has been a significant interest in the recent years towards new 3D media such as 3D immersion and Virtual Reality (VR) applications, especially 360 VR videos. 360 VR video is an immersive spherical video where the user can look around during playback. Unfortunately, 360 VR videos are extremely bandwidth intensive, and therefore are difficult to stream at acceptable quality levels. In this paper, we propose an adaptive bandwidth-efficient 360 VR video streaming system using a divide and conquer approach. In our approach, we propose a dynamic view-aware adaptation technique to tackle the huge streaming bandwidth demands of 360 VR videos. We spatially divide the videos into multiple tiles while encoding and packaging, use MPEG-DASH SRD to describe the spatial relationship of tiles in the 360-degree space, and prioritize the tiles in the Field of View (FoV). In order to describe such tiled representations, we extend MPEG-DASH SRD to the 3D space of 360 VR videos. We spatially partition the underlying 3D mesh, and construct an efficient 3D geometry mesh called hexaface sphere to optimally represent a tiled 360 VR video in the 3D space. Our initial evaluation results report up to 72% bandwidth savings on 360 VR video streaming with minor negative quality impacts compared to the baseline scenario when no adaptations is applied.

by <a href="">Mohammad Hosseini</a>, <a href="">Viswanathan Swaminathan</a> at September 29, 2016 01:30 AM

Some results on counting roots of polynomials and the Sylvester resultant. (arXiv:1609.08712v1 [math.CO])

We present two results, the first on the distribution of the roots of a polynomial over the ring of integers modulo $n$ and the second on the distribution of the roots of the Sylvester resultant of two multivariate polynomials. The second result has application to polynomial GCD computation and solving polynomial diophantine equations.

by <a href="">Michael Monagan</a>, <a href="">Baris Tuncer</a> at September 29, 2016 01:30 AM

Training a Probabilistic Graphical Model with Resistive Switching Electronic Synapses. (arXiv:1609.08686v1 [cs.NE])

Current large scale implementations of deep learning and data mining require thousands of processors, massive amounts of off-chip memory, and consume gigajoules of energy. Emerging memory technologies such as nanoscale two-terminal resistive switching memory devices offer a compact, scalable and low power alternative that permits on-chip co-located processing and memory in fine-grain distributed parallel architecture. Here we report first use of resistive switching memory devices for implementing and training a Restricted Boltzmann Machine (RBM), a generative probabilistic graphical model as a key component for unsupervised learning in deep networks. We experimentally demonstrate a 45-synapse RBM realized with 90 resistive switching phase change memory (PCM) elements trained with a bio-inspired variant of the Contrastive Divergence (CD) algorithm, implementing Hebbian and anti-Hebbian weight updates. The resistive PCM devices show a two-fold to ten-fold reduction in error rate in a missing pixel pattern completion task trained over 30 epochs, compared to untrained case. Measured programming energy consumption is 6.1 nJ per epoch with the resistive switching PCM devices, a factor of ~150 times lower than conventional processor-memory systems. We analyze and discuss the dependence of learning performance on cycle-to-cycle variations as well as number of gradual levels in the PCM analog memory devices.

by <a href="">S. Burc Eryilmaz</a>, <a href="">Emre Neftci</a>, <a href="">Siddharth Joshi</a>, <a href="">SangBum Kim</a>, <a href="">Matthew BrightSky</a>, <a href="">Hsiang-Lan Lung</a>, <a href="">Chung Lam</a>, <a href="">Gert Cauwenberghs</a>, <a href="">H.-S. Philip Wong</a> at September 29, 2016 01:30 AM

Squared chromatic and stability numbers without claws or large cliques. (arXiv:1609.08646v1 [math.CO])

Let $G$ be a claw-free graph on $n$ vertices with clique number $\omega$. We prove the following for the square $G^2$ of $G$. If $\omega\le 3$, then its chromatic number satisfies $\chi(G^2)\le 10$ while its stability number satisfies $\alpha(G^2)\ge n/9$ unless one of its components is a $10$-vertex clique. If $\omega \le 4$, then $\chi(G^2) \le 22$ and $\alpha(G^2)\ge n/20$. This work is motivated by a conjecture of Erd\H{o}s and Ne\v{s}et\v{r}il and provides further evidence for a strengthened form of that conjecture.

by <a href="">Wouter Cames van Batenburg</a>, <a href="">Ross J. Kang</a> at September 29, 2016 01:30 AM

Colouring squares of claw-free graphs. (arXiv:1609.08645v1 [math.CO])

Is there some absolute $\varepsilon > 0$ such that for any claw-free graph $G$, the chromatic number of the square of $G$ satisfies $\chi(G^2) \le (2-\varepsilon) \omega(G)^2$, where $\omega(G)$ is the clique number of $G$? Erd\H{o}s and Ne\v{s}et\v{r}il asked this question for the specific case of $G$ the line graph of a simple graph and this was answered in the affirmative by Molloy and Reed. We show that the answer to the more general question is also yes, and moreover that it essentially reduces to the original question of Erd\H{o}s and Ne\v{s}et\v{r}il.

by <a href="">R&#xe9;mi de Joannis de Verclos</a>, <a href="">Ross J. Kang</a>, <a href="">Lucas Pastor</a> at September 29, 2016 01:30 AM

Benchmarking the Graphulo Processing Framework. (arXiv:1609.08642v1 [cs.DB])

Graph algorithms have wide applicablity to a variety of domains and are often used on massive datasets. Recent standardization efforts such as the GraphBLAS specify a set of key computational kernels that hardware and software developers can adhere to. Graphulo is a processing framework that enables GraphBLAS kernels in the Apache Accumulo database. In our previous work, we have demonstrated a core Graphulo operation called \textit{TableMult} that performs large-scale multiplication operations of database tables. In this article, we present the results of scaling the Graphulo engine to larger problems and scalablity when a greater number of resources is used. Specifically, we present two experiments that demonstrate Graphulo scaling performance is linear with the number of available resources. The first experiment demonstrates cluster processing rates through Graphulo's TableMult operator on two large graphs, scaled between $2^{17}$ and $2^{19}$ vertices. The second experiment uses TableMult to extract a random set of rows from a large graph ($2^{19}$ nodes) to simulate a cued graph analytic. These benchmarking results are of relevance to Graphulo users who wish to apply Graphulo to their graph problems.

by <a href="">Timothy Weale</a>, <a href="">Vijay Gadepally</a>, <a href="">Dylan Hutchison</a>, <a href="">Jeremy Kepner</a> at September 29, 2016 01:30 AM

Cut Tree Construction from Massive Graphs

Authors: Takuya Akiba, Yoichi Iwata, Yosuke Sameshima, Naoto Mizuno, Yosuke Yano
Download: PDF
Abstract: The construction of cut trees (also known as Gomory-Hu trees) for a given graph enables the minimum-cut size of the original graph to be obtained for any pair of vertices. Cut trees are a powerful back-end for graph management and mining, as they support various procedures related to the minimum cut, maximum flow, and connectivity. However, the crucial drawback with cut trees is the computational cost of their construction. In theory, a cut tree is built by applying a maximum flow algorithm for $n$ times, where $n$ is the number of vertices. Therefore, naive implementations of this approach result in cubic time complexity, which is obviously too slow for today's large-scale graphs. To address this issue, in the present study, we propose a new cut-tree construction algorithm tailored to real-world networks. Using a series of experiments, we demonstrate that the proposed algorithm is several orders of magnitude faster than previous algorithms and it can construct cut trees for billion-scale graphs.

September 29, 2016 01:10 AM

Any-time Diverse Subgroup Discovery with Monte Carlo Tree Search

Authors: Guillaume Bosc, Chedy Raïssy, Jean-François Boulicaut, Mehdi Kaytoue
Download: PDF
Abstract: Discovering descriptions that highly distinguish a class label from another is still a challenging task. Such patterns enable the building of intelligible classifiers and suggest hypothesis that may explain the presence of a label. Subgroup Discovery (SD), a framework that formally defines this pattern mining task, still faces two major issues: (i) to define appropriate quality measures characterizing the singularity of a pattern; (ii) to choose an accurate heuristic search space exploration when a complete enumeration is unfeasible. To date, the most efficient SD algorithms are based on a beam search. The resulting pattern collection lacks however of diversity due to its greedy nature. We propose to use a recent exploration technique, Monte Carlo Tree Search (MCTS). To the best of our knowledge, this is the first attempt to apply MCTS for pattern mining. The exploitation/exploration trade-off and the power of random search leads to any-time mining (a solution is available any-time and improves) that generally outperforms beam search. Our empirical study on various benchmark and real-world datasets shows the strength of our approach with several quality measures.

September 29, 2016 01:08 AM

Fine-Grained Algorithm Design for Matching

Authors: George B. Mertzios, André Nichterlein, Rolf Niedermeier
Download: PDF
Abstract: Finding maximum-cardinality matchings in undirected graphs is arguably one of the most central graph problems. For $m$-edge and $n$-vertex graphs, it is well-known to be solvable in $O(m\sqrt{n})$ time; however, for several applications this running time is still too slow. Improving this worst-case upper bound resisted decades of research. In this paper we mainly focus on parameterizations of the input with respect to several kinds of distance to triviality, i.e. how far is the input from some linear-time solvable cases. Our contribution is twofold. First we focus on linear-time fixed-parameter algorithms (with low polynomial parameter dependence). To this end we develop the first linear-time algorithm for maximum matching on cocomparability graphs; this algorithm is based on the recently discovered Lexicographic Depth First Search (LDFS) and is of independent interest. Using this algorithm we derive an $O(k(n+m))$-time algorithm for general graphs, where $k$ is the vertex deletion distance to cocomparability graphs. Second we focus on linear-time kernelization. We start a deeper and systematic study of various "distance to triviality"-parameters for the maximum matching problem. We design linear (and almost linear) time computable kernels of size $O(k)$, $O(k^2)$, $O(k^3)$, and $2^{O(k)}$, respectively, where $k$ is the considered parameter in each case. Investigating linear-time kernelization of a polynomial-time solvable problem, such as maximum matching, leads to a rich number of new and combinatorially interesting challenges. Based on our results, we postulate that maximum matching has the clear potential to become the "drosophila" of "FPT in P studies" analogously to the path-breaking role vertex covering played for classical FPT studies for NP-hard problems.

September 29, 2016 01:04 AM

On Notions of Distortion and an Almost Minimum Spanning Tree with Constant Average Distortion

Authors: Yair Bartal, Arnold Filtser, Ofer Neiman
Download: PDF
Abstract: Minimum Spanning Trees of weighted graphs are fundamental objects in numerous applications. In particular in distributed networks, the minimum spanning tree of the network is often used to route messages between network nodes. Unfortunately, while being most efficient in the total cost of connecting all nodes, minimum spanning trees fail miserably in the desired property of approximately preserving distances between pairs. While known lower bounds exclude the possibility of the worst case distortion of a tree being small, it was shown in [ABN15] that there exists a spanning tree with constant average distortion. Yet, the weight of such a tree may be significantly larger than that of the MST. In this paper, we show that any weighted undirected graph admits a spanning tree whose weight is at most (1+\rho) times that of the MST, providing constant average distortion O(1/\rho).

The constant average distortion bound is implied by a stronger property of scaling distortion, i.e., improved distortion for smaller fractions of the pairs. The result is achieved by first showing the existence of a low weight spanner with small prioritized distortion, a property allowing to prioritize the nodes whose associated distortions will be improved. We show that prioritized distortion is essentially equivalent to coarse scaling distortion via a general transformation, which has further implications and may be of independent interest. In particular, we obtain an embedding for arbitrary metrics into Euclidean space with optimal prioritized distortion.

September 29, 2016 01:03 AM

StruClus: Structural Clustering of Large-Scale Graph Databases

Authors: Till Schäfer, Petra Mutzel
Download: PDF
Abstract: We present a structural clustering algorithm for large-scale datasets of small labeled graphs, utilizing a frequent subgraph sampling strategy. A set of representatives provides an intuitive description of each cluster, supports the clustering process, and helps to interpret the clustering results. The projection-based nature of the clustering approach allows us to bypass dimensionality and feature extraction problems that arise in the context of graph datasets reduced to pairwise distances or feature vectors. While achieving high quality and (human) interpretable clusterings, the runtime of the algorithm only grows linearly with the number of graphs. Furthermore, the approach is easy to parallelize and therefore suitable for very large datasets. Our extensive experimental evaluation on synthetic and real world datasets demonstrates the superiority of our approach over existing structural and subspace clustering algorithms, both, from a runtime and quality point of view.

September 29, 2016 01:02 AM

The Subset Assignment Problem for Data Placement in Caches

Authors: Shahram Ghandeharizadeh, Sandy Irani, Jenny Lam
Download: PDF
Abstract: We introduce the subset assignment problem in which items of varying sizes are placed in a set of bins with limited capacity. Items can be replicated and placed in any subset of the bins. Each (item, subset) pair has an associated cost. Not assigning an item to any of the bins is not free in general and can potentially be the most expensive option. The goal is to minimize the total cost of assigning items to subsets without exceeding the bin capacities. This problem is motivated by the design of caching systems composed of banks of memory with varying cost/performance specifications. The ability to replicate a data item in more than one memory bank can benefit the overall performance of the system with a faster recovery time in the event of a memory failure. For this setting, the number $n$ of data objects (items) is very large and the number $d$ of memory banks (bins) is a small constant (on the order of $3$ or $4$). Therefore, the goal is to determine an optimal assignment in time that minimizes dependence on $n$. The integral version of this problem is NP-hard since it is a generalization of the knapsack problem. We focus on an efficient solution to the LP relaxation as the number of fractionally assigned items will be at most $d$. If the data objects are small with respect to the size of the memory banks, the effect of excluding the fractionally assigned data items from the cache will be small. We give an algorithm that solves the LP relaxation and runs in time $O({3^d \choose d+1} \text{poly}(d) n \log(n) \log(nC) \log(Z))$, where $Z$ is the maximum item size and $C$ the maximum storage cost.

September 29, 2016 01:01 AM

A tight analysis of Kierstead-Trotter algorithm for online unit interval coloring

Authors: Tetsuya Araki, Koji M. Kobayashi
Download: PDF
Abstract: Kierstead and Trotter (Congressus Numerantium 33, 1981) proved that their algorithm is an optimal online algorithm for the online interval coloring problem. In this paper, for online unit interval coloring, we show that the number of colors used by the Kierstead-Trotter algorithm is at most $3 \omega(G) - 3$, where $\omega(G)$ is the size of the maximum clique in a given graph $G$, and it is the best possible.

September 29, 2016 01:00 AM

Multiplicative weights, equalizers, and P=PPAD

Authors: Ioannis Avramopoulos
Download: PDF
Abstract: We show that, by using multiplicative weights in a game-theoretic thought experiment (and an important convexity result on the composition of multiplicative weights with the relative entropy function), a symmetric bimatrix game (that is, a bimatrix matrix wherein the payoff matrix of each player is the transpose of the payoff matrix of the other) either has an interior symmetric equilibrium or there is a pure strategy that is weakly dominated by some mixed strategy. Weakly dominated pure strategies can be detected and eliminated in polynomial time by solving a linear program. Furthermore, interior symmetric equilibria are a special case of a more general notion, namely, that of an "equalizer," which can also be computed efficiently in polynomial time by solving a linear program. An elegant "symmetrization method" of bimatrix games [Jurg et al., 1992] and the well-known PPAD-completeness results on equilibrium computation in bimatrix games [Daskalakis et al., 2009, Chen et al., 2009] imply then the compelling P = PPAD.

September 29, 2016 01:00 AM

Planet Emacsen

Grant Rettke: How To Reevaluate Local Variables

Via here:

  • If you want hooks to run call normal-mode
  • If you don’t want hooks to run call hack-local-variables

by Grant at September 29, 2016 12:40 AM


plot hyperplane (separator line) using weights of line and bias? (single layer perceptron)?

i need to draw separator line to separate male from female base on height and weight using output of single layer perceptron.

i have data.txt file that hold two features (height and weight) and gender where 0 indicates males and 1 indicates females


|      150.5          |     5.2          |   1  |
|      142.8          |     4.0          |   0  | 
|      150.5          |     5.2          |   1  |
|      190            |     5.7          |   0  |

import numpy as np
from sklearn import svm
import matplotlib.pyplot as plt
from sklearn.linear_model import perceptron
from pandas import *
import fileinput

f = fileinput.input('data.txt')

#height of females and males 
X_1 = []
#weight of female and males 
X_2 = []
#labels 0 males and 1 females 
Y = []
for line in f:
    temp = line.split(",")
    if str(temp[2]) == '0\n' :
        X_1.append(round(float(temp[0]), 2))
        X_2.append(round(float(temp[1]), 2))

print len(X_1)
print len(Y)
inputs = DataFrame({
'Height' : X_1,
'Weight' : X_2,
'Targets' : Y
colormap = np.array(['r', 'b'])

net = perceptron.Perceptron(n_iter=1000, verbose=0, random_state=None, fit_intercept=True, eta0=0.002)

# Train the perceptron object (net)[['Height','Weight']],inputs['Targets'])
# Output the values
print "Coefficient 0 " + str(net.coef_[0, 0])
print "Coefficient 1 " + str(net.coef_[0, 1])
print "Bias " + str(net.intercept_)

plt.scatter(inputs.Height,inputs.Weight, c=colormap[inputs.Targets],s=20)

# Calc the hyperplane (decision boundary)
ymin, ymax = plt.ylim()
w = net.coef_[0]
a = -w[0] / w[1]
xx = np.linspace(ymin, ymax)
yy = a * xx - (net.intercept_[0]) / w[1]

# Plot the hyperplane
plt.plot(xx, yy, 'k-')

but my graph looks like this graph with separator line

whereas my actually graph without line looks like that. i do not know what i am doing wrong enter image description here

by kero at September 29, 2016 12:26 AM



Fastest static associative map

Let's say that I wanted to build a key to value associative map where the only requirement was that lookup times were fast.

Once built, the associative map would not allow inserts, deletes, or modifications, so is a static data structure.

What would the best algorithm be for making and using such a thing?

I'm interested to know if there is a provable "best solution", or if it's provably NP-Hard or similar.

Here are some ideas of my own:

Minimally Perfect Hash

My best idea would be to use minimal perfect hashing. I would find a hashing algorithm that hashes the $N$ known inputs to $[0,N)$, that resulting value being able to be looked up in an array.

I would want to find the computationally cheapest (average time) hashing algorithm for my data set.

However, finding a minimally perfect hash function is a challenging problem already without wanting the computationally cheapest one.

Feature Based Indexing

Another thought I have would be to look at my input and find bits which are differing between items. For instance, file paths may have a lot of the same characters in them, especially if they are absolute paths to the same deep folder.

I could find where the bits are that matter and make a tree of objects out of them.

The challenge here I think is that I would ideally want a balanced tree, and it might be hard to test all the permutations of bits that actually matter, to make the tree as balanced as possible.

I think Ideally, my hope is that the entire tree could go away, and I could instead take the bits that mattered and make some equation like "xor bit 2 against bit 3 and add bit 5" to come up with an index into an array.

by Alan Wolfe at September 29, 2016 12:06 AM


HN Daily

Planet Theory

Approximate Sparse Linear Regression

Authors: Sariel Har-Peled, Piotr Indyk, Sepideh Mahabadi
Download: PDF
Abstract: In the Sparse Linear Regression (SLR) problem, given a $d \times n$ matrix $M$ and a $d$-dimensional vector $q$, we want to compute a $k$-sparse vector $\tau$ such that the error $||M \tau-q||$ is minimized. In this paper, we present algorithms and conditional lower bounds for several variants of this problem. In particular, we consider (i) the Affine SLR where we add the constraint that $\sum_i \tau_i=1$ and (ii) the Convex SLR where we further add the constraint that $\tau \geq 0$. Furthermore, we consider (i) the batched (offline) setting, where the matrix $M$ and the vector $q$ are given as inputs in advance, and (ii) the query(online) setting, where an algorithm preprocesses the matrix $M$ to quickly answer such queries. All of the aforementioned variants have been well-studied and have many applications in statistics, machine learning and sparse recovery.

We consider the approximate variants of these problems in the "low sparsity regime" where the value of the sparsity bound $k$ is low. In particular, we show that the online variant of all three problems can be solved with query time $\tilde O(n^{k-1})$. This provides non-trivial improvements over the naive algorithm that exhaustively searches all ${ n \choose k}$ subsets $B$. We also show that solving the offline variant of all three problems, would require an exponential dependence of the form $\tilde \Omega(n^{k/2}/e^{k})$, under a natural complexity-theoretic conjecture. Improving this lower bound for the case of $k=4$ would imply a nontrivial lower bound for the famous Hopcroft's problem. Moreover, solving the offline variant of affine SLR in $o(n^{k-1})$ would imply an upper bound of $o(n^d)$ for the problem of testing whether a given set of $n$ points in a $d$-dimensional space is degenerate. However, this is conjectured to require $\Omega(n^d)$ time.

September 29, 2016 12:00 AM

Understanding and Exploiting Object Interaction Landscapes

Authors: Sören Pirk, Vojtech Krs, Kaimo Hu, Suren Deepak Rajasekaran, Hao Kang, Bedrich Benes, Yusuke Yoshiyasu, Leonidas J. Guibas
Download: PDF
Abstract: Interactions play a key role in understanding objects and scenes, for both virtual and real world agents. We introduce a new general representation for proximal interactions among physical objects that is agnostic to the type of objects or interaction involved. The representation is based on tracking particles on one of the participating objects and then observing them with sensors appropriately placed in the interaction volume or on the interaction surfaces. We show how to factorize these interaction descriptors and project them into a particular participating object so as to obtain a new functional descriptor for that object, its interaction landscape, capturing its observed use in a spatio-temporal framework. Interaction landscapes are independent of the particular interaction and capture subtle dynamic effects in how objects move and behave when in functional use. Our method relates objects based on their function, establishes correspondences between shapes based on functional key points and regions, and retrieves peer and partner objects with respect to an interaction.

September 29, 2016 12:00 AM

September 28, 2016



What is the difference between Basic and Intermediate Pipelining?

I am totally lost regarding this topic. What is Basic and Intermediate Pipelining? What differentiate the 2?

by snash at September 28, 2016 11:00 PM


Differences between editions of Security Analysis by Graham and Dodd?

Where can I find a comparison of the contents, a list of everything that changed or the differences among the different editions of the book Security Analysis by Benjamin Graham & David Dodd?

There are six editions of the book: 1934, 1940, 1951, 1962, 1988, and 2008.

Do I have to read all of them and compare myself or did someone already do that?

Edit: I already read The Intelligent Investor and the sixth edition of Security Analysis. This question is about the differences between the different editions. The sixth edition if I understand it correctly, is based on the 1940-edition.

by tomsv at September 28, 2016 10:45 PM


Need Help with Computer Science Lab [on hold]

This section defines a game with a pet gerbil that descends into crevices to acquire precious metal nuggets. The gerbil can only carry 10 ounces. The precious metals and their values (Rhodeium is the highest and Ruthenium is the lowest) are as follows.

Rhodium Platinum Gold Ruthenium You want to train your gerbil such that when sent into a crevice, the gerbil returns with the most valuable haul. For example, if the crevice contains 5 ounces of Rhodium, 6 of Platinum, 4 of Gold, and 7 of Ruthenium, I want her to return with 5 ounces of Rhodium and 5 of Platinum.

Create a Java Gerbil class that has one public static void found method with the following signature.

public static void found(int rhodium, int platinum, int gold, int ruthenium) The formal parameters are the amounts of Rhodium, Platinum, Gold, and Ruthenium. The method found prints to the console the best retrieval of metal for that trip into the crevice. The following are example calls to the function found.

found(5, 6, 4, 7); // prints the following 5 Rhodium 5 Platinum 0 Gold 0 Ruthenium found(10, 10, 10, 10); //
 prints the following 10 Rhodium 0 Platinum 0 Gold 0 Ruthenium found(3, 0, 0, 1); // prints the following 3 Rhodium 0 Platinum 0 Gold 1 Ruthenium Work through several specific, concrete examples to discover the algorithm. The algorithm I created is several nested ifs.

You can solve this problem without a loop.

(How do I write the code for this)

by Jenny at September 28, 2016 10:39 PM


Dynamic for loops

I am trying to build a program which uses two for loops and when called should print out:

      1 2 3 4 5 6 7 8 9 10
 2 4 6 8 10 12 14 16 18 20
3 6 9 12 15 18 21 24 27 30
let mutable s:string = ""

let loopMulTable n = 
  for i in 1..10 do
    s <- sprintf "%i " (n*i)
    printf "%s " s
    for j in n-1 do
      s <- sprintf "%i " (n*i)
loopMulTable 2

Right now I am not too concerned about the spacing, just making the right prints.

I am not sure whats wrong with my for j in n-1 do loop but it gives me the error: FS0043: The type 'int32' is not compatible with the type seq<'a>.

I probably need to do some type of casting, but not sure how.

Thank you!

Figured out a solution:

let mutable s:string = ""

let loopMulTable n = 
  for i in 1..n do
    printf "\n"
    for j in 1..10 do
      s <- sprintf "%5i " (i*j)
      printf "%s " s
loopMulTable 10

by kthonenice at September 28, 2016 10:34 PM


What happens internally when you remove an object from a treemap

I am wondering what happens if you remove an object in the middle of a treemap. If the treemap would look something like this:

enter image description here

So what would happen if you remove the number 30? Because 27 and 34 will disconnect from the tree.

by Luud van Keulen at September 28, 2016 10:30 PM

Correct bracketing check with rotate operation on position i

Given sequence (length $N$) of brackets like $($ and $)$. The task is to implement data structure which supports following operations:

  • Check whether the sequence is correctly bracketed
  • Rotate bracket at position $i$

I dont want a solution, but some feedback if I am on a good way.

There is no time complexity constraint in the instructions. So I suppose, it should be better than $O(N)$. Because trivial solution with stack leads to $O(N)$.

The only options are:

  • It can be done in logarithmic time by use of some smart tree data structure.
  • Or it can be done in amortized constant time which I actually believe to.

I have following ideas:

  • Define function $w(i)$ for each position $i$ by $w(0)=0$ and $w(i)=w(i-1)+d_i$ where $d_i=+1$ if the bracket is $($ and $d_i=-1$ otherwise. First bracket have index $1$.
  • Fact: The sequence is not correctly bracketed if and only if $w(N) > 0$ or there exists such $j$ for that: $w(j) < 0$.
  • So I want to keep tracking mimum $m$ of those $w(i)$. And for the first query I able to answer in constant time just by checking that $m < 0$ and $w(N) > 0$.
  • The problem is how to track the value of $m$ to achieve the best time.
  • If I rotate bracket on position $i$, all $w(i)$ from $i$ to $N$ are changed by constant $\pm2$. It implies the change of $m$. If I would update those $w(i)$ directly it would lead to again to linear time.
  • So I was wondering how to enclose this behaviour to another datastructure (post is here) in some smart way. But I just cant get over that. Everything I was thinking about is just linear. So is it possible at all?

by Ondra Hrubý at September 28, 2016 10:28 PM

Array counter with mimimum find

I need to implement data strucure such as array, but with the following interface:

  • GetMin() - Returns the minimum from the array
  • IncRight(index) - Increases all values from specified index to the end of the array by 2
  • DecRight(index) - Decreases all values from specified index to the end of the array by 2

Assuming that the datastructure is already initialized, full of items (which are integers only) and we have already minimum found together with its index. So I "only" need to keep track the changes.

My goal is to all operations have amortized constant time complexity. The question is: Is it possible at all?

Anyway: How to implement such data structure with the best time complexity?

I know one more thing about the initial values: two adjacent values differs only by 1 (plus or minus).

by Ondra Hrubý at September 28, 2016 10:20 PM


About the ``recent" paper by Razborov in the Annals of Mathematics

Recently this paper on complexity theory was published at the Annals of Mathematics by Razborov, Curiously this seems to have been submitted to the journal 12 years ago!

I was wondering if by now this work has already gone into some textbook or has it gotten improved or if there are already pedagogic expositions available for this? It would be helpful to get such references if they exist!

by gradstudent at September 28, 2016 10:10 PM

Lambda the Ultimate Forum

SPLASH'16 Amsterdam CFP: early registration ends Sept 30

ACM Conference on Systems, Programming, Languages, and Applications:
Software for Humanity (SPLASH'16)

Amsterdam, The Netherlands
Sun 30th October - Fri 4th November , 2016


30 September 2016 (Early Deadline)

# What's happening at SPLASH?

## Keynotes

- Benjamin Pierce (SPLASH)
The Science of Deep Specification

- Andy Ko (SPLASH)
A Human View of Programming Languages

- Martin Odersky (SPLASH)

- Guy Steele Jr. (SPLASH-I)

- Robby Findler (SLE)
Redex: Lightweight Semantics Engineering

- Tiark Rompf (GPCE)
Lightweight Modular Staging: Generate all the things!

- Simon Peyton Jones (SPLASH-I/E)
The dream of a lifetime: shaping how our children learn computing

- Laurence Tratt (Scala)
Fine-grained language composition without a common VM

- Jan Vitek (Scala)
This is not a Type: Gradual typing in practice

## Workshop Keynotes

- Andrew Black (NOOL)
The Essence of Inheritance

- Alan Blackwell (PLATEAU)
How to Design a Programming Language

- Felienne Hermans (DSLDI)
Small, simple and smelly: What we can learn from examining end-user artifacts?

- Ivano Malavolta (Mobile!)
Beyond native apps: Web technologies to the rescue!

- Betsy Pepels (ITSLE)
Model Driven Software Engineering (MDSE) in the large

- Markus Voelter (ITSLE)
Lessons Learned about Language Engineering from the Development of mbeddr

- Beverly Sanders (SEPS)
Patterns for Parallel Programming: New and Improved!

** Conference Program **

** SPLASH-I Track **

SPLASH-I is a series of invited and solicited talks that address topics relevant to the SPLASH community. Speakers are world-class experts in their field, selected and invited by the organizers. The SPLASH-I talks series is held in parallel with the rest of SPLASH during the week days. Talks are open to all attendees.

A selection of confirmed talks:

- Edwin Brady
Type-driven Development in Idris

- Jürgen Cito
Using Docker Containers to Improve Reproducibility in PL/SE Research

- Yvonne Coady
Exploratory Analysis in Virtual Reality: The New Frontier

- Adam Chlipala
Rapid Development of Web Applications with Typed Metaprogramming in Ur/Web

- Tudo Girba
Software Environmentalism

- Robert Grimm
Adventures in Software Evolution

- Brian Harvey
Snap! Scheme Disguised as Scratch

- Lennart Kats
Responsive Language Tooling For Cloud-based IDEs

- Ralf Laemmel
The basic skill set of software language engineering

- Crista Lopes
Simulating Cities: The Spacetime Framework

- Heather Miller
Language Support for Distributed Systems

- Mark Miller & Bill Tulloh
The elements of decision alignment: Large programs as complex organizations

- Boaz Rosenan & David Lorenz
Define Your App, Don’t Implement It: Building a Scalable Social Network in 45 minutes

- Emmanuel Schanzer

- Chris Seaton
Truffle and Graal: Fast Programming Languages With Modest Effort

- Emma Söderbergh
From Tricorder to Tricium: Useful Static Analysis and the Importance of Workflow Integration

- Emma Tosch
Designing and Debugging Surveys with SurveyMan

- Todd Veldhuizen
Fast Datalog

- Markus Völter
How Domain Requirements Shape Languages

- Jos Warmer
Making Mendix Meta Model Driven

- Andy Zaidman
Fact or fiction? What software analytics can do for us (developers and researchers)

More information here:

** Research tracks


- Onward!

- Onward! Essays

- Software Language Engineering (SLE)

- Generative Programming: Concepts & Experiences (GPCE)

- Dynamic Languages Symposium (DLS)

- Scala Symposium

** Other Events

- Doctoral Symposium

- Programming Language Mentoring Workshop (PLMW)

- Student Research Competition (SRC)

- Posters

** Workshops

SPLASH'16 is hosting a record number of 15 workshops:

- AGERE! Programming based on Actors, Agents, and Decentralized Control

- DSLDI: Domain-Specific Language Design and Implementation

- DSM: Domain-Specific Modeling

- FOSD: Feature-oriented Software Development

- ITSLE: Industry Track Software Language Engineering

- LWC@SLE: Language Workbench Challenge


- Mobile!

- NOOL: New Object-Oriented Languages

- PLATEAU: Evaluation and Usability of Programming Languages and Tools

- Parsing@SLE

- REBLS: Reactive and Event-based Languages & Systems

- SA-MDE: Tutorial on MDD with Model Catalogue and Semantic Booster

- SEPS: Software Engineering for Parallel Systems

- VMIL: Virtual Machines and Intermediate Languages

- WODA: Workshop on Dynamic Analysis

## SPLASH'16 is kindly supported by the following organizations:

- ACM:
- LogicBlox (Gold):
- Universal Robots (PLMW, Gold):
- Oracle (Silver):
- TU Delft (Silver):
- Huawei (Bronze):
- Facebook (Bronze):
- IBM Research (Bronze):
- Google (Bronze):
- Itemis (Bronze):
- ING (Bronze):

Interested in supporting SPLASH'16? See our options here:

September 28, 2016 10:00 PM


Cluster Scenario: Difference between the computedCost of 2 points used as similarity measure between points. Is it applicable?

I want to have a measure of similarity between two points in a cluster. Would the similarity calculated this way be an acceptable measure of similarity between the two datapoint?

Say I have to vectors: vector A and vector B that are in the same cluster. I have trained a cluster which is denoted by model and then model.computeCost() computes thesquared distance between the input point and the corresponding cluster center.

(I am using Apache Spark MLlib)

val costA = model.computeCost(A)
val costB = model.computeCost(B)

val dissimilarity = |cost(A)-cost(B)|

Dissimilarity i.e. the higher the value, the more unlike each other they are.

by Mnemosyne at September 28, 2016 09:52 PM


A Dead-lock in an Operating System is

A Dead-lock in an Operating System is

  1. Desirable process
  2. Undesirable process
  3. Definite waiting process
  4. All of the above

My attempt:

As "If a process is unable to change its state indefinitely because the resources requested by it are being used by another waiting process, then the system is said to be in a deadlock."

So, none option should be true. However, somewhere answer key is given option $(3)$.

Can you explain it, please?

by Mithlesh Upadhyay at September 28, 2016 09:41 PM


Mean and standard deviation of price series with Kalman

I like to calculate the mean and standard deviation of a price series, using the Kalman filter. I am somehow stuck with the deviation, or have some problem in understanding, which my research could not solve.

mean(t) =  mean(t-1) + K(t) * ( price(t) - mean(t-1) )

with Kalman gain K(t) = R(t-1) / (R(t-1) + Ve), state variance R(t) = (1 - K(t)) * R(t-1) and measurement error Ve practically as some pre-defined parameter, similarly to the lookback period in a simple mean.

I've read a few times that the variance R should give kind of variance (and thus standard deviation) of the price series. But with a K < 1, R with every iteration just gets smaller and is no way the deviation of the price series. This only would make sense for a constant value to measure, where with every measurement iteration we get more certainty. Is my concept of the Kalman filter too simplistic? Can anybody give me a hint please.

by Mike at September 28, 2016 09:37 PM


Unique triangulation duals of simple polygons

Given a triangulation (without Steiner points) of a simple polygon $P$, one can consider the dual of this triangulation, which is defined as follows. We create a vertex for every triangle in our triangulation, and we connect two vertices if the corresponding triangles share an edge. The dual graph is known to be a tree with maximum degree three.

For my application, I am interested in the following. Given a tree $T$ with maximum degree three, is there always a simple polygon $P$ such that the dual of every triangulation (without Steiner points) of $P$ is equal to $T$. Here, the triangulation of $P$ may not be unique, but I require that the dual graph be unique.

This is certainly true when $T$ is a path, but becomes unclear when you have vertices of degree three.

by Nizbel99 at September 28, 2016 09:32 PM



WPA_Supplicant disconnects on exit

This is probably an extremely noob question, but I can't exit wpa supplicant and keep connected.

I tried using & and -B (individually). There are no errors for me to post, but if it helps I'm using ARM FreeBSD on a Rasp Pi.

To clarify Ctrl-c is the only way I can exit, but upon doing so I disconnect from Wi-Fi. So in short, how do I resume to terminal and stay connected.

by Blazing Code at September 28, 2016 09:28 PM


what is the process used in ANN Back propagation

I have been doing a project, for the duration of the semester, and am nearly complete with the research side. All I need is to know how back propagation works. I am familiar with all the other algorithms associated with baseline ANNs, just not those used in the backprop teaching method. For full disclosure my activation function is:

Function SigMod(x):
    return (1/(1+(x^2)))
for n in Layer:
    NodeInLayer = 0.0
    for w in Weights:
        NodeInLayer += SigMod((X[w]*W[w]))
    Y[n] = NodeInLayer

My main quest is, again, how exactly do I teach an ANN through the backprop method.

by Eric Schwarz at September 28, 2016 09:09 PM


Open Source Neural Network Library [closed]

I am looking for an open source neural network library. So far, I have looked at FANN, WEKA, and OpenNN. Are the others that I should look at? The criteria, of course, is documentation, examples, and ease of use.

by Loozie at September 28, 2016 09:09 PM



Asymptotic growth of search algorithms

I have 2 search algorithms and I have derived the following tight bound representations:

$$ nlog(n)+mlog(n) $$

$$ m∗n $$

Now i want to find a function $f(n)$ so that when $m$ is an element of tight bound $f(n)$, both algorithms have equal asymptotic run time.

Now I'm not sure if its just as simple as setting the two equations equal to each other and isolating 'm' or of it is more than tha

by Christian at September 28, 2016 08:14 PM



Relation between Type Assignment system (TA) and Hindley-Milner system

Recently I started my studies in type theory/type systems and Lambda Calculus.

I have already read about Simple Typed Lambda Calculus in Church and Curry style. The last one is also known as Type Assignment system (TA).

I'm thinking about the relations between TA and Hindley-Milner (HM), the system in languages like ML and Haskell.

The book Lambda-Calculus and Combinators: An Introduction (Hindley) says that TA is polymorphic (pag. 119). Is that the same sense of polymorphism in systems like HM and System-F?

TA is said to have the strong normalisation property, so is not turing complete. Languages that use HM system are turing complete, Haskell for example. So must be the case that HM system allows terms like the infinity loop $\Omega$ to receive a type. Is that correct or I'm missing something?

Any way, I would like to know the relation between TA and HM.

by Rafael Castro at September 28, 2016 08:04 PM


Why there is no CSS4

I’m still not sure why we don’t have a CSS tag… sighs


by flyingfisch at September 28, 2016 08:03 PM


caret: performing grouped regression with train()

Hopefully this isn't a completely idiotic question. I have a dataset df, n = 2228, p = 19 that describes characteristics of 5 breeds of horses. I would like to model the continuous variable price as a function of the other 17 predictor variables (even mix of categorical and continuous) for each breed by first splitting the data into training and test.

# pre- processing reveals no undo correlation, linear dependency or near
# zero variance veriables
train <- df %>% group_by(breed) %>% sample_frac(size = 2/3) %>% droplevels()
test <- anti_join(df, train) %>% droplevels()
# I imagine I should be somehow able to do this in the following step but can't
# figure it out
model <- train(price ~ ., data = train, method = "glmnet")
test$pred <- predict(model, newdata = test)

As far as I can tell I have no issue splitting the data by breed (see the above code). However, I can't figure out how to fit the model grouped by breed. What I would like to do is analogous to the following from the package nlme i.e. lmList(price ~ . |breed, data = df)

by user6571411 at September 28, 2016 07:59 PM



Missing value error when using bagImpute preprocessing within caret::train function

I want to train a random forest model with a repeatedcv procedure using caret::train. My data has some missing values, so I want to use the preProcess="bagImpute" option within the train function. I do not want to use the preProcess function outside of train, because I want to bagImpute my data for each iteration of the repeatedcv procedure. However, when I attempt to do this, an error is thrown:

Error in { : task 1 failed - "'n' must be a positive integer >= 'x'"
In addition: There were 50 or more warnings (use warnings() to see the first 50)
> warnings()
Warning messages:
1: In eval(expr, envir, enclos) :
  model fit failed for Fold01.Rep01: mtry=2 Error in = c(5.1, 4.9, 4.7,  : 
  missing values in object

Below is a minimal reproducible example using the iris data. I borrowed the initial code for the dataset prep from Minkoo at his website: Many thanks Minkoo!


inTrain <- createDataPartition(iris$Species, p=0.8, list=FALSE)
training <- iris[inTrain, ]

fillInNa <- function(d) {
      naCount <- NROW(d) * 0.1
      for (i in sample(NROW(d), naCount)) {
            d[i, sample(4, 1)] <- NA

 training <- fillInNa(training)

tc<-trainControl("repeatedcv", repeats=30, selectionFunction="oneSE",returnData=T, 
classProbs = T,num=10, preProcOptions ="bagImpute", 
summaryFunction=multiClassSummary, savePredictions = T)


rfTri_Bag<- train(training.x,training.y, 
              preProcess= c("bagImpute"),

Edit: Here is my session info:

R version 3.3.1 (2016-06-21)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1

[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_UnitedStates.1252 LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C LC_TIME=English_United States.1252    

attached base packages:
[1] stats4    grid      stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] ipred_0.9-5         e1071_1.6-7         latticeExtra_0.6-28 RColorBrewer_1.1-2  randomForest_4.6-12 caret_6.0-71       
 [7] rpart_4.1-10        party_1.0-25        strucchange_1.5-1   sandwich_2.3-4      zoo_1.7-13          modeltools_0.2-21  
[13] mvtnorm_1.0-5       gdata_2.17.0        DMwR_0.4.1          pROC_1.8            Metrics_0.1.1       raster_2.5-8       
[19] sp_1.2-3            gridExtra_2.2.1     readr_1.0.0         tidyr_0.6.0         tibble_1.2          tidyverse_1.0.0    
[25] MuMIn_1.15.6        merTools_0.2.2      devtools_1.12.0     plyr_1.8.4          arm_1.9-1           lattice_0.20-33    
[31] MASS_7.3-45         xtable_1.8-2        lmerTest_2.0-32     lme4_1.1-12         Matrix_1.2-6        xlsx_0.5.7         
[37] xlsxjars_0.6.1      rJava_0.9-8         AICcmodavg_2.0-4    pander_0.6.0        ggplot2_2.1.0       purrr_0.2.2        
[43] dplyr_0.5.0         broom_0.4.1        

loaded via a namespace (and not attached):
 [1] TH.data_1.0-7      VGAM_1.0-2         minqa_1.2.4        colorspace_1.2-6   class_7.3-14       MatrixModels_0.4-1
 [7] DT_0.2             prodlim_1.5.7      coin_1.1-2         codetools_0.2-14   splines_3.3.1      mnormt_1.5-4      
[13] knitr_1.14         Formula_1.2-1      nloptr_1.0.4       pbkrtest_0.4-6     cluster_2.0.4      shiny_0.14        
[19] compiler_3.3.1     httr_1.2.1         assertthat_0.1     lazyeval_0.2.0     acepack_1.3-3.3    htmltools_0.3.5   
[25] quantreg_5.29      tools_3.3.1        coda_0.18-1        gtable_0.2.0       reshape2_1.4.1     Rcpp_0.12.7       
[31] nlme_3.1-128       iterators_1.0.8    psych_1.6.6        stringr_1.1.0      mime_0.5           gtools_3.5.0      
[37] scales_0.4.0       parallel_3.3.1     SparseM_1.7        yaml_2.1.13        quantmod_0.4-6     curl_1.2          
[43] memoise_1.0.0      reshape_0.8.5      stringi_1.1.1      foreach_1.4.3      blme_1.0-4         TTR_0.23-1        
[49] caTools_1.17.1     boot_1.3-18        lava_1.4.4         chron_2.3-47       bitops_1.0-6       evaluate_0.9      
[55] ROCR_1.0-7         htmlwidgets_0.7    labeling_0.3       magrittr_1.5       R6_2.1.3           gplots_3.0.1      
[61] Hmisc_3.17-4       multcomp_1.4-6     DBI_0.5            foreign_0.8-66     withr_1.0.2        mgcv_1.8-12       
[67] xts_0.9-7          survival_2.39-4    abind_1.4-5        nnet_7.3-12        car_2.1-3          KernSmooth_2.23-15
[73] rmarkdown_1.0      data.table_1.9.6   git2r_0.15.0       digest_0.6.10      httpuv_1.3.3       munsell_0.4.3     
[79] unmarked_0.11-0   

Edit 2: An almost identical question has been asked here , but the answer given simply shows how to predict from a preProcess() object outside of the train() function. As @Misconstruction points out in a comment, with this method the imputation is "not included inside the CV loop." - My thoughts exactly.

by jlab at September 28, 2016 07:35 PM

What does the expression ## mean in Chisel?

I've been looking all over the web to find out what the expression ## means in chisel, but can't find it anywhere.

For example in this code snippet:

val ways = Module(new BRAM(log2Up(conf.lines), conf.ways * line_size))
val din = Vec.fill(conf.ways) { Bits(width=line_size) }
if(conf.ways == 2) { := din(1) ## din(0)

What is the line in the if-statement doing using the ## expression? Thanks!

by Mrchacha at September 28, 2016 07:34 PM



can not print out when user keeps typing, rxswift

I'm learning RXSwift by printing out each character when doing a search :

override func viewDidLoad() {
        searchBar.rx_text.distinctUntilChanged().subscribeNext { (query) in
            print("Query is \(query)")
        } .addDisposableTo(DisposeBag())

However, there is nothing printed out from the console. I spent few hours on it and I realize that if I declare the variable dispose

let disposeBag = DisposeBag() 

and then doing

.addDisposableTo(disposeBag) // instead of doing .addDisposableTo(Disposable())

Everything works as I expect.

My question is why .addDisposableTo(Disposable()) does not work.

by tonytran at September 28, 2016 06:59 PM


What are some hot topics in Computer Networks at Master's Level? [on hold]

I have done B.E Electrical Engineering (Telecom). I have done CCNA(R&S) and have worked in computer networks field, So based upon my professional background I plan to apply for Masters in Computer Science in USA and would like to opt computer networks as major . Please suggest some current hot research topics in Computer Networks.

by Frank Castle at September 28, 2016 06:51 PM



Does CASH like AUD fall under securities? [on hold]

Seems pretty basic question but among stocks , bonds and others does CASH i.e EURO, USD, AUD is also considered as a security or is it just cash. the doubt is whether to consider AUD among other securities and stocks for calculation.

by Gowtama Krishna at September 28, 2016 06:49 PM


Why complement of most negative number in octal does not turn out to be itself

I know the range of the $n$ digit numbers in $r$'s complement system is given as

$r^{n-1}-1$ to 0 to $-r^{n-1}$.

So for 3 bit 2's complement, its:

$(2^2-1)$ to $0$ to $-(2^2)$
that is
$3$ to $0$ to $-4$
that is
$(011)_{\text{2's comp}}$ to $(0)_{\text{2's comp}}$ to $(100)_{\text{2's comp}}$

And for 3 digit 8's complement, its

$(8^2-1)$ to $0$ to $-(8^2)$
that is
$63$ to $0$ to $-64$
that is
$(077)_{\text{8's comp}}$ to $(0)_{\text{8's comp}}$ to $(700)_{\text{8's comp}}$

Now I came to know that

the $r$'s complement of the smallest / most $-$ve number in the range will be the same number.

For example, taking 2's complement of most $-$ve / smallest number possible with 3 bit 2's complement system, $(100)_{\text{2's comp}}$ is itself:

1 0 0  
0 1 1 (1's complement)
1 0 0 (add 1 to get 2's complement)

But taking 8's complement of most $-$ve / smallest number possible with 3 digit 8's complement system, $(700)_{\text{8's comp}}$ does not seem to be itself:

7 0 0
0 7 7 (7's complement)
1 0 0 (add 1 to get 8's complement)

So 8's complement of 700 does not seem to be 700 itself but is in fact 100. So what I am missing? Or I am doing it all wrong?

by Mahesha999 at September 28, 2016 06:45 PM



Relation between Type Assignment system (TA) and Hindley-Milner (HM) system

Recently I started my studies in type theory/type systems and Lambda Calculus.

I have already read about Simple Typed Lambda Calculus in Church and Curry style. The last one is also known as Type Assignment system (TA).

I'm thinking about the relations between TA and Hindley-Milner (HM), the system in languages like ML and Haskell.

The book Lambda-Calculus and Combinators: An Introduction (Hindley) says that TA is polymorphic (pag. 119). Is that the same sense of polymorphism in systems like HM and System-F?

TA is said to have the strong normalisation property, so is not turing complete. Languages that use HM system are turing complete, Haskell for example, so must be the case that HM system allows terms like the infinity loop $\Omega = (\lambda x.xx)(\lambda x.xx)$ to receive a type. Is that correct or I'm missing something?

Any way, I would like to know the relation between TA and

by Rafael Castro at September 28, 2016 06:27 PM


Can anyone give me some pointers for using SVM for user recognition using keystroke timing?

I am trying to perform user identification using keystroke dynamics. The data consists of the timing of individual keystrokes. I am using an SVM for binary classification. How can I train this for multiple users? i have times of dynamic keyword, very times of users, example “hello” h->16seg, e->10, l->30, o->20, therefore, i not have class(1pos, -1neg)

by rn3w at September 28, 2016 06:18 PM


How can I perform an analysis of risk exposures for an index?

I'm writing my bachelor thesis and the main goal of the paper is to answer the question: "are smart beta indexes efficient?" In order to answer this question would like to determine if those smart beta indexes have desired or undesired risk exposures. The name smart beta basically stands for an ETF or an index which tries to outperform its benchmark.

The data set consists of three Indexes: SMI, S&P 500 and DAX. For each of those indexes one smart beta index is chosen with either the factor tilt value or volatility. I have market data at disposal (monthly returns and volatilities). I plan to work with the program R-statistics.

The image shown below should represent instructions on how to analyse factor exposures. Is there anyone who could help me get a better understanding of this procedure?

enter image description here Source of Image: Reproduced from Figure 5-2 in the USE3 Barra risk model handbook

by Raphael Galvagno at September 28, 2016 06:07 PM


Planet Emacsen

Irreal: Tramp and cd

I try to use ehsell as much as I can when I need to drop into the shell. That way, I stay in Emacs and still have the power of Emacs available. The other day in a post about something else, I saw this powerful use of cd mentioned.

You can cd into a directory on another machine like this

cd /ssh:aineko:org

This logs me into my iMac in the ~/org directory using tramp. Something like

cd /ssh:aineko:

is the same as connecting to aineko with SSH. If you need to log into the remote server as a different user, use something like

cd /ssh:different_user@aineko:

It's not quite completely transparent but it does make remote machines seem like they're mounted locally. Very nice. And powerful.

by jcs at September 28, 2016 05:53 PM



how to build a simple ReLU network in caffe using python

I have a non image dataset , i want to build a RelU network ( fully connected layers only ). I am a beginner ,please point me to the right resources to do the same. Every resource talks about using CNN's on image dataset. Any help is appreciated.

by aditya ramesh at September 28, 2016 05:24 PM




Machine Learning Newbie [on hold]

I've recently begun to get into machine learning, I went through some online materials and have picked up SciKit Learn. Originally, for the project I had in mind I thought I would use linear regression, but upon learning more and examining data closer, I realized it would need to be non-linear. I then did some googling and found Support Vector Regression in SciKit Learn's libraries. From what I understand, though I could be wrong, it "chunks" the data to develop a regression algorithm.

The data I'm looking at is billed purchases over time, the orders seem to have high seasonality (orders tend to be similar based on the year, weather, etc). And I want to look for orders that don't fit that trend (errors, fraud, etc). I thought I would develop a model based on training data, test it to get an R2 value, then predict value for the order, compare the errors and flag the largest. I'm currently going to fit the model to each customer, as not all customers would be the same I thought if I fit model once to all order data that may skew results significantly (or maybe there should be a feature "customer-size"?)

There are a few questions I have so far:

  • What do I need to be aware of with SVR? Will this fulfill that use as I am intending? (Or does anyone have any other resources relating to SVR)
  • I haven't found any great documentation on selecting features within SciKit learn, I know how to enter them but was wondering if there are ways to code it so it will look at the data and then decide.

Hopefully this question is not too broad, it's more of a "I'm just learning and need some advice and some experts to bounce ideas off of"

Thanks for any help/ideas!

by BSmith at September 28, 2016 04:30 PM


randomUniformForests log level and visualization

I am working with randomUniformForests. I would like to know how to set the logging level to minimum and suppress visualisation of graphs. Does anyone know?

Thanks. Regards.

by mg64ve at September 28, 2016 04:27 PM


AARs and Alphas [on hold]

Almost all the event studies analyze the stock performance by AARs (Average Abnormal Returns) than means how better or worse the stock has performed due to the occurrence of an event. How to convert this AAR into alpha? or How can we say whether the AAR is yielding any alpha or not?

Average abnormal return is the excess of return that stock has because of the event. This is the excess amount as compared to the return the stock would have yielded if the event was not there. For calculating the return without the event (which has occurred already) we estimate the return based on any of the available models using benchmark returns. Which means that the stock has over performed or under performed as compared to itself in absence of an event.

Now, with the help of AAR how can we say whether the stock can be hedged with the benchmark to yield alpha?

by Ankesh Mundra at September 28, 2016 04:23 PM



Scraping from Yahoo all component symbols for given composite index symbol

Goal: I want to retrieve a list of all component symbols for any given composite index symbol from Yahoo.

Previously this was possible running this chunk of R code to Yahoo's composite page for the given index symbol (e.g. ^GDAXI here)


queryUrl <- "^GDAXI"
htmlpage <- xml2::read_html(queryUrl)
htmlnodes <- rvest::html_nodes(htmlpage, ".yfnc_tabledata1:nth-child(2) , .yfnc_tabledata1:nth-child(1)")

Just recently Yahoo changed the page here, producing two problems:

  1. They changed to a responsive layout, which makes it impossible for me to create a proper CSS-Selektor
  2. They only display the TOP30 components of a composite index, without any chance to see all other components. This clearly is the bigger problem.

Question: Does anyone know a way to request all component symbols for a given composite index symbol using YQL?

Remark: It is important to me, that I get only the symbols which are actually part of the composite index at that time. So I am not looking for solutions like How to extract all the ticker symbols of an exchange with Quantmod in R? or similar.

by user2161065 at September 28, 2016 04:15 PM

High Scalability

How Uber Manages a Million Writes Per Second Using Mesos and Cassandra Across Multiple Datacenters

If you are Uber and you need to store the location data that is sent out every 30 seconds by both driver and rider apps, what do you do? That’s a lot of real-time data that needs to be used in real-time.

Uber’s solution is comprehensive. They built their own system that runs Cassandra on top of Mesos. It’s all explained in a good talk by Abhishek Verma, Software Engineer at Uber: Cassandra on Mesos Across Multiple Datacenters at Uber (slides).

Is this something you should do too? That’s an interesting thought that comes to mind when listening to Abhishek’s talk.

Developers have a lot of difficult choices to make these days. Should we go all in on the cloud? Which one? Isn’t it too expensive? Do we worry about lock-in? Or should we try to have it both ways and craft brew a hybrid architecture? Or should we just do it all ourselves for fear of being cloud shamed by our board for not reaching 50 percent gross margins?

Uber decided to build their own. Or rather they decided to weld together their own system by fusing together two very capable open source components. What was needed was a way to make Cassandra and Mesos work together, and that’s what Uber built.

For Uber the decision is not all that hard. They are very well financed and have access to the top talent and resources needed to create, maintain, and update these kind of complex systems.

Since Uber’s goal is for transportation to have 99.99% availability for everyone, everywhere, it really makes sense to want to be able to control your costs as you scale to infinity and beyond.

But as you listen to the talk you realize the staggering effort that goes into making these kind of systems. Is this really something your average shop can do? No, not really. Keep this in mind if you are one of those cloud deniers who want everyone to build all their own code on top of the barest of bare metals.

Trading money for time is often a good deal. Trading money for skill is often absolutely necessary.

Given Uber’s goal of reliability, where out of 10,000 requests only one can fail, they need to run out of multiple datacenters. Since Cassandra is proven to handle huge loads and works across datacenters, it makes sense as the database choice.  

And if you want to make transportation reliable for everyone, everywhere, you need to use your resources efficiently. That’s the idea behind using a datacenter OS like Mesos. By statistically multiplexing services on the same machines you need 30% fewer machines, which saves money. Mesos was chosen because at the time Mesos was the only product proven to work with cluster sizes of 10s of thousands of machines, which was an Uber requirement. Uber does things in the large.

What were some of the more interesting findings?

  • You can run stateful services in containers. Uber found there was hardly any difference, 5-10% overhead, between running Cassandra on bare metal versus running Cassandra in a container managed by Mesos.

  • Performance is good: mean read latency: 13 ms and write latency: 25 ms, and P99s look good.

  • For their largest clusters they are able to support more than a million writes/sec and ~100k reads/sec.

  • Agility is more important than performance. With this kind of architecture what Uber gets is agility. It’s very easy to create and run workloads across clusters.

Here’s my gloss of the talk:

In the Beginning

by Todd Hoff at September 28, 2016 03:59 PM


Machine learning - Linear regression using batch gradient descent

I am trying to implement batch gradient descent on a data set with a single feature and multiple training examples (m).

When I try using the normal equation, I get the right answer but the wrong one with this code below which performs batch gradient descent in MATLAB.

 function [theta] = gradientDescent(X, y, theta, alpha, iterations)
      m = length(y);
      for iter =1:1:iterations
          for i=1:1:m
              delta(1,1)= delta(1,1)+( X(i,:)*theta - y(i,1))  ;
              delta(2,1)=delta(2,1)+ (( X(i,:)*theta - y(i,1))*X(i,2)) ;
          theta= theta-( delta*(alpha/m) );

y is the vector with target values, X is a matrix with the first column full of ones and second columns of values (variable).

I have implemented this using vectorization, i.e

theta = theta - (alpha/m)*delta

... where delta is a 2 element column vector initialized to zeroes.

The cost function J(Theta) is 1/(2m)*sum from i=1 to m [(h(theta)-y)^2]

Any help would be much apppreciated.

by Sridhar Thiagarajan at September 28, 2016 03:57 PM


Moody's, S&P, Fitch revenues per country

I need a variable which identifies the possible conflict of interests between credit rating agencies and countries, although they do not pay in order to be rated. Such a variable could be the proportion of income coming from each country, namely how many national private companies or banks or other financial institutions pay the CRAs for their services. Do you think I can find such data on the agencies' balancesheet, maybe they divide the revenues for market segment?

Or other Ideas are welcomed! :)

by Elena De Falco at September 28, 2016 03:57 PM


AWS Answers – Architect More Confidently & Effectively on AWS

After an organization decides to move to the AWS Cloud and to start taking advantage of the benefits that it offers, one of the next steps is to figure out how to properly architect their applications. Having talked to many of them, I know that they are looking for best practices and prescriptive design patterns, along with some ready-made solutions and some higher-level strategic guidance.

To this end, I am pleased to share the new AWS Answers page with you:

Designed to provide you with clear answers to your common questions about architecting, building, and running applications on AWS, the page includes categorized guidance on account, configuration & infrastructure management, logging, migration, mobile apps, networking, security, and web applications. The information originates from well-seasoned AWS architects and is presented in Q&A format. Every contributor to the answers presented on this page has spent time working directly with our customers and their answers reflect the hands-on experience that they have accumulated in the process.

Each answer offers prescriptive guidance in the form of a high-level brief or a fully automated solution that you can deploy using AWS CloudFormation, along with a supporting Implementation Guide that you can view online or download in PDF form. Here are a few to whet your appetite:

How can I Deploy Preconfigured Protections Using AWS WAF? – The solution will set up preconfigured AWS WAF rules and custom components, including a honeypot, in the configuration illustrated on the right.

How do I Automatically Start and stop my Amazon EC2 Instances? – The solution will set up the EC2 Scheduler in order to stop EC2 instances that are not in use, and start them again when they are needed.

What Should I Include in an Amazon Machine Image? This brief provides best practices for creating images and introduces three common AMI designs.

How do I Implement VPN Monitoring on AWS? – The solution will deploy a VPN Monitor and automatically record historical data as a custom CloudWatch metric.

How do I Share a Single VPN Connection with Multiple VPCs? This brief helps you minimize the number of remote connections between multiple Amazon VPC networks and your on-premises infrastructure.


by Jeff Barr at September 28, 2016 03:50 PM


generate a graph with fixed min cut

Is there a constructive way to generate a graph with a fixed min cut equal to $k$? One approach is to generate a random graph and then try to make edges alterations (additions, deletions, swaps) to get the desired min cut -- but I am wondering if there is a more systematic procedure?

Ideally I would want to generate a random instance of a graph with min-cut=$k$, and I would want it to be bipartite, but insight on any part of this question would be helpful!

Thanks in advance!

by user1798883 at September 28, 2016 03:42 PM


What are some hot topics for research in machine learning algorithms? [on hold]

I am currently finding some famous topics in machine learning algorithms in computer science. Please suggest me some direction so that I can continue on one particular topic.

by Jaimin at September 28, 2016 03:38 PM


Does reinforcement learning work with data?

In supervised and unsupervised learning, you have a data to work with (whether labeled or unlabeled). My question is, in the reinforcement learning, is there a data to work with or it is just about prediction based on some parameters?

by Adam at September 28, 2016 03:35 PM

Learning Neural Networks and Machine Learning from scratch [on hold]

I'd like to start with Machine Learning and Neural Networks from scratch, and like some guidelines on where it's best to start.

I have a decent understanding of functional and object oriented programming. The languages I've been using most are Python, Java and C. I'm a third year in college, computer sciences and electronics engineering, so don't hesitate to point me to a lot of math. I want to build up a good theoretical understanding of this. Also, this year I have a basic AI course, involving mostly search trees and expert systems, if that's of any use here.


by Damian Szkaut at September 28, 2016 03:28 PM


Planet Theory

TR16-153 | Lower Bounds and Identity Testing for Projections of Power Symmetric Polynomials | Christian Engels, Raghavendra Rao B V, Karteek Sreenivasaiah

The power symmetric polynomial on $n$ variables of degree $d$ is defined as $p_d(x_1,\ldots, x_n) = x_{1}^{d}+\dots + x_{n}^{d}$. We study polynomials that are expressible as a sum of powers of homogenous linear projections of power symmetric polynomials. These form a subclass of polynomials computed by depth five circuits with summation and powering gates (i.e., $ \sum\bigwedge\sum\bigwedge\sum$ circuits). We show $2^{\Omega(n)}$ size lower bounds for $x_1\cdots x_n$ against the following models: \begin{itemize} \item Depth five $\sum\bigwedge\sum^{\le n}\bigwedge^{\ge 21}\sum$ arithmetic circuits where the bottom $\sum$ gate is homogeneous; \item Depth four $\sum\bigwedge\sum^{\le n}\bigwedge$ arithmetic circuits. \end{itemize} Together with the ideas in [Forbes, FOCS 2015] our lower bounds imply deterministic $n^{\poly(\log n)}$ black-box identity testing algorithms for the above classes of arithmetic circuits. Our technique uses a measure that involves projecting the partial derivative space of the given polynomial to its multilinear subspace and then setting a subset of variables to $0$.

September 28, 2016 03:25 PM

Daniel Lemire

Sorting already sorted arrays is much faster?

If you are reading a random textbook on computer science, it is probably going to tell you all about how good sorting algorithms take linearithmic time. To arrive at this result, they count the number of operations. That’s a good model to teach computer science, but working programmers need more sophisticated models of software performance.

On modern superscalar processors, we expect in-memory sorting to limited by how far ahead the processor can predict where the data will go. Though moving the data in memory is not free, it is a small cost if it can be done predictably.

We know that sorting “already sorted data” can be done in an easy-to-predict manner (just do nothing). So it should be fast. But how much faster is it that sorting randomly shuffled data?

I decided to run an experiment.

I use arrays containing one million distinct 32-bit integers, and I report the time in CPU cycles per value on a Haswell processor. I wrote my code in C++.

function sorted data shuffled data sorted in reverse
std::sort 38 200 30

For comparison, it takes roughly n log(n) comparisons to sort an array of size n in the worst case with a good algorithm. In my experiment, log(n) is about 20.

The numbers bear out our analysis. Sorting an already-sorted array takes a fraction of the time needed to sort a shuffled array. One could object that the reason sorting already-sorted arrays is fast is because we do not have to move the data so much. So I also included initial arrays that were sorted in reverse. Interestingly, std::sort is even faster with reversed arrays! This is clear evidence for our thesis.

(The C++ source code is available. My software includes timsort results if you are interested.)

by Daniel Lemire at September 28, 2016 03:16 PM



What computational problems can be efficiently resolved by Hyper-heuristics?

I am a student working on my final graduation project. I was assigned to study Hyper-heuristics and it is a new subject for me. I was asked to choose a computational problem to apply Hyper-heuristics on them and see the results. However, I am afraid to choose the wrong problem. What are, in your opinion, computational problem that can be resolved with hyper-heuristics efficiently.

Thank you.

by user2878542 at September 28, 2016 03:01 PM


Early stopping with tflearn

I'm having a hard time to figure out how to implement early stopping with tflearn. Supposedly it works by using callbacks in the function but I don't quite get how it's done... This is the example on the website but it still needs a Monitor class that I can't get to work:

class MonitorCallback(tflearn.callbacks.Callback):
    def __init__(self, api):
        self.my_monitor_api = api

    def on_epoch_end(self, training_state):
            accuracy: training_state.global_acc,
            loss: training_state.global_loss,

monitorCallback = new MonitorCallback(api) 
model = ..., callbacks=monitorCallback)

Does anyone have an example or an idea of how to do this? Cheers

by Raspel at September 28, 2016 02:58 PM

Angular2 function randomly doesnt work

I am building an Angular2 app and I have this function in my service (all it does is format numbers to the correct decimal place - I dont use pipes cuz if the num is N/A it throws an error):

formatRiskStats(riskObj) {
    riskObj.quote.price = this.roundNumber(riskObj.quote.price, 2);
    riskObj.quote.ask = this.roundNumber(riskObj.quote.ask, 2); = this.roundNumber(, 2);
    riskObj.quote.bookvalue = this.roundNumber(riskObj.quote.bookvalue, 2);
    riskObj.quote.volume = this.numberWithCommas(riskObj.quote.volume);
    riskObj.quote.avgdailyvolume = this.numberWithCommas(riskObj.quote.avgdailyvolume);
    riskObj.YTDreturn = Number(this.roundNumber(riskObj.YTDreturn, 4)) * 100;
    riskObj.oneYrReturn = Number(this.roundNumber(riskObj.oneYrReturn, 4)) * 100;
    riskObj.annReturn3Year = Number(this.roundNumber(riskObj.annReturn3Year, 4)) * 100;
    riskObj.annReturn5Year = Number(this.roundNumber(riskObj.annReturn5Year, 4)) * 100;
    riskObj.beta = Number(this.roundNumber(riskObj.beta, 2));
    if (riskObj.beta >=0) riskObj.beta = '+' + riskObj.beta;
    else riskObj.beta = '-' + riskObj.beta;
    riskObj.changeFrom50DayMovingAvg = Number(this.roundNumber(riskObj.changeFrom50DayMovingAvg, 4)) * 100;
    riskObj.changeFrom200DayMovingAvg = Number(this.roundNumber(riskObj.changeFrom200DayMovingAvg, 4)) * 100;
    riskObj.downsideRisk = Number(this.roundNumber(riskObj.downsideRisk, 2));
    riskObj.maxDrawdown = Number(this.roundNumber(riskObj.maxDrawdown, 2));
    riskObj.movingAvg50Day = Number(this.roundNumber(riskObj.movingAvg50Day, 2));
    riskObj.movingAvg200Day = Number(this.roundNumber(riskObj.movingAvg200Day, 2));
    riskObj.rSquared = Number(this.roundNumber(riskObj.rSquared, 2));
    riskObj.treynor = Number(this.roundNumber(riskObj.treynor, 4)) * 100;
    riskObj.volatility = Number(this.roundNumber(riskObj.volatility, 2));

    return riskObj;

roundNumber(num, decimalPlaces) {
    if (num === 'N/A') return 'N/A';
    var numStr = num.toString();
    var result = parseFloat(numStr).toFixed(decimalPlaces);
    return result;

It takes in a big obj with a lot of numbers and formats them. I call this function in ngOnInit in my component like this:

ngOnInit() {

initialSearch() {
        .subscribe(data => {

getRiskAttributes() {
    this.securitiesService.getSecurityRiskAttributes(this.ticker, '1960-01-01')
        .subscribe(data => {
            this.riskStats = this.securitiesService.formatRiskStats(data);

Then I interpolate the data in the HTML. The problem I am having is that every time I refresh, usually about one or two of the properties that should have been formatted in the formatRiskStats function is not formatted. It is like different ones are just randomly not formatted sometimes, while other times they are formatted.

Does anyone know why this is happening?

Appreciate any advice.

by georgej at September 28, 2016 02:54 PM




What Strings does this Grammar Generate?

I have this grammar:

S -> aTb
T -> bSa
T -> ba

It is not regular grammar since we have a terminal after a non-terminal on the right-hand side. Nonetheless, I don't comprehend what kind of strings this grammar would generate.

I tried drawing the FA in an attempt to understand what strings would be generated but I'm sure it is wrong:

enter image description here

by user3393266 at September 28, 2016 01:49 PM



Order of Visiting in candidate Elimination algorithm

From the candidate elimination algorithm , it is quite clear that order of examples matters for intermediate S,G but not for the final S,G.

Can we pre-choose an order of training examples which give relatively smaller sizes for S and G in intermediate steps so that it will be computationally simple to get final boundaries?

Edit: Just as a discussion starter , what I have found is that alternating between positive and negative examples is a good start because we are bounding from both sides . Since we remove those cases from G which are not more general than some other case in S and those cases from S which are not more specific than some other case in G, we might expect to have reduced S and G sets

by MysticForce at September 28, 2016 01:16 PM



Recommended books in geometric modelling?

Referring to question:

Was recommended to ask here instead.

I'm currently doing a course in geometric modelling - an introduction to Bézier and B-spline techniques.

We use the book by the same name Bézier and B-Spline Techniques by Prautzsch, Boehm and Puluszny. While the exercises in this book is mathematical of nature, I would love to get some recommendation on other resources, specifically related to the implementation of such techniques on a computer.

Any help is greatly appreciated!

by Ivar Stangeby at September 28, 2016 01:00 PM

Reference request: Category theory as it applies to type systems

I keep hearing about how one must learn category theory to truly understand programming language theory. So far, I've learned a good deal of PL without ever stepping foot into the realm of categories. However, I figured it was time to take the leap to see what I had been missing.

Unfortunately, none of the sources I can find seem to make any connections to type systems or programming. They say it's an introduction to category theory for computer scientists, but then veer off into general abstract nonsense (I say this lovingly) without giving any practical examples or applications.

I guess my question is actually two-fold:

  1. Is category theory essential for understanding the "deep concepts" in PL?
  2. What's a source that explains category theory from the viewpoint of practical applications to type systems and programming?

So far, the furthest I've gotten is to a vague conception of functors (which don't seem to be related to functors in ML, as far as I can tell). I'm dreading the abstraction I'll need to keep in my head to understand monads from a category-theoretic view.

by gardenhead at September 28, 2016 12:48 PM


DNN With embedded layer returning sin wave cost/accuracy

I am a total newbie to tensor flow and machine learning, but trying to model a DNN with an embedded layer infront of it. For some reason I keep getting a sin wave of cost results as well as accuracy. I imagine there is something wrong with my code, so here goes:

This is my model and training routines:

def neural_network_model(x):
    W = tf.Variable(
       tf.truncated_normal([vocab_size, embedding_size], stddev=1 / math.sqrt(vocab_size)),

    embedded = tf.nn.embedding_lookup(W, x)    
    embedding_aggregated = tf.reduce_sum(embedded, [1])

    hidden_1_layer = {
        'weights': tf.Variable(tf.random_normal([embedding_size, n_nodes_hl1])),
        'biases': tf.Variable(tf.random_normal([n_nodes_hl1]))

    hidden_2_layer = {
        'weights': tf.Variable(tf.random_normal([n_nodes_hl1, n_nodes_hl2])),
        'biases': tf.Variable(tf.random_normal([n_nodes_hl2]))

    hidden_3_layer = {
        'weights': tf.Variable(tf.random_normal([n_nodes_hl2, n_nodes_hl3])),
        'biases': tf.Variable(tf.random_normal([n_nodes_hl3]))        

    output = {
        'weights': tf.Variable(tf.random_normal([n_nodes_hl3, n_classes])),
        'biases': tf.Variable(tf.random_normal([n_classes]))

    l1 = tf.matmul(embedding_aggregated, hidden_1_layer['weights']) + hidden_1_layer['biases']
    l1 = tf.nn.relu(l1)

    l2 = tf.matmul(l1, hidden_2_layer['weights']) + hidden_2_layer['biases']
    l2 = tf.nn.relu(l2)

    l3 = tf.matmul(l2, hidden_3_layer['weights']) + hidden_3_layer['biases']
    l3 = tf.nn.relu(l3)    

    output = tf.matmul(l3, output['weights']) + output['biases']        
    return output

def train_neural_network(x_batch, y_batch, test_x, test_y):
    global_step = tf.Variable(0, trainable=False, name='global_step')    

    logits = neural_network_model(x_batch)
    cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits, y_batch))
    tf.scalar_summary('cost', cost)
    optimizer = tf.train.AdagradOptimizer(0.01).minimize(cost, global_step = global_step)

    test_logits = neural_network_model(test_x)
    prediction = tf.nn.softmax(test_logits)
    correct = tf.equal(tf.argmax(prediction, 1), tf.argmax(test_y, 1))
    accuracy = tf.reduce_mean(tf.cast(correct, tf.float32))
    tf.scalar_summary('accuracy', accuracy)

    merged = tf.merge_all_summaries()

    saver = tf.train.Saver()
    model_dir = "model_embedding"
    latest_checkpoint = tf.train.latest_checkpoint(model_dir)

    with tf.Session() as sess:        
        train_writer = tf.train.SummaryWriter(model_dir + "/eval", sess.graph)
        if (latest_checkpoint != None):
            print("Restoring: ", latest_checkpoint)
            saver.restore(sess, latest_checkpoint)
            print("Nothing to restore")

        coord = tf.train.Coordinator()
        threads = tf.train.start_queue_runners(sess=sess, coord=coord)
            epoch = 1
            while not coord.should_stop():
                epoch_loss = 0

                _, c, summary =[optimizer, cost, merged])
                # embd =
                # for idx in range(xb.size):
                #     print(xb[idx])
                #     print(yb[idx])

                train_writer.add_summary(summary, global_step = global_step.eval())
                epoch_loss += c             

                print('Epoch', epoch, 'completed out of',hm_epochs,'loss:',epoch_loss)
                print("Global step: ", global_step.eval())
      , model_dir+'/model.ckpt', global_step=global_step) # default to last 5 checkpoint saves

                epoch += 1
        except tf.errors.OutOfRangeError:
            print('Done training -- epoch limit reached')

My data is a bunch of word integer IDs padded to a size of 2056 uniformly with the padding token being added at the end so a lot of my tensors have a bunch of vocab_size integer value at the end, in order to pad up to 2056.

Is there something glaringly obvious about my code thats wrong?

by schone at September 28, 2016 12:41 PM


Extreme value theory expected value of GPD

We're using extreme value theory to model tail risks on our portfolio. After we choose the threshold, we fit generalized Pareto distribution to our data over the threshold. The expected value of GPD is quite larger (10%) than the average value of our losses over the threshold. My question is, is this to be expected? Or does that mean that the GPD is a bad fit to the data and that we've chosen the wrong threshold? Also, is there a good way to check whether the GPD is a good fit?

by gregorp at September 28, 2016 12:32 PM


Basic addition in Tensorflow?

I want to make a program where I enter in a set of x1 x2 and outputs a y. All of the tensor flow tutorials I can find start with image recognition. Can someone help me by providing me either code or a tutorial on how to do this in python? thanks in advance. edit- the x1 x2 coordinates I was planning to use would be like 1, 1 and the y would be 2 or 4, 6 and the y would be 10. I want to provide the program with data to learn from. I have tried to learn from the tensorflow website but it seemed way more complex that what I wanted.

by user2609405 at September 28, 2016 12:28 PM



training and testing the machine learning model with different data present in different files.(say file1 and file2 for training and testing)

I want to predict the trained model accuracy by data available in different folder.I am using X1_train, X1_validation, Y1_train, Y1_validation = cross_validation.train_test_split(X1, Y1, test_size=validation_size,random_state=seed,stratify = None) but problem is that in this code training and test data both are present in one file.Please help to trained the model with data present in folder (let say file1) and test the trained model with data present in folder(let say file2).

Suppose I trained the model on sample data. type.DESC:**manner How did serfdom develop in and then leave Russia ? **ENTY:cremat What films featured the character Popeye Doyle ? DESC:manner How can I find a list of celebrities ' real names ? ENTY:animal What fowl grabs the spotlight after the Chinese Year of the Monkey ? ABBR:exp What is the full form of .com ? HUM:ind What contemptible scoundrel stole the cork from my lunch ? HUM:gr What team did baseball 's St. Louis Browns become ? HUM:title What is the oldest profession ? DESC:def What are liver enzymes ? HUM:ind Name the scar-faced bounty hunter of The Old West . ENTY:otherWhen was Ozzy Osbourne born ? DESC:reason Why do heavier objects travel downhill faster ? HUM:ind Who was The Pride of the Yankees ?

Now Let say model get trained on these data,now I want to provide a test data on run time and want to know it's type?

OutPut should be look like:

Input ............ .......... Actual Type ........................... Model Prediction

Which two states enclose Chesapeake Bay?.......LOC:state........ Classified_type What does the abbreviation AIDS stand for?..... ENTY:other ........ Classified_type What does a spermologer collect?........ ENTY:other .......... Classified_type

by Saurabh at September 28, 2016 12:13 PM


Mapping graph to another graph's sub-graph

Given two directed graphs G and G' (with no self-edges). Then we need to find a one to one mapping M (if possible) from nodes in G to nodes in G' such that there is an edge v1 to v2 in G if and only if there is an edge from M(v1) to M(v2) in G'.

Example :

Input : G' = { 1 : [2, 3, 4] , 2 : [4] , 3 : [4] , 4 : [ ] } ; G = { 1 : [2] , 2 : [ ] , 3 : [2] } ;

Output : M(1) = 2 , M(2) = 4 , M(3) = 3

I tried some of the methods, but they were all exponential in time complexity. Could you suggest some algorithm to find the mapping?

EDIT : As some of the answers have pointed out that the problem is NP-complete.

In that case, I was wondering whether this mapping problem could be converted into a CNF-SAT formula and then be solved by a SAT solver. What do you think of this approach?

by ralphsol at September 28, 2016 12:06 PM


Just Point Your Defects Already!

We have been using agile workflows on our teams at Atomic since the days of Extreme Programming back in the early 90s. User stories have always required points, although there has long been a debate about whether or not a team should point defects. Usually, pointing defects is harshly discouraged, yet the argument has come up time and time again.

What is the Point of Points?

For starters, I think we can all agree that story points are a metric. A metric, by definition, is used to measure something. A user story is a description of a feature that provides value to a customer. So then story points must be a measure of the value delivered to the customer, right? WRONG!

Points are a measure of the cost to implement a given user story. Undeniably, the time it takes to implement a feature costs time, and this cost is very important to the planning of a project. The cost of a given story is very useful when determining whether or not a given feature is worth implementing. Over a period of time, points allow us to estimate project completion, since again, they are a measure of time. My coworker Micah covers this topic very well in this post.

So, when all points are implemented, the product is ready to ship, right? Wrong again!

Why? Defects!

What are Defects?

Defects are inevitable. No matter how hard we try, defects still happen. What’s worse is that they slow us down. Why? Because they take time. That’s right, time.

No matter how you look at it, spending time on defects decreases the time available for implementing features. Furthermore, the time to fix a given defect can vary.

Luckily, we usually make a decent guess at how hard it will be to track down the cause and develop a fix for a defect. Therefore, we should be able to assign some scalar to a defect to represent the cost, or time, necessary to fix it.

How about…points!

Mark Cohn explains this very well:

My usual recommendation is to assign points to bug fixing the agile defects. This really achieves the best of both worlds. We are able to see how much work the team is really able to accomplish, but also able to look at the historical data and see how much went into the bug-fixing story each sprint.

That’s right. We can’t analyze something if we don’t measure it. Also, not giving points to defects is very much like brushing them under the rug.

Now What?

Well, now we can have a cost associated with each of our defects, just as we do with our user stories. And now that we use the same scale, we can weigh defects and stories against each other. Better yet, we can use the sum of the cost of both to much better estimate when we will be done.

Mark’s example shows how a product owner can assess the effects of not only fixing bugs, but not fixing them in order to get features completed:

Knowing this [the cost of bugs] can be helpful to a team and its product owner. For example, imagine a situation in which the product owner is considering suspending bug fixing for the next six sprints in order to add a valuable different feature into a release. If we know that the team’s full historical average velocity was 25 but that 5 points went to bug fixing each sprint, we know that suspending bug fixing for the next six sprints will result in 30 (6*5) more points of new functionality.

But My Gannt Chart Just Exploded!

Well, you have likely just taken a huge step closer to reality. Better to find this out sooner rather than later. Defects, just like stories, take time! Being armed with how much each story and defect will cost will help you decide how to best spend your time.

As much as we would like to get away from the constraints of cost and time to market, a business needs to deliver solid products that provide value, and time to market is driven by competition and trade show dates. Pointing both stories and defects gives the stakeholders extremely valuable information on what all the pieces are and how much they will cost, so that tough decisions can be made. A product is judged not only by its features, but by its performance, robustness and user experience.

The post Just Point Your Defects Already! appeared first on Atomic Spin.

by Greg Williams at September 28, 2016 12:00 PM



Greedy node in Ethernet LAN

On an Ethernet LAN (running on CSMA/CD protocol) there is a greedy node that wants to transmit at-least N% (N is tunable from 0 to 100) of the successfully transmitted nodes, i.e., N% of frames do not face collisions. How should the malicious node go about achieving this goal?

by Jim Jeffries at September 28, 2016 11:59 AM


Financial Instrument vs Financial Product

From this link: where explained the relationship between asset-classes and financial-instrument types

That's good, but here -Types of financial products: shares, bonds

So is Financial Instrument = Financial Product ?

by ses at September 28, 2016 11:31 AM

Should I use an arithmetic or a geometric calculation for the Sharpe Ratio?

What are the advantages/disadvantages of using the arithmetic Sharpe Ratio vs the geometric Sharpe Ratio? Is one more correct? Or is one better in certain circumstances?

by Kelly at September 28, 2016 11:30 AM


Unable to proceed after using aregImpute() of Hmisc in R

I am using Hmisc package in R for the imputation of missing values and used

impute_arg <- aregImpute(~ Direction_Of_Wind + Average_Breeze_Speed + Max_Breeze_Speed + Min_Breeze_Speed , data = train.mis, n.impute = 5)

Now I want to use the result that come from impute_arg with the original data. What should I do after computing impute_arg so that the missing values are vanished from my data?

by Lok at September 28, 2016 11:16 AM


Proof that median of an array is the number that minimizes the sum of manhattan distance to all points

Given a sorted array A, the problem is to find a number that minimizes the sum of Manhattan distance to the numbers in the array. I found that the median of A is the solution, but was not able to come up for a proof or explanation for the same ( i.e why its not mean).

Any help is highly appreciated.

by Vamsi Krishna at September 28, 2016 11:12 AM


Bulk remove a large directory on a ZFS without traversing it recursively

I want to remove a directory that has large amounts of data on it. This is my backup array, which is a ZFS filesystem, linear span, single pool. The data is in this pool called san, mounted on /san so I want to bulk remove /san/thispc/certainFolder

$ du -h -d 1 certainFolder/
1.2T    certainFolder/

Rather than me have to wait for rm -rf certainFolder/ can't I just destroy the handle to that directory so its overwrite-able(even by the same dir name if I chose to recreate it) ??

So for e.g. not knowing much about zfs fs internal mgmnt specifically how it maps directories, but if I found that map say fro e.g., and removed the right entries for e.g., the directory would no longer display, and that space that the directory formerly held has to be removed from some kind of audit as well.

Is there an easy way to do this, even if on an ext3 fs, or is that already what the recursive remove command has to do in the first place, i.e. pilfer through and edit journals?

I'm just hoping to do something of the likes of kill thisDir to where it simply removes some kind of ID, and poof the directory no longer shows up in ls -la and the data is still there on the drive obviously, but the space will now be reused(overwritten), because it's ZFS.


I mean I think zfs is really that cool, how can we do it? rubbing hands together :-)

My specific use case (besides my love for zfs) is management of my backup archive. This backup dir is pushed to via freefilesync (AWESOME PROG) on my Windows box to an smb fileshare, but also has a version directory where old files go. I'm deleting top level directories that reside in the main backup, which were copied to the version -- e.g. backup/someStuff, version/someStuff, bi monthly cleanup of rm -rm version/someStuff/* from a putty terminal, now I have to open another terminal; don't want to do that every time, I'm tired of uselessly having to monitor rm -rf. I mean, maybe I should set the command to just release the handle, then print to std out. That might be nice. More realistically, recreate the dataset in a few seconds zfs destroy san/version; zfs create -p -o compression=on san/version after the thoughts from the response from @Gilles.

by Brian Thomas at September 28, 2016 11:07 AM

ZFS messed mirrors up

I have a ZFS pool with 6 drives in RAID10 -- well, it used to.

I tried to upgrade a 146GB drive to 1TB drive, and messed up bad.

root@x7550:~# zpool status
  pool: stuffpool
 state: ONLINE
  scan: scrub repaired 0 in 0h6m with 0 errors on Mon May  9 15:26:39 2016

    NAME                                               STATE     READ WRITE CKSUM
    stuffpool                                          ONLINE       0     0     0
      mirror-0                                         ONLINE       0     0     0
        ata-HGST_HTS721010A9E630_JR10004M0LGN6E-part1  ONLINE       0     0     0
        ata-HGST_HTS721010A9E630_JR10004M0M17TE-part1  ONLINE       0     0     0
      mirror-1                                         ONLINE       0     0     0
        ata-HGST_HTS541010A9E680_JA1000102MG9UR-part1  ONLINE       0     0     0
        ata-HGST_HTS541010A9E680_JA1009C03158BP-part1  ONLINE       0     0     0
      scsi-35000c50016ebcdfb-part1                     ONLINE       0     0     0
      ata-HGST_HTS541010A9E680_JA109NDW206MAS-part1    ONLINE       0     0     0

scsi-35000c50016ebcdfb-part1 use to be in mirror-2 and ata-HGST_HTS541010A9E680_JA109NDW206MAS-part1 is the drive I was trying to add to mirror-2.

Is there anything I can do to fix this?

I am running on ubuntu 16.04

root@x7550:~# zpool history stuffpool | tail -n50
History for 'stuffpool':
2016-03-27.01:56:12 zpool create stuffpool mirror ata-HGST_HTS721010A9E630_JR10004M0LGN6E-part1 ata-HGST_HTS721010A9E630_JR10004M0M17TE-part1 -f
2016-03-27.01:57:41 zpool add stuffpool mirror /dev/disk/by-id/ata-HGST_HTS541010A9E680_JA1000102MG9UR-part1 /dev/disk/by-id/ata-HGST_HTS541010A9E680_JA1009C03158BP-part1 -f
2016-03-27.01:59:25 zpool add stuffpool mirror /dev/disk/by-id/scsi-35000c50016ebcdfb-part1 /dev/disk/by-id/scsi-35000c50017675203-part1 -f
2016-03-27.02:12:38 zpool import -c /etc/zfs/zpool.cache -aN
2016-03-27.23:48:32 zfs create stuffpool/stuff
2016-03-27.23:54:47 zpool import -c /etc/zfs/zpool.cache -aN
2016-03-28.00:02:23 zfs create stuffpool/backup
2016-03-30.23:18:04 zpool scrub stuffpool
2016-04-03.01:06:06 zpool import -c /etc/zfs/zpool.cache -aN
2016-04-03.01:15:33 zfs create -p -o mountpoint=/var/lib/lxd/images/f96b6b5d7587150b880e96f872393b7fee53741046b40a76c6db22ed40886bc9.zfs stuffpool/images/f96b6b5d7587150b880e96f872393b7fee53741046b40a76c6db22ed40886bc9
2016-04-03.01:15:53 zfs set readonly=on stuffpool/images/f96b6b5d7587150b880e96f872393b7fee53741046b40a76c6db22ed40886bc9
2016-04-03.01:15:54 zfs snapshot -r stuffpool/images/f96b6b5d7587150b880e96f872393b7fee53741046b40a76c6db22ed40886bc9@readonly
2016-04-03.01:16:00 zfs clone -p -o mountpoint=/var/lib/lxd/containers/ux-1.zfs stuffpool/images/f96b6b5d7587150b880e96f872393b7fee53741046b40a76c6db22ed40886bc9@readonly stuffpool/containers/ux-1
2016-04-08.01:31:47 zpool import -c /etc/zfs/zpool.cache -aN
2016-04-08.01:43:48 zpool import -c /etc/zfs/zpool.cache -aN
2016-04-19.00:00:30 zpool import -c /etc/zfs/zpool.cache -aN
2016-04-21.18:14:15 zfs create -p -o mountpoint=/var/lib/lxd/images/9b03bacc30bcfbe3378e8803daa48ca2d32baa99d111efada484876750e5cc20.zfs stuffpool/images/9b03bacc30bcfbe3378e8803daa48ca2d32baa99d111efada484876750e5cc20
2016-04-21.18:14:35 zfs set readonly=on stuffpool/images/9b03bacc30bcfbe3378e8803daa48ca2d32baa99d111efada484876750e5cc20
2016-04-21.18:14:36 zfs snapshot -r stuffpool/images/9b03bacc30bcfbe3378e8803daa48ca2d32baa99d111efada484876750e5cc20@readonly
2016-04-21.18:14:36 zfs set mountpoint=none stuffpool/images/f96b6b5d7587150b880e96f872393b7fee53741046b40a76c6db22ed40886bc9
2016-04-21.18:14:41 zfs rename -p stuffpool/images/f96b6b5d7587150b880e96f872393b7fee53741046b40a76c6db22ed40886bc9 stuffpool/deleted/images/f96b6b5d7587150b880e96f872393b7fee53741046b40a76c6db22ed40886bc9
2016-04-24.22:54:03 zpool scrub stuffpool
2016-05-07.22:55:42 zpool import -c /etc/zfs/zpool.cache -aN
2016-05-09.15:20:27 zpool scrub stuffpool
2016-05-17.22:56:53 zfs create -p -o mountpoint=/var/lib/lxd/images/4f7a1fe6b71446eba6ee56f49698bd6592f193f731f1c0d9d51b1d199b9b75a5.zfs stuffpool/images/4f7a1fe6b71446eba6ee56f49698bd6592f193f731f1c0d9d51b1d199b9b75a5
2016-05-17.22:57:12 zfs set readonly=on stuffpool/images/4f7a1fe6b71446eba6ee56f49698bd6592f193f731f1c0d9d51b1d199b9b75a5
2016-05-17.22:57:13 zfs snapshot -r stuffpool/images/4f7a1fe6b71446eba6ee56f49698bd6592f193f731f1c0d9d51b1d199b9b75a5@readonly
2016-05-17.22:57:18 zfs destroy -r stuffpool/images/9b03bacc30bcfbe3378e8803daa48ca2d32baa99d111efada484876750e5cc20
2016-05-21.16:47:49 zpool import -c /etc/zfs/zpool.cache -aN
2016-06-09.22:59:47 zpool import -c /etc/zfs/zpool.cache -aN
2016-06-13.20:59:10 zpool import -c /etc/zfs/zpool.cache -aN
2016-06-13.20:59:34 zfs create -p -o mountpoint=/var/lib/lxd/images/49fc7d0d6f01a7639129308b73ad27f5fb7b9d3bb783d905393b6b9e9c4bf1c5.zfs stuffpool/images/49fc7d0d6f01a7639129308b73ad27f5fb7b9d3bb783d905393b6b9e9c4bf1c5
2016-06-13.20:59:54 zfs set readonly=on stuffpool/images/49fc7d0d6f01a7639129308b73ad27f5fb7b9d3bb783d905393b6b9e9c4bf1c5
2016-06-13.20:59:54 zfs snapshot -r stuffpool/images/49fc7d0d6f01a7639129308b73ad27f5fb7b9d3bb783d905393b6b9e9c4bf1c5@readonly
2016-06-13.21:00:00 zfs destroy -r stuffpool/images/4f7a1fe6b71446eba6ee56f49698bd6592f193f731f1c0d9d51b1d199b9b75a5
2016-06-18.02:18:55 zpool import -c /etc/zfs/zpool.cache -aN
2016-06-18.02:27:08 zpool offline stuffpool 1759097636360003165
2016-06-18.02:33:28 zpool detach stuffpool 1759097636360003165
2016-06-18.12:23:26 zpool export stuffpool
2016-06-18.12:24:38 zpool import stuffpool
2016-06-18.12:27:34 zpool add -f stuffpool ata-HGST_HTS541010A9E680_JA109NDW206MAS-part1
2016-06-18.12:31:05 zpool export stuffpool
2016-06-18.13:19:17 zpool import stuffpool

All the ATA drives are 1tb and the SCSI drivers are 146GB

Here is the usage info

root@x7550:~# zpool list
stuffpool  2.85T   162G  2.69T         -     2%     5%  1.00x  ONLINE  -

This is my personal server, so downtime isn't an issue.

by user838 at September 28, 2016 11:02 AM



Old prototxt syntax in caffe

I'm working with some older branch of caffe. Now I need to modify the prototxt file by slicing the input layer.

I know that in the new syntax it looks like this:

layer {
  name: "slice"
  type: "Slice"
  bottom: "labelAndMask"
  ## Example of layer with a shape N x 5 x Height x Width
  top: "label"
  top: "mask"
  slice_param {
    axis: 1
    slice_point: 1

What would be the equivalent in the old prototxt format? Also, where in the caffe sources could I look this up by myself?

by mcExchange at September 28, 2016 10:58 AM

Fred Wilson

Machine Learning As A Service

Our portfolio company Clarifai introduced two powerful new features on their machine learning API yesterday:

  • visual search
  • train your own model

Visual search is super cool:


But I am even more excited about the train your own model feature.

Clarifai says it well on their blog post announcing these two new features:

We believe that the same AI technology that gives big tech companies a competitive edge should be available to developers or businesses of any size or budget. That’s why we built our new Custom Training and Visual Search products – to make it easy, quick, and inexpensive for developers and businesses to innovate with AI, go to market faster, and build better user experiences.

Machine learning requires large data sets and skilled engineers to build the technology that can derive “intelligence” from data. Small companies struggle with both. And so without machine learning as a service from companies like Clarifai, the largest tech companies will have a structural advantage over small developers. Using an API like Clarifai allows you to get the benefits of scale collectively without having to have that scale individually.

Being able to customize these machine learning APIs is really the big opening. Clarifai says this about that:

Custom Training allows you to build a Custom Model where you can “teach” AI to understand any concept, whether it’s a logo, product, aesthetic, or Pokemon. Visual Search lets you use these new Custom Models, in conjunction with our existing pre-built models (general, color, food, wedding, travel, NSFW), to browse or search through all your media assets using keyword tags and/or visual similarity.

If you are building or have built a web or mobile service with a lot of image assets and want to get more intelligence out of them, give Clarifai’s API a try. I think you will find it a big help in adding intelligence to your service.

by Fred Wilson at September 28, 2016 10:37 AM


Problems with creating a decision tree and splitting on an attribute?

So I'm trying to split on an attribute "Color" that has possible values (Blue,Green,Red,Orange,Pink).

I'm splitting on entropy values, and the best split can either be Multi-Way 5, Multi-Way 4, Multi-Way 3, or Binary. For example:

5: (Blue, Green,Red,Orange,Pink)

4: (Blue, Green), (Red), (Orange), (Pink)
   (Green,Pink), (Blue),(Red),(Orange)

3: (Red,Orange), (Blue,Green), (Pink)
   (Red,Blue), (Green, Orange), (Pink)

2: (Blue,Green,Red), (Orange,Pink)
   (Pink), (Blue, Green, Red, Orange)

And so on. But how can I make a comprehensive list of all the possible splits? Is there a specific algorithm I could use? Or how would I even know how many max possible combinations there are with this?

Any help would be greatly appreciated, thanks!!!

by ocean800 at September 28, 2016 10:35 AM


Fame-French alpha for a single stock

I want to study the impact of corporate culture on risk-adjusted stock returns. After quantifying corporate culture I wanted to use panel methodology (I have a sample of 100 S&P500 companies over 10 years) to study the question. As for the risk-adjusted returns I use Jensen's alpha from CAPM and Treynor measure and estimate those using rolling regressions of the previous 36-months returns for each month. After estimating monthly alphas for each stock, I calculate cumulative quarterly alphas ((1+r_m1)(1+r_m2)(1+r_m3)-1) and regress quarterly alphas on one-period lagged corporate culture measures. However, I also want to calculate alphas using Fama-French model. Can I do it for single stocks and proceed in the same fashion? I read journal articles and I found no paper where an author would calculate alpha for a single stock... everywhere portfolios are constructed. Can anyone explain to me the reason why constructing portfolios are favourable and assess my model? Does it make sense to calculate alphas for single stocks?

Thank you!

by Daria Diachenko at September 28, 2016 10:30 AM


How to simulate systems' unavailability?

I am simulating a system where the clients' unavailability is one big issue to the systems behavior.

I gathered the real system logs to analyze the unavailability of the clients in the deployment environments and I computed the values for Mean Time Between Failure (MTBF) and Mean Time To Repair (MTTR).

Now I want to simulate such a system failure in my simulator. Currently what I am doing is as follows:

1. Every client rolls the dice every 1 minute (Uniform random in [0, 1])
2. If the number if larger than a specified probability (P_unavail),
   the client becomes unavailable for the next (T_unavail) minutes (i.e. (T_unavail) = 10min).

The simulation does not need to capture the whole details of the real world. Showing the trends (how often the failure occurs? how long does the repair take?) is indeed enough. However, the simulation method and parameters should not be arbitrary.

If any one who has some knowledge on it.

  • Is the method that I am using appropriate?
  • If yes, what values for P_unavail and T_unavail would be appropriate? Can I compute them MTBF and MTTR from the log analysis?
  • If no, could you give me any idea?


by syko at September 28, 2016 10:27 AM

Reference request for coding Knight's Tour

Could someone give an easily accessible reference containing an algorithm that could be conveniently implemented into a code for computing Knight's Tour (preferably also with fairly good efficiency)?

by Mok-Kong Shen at September 28, 2016 10:15 AM


Es zeichnet sich langsam ab, dass die ganzen fetten ...

Es zeichnet sich langsam ab, dass die ganzen fetten DDoS-Angriffe in letzter Zeit von IoT-Geräten kam. Aktuell: OVH.

Welcher zugekokste Marketing-Vollpfosten hatte noch gleich die Idee, Kühlschränke und Fernseher ans Internet anzuschließen? In diesem Fall handelt es sich wohl um Kameras.

September 28, 2016 10:00 AM

Wo wir gerade bei zugekokstem Marketing-Bullshit waren:Microsoft ...

Wo wir gerade bei zugekokstem Marketing-Bullshit waren:
Microsoft and Bank of America Merrill Lynch collaborate to transform trade finance transacting with Azure Blockchain as a Service

September 28, 2016 10:00 AM


How do you get a probability of all classes to predict without building a classifier for each single class?

Given a classification problem, sometimes we do not just predict a class, but need to return the probability that it is a class.

i.e. P(y=0|x), P(y=1|x), P(y=2|x), ..., P(y=C|x)

Without building a new classifier to predict y=0, y=1, y=2... y=C respectively. Since training C classifiers (let's say C=100) can be quite slow.

What can be done to do this? What classifiers naturally can give all probabilities easily (one I know is using neural network with 100 out nodes)? But if I use traditional random forests, I can't do that, right? I use the Python Scikit-Learn library.

by Log0 at September 28, 2016 09:53 AM

will a numpy matrix with complex computation appraoch give more performance than huge size iterator with same complex computation

Let i have a function which is complex one.

def func():
    do some complex finding confidence score of a text(text_1) by comparing another text(text_2) , using cosine_similarity,tf-idf functions....etc
    return (text_1,cofindece_score)

and i have a list of size 2k. when i'm iterating over this and passing some input to my above function.which taking lot time.i want some alternate way to improve performance. solution can be anything..!!! but performance required.

for more clarity . my list contains list of trained texts. so my loop is like:

for text_in in list_of_texts:

the above concept killing performance.

i am thinking of numpy matrix. but again thinking to build matrix from list will take ~ same time.

what's best solution from here !! please bring all possible solutions.

will appreciate if one simple example will come.

by Achyuta nanda sahoo at September 28, 2016 09:40 AM


WebDAV server with PAM auth and system file permissions?

Basically, what I'm looking for is Samba - except that I want it to be WebDAV in the front. The requirements are that the users can log in to the WebDAV dir with their system account, and files will have their user and group set accordingly.

I know there's sambadav, but frustratingly, it seems to not be possible to get write functionality with it under FreeBSD.

by Daniel Ziltener at September 28, 2016 09:31 AM


Normalize data model in RxJava without for loops and temporary arrays

Given the following response from a api request I would like to normalize the datamodel into a simpler one using rxjava without using any loops or temporary arrays to store information.

Given: List<MyItem> 

String category_name
List<Switch> switches;

String id
boolean active

	"category_name": "Sport",
	"switches": [{
		"id": "sport_01",
		"active": true
	}, {
		"id": "sport_02",
		"active": false
	}, {
		"id": "sport_03",
		"active": true
}, {
    "category_name": "Economy",
	"switches": [{
		"id": "economy_01",
		"active": true
	}, {
		"id": "economy_02",
		"active": true

Expected Normalised: List<MyViewModel>
String categoryName
String switchId
boolean switchActiveState

	"category_name": "Sport",
	"id": "sport_01",
	"active": true
}, {
	"category_name": "Sport",
	"id": "sport_02",
	"active": false
}, ...

My first approach was the following

Observable<MyItem> mMyItemObservable = mApi.getReponse()

Observable<Switch> switchObservable = mMyItemObservable.flatMap(new Func1<MyItem, Observable<List<Switch>>>() {
            public Observable<List<Switch>> call(final MyItem item) {
                return Observable.defer(new Func0<Observable<List<Switch>>>() {
                    @Override public Observable<List<Switch>> call() {
                        return Observable.just(item.switches);
        }).flatMapIterable(new Func1<List<Switch>, Iterable<Switch>>() {
            public Iterable<Switch> call(List<Switch> switches) {
                return switches;

I have tried to use the following, switchObservable, Func...)
Observale.combineLatest(mMyItemObservable, switchObservable, Func...)

in order to produce the endResult List but with no success because most probably the length of the 2 observables were different. ie. mMyItemObservable with length 2 (items) and the switchObservable with length 5 items.

Any ideas of alternative ways to group these 2 observables together to achieve the end result?

by gemini at September 28, 2016 09:27 AM


"Hedging" a put option, question on exercise

I have a question on the following exercise from S. Shreve: Stochastic Calculus for Finance, I:

Exercise 4.2. In Example 4.2.1, we computed the time-zero value of the American put with strike price $5$ to be $1.36$. Consider an agent who borrows $1.36$ at time zero and buys the put. Explain how this agent can generate sufficient funds to pay off his loan (which grows by $25 \%$ each period) by trading in the stock and money markets and optimally exercising the put.

The model from Example 4.2.1 he refers to is the following: \begin{align*} S_0 & = 4 \\ S_1(H) & = 8, S_1(T) = 2 \\ S_2(HH) & = 16, S_2(HT) = S_2(TH) = 4, S_2(TT) = 1 \end{align*} with $r = 1/4$ and risk-neutral probabilities $\hat p = \hat q = 1/2$.

So now my question, I am not sure how to solve this, I guess the agent wants in some way hedge that he can always pay the credit by utilising the option. First how should I think about it, that he should pay back his credit at time step $2$, or earlier if possible by exercising the option, or should he stay liquide until the end? And what means optimal, hedging with minimal invest?

Okay, I solved it by considering two scenarios, first using the option at time step $1$, and then at time step $2$. By using it at time step one I found that he has to invest additional $0.16$, buy $1/2$ from the share, and accordingly borrow $0.16 - 1/2 \cdot 4$ from the bank/monkey market. Then at the first time step, as the option was exercised and the value of the portfolio equals $(1 + r)1.36$ he could just invest everything riskless, i.e. readjusting his portfolio by not buying any shares of stock, and investing in the riskless asset $(1+r)1.36$, in this way at time step $2$ he could pay $(1+r)^2 1.36$.

In the second scenario, i.e. exercising the option at time step $2$, I found that he has to invest additional $1.36$ and buy no share at the initial step, and then readjust in the next step his portfolio as to buy $1/12$ of the share if it goes up and $1.06$ if it goes down, and by exersing the option, if it goes up after paying his debt $(1+r)^2 1.36$ his portfolio has the value $1.3$, meaning he still has money, or $-0.3$ if it goes down, meaning he still has some debt (this point I do not understand fully?...)

So can someone help me in understand and solving this exercise (if my approach is wrong...)?

by StefanH at September 28, 2016 09:21 AM



Level of data security in Virtual Data Rooms? [on hold]

Colleagues, have anybody used Virtual Data Rooms? Are they really helpful for data management? I want to use Virtueller Datenraum because I am from German. Or it would be better to use American room?

by Marie Huschwal at September 28, 2016 08:54 AM



How to train in Matlab a model, save it to disk, and load in C++ program?

I am using libsvm version 3.16. I have done some training in Matlab, and created a model. Now I would like to save this model to disk and load this model in my C++ program. So far I have found the following alternatives:

  1. This answer explains how to save a model from C++, which is based on this website. Not exactly what I need, but could be adapted. (This requires development time).
  2. I could find the best training options (kernel,C) in Matlab and re-train everything in C++. (Will require doing the training in C++ each time I change the option. Not scalable).

Thus, both of these options are not satisfactory,

Does anyone have an idea?

by Andrey Rubshtein at September 28, 2016 07:38 AM

How to correctly derivate quadratic cost function

Given the quadratic cost function f(a) = 1/2 (a-y)^2, I know that the derivative of the function with respect to a is a - y. But I have no clue how to get there... Can you provide me a link where this is explained easily?

by Peter234 at September 28, 2016 07:34 AM

LightSIDE - Cannot extract Feature of Stretchy Patterns

I have a problem when extracting feature of LightSIDE. I use three feature, Basic Features, Character N-Grams and Stretchy Patterns. However when I try to extract Stretchy Patterns, it always return Error:

java.lang.AbstractMethodError: edu.cmu.side.plugin.FeaturePlugin.extractFeatureHitsForSubclass(Ledu/cmu/side/model/data/DocumentList;)Ljava/util/Collection;
at edu.cmu.side.plugin.FeaturePlugin.extractFeatureHits(
at com.david.bow.service.LightSideService.buildFeatureTable(
at com.david.bow.service.LightSideService.prepareBuildFeatureTable(Unknown Source)
at com.david.bow.lightSide.LAH_segment.predictSectionType(
at com.david.bow.controller.TextSummaryController.textSummaryTest(
at com.david.bow.controller.TextSummaryController$$FastClassBySpringCGLIB$$912df307.invoke(<generated>)
at org.springframework.cglib.proxy.MethodProxy.invoke(
at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(
at org.springframework.transaction.interceptor.TransactionInterceptor$1.proceedWithInvocation(
at org.springframework.transaction.interceptor.TransactionAspectSupport.invokeWithinTransaction(
at org.springframework.transaction.interceptor.TransactionInterceptor.invoke(
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(
at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(
at com.david.bow.controller.TextSummaryController$$EnhancerBySpringCGLIB$$412318db.textSummaryTest(<generated>)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(
at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(
at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(
at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(
at org.springframework.web.servlet.DispatcherServlet.doDispatch(
at org.springframework.web.servlet.DispatcherServlet.doService(
at org.springframework.web.servlet.FrameworkServlet.processRequest(
at org.springframework.web.servlet.FrameworkServlet.doGet(
at javax.servlet.http.HttpServlet.service(
at org.springframework.web.servlet.FrameworkServlet.service(
at javax.servlet.http.HttpServlet.service(
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(
at org.apache.catalina.core.ApplicationFilterChain.doFilter(
at org.apache.tomcat.websocket.server.WsFilter.doFilter(
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(
at org.apache.catalina.core.ApplicationFilterChain.doFilter(
at org.springframework.web.filter.OncePerRequestFilter.doFilter(
at org.springframework.web.filter.OncePerRequestFilter.doFilter(
at org.springframework.web.filter.OncePerRequestFilter.doFilter(
at org.springframework.web.filter.DelegatingFilterProxy.invokeDelegate(
at org.springframework.web.filter.DelegatingFilterProxy.doFilter(
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(
at org.apache.catalina.core.ApplicationFilterChain.doFilter(
at org.springframework.web.filter.RequestContextFilter.doFilterInternal(
at org.springframework.web.filter.OncePerRequestFilter.doFilter(
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(
at org.apache.catalina.core.ApplicationFilterChain.doFilter(
at org.springframework.web.filter.CharacterEncodingFilter.doFilterInternal(
at org.springframework.web.filter.OncePerRequestFilter.doFilter(
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(
at org.apache.catalina.core.ApplicationFilterChain.doFilter(
at org.apache.catalina.core.StandardWrapperValve.invoke(
at org.apache.catalina.core.StandardContextValve.invoke(
at org.apache.catalina.authenticator.AuthenticatorBase.invoke(
at org.apache.catalina.core.StandardHostValve.invoke(
at org.apache.catalina.valves.ErrorReportValve.invoke(
at org.apache.catalina.core.StandardEngineValve.invoke(
at org.apache.catalina.connector.CoyoteAdapter.service(
at org.apache.coyote.http11.AbstractHttp11Processor.process(
at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$ Source)
at org.apache.tomcat.util.threads.TaskThread$
at Source)

This is how I extract my feature, I already load the feature and its configuration, and its work properly for the other feature except Stretchy Patterns.

protected Recipe buildFeatureTable(Recipe currentRecipe, String name,   int threshold, String annotation, Type type) {
    FeaturePlugin activeExtractor = null;

    try {
        Collection<FeatureHit> hits = new HashSet<FeatureHit>();
        for (SIDEPlugin plug : currentRecipe.getExtractors().keySet()) {
            activeExtractor = (FeaturePlugin) plug;
            hits.addAll(activeExtractor.extractFeatureHits(currentRecipe.getDocumentList(), currentRecipe.getExtractors().get(plug)));

        FeatureTable ft = new FeatureTable(currentRecipe.getDocumentList(), hits, threshold, annotation, type);
    } catch (Exception e) {
        System.err.println("Feature Extraction Failed");

    return currentRecipe;

And this is my currentRecipe properties enter image description here

And this is my currentRecipe.extractor property :

{Stretchy Patterns={pMax=4, gMax=3, extractAllWordPatterns=true, countHits=false, rareThreshold=0.0, pMin=2, extractPOSCategories=false, requireCategoryHits=false, hideSurfaceFormsOfCategoryHits=true, gMin=1, extractWordCategories=true, extractAllPOSPatterns=false}, Basic Features={Bigrams=true, Contains Non-Stopwords=false, Count Occurences=true, Ignore All-stopword N-Grams=false, Include Punctuation=false, Line Length=false, Normalize N-Gram Counts=true, POS Bigrams=false, POS Trigrams=false, Skip Stopwords in N-Grams=true, Stem N-Grams=true, Track Feature Hit Location=false, Trigrams=true, Unigrams=true, Word/POS Pairs=false}, Character N-Grams={maxGram=4, Include Punctuation=true, Extract Only Within Words=true, minGram=2}}


When I try to extractFeatureHits of Basic Features and Character N-Grams, it always go to abstract class ParallelFeaturePlugin, but not for Stretchy Patterns, its always go to class FeaturePlugin.

by David Vincent at September 28, 2016 06:53 AM

How Interpret the output of propensity matching(matchIT) in R

Can anybody explain me Percent Balance Improvement in propensity score Matching. I am using MatchIt library in R. I ran the model successfully but I am not able to interpret the output.

Percent Balance Improvement:
            Mean Diff. eQQ Med eQQ Mean eQQ Max
distance     7.8916  6.1858   8.5452 44.7717
Gender      10.5178  0.0000  10.9290  0.0000
Age          9.4807 33.3333  12.4580 94.7368
Marital      6.6576  0.0000   7.0984  0.0000

by Ajay Jadhav at September 28, 2016 06:51 AM

Planet Emacsen


Minimum spanning tree problem with special properties

Hi if I have a graph ($n$ vertices and $m$ edges) whose edges have integer weights in the range $1...k$. I am wondering if maybe I could leverage this property to design an $O(k(n+m))$ algorithm to find its minimum spanning tree?

by MarsPlus at September 28, 2016 06:15 AM

Any algorithm better that O(N*logN) for a problem of finding student with largest average score in a list of N scores of the form StudentID, score

Question: Suppose you are a given a csv with N lines of the form StudentID, score. Multiple lines can have same student-ID. Find the student with maximum average score

I can't think of any way to make it faster than O(N*logN).

  1. Maintain a hash-table that keeps track of STudentID = Key and Value being an object with sum of scores and number of scores seen with that studentID --> O(N)

  2. Create an array with values in hash table, maybe convert each entry into average while doing so

  3. Sort it worst-case N(logN), all entries belong to same student

Is there any way to do it faster?

by Smart Home at September 28, 2016 06:15 AM

Big-O comparison different arrangements of similar code [duplicate]

This question already has an answer here:

The following code should have a run time of $O(N)$,


for (int x : array) {
    if (x < min) min = x;
    if (x > min) max = x;

but what about the following code?


for (int x : array) {
    if (x < min) min = x;
for (int x : array) {
    if (x > min) max = x;

by Anonymous Human at September 28, 2016 06:09 AM


Azure Machine Learning - Text Analytics C# Bad Request body even after verifying JSON body

Working with Azure Machine Learning - Text Analytics REST api, located here. Requires sending a payload to the server via POST. I am trying to get similar results as I do with IBM watson

Here is what I tried in console app, here's core code:

static IRestResponse GetResp(string url, string key, string jsonText) {
    IRestClient client = new RestClient(url);
    IRestRequest request = new RestRequest() { RequestFormat = DataFormat.Json };  
    request.AddHeader("Content-Type", "application/json");
    request.AddHeader("Ocp-Apim-Subscription-Key", key);
    IRestResponse response = client.ExecuteAsPost(request, "POST");


//  Here the code that serializes the object to look precisely like body advertised calls it: 
string json = JsonConvert.SerializeObject(documents);
IRestResponse resp = GetResponse("", TaxonomyGlueKey, json);

message body from serializing "documents" is:

 "documents": [
  "language": "en",
  "id": "4",
  "text": "Lateral internal sphincterotomy and fissurectomy"
  "language": "en",
  "id": "5",
  "text": "Fissurectomy and Botox injection"

I get Bad Request errors. I've verified my request is sent and passing authentication (it had failed prior). I have tried many variations on this as well.

I am able to try my request body out and it works properly when copying text from debug variable directly to the body provided by Azure:

If I test using the above I get the response expected, status 200:

Transfer-Encoding: chunked
x-aml-ta-request-id: c4ea9fff-8068-42a3-99c4-68717acddcf5
X-Content-Type-Options: nosniff
apim-request-id: e5eb593b-96a3-4806-9143-1d83424569be
Date: Thu, 21 Jul 2016 14:14:44 GMT
Content-Type: application/json; charset=utf-8

   "documents": [
       "keyPhrases": [
      "id": "4"
      "keyPhrases": [
        "Botox injection"
      "id": "5"
  "errors": []

by Makk at September 28, 2016 06:05 AM

What kinds of functions are considered as "composable"?

The Wikipedia article Function composition (computer science) says:

Like the usual composition of functions in mathematics, the result of each function is passed as the argument of the next, and the result of the last one is the result of the whole.

I have two questions about it:

  1. A composable function must have both arguments and return value?

    So following functions are not:

    def doNothing(): Unit = ()
    def myName(): String = "My name"
    def eat(food:String): Unit = ()

    Is my understanding correct?

  2. Can this function side-effect?

    def hello(name:String):String = {
      println("name: " + name) // side-effect
      name + "!"

    Do we still consider it as "composable"?

by Freewind at September 28, 2016 05:54 AM


Trading liquidity risk

I am trying to understand trading liquidity risk $\cdots$ "Trading liquidity risk occurs when an entity is unable to buy or sell a security at the market price due to a temporary inability to find a counterparty to transact on the other side of the trade." I found this definition on the net somewhere on the net.

If this is to hold true, then can I conclude that trading liquidity risk is common with an OTC market where there are no intermediaries like CCP?

by user161976 at September 28, 2016 05:21 AM


What is the name of the word problem for free groups under straight line program encoding?

I believe that the word problem is the problem to decide whether two different expressions denote the same element of a suitably defined algebraic structure. For simplicity, let us focus on free groups here. (Because I'm only interested in free algebras, and for groups one might indeed call this a word problem.) The expressions $(b^{-1}c)^{-1}b^{-1}(ab^{-1})^{-1}$, $(ab^{-1}c)^{-1}$, and $a^{-1}bc^{-1}$ are examples of such expressions. The first and second expression denote the same element of the free group, while the third expression denotes a different element.

The straight line program encoding is basically the same concept as arithmetic circuits, without implicit commutativity. It is one of the natural encodings of elements for a free algebra. One way to define the straight line program encoding is like in definition 1.1 from one of the google results for straight line program: The straight line program encoding of $f$ is an evaluation circuit $\gamma$ for $f$, where the only operations allowed belong to $\{()^{-1},\cdot\}$. More precisely: $\gamma=(\gamma_{1-n},\dots,\gamma_0,\gamma_1,\dots,\gamma_L)$ where $f=\gamma_L$, $\gamma_{1-n}:=x_1,\dots,\gamma_0:=x_n$ and for $k>0$, $\gamma_k$ is one of the following forms: $\gamma_k=(\gamma_j)^{-1}$ or $\gamma_k=\gamma_i\cdot\gamma_j$ where $i,j<k$.

The application of the operation $()^{-1}$ can easily be restricted to $\gamma_k=(\gamma_j)^{-1}$ for $j\leq 0$ without increasing $L$ to more than $2L+n$. This means that we are basically talking about words over the alphabet $\{x_1,\dots,x_n,(x_1)^{-1},\dots,(x_n)^{-1}\}$, hence the name "word problem" makes sense. But it seems a bad name for the general problem to decide whether two elements of a free algebra given by straight line programs are identical. It might be called identity testing.

Does the problem (to decide whether two elements of a free algebra given by straight line programs are identical) already has an established name, or is there a good name for this problem?

Maybe a better idea would be to give a name to the complementary problem, i.e. the problem to distinguish two different elements of a free algebra. So calling it slp distinction problem for free groups (commutative rings, commutative inverse rings, Boolean rings, ...) could work, because straight line program (slp) is a long name (but good and descriptive nevertheless). The advantage of naming the complementary problem is that we get problems in RP and NP, instead of problems in co-RP and co-NP.

The computational complexity of this problem is not worse than that of identity testing of constant polynomials over $\mathbb Z$ in straight line program encoding (no variables, i.e. $n=0$, but the straight line programs allow to compactly encode huge numbers): Using the same approach as in the dlog-space algorithm for the normal word problem, the problem can be reduced to deciding whether the product of integer 2x2 matrices equals the identity matrix. (The word problem over $n$ letters easily embeds into the word problem over $2$ letters, for example you can replace $a$, $b$, $c$, $d$ by $aa$, $ab$, $ba$, and $bb$.) So the problem is in randomized polynomial time (RP) (or rather co-RP). However, I didn't manage to show that it is actually equivalent (in complexity) to identity testing of (constant) polynomials over $\mathbb Z$, as I initially hoped. (This is unrelated to the answer by D.W., which rather shows that the significance of straight line encoding is currently not widely appreciated.)

by Thomas Klimpel at September 28, 2016 05:12 AM

Can we have a poly time reduction from 2-SAT to 2-Coloring problem?

I know that given a 2-Coloring instance we can easily convert it into a 2-Sat instance in polynomial time . Is the reverse possible? i.e given a 2-sat instance can we convert it into an 2-Coloring instance in poly time?

by MysticForce at September 28, 2016 05:05 AM



Should cash-flows discounted at WACC be pre- or post-tax?

WACC in my mind is effectively a post-tax measure: $$\text{WACC} = \frac{E}{V} k_e+\frac{D}{V}k_d(1-t)$$ In this case should cash-flows, in particular loan cash-flows be adjusted for tax as well? Imagine a scenario where a company buys a portfolio of loans. The company is trying to estimate whether to buy the portfolio and for how much. Market approach is not feasible as these transactions do not have publicly available prices, the portfolio is very specific. Question is whether the cash-flows, interest cash flows specifically should be adjusted for the corporate tax rate $t$ by adjusting the total interest payment received $I$ by $(1-t)$ reflecting the fact that the company will have to pay taxes on interest received.

by PBD10017 at September 28, 2016 05:02 AM


Is this property of a functor stronger than a monad?

While thinking about how to generalize monads, I came up with the following property of a functor F:

inject :: (a -> F b) -> F(a -> b) 

-- which should be a natural transformation in both a and b.

In absence of a better name, I call the functor F bindable if there exists a natural transformation inject shown above.

The main question is, whether this property is already known and has a name, and how is it related to other well-known properties of functors (such as, being applicative, monadic, pointed, traversable, etc.)

The motivation for the name "bindable" comes from the following consideration: Suppose M is a monad and F is a "bindable" functor. Then one has the following natural morphism:

fbind :: M a -> (a -> F(M b)) -> F(M b)

This is similar to the monadic "bind",

bind :: M a -> (a -> M b) -> M b

except the result is decorated with the functor F.

The idea behind fbind was that a generalized monadic operation can produce not just a single result M b but a "functor-ful" F of such results. I want to express the situation when a monadic operation yields several "strands of computation" rather than just one; each "strand of computation" being again a monadic computation.

Note that every functor F has the morphism

eject :: F(a -> b) -> a -> F b

which is converse to "inject". But not every functor F has "inject".

Examples of functors that have "inject": F t = (t,t,t) or F t = c -> (t,t) where c is a constant type. Functors F t = c (constant functor) or F t = (c,t) are not "bindable" (i.e. do not have "inject"). The continuation functor F t = (t -> r) -> r also does not seem to have inject.

The existence of "inject" can be formulated in a different way. Consider the "reader" functor R t = c -> t where c is a constant type. (This functor is applicative and monadic, but that's beside the point.) The "inject" property then means R (F t) -> F (R t), in other words, that R commutes with F. Note that this is not the same as the requirement that F be traversable; that would have been F (R t) -> R (F t), which is always satisfied for any functor F with respect to R.

So far, I was able to show that "inject" implies "fbind" for any monad M.

In addition, I showed that every functor F that has "inject" will also have these additional properties:

  • it is pointed

point :: t -> F t

  • if F is "bindable" and applicative then F is also a monad

  • if F and G are "bindable" then so is the pair functor F * G (but not F + G)

  • if F is "bindable" and A is any profunctor then the (pro)functor G t = A t -> F t is bindable

  • the identity functor is bindable.

Open questions:

  • is the property of being "bindable" equivalent to some other well-known properties, or is it a new property of a functor that is not usually considered?

  • are there any other properties of the functor "F" that follow from the existence of "inject"?

  • do we need any laws for "inject", would that be useful? For instance, we could require that R (F t) be isomorphic to F (R t) in one or both directions.

by winitzki at September 28, 2016 05:01 AM


How do I allow user permissions to data on a Couchpotato and Sickbeard Plugin using Freenas

Trying to setup permissions for the plugins sickbeard / couchpotato.

I’ve read almost every thread google / the forums have but haven’t really found a solution yet.
I’m assuming it’s the data here: (Jail name)/usr/pbi/xxxxxxxxxx
I tried:

chown -R guest:guest /usr/pbi/sickbeard-amd64/*  
chown -R guest:guest /usr/pbi/couchpotato-amd64/*  

But I couldn’t get it to work.
Or would I have to chmod 777 the folders?

by Supa at September 28, 2016 05:01 AM



Is there a popular curve fitting formula of options skew vs strike price or vs Delta?

I was trying to build a options trading/optimization system. But it often gets more inaccurate as it scans through the far from ATM options because, you know, options skews.

That is because I did not price in options skews, or jump premium. I am wondering if there is a popular formula that takes "degree of options skew", and either strike price or Delta as inputs, and then give me skews premium in terms of IV as output.

Thank you very much.

by user496 at September 28, 2016 04:59 AM


Why most modern computer hardware performs sign extension?

Why most modern computer hardware performs sign extension? Give an example where this is desirable and an example where this does the wrong thing

by user59011 at September 28, 2016 04:50 AM

Context Free Grammar for Try-Catch-Finally Statement and Throw

I am currently working on a problem that is asking me to write the grammar productions for the try-catch-finally and throw C# statements. It states that "you can assume that there are nonterminals "Type", "Expression", and "Statement", as well as a terminal "Ident". I do not understand how this is possible using so few nonterminals and only one terminal

by JanVanLeiden at September 28, 2016 04:33 AM

Range of $ 4 $-digits numbers in radix complement

Find the range (in base $ 10 $ number system) of $ 4 $-digit numbers in radix complement with radix equal to $ 2, 8, $ and $ 10, $ respectively. For each radix, find the two most positive and negative values and the values at and near zero.

I have the formula to find the base $ 10 $ value of an arbitrary number of the form $ \displaystyle d_{p - 1}d_{p - 2} \dots d_{1}d_{0}.d_{-1}d_{-2} \dots d_{-n} $ to be $ \displaystyle \sum_{i = -n}^{p - 1} \; d_{i}.r^i $

by user59008 at September 28, 2016 04:14 AM


Can a neural network have integer inputs?

I build a neural network with input as a mixture of integers and booleans. But it did not converge. I have seen many examples on internet and every one of them has input in boolean form. So is it possible to build a neural network with a mixture of inputs or integer inputs?

by ishpreet at September 28, 2016 04:08 AM

How to plot ROC curve and precision-recall curve from BinaryClassificationMetrics

I was trying to plot ROC curve and Precision-Recall curve in graph. The points are generated from the Spark Mllib BinaryClassificationMetrics. By following the following Spark

[(1.0,1.0), (0.0,0.4444444444444444)] Precision
[(1.0,1.0), (0.0,1.0)] Recall
[(1.0,1.0), (0.0,0.6153846153846153)] - F1Measure    
[(0.0,1.0), (1.0,1.0), (1.0,0.4444444444444444)]- Precision-Recall curve
[(0.0,0.0), (0.0,1.0), (1.0,1.0), (1.0,1.0)] - ROC curve

by Desanth pv at September 28, 2016 04:03 AM




Functional Design: What are advantages and disadvantages compared to Object Oriented design?

Are there systems that can be better designed using either of these two approaches?

What is negative point of functional software design?

Objective comparisons will be appreciated.

by mahdix at September 28, 2016 03:22 AM


LSH for all nearest neighbour

Let we have a set $S$ of $n$ points in $R^d$, and $0<\epsilon<1$. Then given a query point $q$, and threshold $r$, LHS algorithm due to Indyk-Motwani finds an ANN of $q$ (in sublinear time) w.h.p.- ie it returns a point $x\in S$ such that $||q-x||_2\leq(1+\epsilon)r$ if $S$ has a point $x^*$ such that $||q-x^*||_2\leq r$.

My question is that is there a non-trivial way to generalize the above LHS to find all nearest neighbor of $q$. (One possible trivial approach is to look all the buckets in which $q$ get hashed and check those points if they are within the threshold distance.)

by Ram at September 28, 2016 03:09 AM



Use the pumping lemma to prove that {www} is not context-free

Use the pumping lemma to prove that the following language is not context-free.

$\qquad L = \{ w w w \mid w \in \{a,b\}^*\}$

I am studying for an exam and really trying to understand this question. For some reason the third w is throwing me off.

I first tried using the string $a^p b^p a^p b^p a^p b^p$ but didn't get very far.

The other string I tried to work through it with was $a^p b^p b^p$

Having a hard time figuring out how exactly to split it up.

Any guidance and explanation would be greatly appreciated.

by Carazz at September 28, 2016 01:39 AM

arXiv Networking and Internet Architecture

Survey of Inter-satellite Communication for Small Satellite Systems: Physical Layer to Network Layer View. (arXiv:1609.08583v2 [cs.NI] UPDATED)

Small satellite systems enable whole new class of missions for navigation, communications, remote sensing and scientific research for both civilian and military purposes. As individual spacecraft are limited by the size, mass and power constraints, mass-produced small satellites in large constellations or clusters could be useful in many science missions such as gravity mapping, tracking of forest fires, finding water resources, etc. Constellation of satellites provide improved spatial and temporal resolution of the target. Small satellite constellations contribute innovative applications by replacing a single asset with several very capable spacecraft which opens the door to new applications. With increasing levels of autonomy, there will be a need for remote communication networks to enable communication between spacecraft. These space based networks will need to configure and maintain dynamic routes, manage intermediate nodes, and reconfigure themselves to achieve mission objectives. Hence, inter-satellite communication is a key aspect when satellites fly in formation. In this paper, we present the various researches being conducted in the small satellite community for implementing inter-satellite communications based on the Open System Interconnection (OSI) model. This paper also reviews the various design parameters applicable to the first three layers of the OSI model, i.e., physical, data link and network layer. Based on the survey, we also present a comprehensive list of design parameters useful for achieving inter-satellite communications for multiple small satellite missions. Specific topics include proposed solutions for some of the challenges faced by small satellite systems, enabling operations using a network of small satellites, and some examples of small satellite missions involving formation flying aspects.

by <a href="">Radhika Radhakrishnan</a>, <a href="">William Edmonson</a>, <a href="">Fatemeh Afghah</a>, <a href="">R. Rodriguez-Osorio</a>, <a href="">Frank Pinto</a>, <a href="">Scott Burleigh</a> at September 28, 2016 01:30 AM

Asynchronous progress design for a MPI-based PGAS one-sided communication system. (arXiv:1609.08574v1 [cs.DC])

Remote-memory-access models, also known as one-sided communication models, are becoming an interesting alternative to traditional two-sided communication models in the field of High Performance Computing. In this paper we extend previous work on an MPI-based, locality-aware remote-memory-access model with a asynchronous progress-engine for non-blocking communication operations. Most previous related work suggests to drive progression on communication through an additional thread within the application process. In contrast, our scheme uses an arbitrary number of dedicated processes to drive asynchronous progression. Further, we describe a prototypical library implementation of our concepts, namely DART, which is used to quantitatively evaluate our design against a MPI-3 baseline reference. The evaluation consists of micro-benchmark to measure overlap of communication and computation and a scientific application kernel to assess total performance impact on realistic use-cases. Our benchmarks shows, that our asynchronous progression scheme can overlap computation and communication efficiently and lead to substantially shorter communication cost in real applications.

by <a href="">Huan Zhou</a>, <a href="">Jose Gracia</a> at September 28, 2016 01:30 AM

An Empirical Comparison of Formalisms for Modelling and Analysis of Dynamic Reconfiguration of Dependable Systems. (arXiv:1609.08531v1 [cs.SE])

This paper uses a case study to evaluate empirically three formalisms of different kinds for their suitability for the modelling and analysis of dynamic reconfiguration of dependable systems. The requirements on an ideal formalism for dynamic software reconfiguration are defined. The reconfiguration of an office workflow for order processing is described, and the requirements on the reconfiguration of the workflow are defined. The workflow is modelled using the Vienna Development Method ($\mathrm{VDM}$), conditional partial order graphs ($\mathrm{CPOGs}$), and the basic Calculus of Communicating Systems for dynamic process reconfiguration (basic $\mathrm{CCS^{dp}}$), and verification of the reconfiguration requirements is attempted using the models. The formalisms are evaluated according to their ability to model the reconfiguration of the workflow, to verify the requirements on the workflow's reconfiguration, and to meet the requirements on an ideal formalism.

by <a href="">Anirban Bhattacharyya</a>, <a href="">Andrey Mokhov</a>, <a href="">Ken Pierce</a> at September 28, 2016 01:30 AM

Cognitive Random Access for Internet-of-Things Networks. (arXiv:1609.08497v1 [cs.NI])

This paper focuses on cognitive radio (CR) internet-of-things (IoT) networks where spectrum sensors are deployed for IoT CR devices, which do not have enough hardware capability to identify an unoccupied spectrum by themselves. In this sensor-enabled IoT CR network, the CR devices and the sensors are separated. It induces that spectrum occupancies at locations of CR devices and sensors could be different. To handle this difference, we investigate a conditional interference distribution (CID) at the CR device for a given measured interference at the sensor. We can observe a spatial correlation of the aggregate interference distribution through the CID. Reflecting the CID, we devise a cognitive random access scheme which adaptively adjusts transmission probability with respect to the interference measurement of the sensor. Our scheme improves area spectral efficiency (ASE) compared to conventional ALOHA and an adaptive transmission scheme which attempts to send data when the sensor measurement is lower than an interference threshold.

by <a href="">Hyesung Kim</a>, <a href="">Seung-Woo Ko</a>, <a href="">Seong-Lyun Kim</a> at September 28, 2016 01:30 AM

Joint Cell Muting and User Scheduling in Multi-Cell Networks with Temporal Fairness. (arXiv:1609.08476v1 [cs.NI])

A semi-centralized joint cell muting and user scheduling scheme for interference coordination in a multicell network is proposed under two different temporal fairness criteria. The main principle behind the proposed scheme is that a central entity selects a cell muting pattern out of a pattern set at a decision instant, and subsequently the un-muted base stations opportunistically schedule the users in the associated cells, both decisions made on a temporal-fair basis. Although some pattern sets are easily obtainable from static frequency reuse systems, we propose a more general pattern set construction algorithm in this paper. As for the first fairness criterion, all cells are assigned to receive the same temporal share with the ratio between the temporal share of a cell center section and that of the cell edge section is set to a fixed desired value for all cells. The second fairness criterion is based on the max-min temporal fairness for which the temporal share of the network-wide worst-case user is maximized. Numerical results are provided to validate the effectiveness of the proposed scheme for both criteria. The impact of choice of the cell muting pattern set is also studied through numerical examples for various cellular topologies.

by <a href="">Shahram Shahsavari</a>, <a href="">Nail Akar</a>, <a href="">Babak Hossein Khalaj</a> at September 28, 2016 01:30 AM

DESQ: Frequent Sequence Mining with Subsequence Constraints. (arXiv:1609.08431v1 [cs.DB])

Frequent sequence mining methods often make use of constraints to control which subsequences should be mined. A variety of such subsequence constraints has been studied in the literature, including length, gap, span, regular-expression, and hierarchy constraints. In this paper, we show that many subsequence constraints---including and beyond those considered in the literature---can be unified in a single framework. A unified treatment allows researchers to study jointly many types of subsequence constraints (instead of each one individually) and helps to improve usability of pattern mining systems for practitioners. In more detail, we propose a set of simple and intuitive "pattern expressions" to describe subsequence constraints and explore algorithms for efficiently mining frequent subsequences under such general constraints. Our algorithms translate pattern expressions to compressed finite state transducers, which we use as computational model, and simulate these transducers in a way suitable for frequent sequence mining. Our experimental study on real-world datasets indicates that our algorithms---although more general---are competitive to existing state-of-the-art algorithms.

by <a href="">Kaustubh Beedkar</a>, <a href="">Rainer Gemulla</a> at September 28, 2016 01:30 AM

Decision Making Based on Cohort Scores for Speaker Verification. (arXiv:1609.08419v1 [cs.SD])

Decision making is an important component in a speaker verification system. For the conventional GMM-UBM architecture, the decision is usually conducted based on the log likelihood ratio of the test utterance against the GMM of the claimed speaker and the UBM. This single-score decision is simple but tends to be sensitive to the complex variations in speech signals (e.g. text content, channel, speaking style, etc.). In this paper, we propose a decision making approach based on multiple scores derived from a set of cohort GMMs (cohort scores). Importantly, these cohort scores are not simply averaged as in conventional cohort methods; instead, we employ a powerful discriminative model as the decision maker. Experimental results show that the proposed method delivers substantial performance improvement over the baseline system, especially when a deep neural network (DNN) is used as the decision maker, and the DNN input involves some statistical features derived from the cohort scores.

by <a href="">Lantian Li</a>, <a href="">Renyu Wang</a>, <a href="">Gang Wang</a>, <a href="">Caixia Wang</a>, <a href="">Thomas Fang Zheng</a> at September 28, 2016 01:30 AM

Reducing the role of random numbers in matching algorithms for school admission. (arXiv:1609.08394v1 [cs.GT])

New methods for solving the college admissions problem with indifference are presented and characterised with a Monte Carlo simulation in a variety of simple scenarios. Based on a qualifier defined as the average rank, it is found that these methods are more efficient than the Boston and Deferred Acceptance algorithms. The improvement in efficiency is directly related to the reduced role of random tie-breakers. The strategy-proofness of the new methods is assessed as well.

by <a href="">Wouter Hulsbergen</a> at September 28, 2016 01:30 AM

An Evaluation of Coarse-Grained Locking for Multicore Microkernels. (arXiv:1609.08372v2 [cs.OS] UPDATED)

The trade-off between coarse- and fine-grained locking is a well understood issue in operating systems. Coarse-grained locking provides lower overhead under low contention, fine-grained locking provides higher scalability under contention, though at the expense of implementation complexity and re- duced best-case performance.

We revisit this trade-off in the context of microkernels and tightly-coupled cores with shared caches and low inter-core migration latencies. We evaluate performance on two architectures: x86 and ARM MPCore, in the former case also utilising transactional memory (Intel TSX). Our thesis is that on such hardware, a well-designed microkernel, with short system calls, can take advantage of coarse-grained locking on modern hardware, avoid the run-time and complexity cost of multiple locks, enable formal verification, and still achieve scalability comparable to fine-grained locking.

by <a href="">Kevin Elphinstone</a>, <a href="">Amirreza Zarrabi</a>, <a href="">Adrian Danis</a>, <a href="">Yanyan Shen</a>, <a href="">Gernot Heiser</a> at September 28, 2016 01:30 AM

Stream Differential Equations: Specification Formats and Solution Methods. (arXiv:1609.08367v1 [cs.LO])

Streams, or infinite sequences, are infinite objects of a very simple type, yet they have a rich theory partly due to their ubiquity in mathematics and computer science. Stream differential equations are a coinductive method for specifying streams and stream operations, and their theory has been developed in many papers over the past two decades. In this paper we present a survey of the many results in this area. Our focus is on the classification of different formats of stream differential equations, their solution methods, and the classes of streams they can define. Moreover, we describe in detail the connection between the so-called syntactic solution method and abstract GSOS.

by <a href="">Helle Hvid Hansen</a>, <a href="">Clemens Kupke</a>, <a href="">Jan Rutten</a> at September 28, 2016 01:30 AM

No fixed point guarantee of Nash equilibrium in quantum games. (arXiv:1609.08360v1 [quant-ph])

Nash equilibrium is not guaranteed in finite quantum games. In this letter, we revisit this fact using John Nash's original approach of countering sets and Kakutani's fixed point theorem. To the best of our knowledge, this mathematically formal approach has not been explored before in the context of quantum games. We use this approach to draw conclusions about Nash equilibrium states in quantum informational processes such as quantum computing and quantum communication protocols.

by <a href="">Faisal Shah Khan</a>, <a href="">Travis Humble</a> at September 28, 2016 01:30 AM

Asynchronous Stochastic Gradient Descent with Delay Compensation for Distributed Deep Learning. (arXiv:1609.08326v1 [cs.LG])

With the fast development of deep learning, people have started to train very big neural networks using massive data. Asynchronous Stochastic Gradient Descent (ASGD) is widely used to fulfill this task, which, however, is known to suffer from the problem of delayed gradient. That is, when a local worker adds the gradient it calculates to the global model, the global model may have been updated by other workers and this gradient becomes "delayed". We propose a novel technology to compensate this delay, so as to make the optimization behavior of ASGD closer to that of sequential SGD. This is done by leveraging Taylor expansion of the gradient function and efficient approximators to the Hessian matrix of the loss function. We call the corresponding new algorithm Delay Compensated ASGD (DC-ASGD). We evaluated the proposed algorithm on CIFAR-10 and ImageNet datasets, and experimental results demonstrate that DC-ASGD can outperform both synchronous SGD and ASGD, and nearly approaches the performance of sequential SGD.

by <a href="">Shuxin Zheng</a>, <a href="">Qi Meng</a>, <a href="">Taifeng Wang</a>, <a href="">Wei Chen</a>, <a href="">Nenghai Yu</a>, <a href="">Zhi-Ming Ma</a>, <a href="">Tie-Yan Liu</a> at September 28, 2016 01:30 AM

Structural characterization of Cayley graphs. (arXiv:1609.08272v1 [cs.DM])

We show that the directed labelled Cayley graphs coincide with the rooted deterministic vertex-transitive simple graphs. The Cayley graphs are also the strongly connected deterministic simple graphs of which all vertices have the same cycle language, or just the same elementary cycle language. Under the assumption of the axiom of choice, we characterize the Cayley graphs for all group subsets as the deterministic, co-deterministic, vertex-transitive simple graphs.

by <a href="">Didier Caucal</a> at September 28, 2016 01:30 AM

Constructing unlabelled lattices. (arXiv:1609.08255v1 [math.CO])

We present an improved orderly algorithm for constructing all unlabelled lattices up to a given size, that is, an algorithm that constructs the minimal element of each isomorphism class relative to some total order.

Our algorithm employs a stabiliser chain approach for cutting branches of the search space that cannot contain a minimal lattice; to make this work, we grow lattices by adding a new layer at a time, as opposed to adding one new element at a time, and we use a total order that is compatible with this modified strategy.

The gain in speed is about two orders of magnitude. As an application, we compute the number of unlabelled lattices on 20 elements.

by <a href="">Volker Gebhardt</a>, <a href="">Stephen Tawn</a> at September 28, 2016 01:30 AM

Towards Scalable Network Delay Minimization. (arXiv:1609.08228v1 [cs.DB])

Reduction of end-to-end network delays is an optimization task with applications in multiple domains. Low delays enable improved information flow in social networks, quick spread of ideas in collaboration networks, low travel times for vehicles on road networks and increased rate of packets in the case of communication networks. Delay reduction can be achieved by both improving the propagation capabilities of individual nodes and adding additional edges in the network. One of the main challenges in such design problems is that the effects of local changes are not independent, and as a consequence, there is a combinatorial search-space of possible improvements. Thus, minimizing the cumulative propagation delay requires novel scalable and data-driven approaches.

In this paper, we consider the problem of network delay minimization via node upgrades. Although the problem is NP-hard, we show that probabilistic approximation for a restricted version can be obtained. We design scalable and high-quality techniques for the general setting based on sampling and targeted to different models of delay distribution. Our methods scale almost linearly with the graph size and consistently outperform competitors in quality.

by <a href="">Sourav Medya</a>, <a href="">Petko Bogdanov</a>, <a href="">Ambuj Singh</a> at September 28, 2016 01:30 AM

Robust Time-Series Retrieval Using Probabilistic Adaptive Segmental Alignment. (arXiv:1609.08201v1 [cs.DB])

Traditional pairwise sequence alignment is based on matching individual samples from two sequences, under time monotonicity constraints. However, in many application settings matching subsequences (segments) instead of individual samples may bring in additional robustness to noise or local non-causal perturbations. This paper presents an approach to segmental sequence alignment that jointly segments and aligns two sequences, generalizing the traditional per-sample alignment. To accomplish this task, we introduce a distance metric between segments based on average pairwise distances and then present a modified pair-HMM (PHMM) that incorporates the proposed distance metric to solve the joint segmentation and alignment task. We also propose a relaxation to our model that improves the computational efficiency of the generic segmental PHMM. Our results demonstrate that this new measure of sequence similarity can lead to improved classification performance, while being resilient to noise, on a variety of sequence retrieval problems, from EEG to motion sequence classification.

by <a href="">Shahriar Shariat</a>, <a href="">Vladimir Pavlovic</a> at September 28, 2016 01:30 AM

The Effect of DNS on Tor's Anonymity. (arXiv:1609.08187v1 [cs.CR])

Previous attacks that link the sender and receiver of traffic in the Tor network ("correlation attacks") have generally relied on analyzing traffic from TCP connections. The TCP connections of a typical client application, however, are often accompanied by DNS requests and responses. This additional traffic presents more opportunities for correlation attacks. This paper quantifies how DNS traffic can make Tor users more vulnerable to correlation attacks. We investigate how incorporating DNS traffic can make existing correlation attacks more powerful and how DNS lookups can leak information to third parties about anonymous communication. We (i) develop a method to identify the DNS resolvers of Tor exit relays; (ii) develop a new set of correlation attacks (DefecTor attacks) that incorporate DNS traffic to improve precision; (iii) analyze the Internet-scale effects of these new attacks on Tor users; and (iv) develop improved methods to evaluate correlation attacks. First, we find that there exist adversaries who can mount DefecTor attacks: for example, Google's DNS resolver observes almost 40% of all DNS requests exiting the Tor network. We also find that DNS requests often traverse ASes that the corresponding TCP connections do not transit, enabling additional ASes to gain information about Tor users' traffic. We then show that an adversary who can mount a DefecTor attack can often determine the website that a Tor user is visiting with perfect precision, particularly for less popular websites where the set of DNS names associated with that website may be unique to the site. We also use the Tor Path Simulator (TorPS) in combination with traceroute data from vantage points co-located with Tor exit relays to estimate the power of AS-level adversaries who might mount DefecTor attacks in practice.

by <a href="">Benjamin Greschbach</a>, <a href="">Tobias Pulls</a>, <a href="">Laura M. Roberts</a>, <a href="">Philipp Winter</a>, <a href="">Nick Feamster</a> at September 28, 2016 01:30 AM

Implementing RBAC model in An Operating System Kernel. (arXiv:1609.08154v1 [cs.OS])

In this paper, the implementation of an operating system oriented RBAC model is discussed. Firstly, on the basis of RBAC96 model, a new RBAC model named OSR is presented. Secondly, the OSR model is enforced in RFSOS kernel by the way of integrating GFAC method and Capability mechanism together. All parts of the OSR implementation are described in detail.

by <a href="">Zhiyong Shan</a>, <a href="">Yu-fang Sun</a> at September 28, 2016 01:30 AM


How to run vowpal-wabbit utl scripts?

I have installed vowpal-wabbit on my mac using brew install vowpal-wabbit.

The vw command works fine. I however want to use some of the scripts in the \utl\ folder of the library.

Specifically I want to run

I tried copying this script to my machine and running it but I get the following error:

reading dataset... ERROR: vw-doc2lda not found in the path

Turns out vw-doc2lda is another script in the utl folder which script cannot run. I tried copying vw-doc2lda also to my local machine and adding its path to $PATH. It still can't find the script.

How do I run the scripts in the utl folder?

by kavini at September 28, 2016 01:15 AM


Compute equality comparison without comparison operators

Is there a possibility to compute the result of an integer equality comparison by only using arithmetic or bitwise operations? Negative values use the two-complement representation.

I am looking for a generic algorithm, which results in two possible values for equality and inequality but not using comparison operations.

by Max at September 28, 2016 01:09 AM

Planet Theory

Scenic Routes Now: Efficiently Solving the Time-Dependent Arc Orienteering Problem

Authors: Gregor Jossé, Ying Lu, Tobias Emrich, Matthias Renz, Cyrus Shahabi, Ugur Demiryurek, Matthias Schubert
Download: PDF
Abstract: This paper extends the Arc Orienteering Problem (AOP) to large road networks with time-dependent travel times and time-dependent value gain, termed Twofold Time-Dependent AOP or 2TD-AOP for short. In its original definition, the NP-hard Orienteering Problem (OP) asks to find a path from a source to a destination maximizing the accumulated value while not exceeding a cost budget. Variations of the OP and AOP have many practical applications such as mobile crowdsourcing tasks (e.g., repairing and maintenance or dispatching field workers), diverse logistics problems (e.g., crowd control or controlling wildfires) as well as several tourist guidance problems (e.g., generating trip recommendations or navigating through theme parks). In the proposed 2TD-AOP, travel times and value functions are assumed to be time-dependent. The dynamic values model, for instance, varying rewards in crowdsourcing tasks or varying urgency levels in damage control tasks. We discuss this novel problem, prove the benefit of time-dependence empirically and present an efficient approximative solution, optimized for fast response systems. Our approach is the first time-dependent variant of the AOP to be evaluated on a large scale, fine-grained, real-world road network. We show that optimal solutions are infeasible and solutions to the static problem are often invalid. We propose an approximate dynamic programming solution which produces valid paths and is orders of magnitude faster than any optimal solution.

September 28, 2016 01:03 AM

Median-of-k Jumplists

Authors: Markus E. Nebel, Elisabeth Neumann, Sebastian Wild
Download: PDF
Abstract: We extend randomized jumplists introduced by Br\"onnimann et al. (STACS 2003) to choose jump-pointer targets as median of a small sample, and present randomized algorithms with expected $O(\log n)$ time complexity that maintain the probability distribution of jump pointers upon insertions and deletions. We analyze the expected costs to search, insert and delete a random element. The resulting data structure, randomized median-of-$k$ jumplists, is competitive to other dictionary implementations and supports particularly efficient iteration in sorted order and selection by rank. We further show that omitting jump pointers in small sublists hardly affects search costs, but significantly reduces the memory consumption. If space is tight and rank-select is needed, median-of-$k$ jumplists are a promising option in practice.

September 28, 2016 01:03 AM

Multi-label Methods for Prediction with Sequential Data. (arXiv:1609.08349v2 [cs.LG] UPDATED)

The number of methods available for classification of multi-label data has increased rapidly over recent years, yet relatively few links have been made with the related task of classification of sequential data. If labels indices are considered as time indices, the problems can often be seen as equivalent. In this paper we detect and elaborate on connections between multi-label methods and Markovian models, and study the suitability of multi-label methods for prediction in sequential data. From this study we draw upon the most suitable techniques from the area and develop two novel competitive approaches which can be applied to either kind of data. We carry out an empirical evaluation investigating performance on real-world sequential-prediction tasks: electricity demand, and route prediction. As well as showing that several popular multi-label algorithms are in fact easily applicable to sequencing tasks, our novel approaches, which benefit from a unified view of these areas, prove very competitive against established methods.

by <a href="">Jesse Read</a>, <a href="">Luca Martino</a>, <a href="">Jaakko Hollm&#xe9;n</a> at September 28, 2016 01:02 AM

Simple Algorithms for Scheduling Monotonic Moldable Tasks

Authors: Patrick Loiseau an Xiaohu Wu
Download: PDF
Abstract: We study simple yet efficient algorithms for scheduling n independent monotonic moldable tasks on $m$ identical processors; the objective is to: (1) minimize the makespan, or (2) maximize the sum of values of tasks completed by a deadline.The workload of a monotonic task is non-decreasing with the number of assigned processors.

In this paper, we propose a scheduling algorithm who achieves a processor utilization of r when the number of processors assigned to a task j is the minimal number of processors needed to complete j by a time d. Here, r equals (1-k/m)/2 in the general setting where k is the maximum number of processors allocated to the tasks (in large computing clusters, m>>k and k/m approaches 0). More importantly, in many real applications, when a parallel task is executed on a small set of f processors, the speedup is linearly proportional to f. This is to be proved to be powerful in designing more efficient algorithms than the existing algorithms which is illustrated in a typical case where f=5; we propose an algorithm who can achieve a utilization of r=3(1-(k+3)/m)/4 and the extension of this algorithm to the case with an arbitrary f is also discussed. Based on the above schedule, we propose an r(1+\epsilon)-approximation algorithm with a complexity of O(nlog(n/\epsilon)) for the first scheduling objective. We also propose a generic greedy algorithm for the second scheduling objective, and, by analyzing it, give an r-approximation algorithm with a complexity of O(n). So far, for the first scheduling objective, the algorithm proposed in the typical setting is simpler and has a better performance than most of the existing algorithms given m>>k; the second objective is considered for the first time.

September 28, 2016 01:02 AM


Simple perceptron doubts on matlab

I am trying to do a simple perceptron that training 3 times. I am doing, also, a function that returns a .csv file with 40 weights. the problem its that I dont know how to implement this for 3 trainings. Following, I show you my code:


PointSize = 1;         disper = 5;
xIni = 10;              xEnd = 20;
yIni = 10;              yEnd = 20;
limitX = (xEnd+xIni)/2
limitY = (yEnd+yIni)/2
x = disper*rand(2*PointSize, 1) + xEnd;
x = [x;disper*rand(2*PointSize, 1) + xIni];
y = disper*rand(PointSize, 1) + yEnd;
y = [y;disper*rand(2*PointSize, 1) + yIni];
y = [y;disper*rand(PointSize, 1) + yEnd];
val = ones(4*PointSize, 2);
for con = 1:length(x)
    if x(con)<limitX
        val(con, 1) = -1;
    if y(con)<limitY
        val(con, 2) = -1;
matDatos = [x, y, val]
csvwrite('dataExam.csv', matDatos)

Perceptron training:

%% start
clc,                clear all,          close all
ini = -4;           fin = -ini;            size = 15;
gamma = 0.04;
x1 = linspace(ini, fin, 4);
data = load('dataExam.csv');
%% Calc
lengthA = length(data.x);
W = rand(1,3) - 0.5;
matridata = -ones(lengthA, 1);
matridata = [matridata, data.x, data.y, data.val];
figure('name', 'Percetron','NumberTitle','off');
scatter(data.x, data.y, size, 'k' );
 hold on
 grid on
for numEpo = 1:15
    for cont = 1:lengthA
        filaEntr = matridata(cont, 1: end-1); 
        valProDot = dot(W, filaEntr); 
        yCalc = Sign(valProDot); 
        yResult = matridata(cont,end); 
        error = yResult - yCalc;
        changeW = error * gamma * yCalc;
        W = W + changeW;
 x2 = W(1)/W(3) - (W(2)/W(3))*x1;
     plot(x1 , x2, 'ro-');

My teacher said that i can implement a main function that calls "Filling", "Training", "Test" and a final function, with the "result", determinate what training has the least error. I appreciate your help...

by Simon Restrepo at September 28, 2016 01:00 AM

Immutable int in C++

How to create simple immutable int in C++? I have a function:

int NOD(const int &a, const int &b) {
    if (b == 0) {return a;}
    const int remainder = a % b;
    return NOD(b, remainder);

which supposed to take two ints and returns greatest common divisor. But I didn't find a way to make a and b immutable, hence they can be altered within function, for example if we add a line in function:

*(int *)&a = 555;

it breaks everything. How to make immutable variables in C++?

by USER at September 28, 2016 01:00 AM


AWS Week in Review – September 19, 2016

Eighteen (18) external and internal contributors worked together to create this edition of the AWS Week in Review. If you would like to join the party (with the possibility of a free lunch at re:Invent), please visit the AWS Week in Review on GitHub.


September 19


September 20


September 21


September 22


September 23


September 24


September 25

New & Notable Open Source

  • ecs-refarch-cloudformation is reference architecture for deploying Microservices with Amazon ECS, AWS CloudFormation (YAML), and an Application Load Balancer.
  • rclone syncs files and directories to and from S3 and many other cloud storage providers.
  • Syncany is an open source cloud storage and filesharing application.
  • chalice-transmogrify is an AWS Lambda Python Microservice that transforms arbitrary XML/RSS to JSON.
  • amp-validator is a serverless AMP HTML Validator Microservice for AWS Lambda.
  • ecs-pilot is a simple tool for managing AWS ECS.
  • vman is an object version manager for AWS S3 buckets.
  • aws-codedeploy-linux is a demo of how to use CodeDeploy and CodePipeline with AWS.
  • autospotting is a tool for automatically replacing EC2 instances in AWS AutoScaling groups with compatible instances requested on the EC2 Spot Market.
  • shep is a framework for building APIs using AWS API Gateway and Lambda.

New SlideShare Presentations

New Customer Success Stories

  • NetSeer significantly reduces costs, improves the reliability of its real-time ad-bidding cluster, and delivers 100-millisecond response times using AWS. The company offers online solutions that help advertisers and publishers match search queries and web content to relevant ads. NetSeer runs its bidding cluster on AWS, taking advantage of Amazon EC2 Spot Fleet Instances.
  • New York Public Library revamped its fractured IT environment—which had older technology and legacy computing—to a modernized platform on AWS. The New York Public Library has been a provider of free books, information, ideas, and education for more than 17 million patrons a year. Using Amazon EC2, Elastic Load Balancer, Amazon RDS and Auto Scaling, NYPL is able to build scalable, repeatable systems quickly at a fraction of the cost.
  • MakerBot uses AWS to understand what its customers need, and to go to market faster with new and innovative products. MakerBot is a desktop 3-D printing company with more than 100 thousand customers using its 3-D printers. MakerBot uses Matillion ETL for Amazon Redshift to process data from a variety of sources in a fast and cost-effective way.
  • University of Maryland, College Park uses the AWS cloud to create a stable, secure and modern technical environment for its students and staff while ensuring compliance. The University of Maryland is a public research university located in the city of College Park, Maryland, and is the flagship institution of the University System of Maryland. The university uses AWS to migrate all of their datacenters to the cloud, as well as Amazon WorkSpaces to give students access to software anytime, anywhere and with any device.

Upcoming Events

Help Wanted

Stay tuned for next week! In the meantime, follow me on Twitter and subscribe to the RSS feed.

by Jeff Barr at September 28, 2016 12:45 AM


100% accuracy in classification with LIBSVM- What could be wrong?

I am building a model for classifying malignant breast tumors using LIBSVM. Here is the algorithm I am following:

  1. Use Backward-elimination for feature selection.
  2. Calculate C and gamma for each set of features using grid search.
  3. Derive the most optimal C and gamma using 10-fold cross validation.
  4. Using the above steps, find the best possible subset of features and the maximum accuracy.

The problem is that I am getting a 100% accuracy on a 80:20 dataset using LIBSVM. I've not excluded any feature, and I am NOT training and testing on the same data. Any hints where I could be wrong? Here are some other relevant information:

cost = [2^-10, 2^-8, 2^-6, 2^-4, 2^-2, 0.5, 1,
        2, 2^2, 2^3, 2^4, 2^5, 2^6, 2^7, 2^8, 2^9, 2^10];
g = [2^-10, 2^-8, 2^-6, 2^-4, 2^-2, 2^-1, 1,
     2, 2^2, 2^3, 2^4, 2^5, 2^6, 2^7, 2^8, 2^9, 2^10];
most optimal C = 1;
most optimal gamma = 9.7656e-04;
Accuracy on 50:50 test:train dataset: 98.5337%
Accuracy on 70:30 test:train dataset: 99.5122%
Dataset used: University of Wisconsin breast cancer dataset (682 entries).

by Prashant Pandey at September 28, 2016 12:34 AM



Need suggestion to get multiclass classifier using some machine learning library for text classification

My Text Data is similar to below:

Class | Text
1 | A big long text with some 1 specific keywords
2 | Some text with features of 2.
3 | 3 specific data in this text with some overlapping features

I have about 20K records like above with about 40 classes. Each text is about 3 or 4 lines.

I have converted them to number matrices by getting all the unique words and applying PorterStemmer.[]

After applying above: my data looks like
Class | FeatureSet
1 | 1 0 0 0 0 1 0 0 .....
2 | 0 0 0 0 1 0 0 0 1 0 ..

My FeatureSet size is all the unique words which is about 15K in my case. So all feature sets will have 15K length with bunch of 0s and 1s. (1s are at the positions where word matches)

Now, with this data I am trying to find a good Java library that can help me build MultiClass Classifier. I tried looking into libsvm and other libraries but not sure how to transform my data so I can write a Java based classifier using libsvm or similar.

Some programs that I considered:

Will appreciate to get some suggestions if you had similar experience.

by itwasnoteasy at September 28, 2016 12:14 AM


Why are the conditions for optimality different for A* tree and graph search?

I am unclear as to why the conditions for optimality for A* search are different for graph search and tree search. When discussing conditions for optimality for A* search in Russell and Norvig's Artificial Intelligence: A Modern Approach they say:

The first condition we require for optimality is that $h(n)$ be an admissible heuristic.


A second, slightly stronger condition called consistency (or sometimes monotonicity) is required only for applications of A* to graph search.

Why is consistency only required for A* graph search and not A* tree search? Why are the conditions for optimality different for the two types of search?

by Conor Igoe at September 28, 2016 12:05 AM

HN Daily

Planet Theory

How to Elect a Low-energy Leader

Authors: Yi-Jun Chang, Tsvi Kopelowitz, Seth Pettie, Ruosong Wang, Wei Zhan
Download: PDF
Abstract: In many networks of wireless devices the scarcest resource is energy, and the lion's share of energy is often spent on sending and receiving packets. In this paper we present a comprehensive study of the energy complexity of fundamental problems in wireless networks with four different levels of collision detection: Strong-CD (in which transmitters and listeners detect collisions), Sender-CD (in which transmitters detect collisions, indirectly), Receiver-CD (in which listeners detect collisions), and No-CD (in which no one detects collisions).

We show that the randomized energy complexity of Approximate Counting and Leader Election is $\Omega(\log^* n)$ in Sender-CD and No-CD but $\Omega(\log(\log^* n))$ in Strong-CD and Receiver-CD, and also provide matching upper bounds. This establishes an exponential separation between the Sender-CD and Receiver-CD models, and also confirms that the recent $O(\log(\log^* n))$ Contention Resolution protocol of Bender et al. (STOC 2016) is optimal in Strong-CD.

In the deterministic setting, all $n$ devices have unique IDs in the range $[N]$. We establish another exponential separation between the deterministic Sender-CD and Receiver-CD models in the opposite direction. We show that Leader Election can be solved with $O(\log \log N)$ energy in the deterministic Sender-CD model, and give a matching $\Omega(\log \log N)$ energy lower bound in the Strong-CD model. However, in Receiver-CD and No-CD the energy complexity of these problems jumps to $\Theta(\log N)$.

For the special case where $n = \Theta(N)$, we prove that Leader Election can be solved with only $O(\alpha(N))$ energy in No-CD. To our best knowledge, this is the first time the inverse-Ackermann function appears in the field of distributed computing.

September 28, 2016 12:00 AM

Tight Hardness Results for Distance and Centrality Problems in Constant Degree Graphs

Authors: Jacob Evald, Søren Dahlgaard
Download: PDF
Abstract: Finding important nodes in a graph and measuring their importance is a fundamental problem in the analysis of social networks, transportation networks, biological systems, etc. Among popular such metrics are graph centrality, betweenness centrality (BC), and reach centrality (RC). These measures are also very related to classic notions like diameter and radius. Roditty and Vassilevska Williams~[STOC'13] showed that no algorithm can compute a (3/2-\delta)-approximation of the diameter in sparse and unweighted graphs faster that n^{2-o(1)} time unless the widely believed strong exponential time hypothesis (SETH) is false. Abboud et al.~[SODA'15] and [SODA'16] further analyzed these problems under the recent line of research on hardness in P. They showed that in sparse and unweighted graphs (weighted for BC) none of these problems can be solved faster than n^{2-o(1)} unless some popular conjecture is false. Furthermore they ruled out a (2-\delta)-approximation for RC, a (3/2-\delta)-approximation for Radius and a (5/3-\delta)-approximation for computing all eccentricities of a graph for any \delta > 0. We extend these results to the case of unweighted graphs with constant maximum degree. Through new graph constructions we are able to obtain the same approximation and time bounds as for sparse graphs even in unweighted bounded-degree graphs. We show that no (3/2-\delta) approximation of Radius or Diameter, (2-\delta)-approximation of RC, (5/3-\delta)-approximation of all eccentricities or exact algorithm for BC exists in time n^{2-o(1)} for such graphs and any \delta > 0. This strengthens the result for BC of Abboud et al.~[SODA'16] by showing a hardness result for unweighted graphs, and follows in the footsteps of Abboud et al.~[SODA'16] and Abboud and Dahlgaard~[FOCS'16] in showing conditional lower bounds for restricted but realistic graph classes.

September 28, 2016 12:00 AM

On the Group and Color Isomorphism Problems

Authors: François Le Gall, David J. Rosenbaum
Download: PDF
Abstract: In this paper, we prove results on the relationship between the complexity of the group and color isomorphism problems. The difficulty of color isomorphism problems is known to be closely linked to the the composition factors of the permutation group involved. Previous works are primarily concerned with applying color isomorphism to bou nded degree graph isomorphism, and have therefore focused on the alternating composit ion factors, since those are the bottleneck in the case of graph isomorphism.

We consider the color isomorphism problem with composition factors restricted to those other than the alternating group, show that group isomorphism reduces in n^(O(log log n)) time to this problem, and, conversely, that a special case of this color isomorphism problem reduces to a slight generalization of group isomorphism. We then sharpen our results by identifying the projective special linear group as the main obstacle to faster algorithms for group isomorphism and prove that the aforementioned reduc tion from group isomorphism to color isomorphism in fact produces only cyclic and projective special linear factors. Our results demonstrate that, just as the alternatin g group was a barrier to faster algorithms for graph isomorphism for three decades, the projective special linear group is an obstacle to faster algorithms for group isomorphism.

September 28, 2016 12:00 AM

September 27, 2016



List of Intraday stock prices API

I am looking for an API to request intraday data for the London stock exchange. I have seen products like eSignal but this seems to include a lot more than the simple data as XML or JSON and is fairly expensive. The idea is to request data and analyse in an application that I have written so all I need is a real time source. Is there anything available like this?

by user2145312 at September 27, 2016 11:00 PM


Is it possible to update list of objects except the MIN value in this list in one LINQ statement?

I think my question is from functional programming. By the way I wrote some code for updating list of my objects:

var minDate = userRecords.Min(x => x.FromDate);
return userRecords.Where(x => x.FromDate != minDate).Select(x =>
        x.IsInitialRecord = false;
        return x;

I want to rewrite it without using local variable. Thanks in advance!

by user3818229 at September 27, 2016 10:50 PM

Can't get SVC Score function to work

I am trying to run this machine learning platform and I get the following error:

ValueError: X.shape[1] = 574 should be equal to 11, the number of features at training time

My Code:

from pylab import *
from sklearn.svm import SVC
from sklearn.ensemble import RandomForestClassifier
import numpy as np

X = list ()
Y = list ()
validationX = list ()
validationY = list ()
file = open ('C:\\Users\\User\\Desktop\\csci4113\\project1\\whitewineTraining.txt','r')
for eachline in file:
    strArray = eachline.split(";")
    row = list ()
    for i in range(len(strArray) - 1):
    if (int(strArray[-1]) > 6):
file2 = open ('C:\\Users\\User\\Desktop\\csci4113\\project1\\whitewineValidation.txt', 'r')
for eachline in file2:
    strArray = eachline.split(";")
    row2 = list ()
    for i in range(len(strArray) - 1):
    if (int(strArray[-1]) > 6):

X = np.array(X)
print (X)
Y = np.array(Y)
print (Y)
validationX = np.array(validationX)
validationY = np.array(validationY)

clf = svm.SVC(),Y)
result = clf.predict(validationX)
clf.score(result, validationY)

The goal of the program is to to build a model from the fit() command where we can use it to compare to a validation set in validationY and see the validity of our machine learning model. Here is the rest of the console output: keep in mind X is confusingly a 11x574 array!

[[  7.           0.27         0.36       ...,   3.           0.45         8.8       ]
 [  6.3          0.3          0.34       ...,   3.3          0.49         9.5       ]
 [  8.1          0.28         0.4        ...,   3.26         0.44        10.1       ]
 [  6.3          0.28         0.22       ...,   3.           0.33        10.6       ]
 [  7.4          0.16         0.33       ...,   3.04         0.68        10.5       ]
 [  8.4          0.27         0.3        ...,   2.89         0.3
[0 0 0 ..., 0 1 0]
C:\Users\User\Anaconda3\lib\site-packages\sklearn\utils\ DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 and willraise ValueError in 0.19. Reshape your data either using X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1) if it contains a single sample.
Traceback (most recent call last):

  File "<ipython-input-68-31c649fe24b3>", line 1, in <module>
    runfile('C:/Users/User/Desktop/csci4113/project1/', wdir='C:/Users/User/Desktop/csci4113/project1')

  File "C:\Users\User\Anaconda3\lib\site-packages\spyderlib\widgets\externalshell\", line 714, in runfile
    execfile(filename, namespace)

  File "C:\Users\User\Anaconda3\lib\site-packages\spyderlib\widgets\externalshell\", line 89, in execfile
    exec(compile(, filename, 'exec'), namespace)

  File "C:/Users/User/Desktop/csci4113/project1/", line 43, in <module>
    clf.score(result, validationY)

  File "C:\Users\User\Anaconda3\lib\site-packages\sklearn\", line 310, in score
    return accuracy_score(y, self.predict(X), sample_weight=sample_weight)

  File "C:\Users\User\Anaconda3\lib\site-packages\sklearn\svm\", line 568, in predict
    y = super(BaseSVC, self).predict(X)

  File "C:\Users\User\Anaconda3\lib\site-packages\sklearn\svm\", line 305, in predict
    X = self._validate_for_predict(X)

  File "C:\Users\User\Anaconda3\lib\site-packages\sklearn\svm\", line 474, in _validate_for_predict
    (n_features, self.shape_fit_[1]))

ValueError: X.shape[1] = 574 should be equal to 11, the number of features at training time

runfile('C:/Users/User/Desktop/csci4113/project1/', wdir='C:/Users/User/Desktop/csci4113/project1')
[[  7.           0.27         0.36       ...,   3.           0.45         8.8       ]
 [  6.3          0.3          0.34       ...,   3.3          0.49         9.5       ]
 [  8.1          0.28         0.4        ...,   3.26         0.44        10.1       ]
 [  6.3          0.28         0.22       ...,   3.           0.33        10.6       ]
 [  7.4          0.16         0.33       ...,   3.04         0.68        10.5       ]
 [  8.4          0.27         0.3        ...,   2.89         0.3
[0 0 0 ..., 0 1 0]
C:\Users\User\Anaconda3\lib\site-packages\sklearn\utils\ DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 and willraise ValueError in 0.19. Reshape your data either using X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1) if it contains a single sample.
Traceback (most recent call last):

  File "<ipython-input-69-31c649fe24b3>", line 1, in <module>
    runfile('C:/Users/User/Desktop/csci4113/project1/', wdir='C:/Users/User/Desktop/csci4113/project1')

  File "C:\Users\User\Anaconda3\lib\site-packages\spyderlib\widgets\externalshell\", line 714, in runfile
    execfile(filename, namespace)

  File "C:\Users\User\Anaconda3\lib\site-packages\spyderlib\widgets\externalshell\", line 89, in execfile
    exec(compile(, filename, 'exec'), namespace)

  File "C:/Users/User/Desktop/csci4113/project1/", line 46, in <module>
    clf.score(result, validationY)

  File "C:\Users\User\Anaconda3\lib\site-packages\sklearn\", line 310, in score
    return accuracy_score(y, self.predict(X), sample_weight=sample_weight)

  File "C:\Users\User\Anaconda3\lib\site-packages\sklearn\svm\", line 568, in predict
    y = super(BaseSVC, self).predict(X)

  File "C:\Users\User\Anaconda3\lib\site-packages\sklearn\svm\", line 305, in predict
    X = self._validate_for_predict(X)

  File "C:\Users\User\Anaconda3\lib\site-packages\sklearn\svm\", line 474, in _validate_for_predict
    (n_features, self.shape_fit_[1]))``

by Amir at September 27, 2016 10:43 PM

Keras model creates linear classification for make_moons data

I'm trying to reproduce the model in this WildML - Implementing a Neural Network From Scratch tutorial but using Keras instead. I've tried to use all of the same configurations as the tutorial, but I keep getting a linear classification even after tweaking the number of epochs, batch sizes, activation functions, and number of units in the hidden layer:

classification graph

Here's my code:

from keras.models import Sequential
from keras.layers import Dense, Activation
from keras.utils.visualize_util import plot
from keras.utils.np_utils import to_categorical

import  numpy as np
import  matplotlib.pyplot as plt

import  sklearn
from    sklearn import datasets, linear_model

# Build model
model = Sequential()
model.add(Dense(input_dim=2, output_dim=3, activation="tanh", init="normal"))
model.add(Dense(output_dim=2, activation="softmax"))
model.compile(loss='categorical_crossentropy', optimizer='sgd', metrics=['accuracy'])

# Train
X, y = sklearn.datasets.make_moons(200, noise=0.20)
y_binary = to_categorical(y), y_binary, nb_epoch=100)

# Helper function to plot a decision boundary.
# If you don't fully understand this function don't worry, it just generates the contour plot below.
def plot_decision_boundary(pred_func):
    # Set min and max values and give it some padding
    x_min, x_max = X[:, 0].min() - .5, X[:, 0].max() + .5
    y_min, y_max = X[:, 1].min() - .5, X[:, 1].max() + .5
    h = 0.01
    # Generate a grid of points with distance h between them
    xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h))
    # Predict the function value for the whole gid
    Z = pred_func(np.c_[xx.ravel(), yy.ravel()])
    Z = Z.reshape(xx.shape)
    # Plot the contour and training examples
    plt.contourf(xx, yy, Z,
    plt.scatter(X[:, 0], X[:, 1], c=y,

# Predict and plot
plot_decision_boundary(lambda x: model.predict_classes(x, batch_size=200))
plt.title("Decision Boundary for hidden layer size 3")

by choxi at September 27, 2016 10:39 PM


Expanding the M4 Instance Type – New M4.16xlarge

EC2’s M4 instances offer a balance of compute, memory, and networking resources and are a good choice for many different types of applications.

We launched the M4 instances last year (read The New M4 Instance Type to learn more) and gave you a choice of five sizes, from large up to 10xlarge. Today we are expanding the range with the introduction of a new m4.16xlarge with 64 vCPUs and 256 GiB of RAM. Here’s the complete set of specs:

Instance Name vCPU Count
Instance Storage Network Performance EBS-Optimized
m4.large 2 8 GiB EBS Only Moderate 450 Mbps
m4.xlarge 4 16 GiB EBS Only High 750 Mbps
m4.2xlarge 8 32 GiB EBS Only High 1,000 Mbps
m4.4xlarge 16 64 GiB EBS Only High 2,000 Mbps
m4.10xlarge 40 160 GiB EBS Only 10 Gbps 4,000 Mbps
m4.16xlarge 64 256 GiB EBS Only 20 Gbps 10,000 Mbps

The new instances are based on Intel Xeon E5-2686 v4 (Broadwell) processors that are optimized specifically for EC2. When used with Elastic Network Adapter (ENA) inside of a placement group, the instances can deliver up to 20 Gbps of low-latency network bandwidth. To learn more about the ENA, read my post, Elastic Network Adapter – High Performance Network Interface for Amazon EC2.

Like the m4.10xlarge, the m4.x16large allows you to control the C states to enable higher turbo frequencies when you are using just a few cores. You can also control the P states to lower performance variability (read my extended description in New C4 Instances to learn more about both of these features).

You can purchase On-Demand Instances, Spot Instances, and Reserved Instances; visit the EC2 Pricing page for more information.

Available Now
As part of today’s launch we are also making the M4 instances available in the China (Beijing), South America (Brazil), and AWS GovCloud (US) regions.


by Jeff Barr at September 27, 2016 10:23 PM


What is the difference between radix trees and Patricia tries?

I am learning about radix trees (aka compressed tries) and Patricia tries, but I am finding conflicting information on whether or not they are actually the same. A radix tree can be obtained from a normal (uncompressed) trie by merging nodes with their parents when the nodes are the only child. This also holds for Patricia tries. In what ways are the two data structures different?

For example, NIST lists the two as the same:

Patricia tree

(data structure)

Definition: A compact representation of a trie in which any node that is an only child is merged with its parent.

Also known as radix tree.

Many sources on the web claim the same. However, apparently Patricia tries are a special case of radix trees. Wikipedia entry says:

PATRICIA tries are radix tries with radix equals 2, which means that each bit of the key is compared individually and each node is a two-way (i.e., left versus right) branch.

I don't really understand this. Is the difference only in the way comparisons are made when doing look-ups? How can each node be a "two-way branch"? Shouldn't there be at most ALPHABET_SIZE possible branches for a given node?

Can someone clarify this? For practical purposes, are radix tries typically implemented as Patricia tries (and, hence, often considered the same)? Or can no such generalizations be made?

by w128 at September 27, 2016 10:21 PM


How to use in my project?

I'm new to java and semtiment analysis too . I need to extract positive and negative words from tweets and stored them in hashmap . i searched for classification for that and found stanford has class called i added all files .jar to the library of my project but couldn't find any illustaration for how to use it ?


import edu.stanford.nlp.classify.Classifier;
import edu.stanford.nlp.classify.ColumnDataClassifier;
import edu.stanford.nlp.classify.LinearClassifier;
import edu.stanford.nlp.ling.Datum;
import edu.stanford.nlp.objectbank.ObjectBank;
import edu.stanford.nlp.util.ErasureUtils;

class ClassifierDemo {

  public static void main(String[] args) throws Exception {
    ColumnDataClassifier cdc = new ColumnDataClassifier("examples/cheese2007.prop");
    Classifier<String,String> cl =
    for (String line : ObjectBank.getLineIterator("examples/cheeseDisease.test", "utf-8")) {
      // instead of the method in the line below, if you have the individual elements
      // already you can use cdc.makeDatumFromStrings(String[])
      Datum<String,String> d = cdc.makeDatumFromLine(line);
      System.out.println(line + "  ==>  " + cl.classOf(d));


  public static void demonstrateSerialization()
    throws IOException, ClassNotFoundException {
    System.out.println("Demonstrating working with a serialized classifier");
    ColumnDataClassifier cdc = new ColumnDataClassifier("examples/cheese2007.prop");
    Classifier<String,String> cl =

    // Exhibit serialization and deserialization working. Serialized to bytes in memory for simplicity
    System.out.println(); System.out.println();
    ByteArrayOutputStream baos = new ByteArrayOutputStream();
    ObjectOutputStream oos = new ObjectOutputStream(baos);
    byte[] object = baos.toByteArray();
    ByteArrayInputStream bais = new ByteArrayInputStream(object);
    ObjectInputStream ois = new ObjectInputStream(bais);
    LinearClassifier<String,String> lc = ErasureUtils.uncheckedCast(ois.readObject());
    ColumnDataClassifier cdc2 = new ColumnDataClassifier("examples/cheese2007.prop");

    // We compare the output of the deserialized classifier lc versus the original one cl
    // For both we use a ColumnDataClassifier to convert text lines to examples
    for (String line : ObjectBank.getLineIterator("examples/cheeseDisease.test", "utf-8")) {
      Datum<String,String> d = cdc.makeDatumFromLine(line);
      Datum<String,String> d2 = cdc2.makeDatumFromLine(line);
      System.out.println(line + "  =origi=>  " + cl.classOf(d));
      System.out.println(line + "  =deser=>  " + lc.classOf(d2));


by user1 at September 27, 2016 10:17 PM


What is the industry standard for annualizing returns over non-contiguous time periods?

Suppose I am invested in the same fund for the first 200 days in 2013, some combination of 150 days in 2014, and the last 100 days in 2015. Further suppose that geometrically linking the daily returns over every day invested gives a cumulative return of 10% over the total 450 days invested during this 3 year time frame.

What is the annualized return in this case?

A couple possibilities seem reasonable:

  1. Annualize only over days invested: $$(1+10\%)^{365/450} - 1 \approx 8.037\%$$

  2. Annualize over the entire time frame: $$(1+10\%)^{1/3} - 1 \approx 3.228\%$$

Is one of these very different answers the standard or is it entirely context dependent?

by Alexis Olson at September 27, 2016 10:14 PM


Python: Combining two lists and removing duplicates in a functional programming way

I'm trying to write a function that would combine two lists while removing duplicate items, but in a pure functional way. For example:

a = [1,2,2]
b = [1,3,3,4,5,0]
union(a,b) --> [1,2,3,4,5,0]

The imperative form of the code would be:

def union(a,b):
    c = []
    for i in a + b:
        if i not in c:
    return c

I've tried several approaches, but couldn't find a way to do that without using a loop to go over the items - what am I missing?

by DanielR at September 27, 2016 09:59 PM


AWS Hot Startups – September 2016

Tina Barr is back with this month’s hot startups on AWS!


It’s officially fall so warm up that hot cider and check out this month’s great AWS-powered startups:

  • Funding Circle – The leading online marketplace for business loans.
  • Karhoo – A ride comparison app.
  • nearbuy – Connecting customers and local merchants across India.

Funding Circle (UK)
Funding Circle is one of the world’s leading direct lending platforms for business loans, where people and organizations can invest in successful small businesses. The platform was established in 2010 by co-founders Samir Desai, James Meekings, and Andrew Mullinger as a direct response to the noncompetitive lending market that exists in the UK. Funding Circle’s goal was to create the infrastructure – similar to a stock exchange or bond market – where any investor could lend to small businesses. With Funding Circle, individuals, financial institutions, and even governments can lend to creditworthy small businesses using an online direct lending platform. Since its inception, Funding Circle has raised $300M in equity capital from the same investors that backed Facebook, Twitter, and Sky. The platform expanded to the US market in October 2013 and launched across Continental Europe in October 2015.

Funding Circle has given businesses the ability to apply online for loans much faster than they could through traditional routes due in part to the absence of high overhead branch costs and legacy IT issues. Their investors include more than 50,000 individuals, the Government-backed British Business Bank, the European Investment Bank, and many local councils and large financial institutions. To date, more than £1.4 billion has been lent through the platform to nearly 16,000 small businesses in the UK alone. Funding Circle’s growth has led independent experts to predict that it will see strong growth in the UK business lending market within a decade. The platform has also made a huge impact in the UK economy – boosting it by £2.7 billion, creating up to 40,000 new jobs, and helping to build more than 2,000 new homes.

As a regulated business, Funding Circle needs separate infrastructure in multiple geographies. AWS provides similar services across all of Funding Circle’s territories. They use the full AWS stack from the top, with Amazon Route 53 directing traffic across global Amazon EC2 instances, to data analytics with Amazon Redshift.

Check out this short video to learn more about how Funding Circle works!

Karhoo (New York)
Daniel Ishag, founder and CEO of Karhoo, found himself in a situation many of us have probably been in. He was in a hotel in California using an app to call a cab from one of the big on-demand services. The driver cancelled. Daniel tried three or four different companies and again, they all cancelled. The very next day he was booking a flight when he saw all of the ways in which travel companies clearly presented airline choices for travelers. Daniel realized that there was great potential to translate this to ground transportation – specifically with taxis and licensed private hire. Within 48 hours of this realization, he was on his way to Bombay to prototype the product.

Karhoo is the global cab comparison and booking app that provides passengers with more choices each time they book a ride. By connecting directly to the fleet dispatch system of established black cab, minicab, and executive car operators, the app allows passengers to choose the ride they want, at the right price with no surge pricing. The vendor-neutral platform also gives passengers the ability to pre-book their rides days or months in advance. With over 500,000 cars on the platform, Karhoo is changing the landscape of the on-demand transport industry.

In order to build a scalable business, Karhoo uses AWS to implement many independent integration projects, run an operation that is data-driven, and experiment with tools and technologies without committing to heavy costs. They utilize Amazon S3 for storage and Amazon EC2, Amazon Redshift, and Amazon RDS for operation. Karhoo also uses Amazon EMR, Amazon ElastiCache, and Amazon SES and is looking into future products such as a mobile device testing farm.

Check out Karhoo’s blog to keep up with their latest news!

nearbuy (India)
nearbuy is India’s first hyper-local online platform that gives consumers and local merchants a place to discover and interact with each other. They help consumers find some of the best deals in food, beauty, health, hotels, and more in over 30 cities in India. Here’s how to use them:

  • Explore options and deals at restaurants, spas, gyms, movies, hotels and more around you.
  • Buy easily and securely, using credit/debit cards, net-banking, or wallets.
  • Enjoy the service by simply showing your voucher on the nearbuy app (iOS and Android).

After continuously observing the amount of time people were spending on their mobile phones, six passionate individuals decided to build a product that allowed for all goods and services in India to be purchased online. nearbuy has been able to make the time gap between purchase and consumption almost instant, make experiences more relevant by offering them at the user’s current location, and allow services such as appointments and payments to be made from the app itself. The nearbuy team is currently charting a path to define how services can and will be bought online in India.

nearbuy chose AWS in order to reduce its time to market while aggressively scaling their operations. They leverage Amazon EC2 heavily and were one of the few companies in the region running their  entire production load on EC2. The container-based approach has not only helped nearbuy significantly reduce its infrastructure cost, but has also enabled them to implement CI+CD (Continuous Integration / Continuous Deployment), which has reduced time to ship exponentially.

Stay connected to nearbuy by following them at

Tina Barr

by Jeff Barr at September 27, 2016 09:51 PM