# Planet Primates

## October 07, 2016

### Planet Theory

#### Linear algebraic structure of word meanings

Word embeddings capture the meaning of a word using a low-dimensional vector and are ubiquitous in natural language processing (NLP). (See my earlier post 1 and post2.) It has always been unclear how to interpret the embedding when the word in question is polysemous, that is, has multiple senses. For example, tie can mean an article of clothing, a drawn sports match, and a physical action.

Polysemy is an important issue in NLP and much work relies upon WordNet, a hand-constructed repository of word senses and their interrelationships. Unfortunately, good WordNets do not exist for most languages, and even the one in English is believed to be rather incomplete. Thus some effort has been spent on methods to find different senses of words.

In this post I will talk about my joint work with Li, Liang, Ma, Risteski which shows that actually word senses are easily accessible in many current word embeddings. This goes against conventional wisdom in NLP, which is that of course, word embeddings do not suffice to capture polysemy since they use a single vector to represent the word, regardless of whether the word has one sense, or a dozen. Our work shows that major senses of the word lie in linear superposition within the embedding, and are extractable using sparse coding.

This post uses embeddings constructed using our method and the wikipedia corpus, but similar techniques also apply (with some loss in precision) to other embeddings described in post 1 such as word2vec, Glove, or even the decades-old PMI embedding.

## A surprising experiment

Take the viewpoint –simplistic yet instructive– that a polysemous word like tie is a single lexical token that represents unrelated words tie1, tie2, … Here is a surprising experiment that suggests that the embedding for tie should be approximately a weighted sum of the (hypothethical) embeddings of tie1, tie2, …

Take two random words $w_1, w_2$. Combine them into an artificial polysemous word $w_{new}$ by replacing every occurrence of $w_1$ or $w_2$ in the corpus by $w_{new}.$ Next, compute an embedding for $w_{new}$ using the same embedding method while deleting embeddings for $w_1, w_2$ but preserving the embeddings for all other words. Compare the embedding $v_{w_{new}}$ to linear combinations of $v_{w_1}$ and $v_{w_2}$.

Repeating this experiment with a wide range of values for the ratio $r$ between the frequencies of $w_1$ and $w_2$, we find that $v_{w_{new}}$ lies close to the subspace spanned by $v_{w_1}$ and $v_{w_2}$: the cosine of its angle with the subspace is on average $0.97$ with standard deviation $0.02$. Thus $v_{w_{new}} \approx \alpha v_{w_1} + \beta v_{w_2}$. We find that $\alpha \approx 1$ whereas $\beta \approx 1- c\lg r$ for some constant $c\approx 0.5$. (Note this formula is meaningful when the frequency ratio $r$ is not too large, i.e. when $r < 10^{1/c} \approx 100$.) Thanks to this logarithm, the infrequent sense is not swamped out in the embedding, even if it is 50 times less frequent than the dominant sense. This is an important reason behind the success of our method for extracting word senses.

This experiment –to which we were led by our theoretical investigations– is very surprising because the embedding is the solution to a complicated, nonconvex optimization, yet it behaves in such a striking linear way. You can read our paper for an intuitive explanation using our theoretical model from post2.

## Extracting word senses from embeddings

The above experiment suggests that

but this alone is insufficient to mathematically pin down the senses, since $v_{tie}$ can be expressed in infinitely many ways as such a combination. To pin down the senses we will interrelate the senses of different words —for example, relate the “article of clothing” sense tie1 with shoe, jacket etc.

The word senses tie1, tie2,.. correspond to “different things being talked about” —in other words, different word distributions occuring around tie. Now remember that our earlier paper described in post2 gives an interpretation of “what’s being talked about”: it is called discourse and it is represented by a unit vector in the embedding space. In particular, the theoretical model of post2 imagines a text corpus as being generated by a random walk on discourse vectors. When the walk is at a discourse $c_t$ at time $t$, it outputs a few words using a loglinear distribution:

One imagines there exists a “clothing” discourse that has high probability of outputting the tie1 sense, and also of outputting related words such as shoe, jacket, etc. Similarly there may be a “games/matches” discourse that has high probability of outputting tie2 as well as team, score etc.

By equation (2) the probability of being output by a discourse is determined by the inner product, so one expects that the vector for “clothing” discourse has high inner product with all of shoe, jacket, tie1 etc., and thus can stand as surrogate for $v_{tie1}$ in expression (1)! This motivates the following global optimization:

Given word vectors in $\Re^d$, totaling about $60,000$ in this case, a sparsity parameter $k$, and an upper bound $m$, find a set of unit vectors $A_1, A_2, \ldots, A_m$ such that where at most $k$ of the coefficients $\alpha_{w,1},\dots,\alpha_{w,m}$ are nonzero (so-called hard sparsity constraint), and $\eta_w$ is a noise vector.

Here $A_1, \ldots A_m$ represent important discourses in the corpus, which we refer to as atoms of discourse.

Optimization (3) is a surrogate for the desired expansion of $v_{tie}$ in (1) because one can hope that the atoms of discourse will contain atoms corresponding to clothing, sports matches etc. that will have high inner product (close to $1$) with tie1, tie2 respectively. Furthermore, restricting $m$ to be much smaller than the number of words ensures that each atom needs to be used for multiple words, e.g., reuse the “clothing” atom for shoes, jacket etc. as well as for tie.

Both $A_j$’s and $\alpha_{w,j}$’s are unknowns in this optimization. This is nothing but sparse coding, useful in neuroscience, image processing, computer vision, etc. It is nonconvex and computationally NP-hard in the worst case, but can be solved quite efficiently in practice using something called the k-SVD algorithm described in Elad’s survey, lecture 4. We solved this problem with sparsity $k=5$ and using $m$ about $2000$. (Experimental details are in the paper. Also, some theoretical analysis of such an algorithm is possible; see this earlier post.)

# Experimental Results

Each discourse atom defines via (2) a distribution on words, which due to the exponential appearing in (2) strongly favors words whose embeddings have a larger inner product with it. In practice, this distribution is quite concentrated on as few as 50-100 words, and the “meaning” of a discourse atom can be roughly determined by looking at a few nearby words. This is how we visualize atoms in the figures below. The first figure gives a few representative atoms of discourse.

And here are the discourse atoms used to represent two polysemous words, tie and spring

You can see that the discourse atoms do correspond to senses of these words.

Finally, we also have a technique that, given a target word, generates representative sentences according to its various senses as detected by the algorithm. Below are the sentences returned for ring. (N.B. The mathematical meaning was missing in WordNet but was picked up by our method.)

## A new testbed for testing comprehension of word senses

Many tests have been proposed to test an algorithm’s grasp of word senses. They often involve hard-to-understand metrics such as distance in WordNet, or sometimes tied to performance on specific applications like web search.

We propose a new simple test –inspired by word-intrusion tests for topic coherence due to Chang et al 2009– which has the advantages of being easy to understand, and can also be administered to humans.

We created a testbed using 200 polysemous words and their 704 senses according to WordNet. Each “sense” is represented by a set of 8 related words; these were collected from WordNet and online dictionaries by college students who were told to identify most relevant other words occurring in the online definitions of this word sense as well as in the accompanying illustrative sentences. These 8 words are considered as ground truth representation of the word sense: e.g., for the “tool/weapon” sense of axe they were: handle, harvest, cutting, split, tool, wood, battle, chop.

Police line-up test for word senses: the algorithm is given a random one of these 200 polysemous words and a set of $m$ senses which contain the true sense for the word as well as some distractors, which are randomly picked senses from other words in the testbed. The test taker has to identify the word’s true senses amont these $m$ senses.

As usual, accuracy is measured using precision (what fraction of the algorithm/human’s guesses were correct) and recall (how many correct senses were among the guesses).

For $m=20$ and $k=4$, our algorithm succeeds with precision $63\%$ and recall $70\%$, and performance remains reasonable for $m=50$. We also administered the test to a group of grad students. Native English speakers had precision/recall scores in the $75$ to $90$ percent range. Non-native speakers had scores roughly similar to our algorithm.

Our algorithm works something like this: If $w$ is the target word, then take all discourse atoms computed for that word, and compute a certain similarity score between each atom and each of the $m$ senses, where the words in the senses are represented by their word vectors. (Details are in the paper.)

##Takeaways

Word embeddings have been useful in a host of other settings, and now it appears that they also can easily yield different senses of a polysemous word. We have some subsequent applications of these ideas to other previously studied settings, including topic models, creating WordNets for other languages, and understanding the semantic content of fMRI brain measurements. I’ll describe some of them in future posts.

## July 28, 2016

### StackOverflow

#### Currying in javascript for function with n parameters

If f :: (a, b) -> c, we can define curry(f) as below:

curry(f) :: ((a, b) -> c) -> a -> b -> c

const curry = f => a => b => f(a, b);
const sum = curry((num1, num2) => num1 + num2);
console.log(sum(2)(3)); //5


How do we implement generic curry function that takes a function with n parameters?

#### LSTM network learning

I have attempted to program my own LSTM (long short term memory) neural network. I would like to verify that the basic functionality is working. I have implemented a Back propagation through time BPTT algorithm to train a single cell network.

Should a single cell LSTM network be able to learn a simple sequence, or are more than one cells necessary? The network does not seem to be able to learn a simple sequence such as 1 0 0 0 1 0 0 0 1 0 0 0 1.

I am sending the the sequence 1's and 0's one by one, in order, into the network, and feeding it forward. I record each output for the sequence.

After running the whole sequence through the LSTM cell, I feed the mean error signals back into the cell, saving the weight changes internal to the cell, in a seperate collection, and after running all the errors one by one through and calculating the new weights after each error, I average the new weights together to get the new weight, for each weight in the cell.

Am i doing something wrong? I would very appreciate any advice.

Thank you so much!

### QuantOverflow

#### Dual discounted forward curve

I was wondering how to calculate the forward rates based on OIS discounting for the half year terms. I know how to do this for the full year terms -> just making sure that the two legs are equal to each other. The problem is that I don't have fixed payments on the half year terms. For example when one wants to calculate the 1,5 OIS discounted forward rate you only have 1 fixed payment versus 3 floating payments. How to deal with this? One method which can be used is just calculating the full year rates and then interpolate between these rates, but I wonder if there is a method which can calculate these rates without using interpolation?

Next to that what does a 1,5 swap rate exactly mean, is this just an interpolated rate because you will have the same problem here when there isn't a payment on the half years.

### StackOverflow

#### UseMethod("predict") : no applicable method for 'predict' applied to an object of class "train"

I have a model (fit), based on historic information until last month. Now I would like to predict using my model for the current month. When I try to invoke the following code:

predicted <- predict(fit, testData[-$Readmit])  I get the following error: Error in UseMethod("predict") : no applicable method for 'predict' applied to an object of class "train"  The fit model was created via: train function from caret package. # Script-1: create a model: fit <- train(testData[-$Readmit], testData$Readmit) saveRDS(fit, modelFileName) # save the fit object into a file # Script-2: predict fit <- readRDS(modelFileName) # Load the model (generated previously) predicted <- predict(fit, testData[-$Readmit])


The data set from the training model as the following structure:

> str(fit$trainingData) 'data.frame': 29955 obs. of 27 variables:$ Acuity                : Factor w/ 3 levels "Elective  ","Emergency ",..: 2 2 2 1 1 2 2 2 1 1 ...
$AgeGroup : Factor w/ 10 levels "100-105","65-70",..: 8 6 9 9 5 4 9 2 3 2 ...$ IsPriority            : int  0 0 0 0 0 0 0 0 0 0 ...
$QNXTReferToId : int 115 1703712 115 3690 1948 115 109 512 481 1785596 ...$ QNXTReferFromId       : int  1740397 1724801 1711465 1704170 1714272 1731911 1535 1712758 1740614 1760252 ...
$iscasemanagement : Factor w/ 2 levels "N","Y": 2 1 1 2 2 1 2 1 2 2 ...$ iseligible            : Factor w/ 2 levels "N","Y": 2 2 2 2 2 2 2 2 2 2 ...
$referralservicecode : Factor w/ 11 levels "12345","278",..: 1 1 1 9 9 1 1 6 9 9 ...$ IsHighlight           : Factor w/ 2 levels "N","Y": 1 1 1 1 1 1 1 1 1 1 ...
$admittingdiagnosiscode: num 439 786 785 786 428 ...$ dischargediagnosiscode: num  439 0 296 786 428 ...
$RealLengthOfStay : int 3 1 6 1 2 3 3 7 3 2 ...$ QNXTPCPId             : int  1740397 1724801 1711465 1704170 1714272 1731911 1535 1712758 1740614 1760252 ...
$QNXTProgramId : Factor w/ 3 levels "QMXHPQ0839 ",..: 1 1 1 1 1 1 1 1 1 1 ...$ physicalzipcode       : int  33054 33712 33010 33809 33010 33013 33142 33030 33161 33055 ...
$gender : Factor w/ 2 levels "F","M": 1 1 1 1 2 1 1 2 2 1 ...$ ethnicitycode         : Factor w/ 4 levels "ETHN0001       ",..: 4 4 4 4 4 4 4 4 4 4 ...
$dx1 : num 439 786 296 786 428 ...$ dx2                   : num  439 292 785 786 428 ...
$dx3 : num 402 0 250 0 0 ...$ svc1                  : int  0 120 120 762 762 120 120 120 762 762 ...
$svc2 : int 120 0 0 0 0 0 0 0 0 0 ...$ svc3                  : int  0 0 0 0 0 0 0 0 0 0 ...
$Disposition : Factor w/ 28 levels "0","APPEAL & GRIEVANCE REVIEW ",..: 11 11 16 11 11 11 11 11 11 11 ...$ AvgIncome             : Factor w/ 10 levels "-1",">100k","0-25k",..: 3 6 3 8 3 4 3 5 4 4 ...
$CaseManagerNameID : int 124 1 1 19 20 1 16 1 43 20 ...$ .outcome              : Factor w/ 2 levels "NO","YES": 1 2 2 1 1 1 2 2 1 1    ...


now the testData will have the following structure:

> str(testData[-$Readmit]) 'data.frame': 610 obs. of 26 variables:$ Acuity                : Factor w/ 4 levels "0","Elective  ",..: 3 2 4 2 2 2 4 3 3 3 ...
$AgeGroup : Factor w/ 9 levels "100-105","65-70",..: 4 3 5 4 2 9 4 2 4 6 ...$ IsPriority            : int  0 0 0 0 0 0 1 1 1 1 ...
$QNXTReferToId : int 2140 482 1703785 1941 114 1714905 1703785 98 109 109 ...$ QNXTReferFromId       : int  1791383 1729375 1718532 1746336 1718267 1718267 1718532 98 109 109 ...
$iscasemanagement : Factor w/ 2 levels "N","Y": 2 2 2 2 2 2 1 2 2 1 ...$ iseligible            : Factor w/ 2 levels "N","Y": 2 2 2 2 2 2 2 2 2 2 ...
$referralservicecode : Factor w/ 7 levels "12345","IPMAT ",..: 5 1 1 1 1 1 1 5 1 5 ...$ IsHighlight           : Factor w/ 2 levels "N","Y": 1 1 1 1 1 1 1 1 1 1 ...
$admittingdiagnosiscode: num 11440 11317 11420 11317 1361 ...$ dischargediagnosiscode: num  11440 11317 11420 11317 1361 ...
$RealLengthOfStay : int 1 2 4 3 1 1 16 1 1 3 ...$ QNXTPCPId             : int  3212 1713678 1738430 1713671 1720569 1791640 1725962 1148 1703290 1705009 ...
$QNXTProgramId : Factor w/ 2 levels "QMXHPQ0839 ",..: 1 1 1 1 1 1 1 1 1 1 ...$ physicalzipcode       : int  34744 33175 33844 33178 33010 33010 33897 33126 33127 33125 ...
$gender : Factor w/ 2 levels "F","M": 2 1 2 1 2 2 2 1 1 2 ...$ ethnicitycode         : Factor w/ 1 level "No Ethnicity   ": 1 1 1 1 1 1 1 1 1 1 ...
$dx1 : num 11440 11317 11420 11317 1361 ...$ dx2                   : num  11440 11317 11420 11317 1361 ...
$dx3 : num 0 1465 0 11326 0 ...$ svc1                  : int  52648 27447 50040 27447 55866 55866 51595 0 99221 300616 ...
$svc2 : int 76872 120 50391 120 120 38571 120 762 120 0 ...$ svc3                  : int  762 0 120 0 0 51999 0 0 0 762 ...
$Disposition : Factor w/ 14 levels "0","DENIED- Not Medically Necessary ",..: 3 5 3 4 3 3 5 3 3 5 ...$ AvgIncome             : Factor w/ 10 levels "-1",">100k","0-25k",..: 6 7 5 9 3 3 6 4 3 4 ...
CaseManagerNameID : int 1 2 3 4 5 6 7 8 9 7 ...  The variable structure is the same, just that some factor variables has different levels because some variable has new values. For example: Acuity in the model has 3-levels and in the testing data 4-levels. I don't have from upfront a way to know all possible level for all variables. Any advice, please... Thanks in advance, David #### Design pattern for transformation of array data in Python script I currently have a Python class with many methods each of which performs a transformation of time series data (primarily arrays). Each method must be called in a specific order, since the inputs of each function specifically rely on the outputs of the previous. Hence my code structure looks like the following: class Algorithm: def __init__(self, data1, data2): self.data1 = data1 self.data2 = data2 def perform_transformation1(self): ...perform action on self.data1 def perform_transformation2(self): ...perform action on self.data1 etc..  At the bottom of the script, I instantiate the class and then proceed to call each method on the instance, procedurally. Using object orientated programming in this case seems wrong to me. My aims are to re-write my script in a way such that the inputs of each method are not dependent on the outputs of the preceding method, hence giving me the ability to decide whether or not to perform certain methods. What design pattern should I be using for this purpose, and does this move more towards functional programming? ### QuantOverflow #### Asset allocation problem using Hidden Markov Model I am recently getting more interested in Hidden Markov Models (HMM) and its application on financial assets to understand their behavior. But what captured my attention the most is the use of asset regimes as information to portfolio optimization problem. I am refering to this article I searched in many sites for the code to apply an asset allocation problem based on HMM estimations but I can't find .. I am extremely interesed ..I would be very grateful if you could provide me any code example that uses HMM to asset allocation problem. #### How to calculate this swap rate What is the 2x5 swap rate? here 2x5 swap rate refers to the 3-year swap, 2 years forward. #### Exercise: interpretation of terms in black-scholes I have following exercise: This is what I did: \begin{align} C(K)&= e^{-r\tau} \mathbb{E}^\mathbb{Q}[((S_T - K)^+] \\ &= e^{-r\tau}\mathbb{E}^\mathbb{Q}[((S_T - K)\mathbb{1}_{S_T>K}] \\ &=e^{-r\tau}\mathbb{E}^\mathbb{Q}[S_T \mathbb{1}_{S_T>K}]-Ke^{-r\tau}\mathbb{E}^\mathbb{Q}[\mathbb{1}_{S_T>K}] \end{align} by \mathbb{1}$I mean indicator function. Now I understand that $$\mathbb{E}^\mathbb{Q}[\mathbb{1}_{S_T>K}] = \mathbb{Q}(S_T>K)$$ however I don't know how to deal with $$\mathbb{E}^\mathbb{Q}[S_T \mathbb{1}_{S_T>K}]$$ is this the right way how to solve this exercise? ### CompsciOverflow #### Does word addressable memory has more bytes than byte addressable memory? Well, my question - if word addressable has more bytes than byte addressable - is derived from the fact that in word addressable memory each address adresses a word and in byte addressable memory each address addresses a byte. In the case of word addressable memory - If the word size is 4 bytes 32 bit architecture, for example, do i have 4*2^32 bytes in my memory? And in the case of byte addressable memory -will i have 2^32 bytes in my memory? How is it possible that tha same ram, i.e given the ram size is 4 GB, would contain different number of bytes depending on byte/ word addressable memory? 1GB RAM has 1*1024*1024*1024 bytes in it. Say our architecture is 32 bit. So in the case of byte addressable memory there will be 4*1024*1024*1024 virtual addresses per program but in reality there are 1*1024*1024*1024 physical addresses, each points to a byte. In the case of word addressable memory there will be 4*1024*1024*1024 virtual addresses per program - each address points to a word? so in this case there will be theoretically 4*4*1024*1024*1024 bytes available for each program ? There will be 1*1024*1024*1024 physical addresses, each address points to a word. So there are actually 4*1*1024*1024*1024 bytes in the memory in this case? As you can see im super confuse #### worst case of insertion sort what is the worst case of insertion sort ? ### Lobsters #### This is the real JavaScript fatigue ### CompsciOverflow #### Why are hash map look-ups assumed to be$O(1)$on average To look up a key in a hash map you have to 1. calculate its hash 2. find the entry in the resulting hash bucket Hash calculation takes at least$O(l)$operations when the hashes are$l$-bit-numbers. When using an index (like a binary tree) for each bucket, finding an entry within a bucket that contains$k$entries can be done in$O(\log k)$. With$n$being the total number of entries in the hash map and$m$being the number of buckets,$k$averages to$n/m$. Due to$m=2^l$we thus get$O(\log k) = O(\log n/m) = O(\log n - \log m) = O(\log n - l)$. Combining these two runtimes one gets a total look-up time of$O(l + \log n - l) = O(\log n)$, which conforms to the intuition that a lookup in a collection with$n$entries is not possible below$O(\log n)$operations. In short, it is generally assumed that$l$and$k$are both constant with regard to$n$. But if you fix$l$then$k$grows with$n$. Am I missing something here? ### QuantOverflow #### FX Forward pricing with correlation between FX and Zero-Cupon I would like to extend my question about about FX Forward rates in stochastic interest rate setup: FX forward with stochastic interest rates pricing We consider a FX process$X_t = X_0 \exp( \int_0^t(r^d_s-r^f_s)ds -\frac{\sigma^2}{2}t+ \sigma W_t^2)$where$r^d$and$r^f$are stochastic processes not independent of the Brownian motion$W$. The domestic risk-neutral measure is denoted by$\mathbb Q^d$. The domestic and foreign bank accounts are$\beta^d$and$\beta^f $respectively. The domestic and foreing zero-coupon bond prices of maturity$T$at time$t$are respective$B_d(t,T)$and$B_f(t,T)$. The domestic bond follows the SDE $$\frac{dB_d(t,T)}{B_d(t,T)} = r^d_t \ dt + \sigma(t,T) \ dW^1_t$$ with determinist initial conditions$B_d(0,T)$. The drift and volatility functions in the SDEs are all determinist functions and$W^1$and$W^2$are standard Brownian motions such that$\langle W^1, W^2\rangle_t =\rho \ dt$. Now consider domestic$\tau$-forward measure$\mathbb Q^{d,\tau}$$$\left. \frac{d\mathbb Q^{d,\tau}}{d\mathbb Q^d}\right|_{\mathcal F_t} = \frac{B_d(t,\tau)}{\beta_t^dB_d(0,\tau)} = \mathcal E_t \left( \int_0 ^. \sigma(s,\tau) \ dW^1_s\right)$$ and the$\mathbb Q^{\tau}$-Brownian motions $$W^{1,\tau}_. := W^1_. + \int_0 ^. \sigma(s,\tau) \ ds$$ $$W^{2,\tau}_. := W^2_. + \rho \int_0 ^. \sigma(s,\tau) \ ds$$ Question I would like to calculate the non-deliverable FX forward rate. Since the fixing date$t_f$is such that$t_f< T$, where$T$is the settlement date, it implies to pass by the calculation following expectation: $$\mathbb E^{\mathbb Q^d} _t \left[ \exp(-\int_t^T r^d_s ~ds)\ X_{t_f}\right]$$ I can get to this point $$\mathbb E^{\mathbb Q^d} _t \left[ \exp(-\int_t^T r^d_s ~ds)\ X_{t_f}\right]= B_d(t,t_f)\ \mathbb E^{\mathbb Q^{d,t_f}}_t\left[ B_b(t_f,T)X_{t_f}\right].$$ From that point I am struggling to get through all the calculations. Is there a smart way to compute the last expectation? ### CompsciOverflow #### Is language L in R||RE/R||CO-RE/R||not in CO-RE or RE - any intuition/tips? I have a test in computational models coming this Sunday, and it seems no matter how many questions from the type "is this language in R or RE or CoRE or not in CORE or RE" I solve, I always manage to get it wrong somehow. Like, I always get bad intuition about what's going on, and once I've already decided to myself that it belongs to one of those, it's pretty easy to create a reduction (obviously an incorrect one) and get it all wrong. Are there any tips/intuition I get about how to decide ahead weather or not a language is decidable/recognizable? ### StackOverflow #### Finding the optimal combination of algorithms in an sklearn machine learning toolchain In sklearn it is possible to create a pipeline to optimize the complete tool chain of a machine learning setup, as shown in the following sample: from sklearn.pipeline import Pipeline from sklearn.svm import SVC from sklearn.decomposition import PCA estimators = [('reduce_dim', PCA()), ('svm', SVC())] clf = Pipeline(estimators)  Now a pipeline represents by definition a parallel process. But what if I want to compare different algorithms on the same level of a pipeline? Say I want to try another feature transformation algorithm additionally to PCA and another machine learning algorithm such as trees additionally to SVM, and get the best of the 4 possible combinations? Can this be represented by some kind of parallel pipe or is there a meta algorithm for this in sklearn? ### CompsciOverflow #### Given a set of numbers (negative or positive), and a maximum weight w, find a subset that is maximal whose sum is less than w The aim of this problem is to find a subset (need not be consecutive) of a given set such that the sum is maximal and less than some given number$w$. (Note, we are trying to find a subset that is less than or equal to$w$and not closest to$w$). For example, given a set$\{1, 3, 5, 9, 10\}$and maximum weight 17, the maximal subset is$\{3, 5, 9\}$since its sum is exactly 17. Another example: given a set$\{1, 3, 4, 9\}$and maximum weight 15, the maximal subset is$\{1, 4, 9\}$since its sum is 14, and there are no other subsets whose sum is 15. Example with both positive and negative numbers: given a set$\{-3, 2, 4\}$and maximum weight 3, the subset is the set itself since -3 + 2 + 4 = 3. I know how to solve it with only positive numbers, but I am struggling to find an algorithm to solve this problem for the general case with both positive and negative numbers. Obviously, my goal is not to use the brute force approach and check every possible subset since the complexity would be$O(n2^n)$. I stumbled upon an idea on another post that suggested adding a sufficiently large number to every elements in the set and subsequently changing the maximum weight. That is given a set$R = \{ a_1, a_2, ... , a_n \}$, we add some number$X$(we can pick some number greater than equal to the absolute value of the smallest negative number) to get a set that looks like$\{ a_1 + X, a_2 + X, ... , a_n + X \}$and change the maximum weight to$nX + w$where$w$was the original weight. Now, we have reduced the problem to only non-negative numbers. However, I could not see a way to actually find the subset that was closest to the original weight, but only whether any elements add up exactly the original weight (ie, there is no way to actually find the subset, but only to determine that some subset exists). Is there any other clever trick like this one to solve the problem for both positive and negative numbers? Any help would be thoroughly appreciated. ### Lobsters #### A Sensible Intro to FRP ### StackOverflow #### Does a data structure like this exist? I'm searching for a data structure that can be sorted as fast as a plain list and which should allow to remove elements in the following way. Let's say we have a list like this: [{2,[1]}, {6,[2,1]}, {-4,[3,2,1]}, {-2,[4,3,2,1]}, {-4,[5,4,3,2,1]}, {4,[2]}, {-6,[3,2]}, {-4,[4,3,2]}, {-6,[5,4,3,2]}, {-10,[3]}, {18,[4,3]}, {-10,[5,4,3]}, {2,[4]}, {0,[5,4]}, {-2,[5]}]  i.e. a list containing tuples (this is Erlang syntax). Each tuple contains a number, and a list which includes the members of a list used to compute previous number. What I want to do with the list is the following. First, sort it, then take the head of the list, and finally clean the list. With clean I mean to remove all the elements from the tail that contain elements that are in the head, or, in other words, all the elements from the tail which intersection with head is not empty. For example, after sorting the head is {18,[4,3]}. Next step is removing all the elements of the list that contain 4 or 3, i.e. the resulting list should be this one: [{6,[2,1]}, {4,[2]}, {2,[1]}, {-2,[5]}]  The process follows by taking the new head and cleaning again till the whole list is consumed. Note that if the the clean process preserves the order, there is no need to resorting the list each iteration. The bottleneck here is the clean process. I would need some structure which allows me to do the cleaning in a faster way than now. Does anyone know some structure that allows to do this in an efficient way without losing the order or at least allowing fast sorting? #### Using nested reduce in Swift I have an array which contains an arrays of Double, like in the screenshot: My goal is to get the sum of the multiplication of the Double elements of each array. It means, I want to multiply all elements of each array then, in my case, I will have 3 values so I get the sum of them. I want to use reduce, flatMap ? or any elegant solution. What I have tried ? totalCombinations.reduce(0.0) {$0 + ($1[0]*$1[1]*$1[2]) }  but this work only when I know the size of the arrays that contains the doubles. ### CompsciOverflow #### Should a pure function take all of the functions it calls as arguments? After Wikipedia, if a function is pure, then: [it] always evaluates the same result value given the same argument value(s). So: if a function, let's call it f, calls another function, g, then it's behavior clearly depends on the structure of the function g. Should f take g as an argument in order to be pure? ### DataTau #### Understanding Gaussian Processes with 3D shapes ### TheoryOverflow #### Irreducible languages This is not necessarily a research question. Just a question out of curiosity: I am trying to understand if one can define "irreducible" languages. As a first guess I call a language L "reducible" if it can be written as$L = A \cdot B$with$A \cap B = \emptyset$and$|A|,|B|>1$, otherwise call the language "irreducible". Is it true: 1) If P is irreducible, A,B, C are languages such that$A\cap B = \emptyset$,$P \cap C = \emptyset$and$A\cdot B = C\cdot P$, then there exists a language$B' \cap P = \emptyset$such that$B = B'\cdot P$? This would correspond in integers to the lemma of Euklid and would be usefull to prove uniqueness of "factorization". 2) Is it true that every language can be factored in a finite number of irreducible languages? If someone has a better idea on how to define "irreducible" language, I would like to hear it. (Or is there maybe already a definiton of this, which I am unaware of?) ### StackOverflow #### Practical use of K-combinator (Kestrel) in javascript The K-combinator can be implemented as below and the implementation should not have any side-effects. const K = x => y => x; When is it useful? Please help me with practical examples. ### CompsciOverflow #### Topological Sort without modifying the graph or marking edges I have a DAG which I want to traverse in a topological order. Wikipedia describes two algorithms for topological sorting, which both work in theory but seem impractical to me from a design point of view: Kahn's algorithm modifies the graph (by removing edges) and the DFS-based one marks nodes, which would require me to modify my node classes (by adding a boolean field) and is furthermore not thread safe. Are there more practical approaches, that preserve the asymptotic runtime but do not interfere with my business logic so much? ### QuantOverflow #### Where can I find API access to historical options data? Paid or free? I'm looking for a company or website that provides API access to historical options data. I would prefer a provider that has a python module to access the API. Any leads would be appreciated. ### UnixOverflow #### Where is the __sysctl function defined in FreeBSD? I am reading the source code to understand sysctl in FreeBSD. It looks like the most important function int __sysctl(const int *name, u_int namelen, void *oldp, size_t *oldlenp, const void *newp, size_t newlen);  is not defined in lib/libc/gen/sysctl.c. I tried to grep over FreeBSD's source code but I failed to find the defintion of __sysctl. Where is it defined? ### Lobsters #### Scheduling Your Kubernetes Pods With Elixir ### Fred Wilson #### The AI Nexus Lab In Matt Turck‘s recent blog post about the state of NYC’s tech sector, he wrote: The New York data and AI community, in particular, keeps getting stronger. Facebook’s AI department is anchored in New York by Yann LeCun, one of the fathers of deep learning. IBM Watson’s global headquarter is in NYC. When Slack decided to ramp up its effort in data, it hired NYC-based Noah Weiss, former VP of Product at Foursquare, to head its Search Learning and Intelligence Group. NYU has a strong Center for Data Science (also started by LeCun). Ron Brachman, the new director of the Technion-Cornell Insititute, is an internationally recognized authority on artificial intelligence. Columbia has a Data Science Institute. NYC has many data startups, prominent data scientists and great communities (such as our very own Data Driven NYC!). And now NYC has our very own AI accelerator program based at NYU’s Tandon Engineering School Accelerator, called The AI Nexus Lab. The 4 month program will immerse early stage AI companies from around the world with NYU AI resources, computing resources at the Data Future Lab, two full time technical staff members, and a student fellow for each company. Unlike a traditional accelerator, they are recruiting only 5 companies with the goal of market entry and sustainability for all 5. They won’t have a Demo Day, the program will end with a day long AI conference celebrating AI entrepreneurs, researchers, innovators and funders during which which they will announce the 5 companies. Companies will get a net$75,000 for joining the program.

If you have an early stage AI company and want to join this program, you can apply here.

### StackOverflow

#### Creating embeddings and training on embeddings using a bigram LSTM model in Tensorflow

I'm having trouble figuring out how to create and train on bigram embeddings for LSTMs in Tensorflow.

We are initially given that train_data is a Tensor of shape (num_unrollings, batch_size, 27) i.e.num_unrollingsis the total number of batches,batch_sizeis the size of each batch, and27 is the size of the one-hot-encoded vector for characters "a" to "z" and including " ".

The LSTM takes as input a single batch at each time step i.e. it takes in a Tensor of shape (batch_size, 27)

characters() is a function that takes in a Tensor of shape 27 and returns the most likely character that it represents from the one-hot-encodings.

What I have done so far is created an index lookup for each bigram. We have a total of 27*27 = 729 bigrams (because I include the " " character). I choose to represent each bigram by a vector of log(729) ~ 10 bits.

In the end I am trying to make my input to the LSTM a Tensor of shape (batch_size / 2, 10). So that I can train on the bigrams.

Here is the relevant code:

batch_size=64
num_unrollings=10
num_embeddings = 729
embedding_size = 10

bigram2id = dict()

key = ""

# build dictionary of bigrams and their respective indices:
for i in range(ord('z') - ord('a') + 2):
key = chr(97 + i)
if (i == 26):
key = " "
for j in range(ord('z')- ord('a') + 2):
if j == 26:
bigram2id[key + " "] = i*27 + j
continue
bigram2id[key + chr(97 + j)] = i*27 + j

graph = tf.Graph()

with graph.as_default():

# embeddings
embeddings = tf.Variable(tf.random_uniform([num_embeddings, embedding_size], -1.0, 1.0), trainable=False)

"""
1) load the training data as we would normally
2) look up the embeddings of the data then from there get the inputs and the labels
3) train
"""

# load training data, labels for both unembedded and embedded data
train_data = list()
embedded_train_data = list()
for _ in range(num_unrollings + 1):
train_data.append(tf.placeholder(tf.float32, shape=[batch_size,vocabulary_size]))
embedded_train_data.append(tf.placeholder(tf.float32, shape=[batch_size / 2, embedding_size]))

# look up embeddings for training data and labels (make sure to set trainable=False)
for batch_ctr in range(num_unrollings + 1):
for bigram_ctr in range((batch_size // 2) + 1):
# get current bigram
current_bigram = characters(train_data[batch_ctr][bigram_ctr*2]) + characters(train_data[batch_ctr][bigram_ctr*2 + 1])
# look up id
current_bigram_id = bigram2id[current_bigram]
# look up embedding
embedded_bigram = tf.nn.embedding_lookup(embeddings, embedded_bigram)
embedded_train_data[batch_ctr][bigram_ctr].append(embedded_bigram)


But right now, I am getting the Shape (64, 27) must be of rank 1 error and even if I fix that, I am not sure whether I am taking the right approach.

### CompsciOverflow

#### Is zero allowed as an edge's weight, in a weighted graph?

I am trying to write a script that generates random graphs and I need to know if an edge in a weighted graph can have the 0 value.

actually it makes sense that 0 could be used as an edge's weight, but I've been working with graphs in last few days and I have never seen an example of it.

### StackOverflow

#### How to write the precision value in report for multi-class classification?

In my dataset I have multiple classes. I have calculated Precision and Recall for each class separately using confusion matrix. How should I get one value to write in my report? Should I use Weighted Average or is there any other solution? In case of weighted average, how to determine weights? Please Help.

### QuantOverflow

#### One Way CSA Agreements

This is probably an older topic but I don't seem to find any related threads on this forum.

What is the best way to value, let's say, a vanilla IR swap (you receive fixed) that you trade against a sovereign, where you are on a 1-way cash CSA agreement? (i.e. you post collateral when PV is negative but not vice versa).

1. How good an approximation to fair value is when you just discount the swap at your cost of funds?

2. a) Or is it better to discount it at some other reference rate (whats a good reference rate?)

and

b) calculate and include a seperate valuation adjustment, where the positive and negative PV paths are simulated seperately and the former discounted with the reference rate, the negative MTMs discounted with cost of funds?

3. To the purpose of 2), is it better to value a swap synthetically as short caplets discounted at OIS, and long floorlets discounted at your cost of funds? Can you think of any issues when done this way? This obviously won't work for exotics derivatives.

Many thanks!

### StackOverflow

#### Tuning two parameters for random forest in Caret package

When i only used mtry parameter as the tuingrid, it worked but when i added ntree parameter the error becomes Error in train.default(x, y, weights = w, ...): The tuning parameter grid should have columns mtry. The code is as below:

require(RCurl)
require(prettyR)
library(caret)
url <- "https://raw.githubusercontent.com/gastonstat/CreditScoring/master/CleanCreditScoring.csv"
cs_data <- getURL(url)
classes <- cs_data[, "Status"]
predictors <- cs_data[, -match(c("Status", "Seniority", "Time", "Age", "Expenses",
"Income", "Assets", "Debt", "Amount", "Price", "Finrat", "Savings"), colnames(cs_data))]

train_set <- createDataPartition(classes, p = 0.8, list = FALSE)
set.seed(123)

cs_data_train = cs_data[train_set, ]
cs_data_test = cs_data[-train_set, ]

# Define the tuned parameter
grid <- expand.grid(mtry = seq(4,16,4), ntree = c(700, 1000,2000) )

ctrl <- trainControl(method = "cv", number = 10, summaryFunction = twoClassSummary,classProbs = TRUE)

rf_fit <- train(Status ~ ., data = cs_data_train,
method = "rf",
preProcess = c("center", "scale"),
tuneGrid = grid,
trControl = ctrl,
family= "binomial",
metric= "ROC" #define which metric to optimize metric='RMSE'
)
rf_fit


### QuantOverflow

#### How are HFT systems implemented on FPGA nowadays?

Vendors like Cisco claim they have achieved the same results with high performance NIC's (http://www.cisco.com/c/dam/en/us/products/collateral/switches/nexus-3000-series-switches/white_paper_c11-716030.pdf).

My question is, what part of HFT systems are mostly implemented on FPGAs nowadays? Are FPGAs still very popular? Is only the feed handler implemented on the FPGAs? Because some of these systems described above only have a feed handler implemented on the FPGA, because the strategy changes too much, or is too hard to implement on FPGAs. Others claim that they have also implemented trading strategies on FPGAs or using high performance NICs instead of FPGAs to build HFT systems. I've read about different approaches but I find it hard to compare as most of the results are tested on different input sets.

#### What are some quantitative trading strategies used by high-frequency trading companies to make a killing on a market crash day on 24Aug2015?

Virtu Financial (VIRT), the high-speed trading firm that went public earlier this year, was one of the few stocks on the market to log gains on Monday while the S&P 500 dropped nearly 4%. Indeed, Virtu, which claims not to have posted a daily loss in years, just had one of its most profitable trading days in history.

### StackOverflow

#### Train model using queue Tensorflow

I designed a neural network in tensorflow for my regression problem by following and adapting the tensorflow tutorial. However, due to the structure of my problem (~300.000 data points and use of the costful FTRLOptimizer), my problem took too long to execute even with my 32 CPUs machine (I don't have GPUs).

According to this comment and a quick confirmation via htop, it appears that I have some single-threaded operations and it should be feed_dict.

Therefore, as adviced here, I tried to use queues for multi-threading my program.

I wrote a simple code file with queue to train a model as following:

import numpy as np
import tensorflow as tf

#Function for enqueueing in parallel my data
sess.run(enqueue_op, feed_dict={x_batch_enqueue: x, y_batch_enqueue: y})

#Set the number of couples (x, y) I use for "training" my model
BATCH_SIZE = 5

#Generate my data where y=x+1+little_noise
x = np.random.randn(10, 1).astype('float32')
y = x+1+np.random.randn(10, 1)/100

#Create the variables for my model y = x*W+b, then W and b should both converge to 1.
W = tf.get_variable('W', shape=[1, 1], dtype='float32')
b = tf.get_variable('b', shape=[1, 1], dtype='float32')

#Prepare the placeholdeers for enqueueing
x_batch_enqueue = tf.placeholder(tf.float32, shape=[None, 1])
y_batch_enqueue = tf.placeholder(tf.float32, shape=[None, 1])

#Create the queue
q = tf.RandomShuffleQueue(capacity=2**20, min_after_dequeue=BATCH_SIZE, dtypes=[tf.float32, tf.float32], seed=12, shapes=[[1], [1]])

#Enqueue operation
enqueue_op = q.enqueue_many([x_batch_enqueue, y_batch_enqueue])

#Dequeue operation
x_batch, y_batch = q.dequeue_many(BATCH_SIZE)

#Prediction with linear model + bias

#MAE cost function
cost = tf.reduce_mean(tf.abs(y_batch-y_pred))

learning_rate = 1e-3
init = tf.initialize_all_variables()
sess = tf.Session()
sess.run(init)

#Feed the queue

#Train the model
for epoch in range(1000):
sess.run(train_op)
print(sess.run(cost))
Wf=sess.run(W)
bf=sess.run(b)


This code doesn't work because each time I call x_batch, one y_batch is also dequeued and vice versa. Then, I do not compare the features with the corresponding "result".

Is there an easy way to avoid this problem ?

#### Recommendations without ratings (Azure ML)

I'm trying to build an experiment to create recommendations (using the Movie Ratings sample database), but without using the ratings. I simply consider that if a user has rated certain movies, then he would be interested by other movies that have been rated by users that have also rated his movies.

I can consider, for instance, that ratings are 1 (exists in the database) or 0 (does not exist), but in that case, how do I transform the initial data to reflect this?

I couldn't find any kind of examples or tutorials about this kind of scenario, and I don't really know how to proceed. Should I transform the data before injecting it into an algorithm? And/or is there any kind of specific algorithm that I should use?

#### How to calculate the TF and IDF for each word in each document in corpus

I have the following scenario, 1000K documents. And I am trying to:

Calculate the TF only for each word in document. which mean: list[document][word] = TF

Calculate the IDF only for each word in document. which mean: list[document][word] = IDF

I am trying to use TfidfVectorizer, but, I think the results I am getting are wrong.

from sklearn.feature_extraction.text import TfidfVectorizer
import os

inputDir = "/Users/1000K_files/"
dirArray = os.listdir(inputDir)
files_content = []

for filename in dirArray:
inputpath = os.path.join(inputDir, filename)
inputContent = open(inputpath)
files_content.append(file_data)

tf = TfidfVectorizer(analyzer='word', use_idf=1, smooth_idf=1,    min_df=1, stop_words=None, norm=None, strip_accents='unicode',  sublinear_tf=1, ngram_range=(1, 1), max_features=None, token_pattern=r'[a-z]+')
tfidf_matrix = tf.fit_transform(files_content)
feature_names = tf.get_feature_names()

# Trying to extract the IDF values only
z = zip(feature_names, tf.idf_)
words_idf = dict()
z_len = len(z)
for index_documents in range(0, z_len, 1):
words_idf[z[index_documents][0]] = z[index_documents][1]

# Trying to extract the TFIDF values corpus wise.
words_tfidf = dict()
cx = tfidf_matrix.tocoo()
df = zip(cx.col, cx.data)
for key, val in df:
words_tfidf[feature_names[key]] = val

# Trying to extract the TF values only
words_tf = dict()
cx = tfidf_matrix.tocoo()
df = zip(cx.row, cx.col, cx.data)
for row, key, val in df:
key = feature_names[key]
if not words_tf.has_key(row):
words_tf[row] = {}
if words_tf[row].has_key(key):
words_tf[row][key] = words_tf[row][key] + val
else:
words_tf[row][key] = val


I am guessing my methods to do so are wrong. What am I missing here?

Thanks.

#### How to apply RNN to sequence to sequence NLP task?

I'm quite confused about sequence to sequence RNN on NLP tasks. Previously, I have implemented some neural models of classification tasks. In those tasks, the models take word embeddings as input and use a softmax layer at the end of the networks to do classification. But how do neural models do seq2seq tasks? If the input is word embedding, then what is the output of the neural model? Examles of these tasks include question answering, dialogue systems and machine translation.

### QuantOverflow

#### How are Quandl monthly S&P500 earnings estimates derived?

Can someone explain how the monthly earnings estimates are derived for S&P500? Quandl sources multpl.com, who state:

Yields following March 2015 (including current yield) are estimated based on 12 month earnings through March 2015 — the latest reported by S&P.


Is there a release schedule for S&P500 (ttm) earnings somewhere?

I would like to be able to manually derive and match the Quandl estimates on the first of each month.

Also, how stable are the Quandl estimates? That is to say, if I use some historical record, but try to replicate each estimate on the first of each month, are they often revised such that the 1st of the month estimates would not reflect the historical values?

#### What are some options to execute ML algos against with live data using C#, F# or Python for a retail trader?

I'm a retail algorithmic trader. I've written some algorithms that parse intraday movements and make decisions. I still execute trades manually but eventually I need the ability to execute trades on the fly.

I need a platform where I can implement these algos in C#, F#, Java or Python against live data feeds to flag the situations.

Obviously, TD Ameritrade has Thinkscript but it's not really what I need. I need to be able to use regular programming languages against live data.

Any ideas?

#### EuroBSDCon 2016 schedule has been released

The EuroBSDCon 2016 talks and schedule have been released, and oh are we in for a treat!

All three major BSD's have a "how we made the network go fast" talk, nearly every single timeslot has a networking related talk, and most of the non-networking talks look fantastic as well.

The OpenBSD related talks are:
• Embracing the BSD routing table - mpi@
• rc.d(8) on OpenBSD - ajacoutot@
• OpenBSD meets 802.11n - stsp@
• OpenBSD: pf+rdomains create splendid multi-tenancy firewalls - Philipp Buehler (formerly known as pb@)
• Dropping in 80Gbits (hopefully) of stateful firewalling capacity with PF and OpenOSPFd - Gareth Llewellyn
• What we learnt from natively building packages on exotic archs - landry@
• Bidirectional Forwarding Detection (BFD) implementation and support in OpenBSD - phessler@
• Retrofitting privsep into ports tools - espie@
• Why and how you ought to keep multibyte character support simple - ingo@
• And an OpenBSD related tutorial is
• OpenBSD: Building a test-environment for multi-tenancy firewalls - Philipp Buehler

### QuantOverflow

#### CSA discounting vs OIS discounting

In the fixed income literature, is the CSA discounting the same as OIS discounting? Seems they're referring to the same thing, but couldn't find an explicit statement confirming it.

### StackOverflow

#### How to do multiclass classification properly with NLTK?

So, I'm trying to do text multiclass classification. I have been reading a lot of old questions and blog posts, but I still can't fully understand the concept of that.

I tried some example from this blog post as well. http://www.laurentluce.com/posts/twitter-sentiment-analysis-using-python-and-nltk/

But when it comes to multiclass classification I don't quite understand that. Let's say I want to classify text into multi languages, French, English, Italian and German. And I want to use NaviesBayes which I think it would be the easiest to start with. From what I have read in the old questions, the simplest solution would be to use one vs all. So, each language will have its own model. So, I would have 3 models for French, English and Italian. Then I would run a text against every model and check if which one has the highest probability. Am I correct?

But when it comes to coding, in the example above he has tweets like this which will be classified either positive or negative.

pos_tweets = [('I love this car', 'positive'),
('This view is amazing', 'positive'),
('I feel great this morning', 'positive'),
('I am so excited about tonight\'s concert', 'positive'),
('He is my best friend', 'positive')]

neg_tweets = [('I do not like this car', 'negative'),
('This view is horrible', 'negative'),
('I feel tired this morning', 'negative'),
('I am not looking forward to tonight\'s concert', 'negative'),
('He is my enemy', 'negative')]


Which it's positive or negative. So, when it comes to train one model for French how should I tag the text? Would it be like this? So this would be the positive?

[('Bon jour', 'French'),
'je m'appelle', 'French']


And the negative would be

[('Hello', 'English'),
('My name', 'English')]


But would this mean I could just add Italian and German and have just one model for 4 languages? Or I don't really need the negative?

So, the question would be what's the right approach to do multi class classification with ntlk?

### QuantOverflow

#### Finding optimal drift, importance sampling, least square monte carlo

I am working with Importance sampling for Least Squared monte carlo and have now problems understanding the implementation of the Robbins-Monro algorithm for finding the optimal drift for finding minimum variance of my estimate. The original problem formulation that is now answered is given here.

The article I am following for Robbins-Monro algorithm is this link

The problem i want to solve is to find a optimal drift $\theta^*$ by solving:

$H(\theta^*)=\min_{\theta}H(\theta)$

Where $H(\theta)=\mathbb{E}\left[ G^2(Z)e^{-\theta Z+\frac{1}{2}\theta^2}\right]$, the second moment of the payoff function $G(Z)=\max(K-S(t),0)$. Indeed, we have: $\nabla H(\theta)=0$

Now following the Morris monro algorithm in the link, the general formulation of the stochastic algorithm is given in equation (10) and is given by:

$X_{n+1}=X_n-\gamma_{n+1}F(X_n,Z_{n+1})$

and going further to equation (15) we have the second moment (the gradient of $H(\theta)$) given by:

$h(\theta)=\nabla H(\theta)=\mathbb{E}\left[(\theta-Z)G^2(Z)e^{-\theta Z+\frac{1}{2}\theta^2}\right]$.

Now I wonder, since I don't know the second moment, how should I approximate it numerically in order to evaluate the algorithm? Given in the article, they don't really explain how the second moment is found?

Appreciate for help. Thank you!

#### Understanding meaningfulness of the BS model for portfolio of 2 assets

this is my 1st post here.

I would like to discover the beauty of science hidden behind qualitative finance. I have half of the summer fully open for experiments, I am learning Java for this ( chosen as very popular language) and there is open source library for many fin. stuff.

Currently I do play with idea of calculating European Call option by BS model of my imaginary portfolio of 2 options A and B. I simplified and got "1 asset with new volatility" and I want to calculate its Fair Price,its Greeks. I got some results but I would like to make the most interesting - find out how the data I got is vague, or how it is good, so I would like to do

1) sanity checks to get a sense of my model and my approximations and limitations of it,

2) construct more complex derivatives, so to learn more about finance. Next plan: extend this problem to Quanto options. I will have foreign option, for which I want to get price, greeks.

3) and after I want to study Swaps, Swaptions, Equity Linked Swaptions.

I have been looking through other posts, Willmot, Haug, etc and found that the most obvious checks and answers to my questions is to check: put-call parity - to check delta. maybe to get some plots to see convexity of Vega, to check in which order price of the option changes with change of the parameters. Nothing more.

Please, tell if you have any ideas about my issues. Thank you.

Very precise questions - sanity checks ( their variety and appropriatness to use), how to go on with building Equity Linked Swaptions ( I thought of Monte Carlo)

Of course general advises on how to build my study will be much appreciated.

### CompsciOverflow

#### How can one find an element in a merkle tree?

How can one find an element in a merkle tree, as effectively as possible?

Each internal node has a hash value. So I think, first, hash the value to find, and if an internal node has the same value exactly, get its leaf node. But this is correct in 2-depth, not all cases. Because each internal node has a hashed what is concatenation of their child nodes, by the avalanche effect, the concatenated hash value is unexpected.

So I cannot find the value to do hash and compare.

### StackOverflow

#### Classification using SVM

In an attempt to classify text I want to use SVM. I want to classify test data into one of the labels(health/adult) The training & test data are text files

I am using python's scikit library. While I was saving the text to txt files I encoded it in utf-8 that's why i am decoding them in the snippet. Here's my attempted code

String = String.decode('utf-8')
String2 = String2.decode('utf-8')
bigram_vectorizer = CountVectorizer(ngram_range=(1, 2),
token_pattern=r'\b\w+\b', min_df=1)

X_2 = bigram_vectorizer.fit_transform(String2).toarray()
X_1 = bigram_vectorizer.fit_transform(String).toarray()
X_train = np.array([X_1,X_2])
print type(X_train)
y = np.array([1, 2])
clf = SVC()
clf.fit(X_train, y)

#prepare test data
print(clf.predict(X))


This is the error I am getting

  File "/Users/guru/python_projects/implement_LDA/lda/apply.py", line 107, in <module>
clf.fit(X_train, y)
File "/Users/guru/python_projects/implement_LDA/lda/lib/python2.7/site-packages/sklearn/svm/base.py", line 150, in fit
X = check_array(X, accept_sparse='csr', dtype=np.float64, order='C')
File "/Users/guru/python_projects/implement_LDA/lda/lib/python2.7/site-packages/sklearn/utils/validation.py", line 373, in check_array
array = np.array(array, dtype=dtype, order=order, copy=copy)
ValueError: setting an array element with a sequence.


When I searched for the error, I found some results but they even didn't help. I think I am logically wrong here in applying SVM model. Can someone give me a hint on this?

Ref: [1][2]

### QuantOverflow

#### Annualized Log Returns

I backtested an investment strategy over ten years (521 weeks to be specific) and calculated the weekly return using log returns. The sum of all weekly returns added up to 145%. How do I annualize this return? Is my assumption correct to simply calculate: (145/521)*52 to get the annual return?

### StackOverflow

#### Decorator pattern using Java 8

Wikipedia has an example of a decorator pattern here:

https://en.wikipedia.org/wiki/Decorator_pattern#Second_example_.28coffee_making_scenario.29

I was trying to solve this using functional style using Java 8,the solution I came up:

1.CoffeeDecorator.java

public class CoffeeDecorator {

public static Coffee getCoffee(Coffee basicCoffee, Function<Coffee, Coffee>... coffeeIngredients) {

Function<Coffee, Coffee> chainOfFunctions = Stream.of(coffeeIngredients)
.reduce(Function.identity(),Function::andThen);
return chainOfFunctions.apply(basicCoffee);
}

public static void main(String args[]) {

Coffee simpleCoffee = new SimpleCoffee();
printInfo(simpleCoffee);

Coffee coffeeWithMilk = CoffeeDecorator.getCoffee(simpleCoffee, CoffeeIngredientCalculator::withMilk);
printInfo(coffeeWithMilk);

Coffee coffeeWithWSprinkle = CoffeeDecorator.getCoffee(coffeeWithMilk,CoffeeIngredientCalculator::withSprinkles);
printInfo(coffeeWithWSprinkle);

}

public static void printInfo(Coffee c) {
System.out.println("Cost: " + c.getCost() + "; Ingredients: " + c.getIngredients());
}


}

2.CoffeeIngredientCalculator.java

public class CoffeeIngredientCalculator {

public static Coffee withMilk(Coffee coffee) {
return new Coffee() {

@Override
public double getCost() {
return coffee.getCost() + 0.5;
}

@Override
public String getIngredients() {
return coffee.getIngredients() + " , Milk";
}
};
}

public static Coffee withSprinkles(Coffee coffee) {
return new Coffee() {

@Override
public double getCost() {
return coffee.getCost() + 0.2;
}

@Override
public String getIngredients() {
return coffee.getIngredients() + " , Sprinkles";
}
};
}


}

Now, I am not so convinced with the solution in the CoffeeIngredientCalculator. If we had a single responsibility in the Coffee interface, getCost(), using the functional style and applying the decorator pattern seems a lot better and cleaner. It would basically boil down to a Function<Double,Double> ,we would not need the abstract class, separate decorators and can just chain the functions.

But in the coffee example, with 2 behaviors of the cost and description on the Coffee object, I am not so convinced that this is a significant value addition as we are creating an anonymous class,overriding the 2 methods.

Questions:

1) Is this solution acceptable ?

2) If not, is there a better way to solve it using functional style?

3) Should we stick to the usual GOF way of having an abstract class and separate decorator classes in scenarios where the object that we are decorating has multiple methods?

### QuantOverflow

#### What are the steps for creating an efficient intra day algo trading system? [on hold]

I am trying to create an algo trading system (in C++) using technical analysis strategies to trade in the duration of 1 minute. Initially it will only use it for paper trading. I want to know what are the steps for creating an efficient intra day algo trading system. I have an API which can be used to fetch quotes (open,high,low,last,volume) and also for buying and selling but no historical data API.

This is a sample trading logic

• Long only strategy
• only trade s&p 500 stocks (they are most liquid)
• at a given time portfolio will contain only two stocks
• Amount of money used will be like this

Total amount <-> 0.4 for Stock A+ 0.4 for stock B+ 0.2 is buffer amount

• Buy stock when close price is greater than moving average and sell it vice versa

Here is my implementation idea

STEP 1:
Create two C++ scripts a scrap.cpp and a trader.cpp

STEP 2:
scrap.cpp fetches quotes and save it in a database(sqlite/mysql/redis any recommendation).

STEP 3:
Use trader.cpp to fetch values from the database and then do technical analysis example moving average calculation. Then according to the condition buy and sell.


These are the problems?

Which database should professionals recommand to use it for intraday trading for one minute duration? and why?

What should be the database scheme for trading especially for sqlite?

Wether I forget anything or should I include any other thing?

END NOTE: I am using cpp because it can download and save quotes very fast especially when we have list of stocks more than 500.

### CompsciOverflow

#### Negligible functions in definitions of statistical closeness and computational indistinguishability

Statistical closeness implies computational indistinguishability.

Is there any (simple) relationship between negligible function that is used in definition of statistical closeness and negligible function that is used in definition of computational indistinguishability?

Can we say anything about the negligible function used in computational indistinguishability if we know how the negligible function in definition of statistical closeness looks like?

What about the special case: negligible function in definition of statistical closeness is zero?

Definitions from Goldreich, Foundations of Cryptography:

Ensemble = sequence of random variables, $\{X_i\}_{i \in \mathbb{N}}$

Computational indistinguishability:

Two ensembles $\{X_i\}_{i \in \mathbb{N}}$ and $\{Y_i\}_{i \in \mathbb{N}}$ are computationally indistinguishable if for every probabilistic polynomial-time algorithm $D$, every positive polynomial $p$ and all sufficiently large $n$'s,

$|P(D(X_n) = 1) - P(D(Y_n) = 1)| < \frac{1}{p(n)}.$

(this definition actually says that $|P(D(X_n) = 1) - P(D(Y_n) = 1)|$ is negligible function in $n$)

Statistical closeness:

Ensembles $\{X_i\}_{i \in \mathbb{N}}$ and $\{Y_i\}_{i \in \mathbb{N}}$ are statistically close if their statistical difference $\triangle(n)$ is negligible.

$\triangle(n) = \frac{1}{2} \sum_a |P(X_n = a) - P(Y_n = a)|$

### QuantOverflow

#### Trading days or Calendar days for Compound Annual Growth Rate?

When calculating CAGR for intervals shorter than a year (or intervals that are longer than, but not integer years in length), should you use the 252 trading days or the 365.25 calendar days?

The formula I am using follows:

CAGR = ( Current Value / Initial Value ) ^ (1 / (Days passed / Days in the year)) - 1

### CompsciOverflow

#### Simple elementary cellular automata with high period?

I'm looking for a simple rewrite system which displays high period.

In order to do that, I've ran a brute-force search on every elementary cellular automata, for a few fixed memory lengths L. The result is that, when L=7, there are rules with the max possible period (128). Yet, for L=8, no matter which rule is used, I couldn't get a period > 180. For L=9, the maximum period of 133.

I've, then, tried a few variations of the core idea. For example, I tried using 3 symbols instead of just 2 and do a similar brute-force search, but the results are similar.

Thus, I ask: is there any similar system with a rewrite rule which displays high period?

### CompsciOverflow

#### Simple majority classifier question

one of my training questions for my exam is the following one:

Suppose you are testing a new algorithm on a data set consisting of 100 positive and 100 negative examples. You plan to use leave-one-out cross-validation (i.e. 200-fold cross-validation) and compare your algorithm to a baseline function, a simple majority classifier. Given a set of training data, the majority classifier always outputs the class that is in the majority in the training set, regardless of the input. You expect the majority classifier to achieve about 50% classification accuracy, but to your surprise, it scores zero every time. Why?

My only solution about it is that the training data is inverse to the real data. But I'm not sure about my answer. May anybody help me?

Regards,

Patrick

### QuantOverflow

#### Stockmarket (expected profit)

Acme insurance LTD, is a public company, who has issued Two million shares with a $2 face value. On June 20 this year, they announced a dividend that amounted to a dividned percentage of 6%. The market value of the shares on that day was$3.50. The company profit exceeded stockmarket expectation by 45%. If the company decided to issue 30% of its actual profits as a dividend, what was the expected profit amount for Acme Insurance for the last financial year? Justify fully.

### CompsciOverflow

#### Deadlock and cycle in a resource allocation graph

Here is a resource allocation graph asked in my Operating Systems Theory midterm. The question is, "Is there a deadlock here? Explain your answer in detail"

Ra and Rb are resource sets and every dot inside of them are resources. Circles are processes. An arrow from process to a resource set means that process is requesting a resource from that set. An arrow from resource set to process means that process owns a resource from that resource set.

I want to have your opinions on this, because the lecturer's answer is conflicting with mine. Lecturer says there is a deadlock here. But my answer was, since Py and Pz are not requesting a resource, they will simply continue their execution and terminate, releasing their resources. Then Px and Pw can obtain their requested resources and keep executing. It is obvious there is a cycle in this graph as Px-Pw but this doesn't conclude us to a deadlock. Thus I can't see a way to make "there is a deadlock here"conclusion.

So is there a deadlock here?

### Planet Theory

#### GAMES/EC 2016

This week I report from Maastricht in the Netherlands from the GAMES 2016, the 5th World Congress of the Game Theory Society. By having their congress every four years, everyone who is anyone in the game theory community makes a strong effort to be here, including three Nobel laureates, Robert Aumann, Roger Myerson and Eric Maskin. The conference has about 750 participants and up to 19 parallel sessions.

This year the conference is co-located with the Economics and Computation conference that comes more from the CS community. By co-located we are sharing the same buildings and many of the events, effectively one larger conference (which means in reality 21 parallel sessions).

EC keeps growing, accepting 80 papers out of 242 submissions, all of which are freely downloadable.

My favorite EC talk was the best student paper, Deferred Acceptance with Compensation Chains by
Piotr Dworczak, a graduate student in the Stanford Business School. He gives an algorithm for finding stable matchings with the property that every stable matching can be found by changing the order that the agents get to choose. The paper Which Is the Fairest (Rent Division) of Them All? by
Kobi Gal, Moshe Mash, Ariel Procaccia and Yair Zick won best paper.

Also a shout out to the talk Cadet-Branch Matching in a Quasi-Linear Labor Market solely authored by Ravi Jagadeesan, a rising junior undergraduate at Harvard. I went to grad school with Ravi's mother Lalita, and yes that makes me feel old.

Tim Roughgarden gave the Kalai prize talk for his work on Intrinsic Robustness of the Price of Anarchy. The talk, attended by a good number of the game theorists, gave a general approach to generalizing bounds price of anarchy results to broader classes of equilibria. Tim followed Keith Chen who heads the analytic team for Uber and discussed how game theory and optimization ideas are driving a major e-commerce company. No major surprises but here's one trade secret: Uber covers its maps with hexagons while Lyft uses squares.

All is all a great week, with packed schedules and crowded activities, but great to see all these game theorists and computer scientists talking with each other.

### StackOverflow

#### Tensorflow ValueError: No variables to save from

I have written a tensorflow CNN and it is already trained. I wish to restore it to run it on a few samples but unfortunately its spitting out:

ValueError: No variables to save

My eval code can be found here:

import tensorflow as tf

import main
import Process
import Input

eval_dir = "/Users/Zanhuang/Desktop/NNP/model.ckpt-30"
checkpoint_dir = "/Users/Zanhuang/Desktop/NNP/checkpoint"

init_op = tf.initialize_all_variables()
saver = tf.train.Saver()

def evaluate():
with tf.Graph().as_default() as g:
sess.run(init_op)

ckpt = tf.train.get_checkpoint_state(checkpoint_dir)

saver.restore(sess, eval_dir)

images, labels = Process.eval_inputs(eval_data = eval_data)

forward_propgation_results = Process.forward_propagation(images)

top_k_op = tf.nn.in_top_k(forward_propgation_results, labels, 1)

print(top_k_op)

def main(argv=None):
evaluate()

if __name__ == '__main__':
tf.app.run()


#### What does Ord mean in Ramda's type annotation?

Ramda's documentation for clamp states:

## clamp

Ord a => a → a → a → a

Restricts a number to be within a range.

Also works for other ordered types such as Strings and Dates.

R.clamp(1, 10, -1) // => 1
R.clamp(1, 10, 11) // => 10
R.clamp(1, 10, 4)  // => 4


I understand what "a → a → a → a" means (a curried function that takes three arguments of the same type and returns a result of the same type as arguments).

What does "Ord" and fat arrow (=>) mean?

### CompsciOverflow

#### Can a Turing Machine (TM) decide whether the halting problem applies to all TMs?

On this site there are many variants on the question whether TMs can decide the halting problem, whether for all other TMs or certain subsets. This question is somewhat different.

It asks whether the fact the halting problem applies to all TMs can be decided by a TM. I believe the answer is no, and wish to check my reasoning.

1. Define the meta-halting language $L_{MH}$ as the language composed of TMs that decide whether a TM halts.

$$L_{MH} = \{ M : \forall_{M',w} M(M', w) \text{ accepts if M'(w) halts, rejects otherwise}\}$$

1. $L_{MH}= \emptyset$ due to the halting problem.

Thus, the title question more precisely stated: is it decidable whether $L_{MH} = \emptyset$?

1. Per Rice's theorem, it is undecidable whether an r.e. language is empty.
In both cases, if $L_{MH}$ is or is not r.e., it is undecidable whether $L_{MH} = \emptyset$.

2. Therefore, it is undecidable whether $L_{MH} = \emptyset$.

This proves a TM cannot decide whether the halting problem applies to all TMs.

Is my understanding correct?

UPDATE: I am trying to show that a TM cannot "prove the halting problem" for some definition of "prove" that seems intuitively correct. Below is an illustration of why I think this is correct.

We can create a TM $M_{MH}$ that generates $L_{MH}$ in the following way. The TM takes a tuple $(M_i,M_j,w_k,steps)$. It simulates $M_i(M_j, w_k)$ for $steps$ iterations. If $M_i$ accepts all $(M_j, w_k)$ pairs that halt, and rejects all others then $M_{MH}$ accepts $M_i$. Otherwise, it rejects $M_i$ if $M_i$ decides incorrectly or fails to halt.

$M_{MH}$ does not halt, because it must evaluate an infinite number of pairs for each $M_i$. Additionally, all the $M_i$s will fail to halt. $M_{MH}$ will be unable to accept or reject any $M_i$ as it will not know from the simulation that all $M_i$s will fail to halt. Thus, the language it defines is not r.e. and not decidable.

$M_{MH}$ captures my intuition of what I think it means for a TM to prove the halting problem. Other suggestions, such as $M_{MH}$ rejecting all $M_i$ or outputting a known proof give $M_{MH}$ prior knowledge that the halting problem applies to all $M_i$. This cannot count as $M_{MH}$ proving something since the $M_{MH}$'s premise is the conclusion it is proving, and thus is circular.

### StackOverflow

#### How to make relationship between some factor images to a some subset of images in matlab using ANN

I am a beginner of Matlab. I am trying to establish a non linear relationship(ANN) between some training images and some factors that affects those training images.In details,

1. I have a raster layer with only two values, 0 means no forest degradation and 1 means forest degradation
2. Several other raster layers- distance to road, distance to market centers(where wood is sold), elevation raster layer etc for all layers all the values normalized in a scale of 0-1
3. I divided the whole of the layer 1 into square blocks, from those blocks I randomly took(training samples) some block(subset of image).

Keep in mind that layers 2 affects the layer 1(forest degradation).

Now I want to create a relationship between those factors in layer 2 and layer 3- based on this relationship I want to predict the value of layer 1 in the non sample blocks. Means that for a place(not in the training sample area) based on the values of the layers got from step 2 I can predict the values(0 or 1) there provided that training.I am using Matlab 2016.

You can suggest a tutorial for such image processing in matlab too.

### CompsciOverflow

#### Find the optimal way [on hold]

We consider the TSP in Grid-City.

The roads in Grid-City have the form of a grid, so that the intersection points can be described by an integer coordinate system.

The distance of $2$ points $C=(x,y)$ and $D=(x',y')$ is defined as $d(C,D)=|x-x'|+|y-y'|$.

An input for the TSP consists of $23$ points with the following coordinates: $$(i,0), \text{ for } i=0, 1, \dots , 10, \\ (i,2), \text{ for } i=0, 1, \dots , 10, \\ (13, 0)$$

I want to give the optimal TSP-Tour.

We have the following grid, or not?

To find the optimal TSP-Tour (without the approximation-algorithms) from which point do we start? From which point we want?

If we choose one point to start, lets consider the $S=(0,0)$.

The second point will be either $A=(0,2)$ or $B=(1,0)$, right? Since $2=d(S,A)>d(S,B)=1$, the second point is $B=(1,0)$.

Or can we consider for the second point also the diagonal one, $(1,2)$ ?

Or is this not the correct way to find the optimal TSP-Tour?

When the roads are just the vertical and horizontal lines, we have that the distance of a point $(i,j)$ to an other is either $d_1=1$ or $d_2=2$, or not?

If this is true, is the optimal TSP-Tour the following? $$(0,0)\rightarrow (1,0) \rightarrow (2,0) \rightarrow \dots \rightarrow (10,0)\rightarrow (13,0)\rightarrow (10,2) \rightarrow (9,2) \rightarrow \dots \rightarrow (1,2) \rightarrow (0,2) \rightarrow (0,0)$$ But how do we get from the point $(13,0)$ to the point $(10,2)$ ?



EDIT:

I have to find also the approximation for the TSP-Tour through the NEAREST-NEIGHBOR and then through the NEAREST-INSERTION with starting point $(0,0)$.

Do we get the following result through the NEAREST-NEIGHBOR? $$(0,0)\rightarrow (1,0) \rightarrow (2,0) \rightarrow \dots \rightarrow (10,0)\rightarrow (10,2) \rightarrow (9,2) \rightarrow \dots \rightarrow (1,2) \rightarrow (0,2) \rightarrow (13,0) \rightarrow (0,0)$$

So is the length of this Tour equal to $10+2+10+15+13=50$ ?

The NEAREST-INSERTION algorithm is the following:

T <- {1}
while |T|<n do
j <- vertex with minimal d(T,j), j notin T
insert j with minimal cost into T
return T


So, we have the following:

T={(0,0)}
j=(1,0)
T={(0,0), (1,0)}
....
T={(0,0), (1,0), ... , (i,0), ... , (100,0), (100,2), ... , (j,2), ... , (1,2), (0,0)}


or not?

So, do we get the same result with both approximations?

#### Quantum circuits for multiply-accumulation

Classically, multiplication can be done in $O(n \ \lg(n) \ 8^{\lg^* n})$ steps on a multi-tape Turing machine via Fürer's algorithm. Using that algorithm, combined with uncomputing, you can make a quantum multiply-accumulate circuit with the same bound: $O(n \ \lg(n) \ 8^{\lg^* n})$ gates.

But quantum circuits have more options available. Maybe they can multiply in some better faster way, instead of just copying the classical algorithm.

Are there more efficient quantum circuits for multiply-accumulating?

## The backstory

I have spectrograms of certain repeating sounds. The background of the spectrograms are black and could easily be made transparent if that would help. The sounds are easily distinct from the background. As shown below, each sound is a series of connected pixels of various colors.

## What I want to do

I would like to separate each of the sounds into separate images or slices. That is, from the above example I would like to isolate the sounds into something like the following example.

If you are familiar with Unity 3D, they do something similar with their "Sprite Slicer" that creates a rectangle around objects on a transparent image.

## My attempt of an algorithm

Select non-black pixel
For each neighboring pixel
If pixel is black
Try each non-visited neighboring pixel again, traveling a maximum of three pixels away ( a sort of threshold )
If pixel is non-black
Add to a list of significant pixels and travel to neighboring pixels
Mark selected pixels as visited and select a new non-black pixel


This would result in a list of lists containing pixels where each list is the extracted sound or cluster of pixels that could be compiled into images.

## My concerns

I know this has been done before and is called something, but for the life of me I cannot find the words to describe it (hence the poor title).

My algorithm feels non-standard and potentially slow and poorly written. It should work, but it's not it.

## Questions

What is this called, how can I find more about it?

How do I succeed in extracting the sub-images as shown by the example?

#### Similarity between Min-Conflicts and Coordinate Descent in CSPs?

I'm currently writing a library that solves a specific type of problem that involves mainly constraint satisfaction.

I have came across the Min-Conflicts Algorithm which proved to be rather efficient in the context of the problem.

However, I have recently chanced upon the Coordinate Descent Algorithm and how strikingly it resembles the Min-Conflicts Algorithm.

Probably the only difference is that in Min-Conflicts, a random variable is selected to be modified at each step whereas Coordinate Descent cycles through the variables.

Am I right to say that apart from this difference, min-conflicts and coordinate descent are essentially equivalent? If so why are classified differently?

### StackOverflow

#### Scala Application does not take parameters compile error

I am trying to come up with a simple function that takes a function and list of integers and apply the function on every integer in the list -

    def IntOps(f: Int => Int)(values: List[Int]): Int = {
if(values.isEmpty) 0
//Getting "Application does not take parameters" in values.tail
}
IntOps(x=> x+x)(List(1, 2, 30)


I am getting a compiler error Application does not take parameters on values.tail, I am beginner to both Functional programming and scala so any pointers or answers to understand this would be great.

### Lobsters

#### LastPass: design flaw in communication between privileged and unprivileged components

Key quote:

This allows access to any of the privileged LastPass RPCs, so this is a complete compromise of the lastpass addon. From here an attacker can create and delete files, execute script, steal all passwords, log victims into their own lastpass account so that they can steal anything new saved there, etc, etc.

Note that this is not the same bug as in the post from earlier today.

### CompsciOverflow

#### EQtm is not mapping reducible to its complement

This is a problem from Sipser's book (marked with an asterisk).

$EQ_{TM} = \{(\langle M \rangle, \langle N \rangle)$ where $M$ and $N$ are Turing machines and $L(M) = L(N)\}$

We know that neither $EQ_{TM}$ nor $\overline{EQ_{TM}}$ are recognizable so unsure how to go about proving there can't be a mapping reduction from one to the other.

Any hints?

### TheoryOverflow

#### How is the VP=VNP question in char 2 different from other char? What is the current frontier in regards to this question?

What are the caveats one should be aware of when pursuing VP=VNP question in char 2 compared to other char? What is the current frontier in regards to this question?

### CompsciOverflow

#### Sorting $n$ balls using a two-pan scale

This is a 2016 interview question:

Show that $n$ balls with distinct weights can be sorted using a two-pan balance in only $\lceil \log_2 n! \rceil$ weighings.

How can this be accomplished?

### StackOverflow

#### Why L1 regularization works in machine Learning

Well, in machine learning, one way to prevent overfitting is to add L2 regularization, and some says that L1 regularization is better, why is that? Also i know that L1 is used to ensure the sparsity of data, what is the theoretical support for this result?

#### Why shuffling of data matters in Tensorflow?

I am working on an image classification problem in Tensorflow. I have 2 datasets, one for training and one for evaluation. I had shuffled the training data using tensorflow inbuilt functions and trained the model, but the evaluation model was not shuffled and was sorted according to labels. While trying to predict the accuracy of the model, sorted evaluation file gave an accuracy of 3% whereas when I randomly sorted the file and tried, the accuracy was 60%. Why would shuffling of data matter in the evaluation data? Please help!

### Planet Emacsen

#### Irreal: The Emacs/Vi Holy War

I saw this tweet and it got me thinking.

The tweet is, of course, snark but it raises an interesting question. Does anyone still care about the holiest of holy wars? This tweet

suggests that some of us do but my sense is that Emacsers and Vimers are pretty much united against all the newcomers as exemplified by this tweet

I always get in trouble when I bring this up but I think it's true that serious developers overwhelmingly prefer either Emacs or Vim. Of course there are exceptions. There are, I'm sure, thousands of excellent developers that use something else but mostly the great developers use Emacs or Vim.

The choice between the two depends on the developer's outlook. If you want the fastest, most composable editor and are focused on simply editing text, you will probably prefer Vim. If, on the other hand, you want an environment that subsumes editing among other things, you will probably prefer Emacs.

My point, though, is that Emacsers and Vimers have pretty much moved from fighting each other to shaking their heads in disbelief about those engineers who are using one of those other editors. So perhaps the holy war isn't over, it's just move to another domain.

### arXiv Programming Languages

#### AutoPriv: Automating Differential Privacy Proofs. (arXiv:1607.08228v1 [cs.PL])

The growing populariy and adoption of differential privacy in academic and industrial settings has resulted in the development of increasingly sophisticated algorithms for releasing information while preserving privacy. Accompanying this phenomenon is the natural rise in the development and publication of incorrect algorithms, thus demonstrating the necessity of formal verification tools. However, existing formal methods for differential privacy face a dilemma: methods based on customized logics can verify sophisticated algorithms but comes with a steep learning curve and significant annotation burden on the programmers; while existing type systems lacks expressive power for some sophisticated algorithms.

In this paper, we present AutoPriv, a simple imperative language that strikes a better balance between expressive power and usefulness. The core of AutoPriv is a novel relational type system that separates relational reasoning from privacy budget calculations. With dependent types, the type system is powerful enough to verify sophisticated algorithms where the composition theorem falls short. In addition, the inference engine of AutoPriv infers most of the proof details, and even searches for the proof with minimal privacy cost when multiple proofs exist. We show that AutoPriv verifies sophisticated algorithms with little manual effort.

#### Open and Regionalised Spectrum Repositories for Emerging Countries. (arXiv:1607.08227v1 [cs.NI])

TV White Spaces have recently been proposed as an alternative to alleviate the spectrum crunch, characterised by the need to reallocate frequency bands to accommodate the ever-growing demand for wireless communications. In this paper, we discuss the motivations and challenges for collecting spectrum measurements in developing regions and discuss a scalable system for communities to gather and provide access to White Spaces information through open and regionalised repositories. We further discuss two relevant aspects. First, we propose a cooperative mechanism for sensing spectrum availability using a detector approach. Second, we propose a strategy (and an architecture) on the database side to implement spectrum governance. Other aspects of the work include discussion of an extensive measurement campaign showing a number of white spaces in developing regions, an overview of our experience on low-cost spectrum analysers, and the architecture of zebra-rfo, an application for processing crowd-sourced spectrum data.

#### PANDA: Extreme Scale Parallel K-Nearest Neighbor on Distributed Architectures. (arXiv:1607.08220v1 [cs.DC])

Computing $k$-Nearest Neighbors (KNN) is one of the core kernels used in many machine learning, data mining and scientific computing applications. Although kd-tree based $O(\log n)$ algorithms have been proposed for computing KNN, due to its inherent sequentiality, linear algorithms are being used in practice. This limits the applicability of such methods to millions of data points, with limited scalability for Big Data analytics challenges in the scientific domain. In this paper, we present parallel and highly optimized kd-tree based KNN algorithms (both construction and querying) suitable for distributed architectures. Our algorithm includes novel approaches for pruning search space and improving load balancing and partitioning among nodes and threads. Using TB-sized datasets from three science applications: astrophysics, plasma physics, and particle physics, we show that our implementation can construct kd-tree of 189 billion particles in 48 seconds on utilizing $\sim$50,000 cores. We also demonstrate computation of KNN of 19 billion queries in 12 seconds. We demonstrate almost linear speedup both for shared and distributed memory computers. Our algorithms outperforms earlier implementations by more than order of magnitude; thereby radically improving the applicability of our implementation to state-of-the-art Big Data analytics problems. In addition, we showcase performance and scalability on the recently released Intel Xeon Phi processor showing that our algorithm scales well even on massively parallel architectures.

#### Online Trajectory Segmentation and Summary With Applications to Visualization and Retrieval. (arXiv:1607.08188v1 [cs.CV])

Trajectory segmentation is the process of subdividing a trajectory into parts either by grouping points similar with respect to some measure of interest, or by minimizing a global objective function. Here we present a novel online algorithm for segmentation and summary, based on point density along the trajectory, and based on the nature of the naturally occurring structure of intermittent bouts of locomotive and local activity. We show an application to visualization of trajectory datasets, and discuss the use of the summary as an index allowing efficient queries which are otherwise impossible or computationally expensive, over very large datasets.

#### Android Malware Detection Using Parallel Machine Learning Classifiers. (arXiv:1607.08186v1 [cs.CR])

Mobile malware has continued to grow at an alarming rate despite on-going efforts towards mitigating the problem. This has been particularly noticeable on Android due to its being an open platform that has subsequently overtaken other platforms in the share of the mobile smart devices market. Hence, incentivizing a new wave of emerging Android malware sophisticated enough to evade most common detection methods. This paper proposes and investigates a parallel machine learning based classification approach for early detection of Android malware. Using real malware samples and benign applications, a composite classification model is developed from parallel combination of heterogeneous classifiers. The empirical evaluation of the model under different combination schemes demonstrates its efficacy and potential to improve detection accuracy. More importantly, by utilizing several classifiers with diverse characteristics, their strengths can be harnessed not only for enhanced Android malware detection but also quicker white box analysis by means of the more interpretable constituent classifiers.

#### Adaptive Versus Non-Adaptive Strategies in the Quantum Setting with Applications. (arXiv:1607.08168v1 [quant-ph])

We prove a general relation between adaptive and non-adaptive strategies in the quantum setting, i.e., between strategies where the adversary can or cannot adaptively base its action on some auxiliary quantum side information. Our relation holds in a very general setting, and is applicable as long as we can control the bit-size of the side information, or, more generally, its "information content". Since adaptivity is notoriously difficult to handle in the analysis of (quantum) cryptographic protocols, this gives us a very powerful tool: as long as we have enough control over the side information, it is sufficient to restrict ourselves to non-adaptive attacks.

We demonstrate the usefulness of this methodology with two examples. The first is a quantum bit commitment scheme based on 1-bit cut-and-choose. Since bit commitment implies oblivious transfer (in the quantum setting), and oblivious transfer is universal for two-party computation, this implies the universality of 1-bit cut-and-choose, and thus solves the main open problem of [FKSZZ13]. The second example is a quantum bit commitment scheme proposed in 1993 by Brassard et al. It was originally suggested as an unconditionally secure scheme, back when this was thought to be possible. We partly restore the scheme by proving it secure in (a variant of) the bounded quantum storage model.

In both examples, the fact that the adversary holds quantum side information obstructs a direct analysis of the scheme, and we circumvent it by analyzing a non-adaptive version, which can be done by means of known techniques, and applying our main result.

#### DynaLog: An automated dynamic analysis framework for characterizing Android applications. (arXiv:1607.08166v1 [cs.CR])

Android is becoming ubiquitous and currently has the largest share of the mobile OS market with billions of application downloads from the official app market. It has also become the platform most targeted by mobile malware that are becoming more sophisticated to evade state-of-the-art detection approaches. Many Android malware families employ obfuscation techniques in order to avoid detection and this may defeat static analysis based approaches. Dynamic analysis on the other hand may be used to overcome this limitation. Hence in this paper we propose DynaLog, a dynamic analysis based framework for characterizing Android applications. The framework provides the capability to analyse the behaviour of applications based on an extensive number of dynamic features. It provides an automated platform for mass analysis and characterization of apps that is useful for quickly identifying and isolating malicious applications. The DynaLog framework leverages existing open source tools to extract and log high level behaviours, API calls, and critical events that can be used to explore the characteristics of an application, thus providing an extensible dynamic analysis platform for detecting Android malware. DynaLog is evaluated using real malware samples and clean applications demonstrating its capabilities for effective analysis and detection of malicious applications.

#### N-opcode Analysis for Android Malware Classification and Categorization. (arXiv:1607.08149v1 [cs.CR])

Malware detection is a growing problem particularly on the Android mobile platform due to its increasing popularity and accessibility to numerous third party app markets. This has also been made worse by the increasingly sophisticated detection avoidance techniques employed by emerging malware families. This calls for more effective techniques for detection and classification of Android malware. Hence, in this paper we present an n-opcode analysis based approach that utilizes machine learning to classify and categorize Android malware. This approach enables automated feature discovery that eliminates the need for applying expert or domain knowledge to define the needed features. Our experiments on 2520 samples that were performed using up to 10-gram opcode features showed that an f-measure of 98% is achievable using this approach.

#### Collision-free Operation in High Density WLAN Deployments. (arXiv:1607.08138v1 [cs.NI])

WiFi's popularity has led to crowded scenarios composed of many Access Points (AP) and clients, often operating on overlapping channels, producing interference that gravely degrades performance. This misallocation of resources is often the result of multiple WLANs ownership, that is, networks are frequently setup automatically without considering neighbouring APs. In this work we overview the effect of Overlapping BSS (OBSS) from the perspective of the MAC layer, taking special interest on describing the advantages of eliminating collisions with Carrier Sense Multiple Access with Enhanced Collision Avoidance (CSMA/ECA). We propose a single Access Point (AP) and several multi-AP scenarios, including the residential building example proposed for testing the upcoming IEEE 802.11ax amendment. Results using the first NS-3 implementation of CSMA/ECA reveal the advantage of CSMA/ECA's deterministic backoff contention technique, confirming its suitability for very crowded scenarios.

#### Design and Implementation of a Measurement-Based Policy-Driven Resource Management Framework For Converged Networks. (arXiv:1607.08123v1 [cs.NI])

This paper presents the design and implementation of a measurement-based QoS and resource management framework, CNQF (Converged Networks QoS Management Framework). CNQF is designed to provide unified, scalable QoS control and resource management through the use of a policy-based network management paradigm. It achieves this via distributed functional entities that are deployed to co-ordinate the resources of the transport network through centralized policy-driven decisions supported by measurement-based control architecture. We present the CNQF architecture, implementation of the prototype and validation of various inbuilt QoS control mechanisms using real traffic flows on a Linux-based experimental test bed.

#### Statistical Delay Bound for WirelessHART Networks. (arXiv:1607.08102v1 [cs.PF])

In this paper we provide a performance analysis framework for wireless industrial networks by deriving a service curve and a bound on the delay violation probability. For this purpose we use the (min,x) stochastic network calculus as well as a recently presented recursive formula for an end-to-end delay bound of wireless heterogeneous networks. The derived results are mapped to WirelessHART networks used in process automation and were validated via simulations. In addition to WirelessHART, our results can be applied to any wireless network whose physical layer conforms the IEEE 802.15.4 standard, while its MAC protocol incorporates TDMA and channel hopping, like e.g. ISA100.11a or TSCH-based networks. The provided delay analysis is especially useful during the network design phase, offering further research potential towards optimal routing and power management in QoS-constrained wireless industrial networks.

#### Event-Driven Implicit Authentication for Mobile Access Control. (arXiv:1607.08101v1 [cs.NI])

In order to protect user privacy on mobile devices, an event-driven implicit authentication scheme is proposed in this paper. Several methods of utilizing the scheme for recognizing legitimate user behavior are investigated. The investigated methods compute an aggregate score and a threshold in real-time to determine the trust level of the current user using real data derived from user interaction with the device. The proposed scheme is designed to: operate completely in the background, require minimal training period, enable high user recognition rate for implicit authentication, and prompt detection of abnormal activity that can be used to trigger explicitly authenticated access control. In this paper, we investigate threshold computation through standard deviation and EWMA (exponentially weighted moving average) based algorithms. The result of extensive experiments on user data collected over a period of several weeks from an Android phone indicates that our proposed approach is feasible and effective for lightweight real-time implicit authentication on mobile smartphones.

#### Automatically Reinforcing a Game AI. (arXiv:1607.08100v1 [cs.AI])

A recent research trend in Artificial Intelligence (AI) is the combination of several programs into one single, stronger, program; this is termed portfolio methods. We here investigate the application of such methods to Game Playing Programs (GPPs). In addition, we consider the case in which only one GPP is available - by decomposing this single GPP into several ones through the use of parameters or even simply random seeds. These portfolio methods are trained in a learning phase. We propose two different offline approaches. The simplest one, BestArm, is a straightforward optimization of seeds or parame- ters; it performs quite well against the original GPP, but performs poorly against an opponent which repeats games and learns. The second one, namely Nash-portfolio, performs similarly in a "one game" test, and is much more robust against an opponent who learns. We also propose an online learning portfolio, which tests several of the GPP repeatedly and progressively switches to the best one - using a bandit algorithm.

#### The Actias system: supervised multi-strategy learning paradigm using categorical logic. (arXiv:1607.08098v1 [cs.DB])

One of the most difficult problems in the development of intelligent systems is the construction of the underlying knowledge base. As a consequence, the rate of progress in the development of this type of system is directly related to the speed with which knowledge bases can be assembled, and on its quality. We attempt to solve the knowledge acquisition problem, for a Business Information System, developing a supervised multistrategy learning paradigm. This paradigm is centred on a collaborative data mining strategy, where groups of experts collaborate using data-mining process on the supervised acquisition of new knowledge extracted from heterogeneous machine learning data models.

The Actias system is our approach to this paradigm. It is the result of applying the graphic logic based language of sketches to knowledge integration. The system is a data mining collaborative workplace, where the Information System knowledge base is an algebraic structure. It results from the integration of background knowledge with new insights extracted from data models, generated for specific data modelling tasks, and represented as rules using the sketches language.

#### Android Malware Detection: an Eigenspace Analysis Approach. (arXiv:1607.08087v1 [cs.CR])

The battle to mitigate Android malware has become more critical with the emergence of new strains incorporating increasingly sophisticated evasion techniques, in turn necessitating more advanced detection capabilities. Hence, in this paper we propose and evaluate a machine learning based approach based on eigenspace analysis for Android malware detection using features derived from static analysis characterization of Android applications. Empirical evaluation with a dataset of real malware and benign samples show that detection rate of over 96% with a very low false positive rate is achievable using the proposed method.

#### The Descriptive Complexity of Subgraph Isomorphism in the Absence of Order. (arXiv:1607.08067v1 [cs.LO])

Let $C$ be a class of graphs and $\pi$ be a graph parameter. Let $\Phi$ be a formula in the first-order language containing only the adjacency and the equality relations. We say that $\Phi$ \emph{defines $C$ on connected graphs with sufficiently large $\pi$} if there is a constant $k$ such that, for every connected graph $G$ with $\pi(G)\ge k$, $\Phi$ is true on $G$ exactly when $G$ belongs to $C$. For a fixed connected graph $F$, let $S(F)$ denote the class of all graphs containing $F$ as a subgraph. Let $D_\pi(F)$ denote the minimum quantifier depth of a formula $\Phi$ defining $S(F)$ on connected graphs with sufficiently large $\pi$. We have $D_v(F)\ge D_{tw}(F)\ge D_\kappa(F)$, where $v(G)$ denotes the number of vertices in a graph $G$, $tw(G)$ is the treewidth of $G$, and $\kappa(G)$ is the connectivity of~$G$. We obtain the following results.

- There are graphs $F$ such that $D_v(F)$ is strictly smaller than the number $n$ of vertices in $F$. In particular, $D_v(P_n)=n-1$ for the path graphs on $n\ge4$ vertices. Moreover, there are some trees $F$ such that $D_v(F)\le n-3$.

- On the other hand, $D_v(F)=D_{tw}(F)=n$ if $F$ has no vertex of degree 1. In general, $D_v(F)>n/2$ unless $F=P_2$ or $P_3$.

- $D_{tw}(F)\ge tw(F)$ for every $F$. Over trees $F$ with $n$ vertices, the values of $D_{tw}(F)$ occupy the almost full spectrum $\{1,5,\ldots,n\}$. The minimum value $D_{tw}(F)=1$ is attained if $F$ is a subtree of a subdivided 3-star $K_{1,3}$. The maximum $D_{tw}(K_{1,n-1})=n$ is attained for the star graphs on $n\ge5$ vertices.

- $D_\kappa(F)\ge\frac mn+2$ whenever the number $m$ of edges in $F$ is larger than the number $n$ of vertices. Over graphs $F$ with $n$ vertices, the values of $D_\kappa(F)$ occupy the almost full spectrum $\{1,3,\ldots,n\}$.

#### Cops and Robbers on Intersection Graphs. (arXiv:1607.08058v1 [math.CO])

The cop number of a graph $G$ is the smallest $k$ such that $k$ cops win the game of cops and robber on $G$. We investigate the maximum cop number of geometric intersection graphs, which are graphs whose vertices are represented by geometric shapes and edges by their intersections. We establish the following dichotomy for previously studied classes of intersection graphs:

The intersection graphs of arc-connected sets in the plane (called string graphs) have cop number at most 15, and more generally, the intersection graphs of arc-connected subsets of a surface have cop number at most $10g+15$ in case of orientable surface of genus $g$, and at most $10g'+15$ in case of non-orientable surface of Euler genus $g'$. For more restricted classes of intersection graphs, we obtain better bounds: the maximum cop number of interval filament graphs is two, and the maximum cop number of outer-string graphs is between 3 and 4.

The intersection graphs of disconnected 2-dimensional sets or of 3-dimensional sets have unbounded cop number even in very restricted settings. For instance, we show that the cop number is unbounded on intersection graphs of two-element subsets of a line, as well as on intersection graphs of 3-dimensional unit balls, of 3-dimensional unit cubes or of 3-dimensional axis-aligned unit segments.

#### Satisfiability Checking meets Symbolic Computation (Project Paper). (arXiv:1607.08028v1 [cs.SC])

Symbolic Computation and Satisfiability Checking are two research areas, both having their individual scientific focus but sharing also common interests in the development, implementation and application of decision procedures for arithmetic theories. Despite their commonalities, the two communities are rather weakly connected. The aim of our newly accepted SC-square project (H2020-FETOPEN-CSA) is to strengthen the connection between these communities by creating common platforms, initiating interaction and exchange, identifying common challenges, and developing a common roadmap from theory along the way to tools and (industrial) applications. In this paper we report on the aims and on the first activities of this project, and formalise some relevant challenges for the unified SC-square community.

#### LWIP and Wi-Fi Boost Link Management. (arXiv:1607.08026v1 [cs.NI])

3GPP LWIP Release 13 technology and its prestandard version Wi-Fi Boost have recently emerged as an efficient LTE and Wi-Fi integration at the IP layer, allowing uplink on LTE and downlink on Wi-Fi. This solves all the contention problems of Wi-Fi and allows an optimum usage of the unlicensed band for downlink. In this paper, we present a new feature of Wi-Fi Boost, its radio link management, which allows to steer the downlink traffic between both LTE and Wi-Fi upon congestion detection in an intelligent manner. This customised congestion detection algorithm is based on IP probing, and can work with any Wi-Fi access point. Simulation results in a typical enterprise scenario show that LWIP R13 and Wi-Fi Boost can enhance network performance up to 5x and 6x over LTE-only, and 4x and 5x over Wi-Fi only networks, respectively, and that the the proposed radio link management can further improve Wi-Fi Boost performance over LWIP R13 up to 19 %. Based on the promising results, this paper suggests to enhance LWIP R13 user feedback in future LTE releases.

#### Understanding the limits of LoRaWAN. (arXiv:1607.08011v1 [cs.NI])

The quick proliferation of LPWAN networks, being LoRaWAN one of the most adopted, raised the interest of the industry, network operators and facilitated the development of novel services based on large scale and simple network structures. LoRaWAN brings the desired ubiquitous connectivity to enable most of the outdoor IoT applications and its growth and quick adoption are real proofs of that. Yet the technology has some limitations that need to be understood in order to avoid over-use of the technology. In this article we aim to provide an impartial overview of what are the limitations of such technology, and in a comprehensive manner bring use case examples to show where the limits are.

#### System-level Scalable Checkpoint-Restart for Petascale Computing. (arXiv:1607.07995v1 [cs.DC])

Fault tolerance for the upcoming exascale generation has long been an area of active research. One of the components of a fault tolerance strategy is checkpointing. Petascale-level checkpointing is demonstrated through a new mechanism for virtualization of the InfiniBand UD (unreliable datagram) mode, and for updating the remote address on each UD-based send, due to lack of a fixed peer. Note that InfiniBand UD is required to support modern MPI implementations. An extrapolation from the current results to future SSD-based storage systems provides evidence that the current approach will remain practical in the exascale generation. This transparent checkpointing approach is evaluated using a framework of the DMTCP checkpointing package. Results are shown for HPCG (linear algebra), NAMD (molecular dynamics), and the NAS NPB benchmarks. In tests up to 24,000 MPI processes on 24,000 CPU cores, checkpointing of a computation with a 29 TB memory footprint in 10 minutes is demonstrated. Runtime overhead is reduced to less than 1%. The approach is also evaluated across three widely used MPI implementations.

#### PIWD: A Plugin-based Framework for Well-Designed SPARQL. (arXiv:1607.07967v1 [cs.DB])

In the real world datasets (e.g., DBPedia query log), well-designed and-opt patterns (or WDAO-patterns) accounts for a large proportion in all SPARQL queries. In this paper, we present a plugin-based framework for all SELECT queries built on the notion WDAO-patterns, named PIWD. Given a WDAO-query Q, PIWD firstly transform a WDAO-pattern into WDAO-tree, whose leaves only contain the conjunctive queries (CQ) and inner nodes only contain OPT operation. Secondly, we employ CQ framework in its leaves. Finally, PIWD answer this query by new query plan. The preliminary experiment results(that is, it is not optimized) show that PIWD can answer all WDAO-queries. Additionally, PIWD can support all CQ framework since it is independent of them.

#### A New Approach to SMS Steganography using Mathematical Equations. (arXiv:1607.07947v1 [cs.CR])

In the era of Information Technology, cyber-crime has always been a worrying issue for online users. Phishing, social engineering, and third party attacks have made people reluctant to share their personal information, even with trusted entities. Messages that are sent via Short Message Service (SMS) are easily copied and hacked by using special software. To enforce the security of sending messages through mobile phones, one solution is SMS steganography. SMS Steganography is a technique that hides a secret message in the SMS. We propose a new approach for SMS steganography that uses a mathematical equation as the stego media in order to transmit the data. With this approach, we can hide up to 35 characters (25%) of a secret message on a single SMS with maximum of 140 characters.

#### $H$-supermagic labelings for firecrackers, banana trees and flowers. (arXiv:1607.07911v1 [cs.DM])

A simple graph $G=(V,E)$ admits an $H$-covering if every edge in $E$ is contained in a subgraph $H'=(V',E')$ of $G$ which is isomorphic to $H$. In this case we say that $G$ is $H$-supermagic if there is a bijection $f:V\cup E\to\{1,\ldots\lvert V\rvert+\lvert E\rvert\}$ such that $f(V)=\{1,\ldots,\lvert V\rvert\}$ and $\sum_{v\in V(H')}f(v)+\sum_{e\in E(H')}f(e)$ is constant over all subgraphs $H'$ of $G$ which are isomorphic to $H$. In this paper, we show that for odd $n$ and arbitrary $k$, the firecracker $F_{k,n}$ is $F_{2,n}$-supermagic, the banana tree $B_{k,n}$ is $B_{1,n}$-supermagic and the flower $F_n$ is $C_3$-supermagic.

#### Product Offerings in Malicious Hacker Markets. (arXiv:1607.07903v1 [cs.CR])

Marketplaces specializing in malicious hacking products - including malware and exploits - have recently become more prominent on the darkweb and deepweb. We scrape 17 such sites and collect information about such products in a unified database schema. Using a combination of manual labeling and unsupervised clustering, we examine a corpus of products in order to understand their various categories and how they become specialized with respect to vendor and marketplace. This initial study presents how we effectively employed unsupervised techniques to this data as well as the types of insights we gained on various categories of malicious hacking products.

### StackOverflow

#### how to use auto SVM on opencv 3 for image processing applications?

how to use auto SVM on opencv 3 for image processing applications? please give me example. I was used:

    SVM::Params params;
params.svmType    = SVM::C_SVC;
params.kernelType = SVM::LINEAR;
params.termCrit   = TermCriteria(TermCriteria::MAX_ITER, 100, 1e-6);


but the SVM has auto mode.

How can I improve the accuracy of the system?

### Planet Theory

#### Suffix arrays with a twist

Authors: Tomasz Kowalski, Szymon Grabowski, Kimmo Fredriksson, Marcin Raniszewski
Abstract: The suffix array is a classic full-text index, combining effectiveness with simplicity. We discuss three approaches aiming to improve its efficiency even more: changes to the navigation, data layout and adding extra data. In short, we show that $(i)$ how we search for the right interval boundary impacts significantly the overall search speed, $(ii)$ a B-tree data layout easily wins over the standard one, $(iii)$ the well-known idea of a lookup table for the prefixes of the suffixes can be refined with using compression, $(iv)$ caching prefixes of the suffixes in a helper array can pose a(nother) practical space-time tradeoff.

#### Approximation and Parameterized Complexity of Minimax Approval Voting

Authors: Marek Cygan, Łukasz Kowalik, Arkadiusz Socała, Krzysztof Sornat
Abstract: We present three results on the complexity of Minimax Approval Voting. First, we study Minimax Approval Voting parameterized by the Hamming distance $d$ from the solution to the votes. We show Minimax Approval Voting admits no algorithm running in time $\mathcal{O}^\star(2^{o(d\log d)})$, unless the Exponential Time Hypothesis (ETH) fails. This means that the $\mathcal{O}^\star(d^{2d})$ algorithm of Misra et al. [AAMAS 2015] is essentially optimal. Motivated by this, we then show a parameterized approximation scheme, running in time $\mathcal{O}^\star(\left({3}/{\epsilon}\right)^{2d})$, which is essentially tight assuming ETH. Finally, we get a new polynomial-time randomized approximation scheme for Minimax Approval Voting, which runs in time $n^{\mathcal{O}(1/\epsilon^2 \cdot \log(1/\epsilon))} \cdot \mathrm{poly}(m)$, almost matching the running time of the fastest known PTAS for Closest String due to Ma and Sun [SIAM J. Comp. 2009].

#### Minmax Tree Facility Location and Sink Evacuation with Dynamic Confluent Flows

Authors: Di Chen, Mordecai Golin
Abstract: Let $G=(V,E)$ be a graph modelling a building or road network in which edges have-both travel times (lengths) and capacities associated with them. An edge's capacity is the number of people that can enter that edge in a unit of time.

In emergencies, people evacuate towards the exits. If too many people try to evacuate through the same edge, congestion builds up and slows down the evacuation.

Graphs with both lengths and capacities are known as Dynamic Flow networks. An evacuation plan for $G$ consists of a choice of exit locations and a partition of the people at the vertices into groups, with each group evacuating to the same exit. The evacuation time of a plan is the time it takes until the last person evacuates. The $k$-sink evacuation problem is to provide an evacuation plan with $k$ exit locations that minimizes the evacuation time. It is known that this problem is NP-Hard for general graphs but no polynomial time algorithm was previously known even for the case of $G$ a tree. This paper presents an $O(n k^2 \log^5 n)$ algorithm for the $k$-sink evacuation problem on trees. Our algorithms also apply to a more general class of problems, which we call minmax tree facility location.

#### On maximizing a monotone k-submodular function subject to a matroid constraint

Authors: Shinsaku Sakaue
Abstract: A $k$-submodular function is an extension of a submodular function in that its input is given by $k$ disjoint subsets instead of a single subset. For unconstrained nonnegative $k$-submodular maximization, Ward and \v{Z}ivn\'y proposed a constant-factor approximation algorithm, which was improved by the recent work of Iwata, Tanigawa and Yoshida presenting a $1/2$-approximation algorithm. Iwata et al. also provided a $k/(2k-1)$-approximation algorithm for monotone $k$-submodular maximization and proved that its approximation ratio is asymptotically tight. More recently, Ohsaka and Yoshida proposed constant-factor algorithms for monotone $k$-submodular maximization with several size constraints. However, while submodular maximization with various constraints has been extensively studied, no approximation algorithm has been developed for constrained $k$-submodular maximization, except for the case of size constraints. In this paper, we prove that a greedy algorithm outputs a $1/2$-approximate solution for monotone $k$-submodular maximization with a matroid constraint. The algorithm runs in $O(M|E|(\text{MO} + k\text{EO}))$ time, where $M$ is the size of an optimal solution, $|E|$ is the size of the ground set, and $\text{MO}, \text{EO}$ represent the time for the membership oracle of the matroid and the evaluation oracle of the $k$-submodular function, respectively.

#### Algorithmic statistics: forty years later

Authors: Nikolai Vereshchagin, Alexander Shen
Abstract: Algorithmic statistics has two different (and almost orthogonal) motivations. From the philosophical point of view, it tries to formalize how the statistics works and why some statistical models are better than others. After this notion of a "good model" is introduced, a natural question arises: it is possible that for some piece of data there is no good model? If yes, how often these bad ("non-stochastic") data appear "in real life"?

Another, more technical motivation comes from algorithmic information theory. In this theory a notion of complexity of a finite object (=amount of information in this object) is introduced; it assigns to every object some number, called its algorithmic complexity (or Kolmogorov complexity). Algorithmic statistic provides a more fine-grained classification: for each finite object some curve is defined that characterizes its behavior. It turns out that several different definitions give (approximately) the same curve.

In this survey we try to provide an exposition of the main results in the field (including full proofs for the most important ones), as well as some historical comments. We assume that the reader is familiar with the main notions of algorithmic information (Kolmogorov complexity) theory.

### QuantOverflow

#### Why is CSA currency OIS rate used in discounting instead of local currency OIS?

I have been struggling to understand the logic behind cross currency OIS discounting (where cash flows happen in different currencies than the collateral is paid). I will illustrate my question through example with very much simplified numbers.

Let’s assume a world where:

JPY OIS = 10% per day, flat
USD OIS = 0% per day, flat
USDJPY spot = 100
USDJPY Forward for tomorrow = 100


My counterparty (you) is paying me tomorrow 100 JPY, and we have a “perfect” (daily calls, USD cash only, pays USD OIS interest) CSA.

Now, all sources I have found (see, for example, this), claim the same about the proper discounting process. We first convert the cash flow with forward rates to CSA currency,

100 JPY /100 USDJPYtomorrow = 1 USD


and then discount with the CSA currency OIS curve:

1/(1+0.0) = 1 USD is the value of your 100 JPY tomorrow and that you should pay me as collateral today. Any other valuation would give one of us arbitrage opportunity or cause unfair value transfer to one direction or another.

So, for example following discounting is completely wrong:

We discount the JPY cash flow with the JPY OIS:

100 JPY / (1+ 0.1) = 90.91


And convert that at spot to USD:

90.91 JPY / 100 USDJPY= 0.9091 USD.


Now, clearly, if you default today, I can sell my 0.9091 USD and buy JPY, invest that at JPY OIS and receive 100 JPY tomorrow. So I should be happy. But every source I have claims that if I take only 0.9091 USD instead of 1 USD as collateral, I will lose (or win?) some money to you. I just do not understand where and how. Could someone describe step by step all transactions in detail that show the wealth transfer/arbitrage opportunity?

### Planet Theory

#### Computing exponentially faster: Implementing a nondeterministic universal Turing machine using DNA

Authors: Andrew Currin, Konstantin Korovin, Maria Ababi, Katherine Roper, Douglas B. Kell, Philip J. Day, Ross D. King
Abstract: The theory of computer science is based around Universal Turing Machines (UTMs): abstract machines able to execute all possible algorithms. Modern digital computers are physical embodiments of UTMs. The nondeterministic polynomial (NP) time complexity class of problems is the most significant in computer science, and an efficient (i.e. polynomial P) way to solve such problems would be of profound economic and social importance. By definition nondeterministic UTMs (NUTMs) solve NP complete problems in P time. However, NUTMs have previously been believed to be physically impossible to construct. Thue string rewriting systems are computationally equivalent to UTMs, and are naturally nondeterministic. Here we describe the physical design for a NUTM that implements a universal Thue system. The design exploits the ability of DNA to replicate to execute an exponential number of computational paths in P time. Each Thue rewriting step is embodied in a DNA edit implemented using a novel combination of polymerase chain reactions and site-directed mutagenesis. We demonstrate that this design works using both computational modelling and in vitro molecular biology experimentation. The current design has limitations, such as restricted error-correction. However, it opens up the prospect of engineering NUTM based computers able to outperform all standard computers on important practical problems.

### QuantOverflow

#### Options order "logs" - how is it named? And is it somewhere online? [duplicate]

I try to find somewhere "logs" of options orders. I mean - when which order was posted for which option and what size. AFAIK, it is named "ticker tape" for stocks; or level-2.. But is there such things for options? Trying to google but possibly I just dont know keywords to search for... How is it named, and can it be found online (not inside TOS etc)

I checked What data sources are available online? as it was suggested, but it does not tell anything about options tick data. What I am looking for - data (current and historical) on options orders (time,size,price) on major exchanges.

I think historic data is being sold; But is there way to download at least "todays" data somewhere?

regards, Anar

### TheoryOverflow

#### Do we take these vertices? [on hold]

I am looking at an exercise about the vertex cover.

We are given the undirected graph $G=(V,E)$ with $V=[10]$ and $E=\{(i, i+1)\mid i=1, \dots , 9\}$.

Before I use the approximation algorithm, I have to give the minimal vertex cover $C$ of $G$ and the length of $C$.

How do we find the minimal vertex cover without the approximation algorithm?

The vertex cover has to contain at least one vertex of each edge, right?

At the given graph, each vertex is connected with the consecutive one, so do we maybe take for example the first one, then the third one, etc or the second one, then the fourth one, etc?

### Planet Theory

#### Counting matchings with k unmatched vertices in planar graphs

Abstract: We consider the problem of counting matchings in planar graphs. While perfect matchings in planar graphs can be counted by a classical polynomial-time algorithm, the problem of counting all matchings (possibly containing unmatched vertices, also known as defects) is known to be #P-complete on planar graphs. To interpolate between the hard case of counting matchings and the easy case of counting perfect matchings, we study the parameterized problem of counting matchings with exactly k unmatched vertices in a planar graph G, on input G and k. This setting has a natural interpretation in statistical physics, and it is a special case of counting perfect matchings in k-apex graphs (graphs that can be turned planar by removing at most k vertices).

Starting from a recent #W[1]-hardness proof for counting perfect matchings on k-apex graphs, we obtain that counting matchings with k unmatched vertices in planar graphs is #W[1]-hard. In contrast, given a plane graph G with s distinguished faces, there is an $O(2^s \cdot n^3)$ time algorithm for counting those matchings with k unmatched vertices such that all unmatched vertices lie on the distinguished faces. This implies an $f(k,s)\cdot n^{O(1)}$ time algorithm for counting perfect matchings in k-apex graphs whose apex neighborhood is covered by s faces.

#### A note on "Approximation schemes for a subclass of subset selection problems", and a faster FPTAS for the Minimum Knapsack Problem

Authors: Cédric Bentz, Pierre Le Bodic
Abstract: Pruhs and Woeginger prove the existence of FPTAS's for a general class of minimization and maximization subset selection problems. Without losing generality from the original framework, we prove how better asymptotic worst-case running times can be achieved if a $\rho$-approximation algorithm is available, and in particular we obtain matching running times between maximization and minimization subset selection problems. We directly apply this result to the Minimum Knapsack Problem, for which the original framework yields an FPTAS with running time $O(n^5/\epsilon)$, where $\epsilon$ is the required accuracy and $n$ is the number of items, and obtain an FPTAS with running time $O(n^3/\epsilon)$, thus improving the running time by a quadratic factor in the worst case.

## July 27, 2016

### CompsciOverflow

#### Huffman tree and maximum depth

Knowing the frequencies of each symbol, is it possible to determine the maximum height of the tree without applying the Huffman algorithm? Is there a formula that gives this tree height?

### Lobsters

#### Ten Year Anniversary of Core 2 Duo

Pretty amazing. I have a Core 2 laptop (more than one actually) and I’m still quite happy with their performance, even ten years after introduction. In 2006, I could not even imagine still using a computer from 1996.

### StackOverflow

#### Python implementation of Clustering Based Local Outlier Factor

I'm doing a project that requires unsupervised anomaly detection. Does anyone know of a complete python implementation of the Clustering Based Local Outlier Factor(CBLOF) as described in this paper?

This is different than the k-nn implementation of LOF which I'm aware is implemented here.

### Fefe

#### Kann Microsoft sich und Windows 10 noch unbeliebter ...

Kann Microsoft sich und Windows 10 noch unbeliebter machen?

#### Infrastrukturapokalypse: In Berlin hat es geregnet.

Infrastrukturapokalypse: In Berlin hat es geregnet.

#### Der Pokemon-Bullshit hat inzwischen zur Sperrung einer ...

Der Pokemon-Bullshit hat inzwischen zur Sperrung einer Brücke in Düsseldorf geführt, und der Bundeswehr rennen verwirrte Pokemon-Jäger auf die Schießplätze, während dort scharf geschossen wird.

Ich bin ja langsam so weit, die Angelegenheit dem Herrn Darwin überlassen zu wollen.

### StackOverflow

#### encoding categorical variables in libsvm

Is there a class in libsvm that can automatically encode string/categorical features. I found something called libsvmstringoutcomedatawriter. Which type of encoding does the above use? One hot encoding ?

### QuantOverflow

#### Realized "efficient" frontier. Is this reasonable?

I have performed some out-of-sample analysis of mean-variance optimization with monthly rebalancing. Studying the "realized efficient frontier", I am worried that something is wrong. Since the frontier is the "realized outcome", I am aware that the frontier might not be efficient which can be seen in the figure (we have higher volatility for lower return). Having read research on out-of-sample performance with MVO, they usually obtain frontiers that are convex and coincide with the theory (i.e. more risk usually implies more return). I need a second opinion regarding the following figure, is it a reasonable shape? The numbers in the figure are the target volatilities during the portfolio optimization and the difference between the plots is the method I have used to estimated the covariance matrix.

Figure of the frontiers

### TheoryOverflow

#### Is it possible to verify a typechecker for a total dependently-typed language in that language's logic?

I understand the diagonalization argument against implementing an eval function in a total language, and that typechecking in a dependently typed language requires evaluation, so implementing a typechecker for a total dependently typed language in itself is out. But is it possible to write a typechecker T for total dependently typed language A in (possibly non-total) language B, formalize B in A, and prove the correctness/termination/etc. of T in that formalization (without relying on cheats like including a primitive operation in B that does typechecking)? If not, is it possible to add some (presumably non-computable?) axiom to A that is not included in T and allows the proof to go through without allowing verifying incorrect implementations?

More concretely, suppose I have some formalization of turing machines in Coq and write a turing machine that takes a finite text stream (encoded in some reasonable format) and outputs a 1 if it represents a well-typed Gallina fragment and a 0 otherwise. Could I prove termination and correctness of that turing machine in Coq itself?

### CompsciOverflow

#### Closed form solution for optimization problem

Consider the problem of finding the real-valued matrix $C$ so that

$$\|S-AC\|_F^2\qquad(1)$$

is minimal. ($S$ and $A$ are real valued matrices and $_F$ denotes the Frobenius norm). This problem has a closed form solution of $C=A^+S$, where $A^+$ is the pseudo-inverse of $A$.

I need to control the magnitude of the elements in $C$, that is an extra regularization term $\varepsilon\|C\|_F^2$ must be added to (1). Is it there a closed form solution for minimizing

$$\|S-AC\|_F^2 + \varepsilon\|C\|_F^2\,?\qquad(2)$$

### StackOverflow

#### Education Evaluation and prediction Engine [on hold]

i am going to make Education Evaluation Engine to evaluate and predict to the students ... so i search on prediction servers or algorithms so which server or algorithm will be useful to me

# notice: use php & mysql

machine learning algorithms

#### 'ShuffleSplit' object has no attribute 'items' (Python - Machine Learning)

I am trying to run a function that looks like so:

from sklearn.metrics import fbeta_score, make_scorer
from sklearn.tree import DecisionTreeRegressor
from sklearn import grid_search, datasets

def fit_model(X, y):
""" Performs grid search over the 'max_depth' parameter for a
decision tree regressor trained on the input data [X, y]. """

# Create cross-validation sets from the training data
cv_sets = ShuffleSplit(X.shape[0], n_iter = 10, test_size = 0.20, random_state = 0)

# TODO: Create a decision tree regressor object
regressor = DecisionTreeRegressor(random_state=0)

# TODO: Create a dictionary for the parameter 'max_depth' with a range from 1 to 10
params = {'max_depth': range(1,10)}

# TODO: Transform 'performance_metric' into a scoring function using 'make_scorer'
scoring_fnc = make_scorer(performance_metric, beta=2)

# TODO: Create the grid search object
grid = grid_search.GridSearchCV(regressor, params, scoring_fnc, cv_sets)

# Fit the grid search object to the data to compute the optimal model
grid = grid.fit(X, y)

# Return the optimal model after fitting the data
return grid.best_estimator_


Then I have this code:

# Fit the training data to the model using grid search
reg = fit_model(X_train, y_train)

# Produce the value for 'max_depth'
print "Parameter 'max_depth' is {} for the optimal model.".format(reg.get_params()['max_depth'])


However, when running this, I get this error:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-53-192f7c286a58> in <module>()
1 # Fit the training data to the model using grid search
----> 2 reg = fit_model(X_train, y_train)
3
4 # Produce the value for 'max_depth'
5 print "Parameter 'max_depth' is {} for the optimal model.".format(reg.get_params()['max_depth'])

<ipython-input-52-38c7925dbfef> in fit_model(X, y)
25
26     # Fit the grid search object to the data to compute the optimal model
---> 27     grid = grid.fit(X, y)
28
29     # Return the optimal model after fitting the data

/Users/--------/anaconda/lib/python2.7/site-packages/sklearn/grid_search.pyc in fit(self, X, y)
802
803         """
--> 804         return self._fit(X, y, ParameterGrid(self.param_grid))
805
806

/Users/--------/anaconda/lib/python2.7/site-packages/sklearn/grid_search.pyc in _fit(self, X, y, parameter_iterable)
551                                     self.fit_params, return_parameters=True,
552                                     error_score=self.error_score)
--> 553                 for parameters in parameter_iterable
554                 for train, test in cv)
555

/Users/-------/anaconda/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.pyc in __call__(self, iterable)
802             self._iterating = True
803
--> 804             while self.dispatch_one_batch(iterator):
805                 pass
806

/Users/--------/anaconda/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.pyc in dispatch_one_batch(self, iterator)
660                 return False
661             else:
663                 return True
664

/Users/--------/anaconda/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.pyc in _dispatch(self, batch)
568
569         if self._pool is None:
--> 570             job = ImmediateComputeBatch(batch)
571             self._jobs.append(job)
572             self.n_dispatched_batches += 1

/Users/---------/anaconda/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.pyc in __init__(self, batch)
181         # Don't delay the application, to avoid keeping the input
182         # arguments in memory
--> 183         self.results = batch()
184
185     def get(self):

/Users/-----------/anaconda/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.pyc in __call__(self)
70
71     def __call__(self):
---> 72         return [func(*args, **kwargs) for func, args, kwargs in self.items]
73
74     def __len__(self):

/Users/----------/anaconda/lib/python2.7/site-packages/sklearn/cross_validation.pyc in _fit_and_score(estimator, X, y, scorer, train, test, verbose, parameters, fit_params, return_train_score, return_parameters, error_score)
1515     fit_params = fit_params if fit_params is not None else {}
1516     fit_params = dict([(k, _index_param_value(X, v, train))
-> 1517                       for k, v in fit_params.items()])
1518
1519     if parameters is not None:

AttributeError: 'ShuffleSplit' object has no attribute 'items'


What is the issue?

### TheoryOverflow

#### Is there a linear space lower bound for streaming set equality?

Consider two streams. In each stream one string arrives at a time. A query asks: Is the set of strings that has arrived so far the same in both streams?

Is there a linear space randomized lower bound for this problem?

If the sets of strings had been given in advance we could just sort them, concatenate them in each set and then compute a random fingerprint of each of the two concatenated strings which can then be compared quickly using very little space (although there is of course no benefit as we first created something of linear size).

In the streaming case, we would need an updatable fingerprint it seems to do something similar. I suspect this is not possible to do in sublinear space but I don't see a proof yet.

### QuantOverflow

#### Swap Rates, OIS vs LIBOR, and multiple curves

I was reading through a paper that attempted to present a theoretical explanation for the divergence in value of different LIBOR tenors (and thus for the use of different curves for different tenors). The author's framework explained differences in terms of FRA and basis swap spreads (in Conjecture 6, p.19), and I was wondering:

1. have any theoretical frameworks been developed that are used in practice?
2. are there any other reasons (beyond empirical fitting) for using multiple curves (inspired by the answers to this question)?

Finally, in the framework of multiple curves, is it then appropriate to say that 1Y LIBOR is above 1Y swaps because the swaps use OIS discounting (and are collateralized)? If so, does this extend to explaining any of the difference in long term rates (10Y swaps below 10Y treasuries), or are these dynamics better explained by regulatory environments and repo markets?

Thanks

### StackOverflow

#### How can I use sklearn.naive_bayes with categorical features?

I want to learn a Naive Bayes model for a problem where the class is boolean (takes on one of two values). Some of the features are boolean, but other features are categorical and can take on a small number of values (~5).

If all my features were boolean then I would want to use sklearn.naive_bayes.BernoulliNB. It seems clear that sklearn.naive_bayes.MultinomialNB is not what I want.

One solution is to split up my categorical features into boolean features. For instance, if a variable "X" takes on values "red", "green", "blue", I can have three variables: "X is red", "X is green", "X is blue". That violates the assumption of conditional independence of the variables given the class, so it seems totally inappropriate.

Another possibility is to encode the variable as a real-valued variable where 0.0 means red, 1.0 means green, and 2.0 means blue. That also seems totally inappropriate to use GaussianNB (for obvious reasons).

What I'm trying to do doesn't seem weird, but I don't understand how to fit it into the Naive Bayes models that sklearn gives me. It's easy to code up myself, but I prefer to use sklearn if possible for obvious reasons (most: to avoid bugs).

[Edit to explain why I don't think multinomial NB is what I want]:

My understanding is that in multinomial NB the feature vector consists of counts of how many times a token was observed in k iid samples.

My understanding is that this is a fit for document of classification where there is an underlying class of document, and then each word in the document is assumed to be drawn from a categorical distribution specific to that class. A document would have k tokens, the feature vector would be of length equal to the vocabulary size, and the sum of the feature counts would be k.

In my case, I have a number of bernoulli variables, plus a couple categorical ones. But there is no concept of the "counts" here.

Example: classes are people who like or don't like math. Predictors are college major (categorical) and whether they went to graduate school (boolean).

I don't think this fits multinomial since there are no counts here.

### TheoryOverflow

#### Given a subset of vertices, find a cycle of a minimal number of edges that traverses all vertices in the subset

I am looking for an algorithm that given a connected, bridge-less, undirected graph and a subset of vertices, finds a cycle that traverses all the vertices in the given subset. However, I also need the cycle to contain a minimal number of edges.

For example:

The subset here is the two blackened vertices and the result would be the cycle in red.

Any ideas? Is there a polynomial algorithm that solves this?

Any help will be greatly appreciated. Thanks!

### QuantOverflow

#### FX forward with stochastic interest rates pricing

I would like to extend the following question about FX Forward rates in stochastic interest rate setup: "Expectation" of a FX Forward

We consider a FX process $X_t = X_0 \exp( \int_0^t(r^d_s-r^f_s)ds -\frac{\sigma^2}{2}t+ \sigma W_t)$ where $r^d$ and $r^f$ are stochastic processes not independent of the Brownian motion $W$. As we know the FX Forward rate is $F^X(t,T) = E_t^d\left[X_T \right]$ under the domestic risk-neutral measure.

The question is how to show that $F^X(t,T) = X_t \frac{B_f(t,T)}{B_d(t,T)}$ where $B_d(t,T)$ and $B_f(t,T)$ are respective the domestic and foreing zero-coupon bond prices of maturity $T$ at time $t$.

Since $X_T = X_t \exp\left( \int_t^T(r^d_s-r^f_s)ds+ \sigma (W_T-W_t)\right)$ \begin{align} F^X(t,T) &= X_t E_t^d\left[\exp\left( \int_t^T(r^d_s-r^f_s)ds-\frac{\sigma^2}{2}(T-t) + \sigma (W_T-W_t)\right)\right] \\& =X_t E_t^d\left[\exp\left( \int_t^T(r^d_s-r^f_s)ds \right) \frac{\mathcal E_T(\sigma W )}{\mathcal E_t(\sigma W )}\right] \\&= X_t E_t^d\left[\exp\left( \int_t^T(r^d_s-r^f_s)ds \right) \frac{d\mathcal Q^f}{d\mathcal Q^d} \frac{1}{E_t^d \left[\frac{d\mathcal Q^f}{d\mathcal Q^d}\right]}\right] \\&= X_t E_t^f\left[\exp\left( \int_t^T(r^d_s-r^f_s)ds \right) \right] \end{align}

Now how to conclude given that $r^d$ and $r^f$ are not necessarilly independent of each other since they both depend on the Brownian motion $W$ (by the way let's assume we working in the natural filtration of $W$)?

Edit

I would like to extend my question to the pricing of non-deliverable FX forwards. I posted a new question for that here : FX Forward pricing.

### StackOverflow

#### SOM multiple BMU, need to work on one random one or all of it

I am currently learning my own self organized map (SOM). I am new to this. The nodes are all initialized with random values. I was wondering, what do I need to do if there is a case that return multiple BMU during the training session. For the update phase, do I need to take only one of them, or all of them?

#### Compose function signature

I've read that the composition of g :: A -> B and f :: B -> C, pronounced (“f composed of g”), results in another function (arrow) from A -> C. This can be expressed more formally as

f • g = f(g) = compose :: (B -> C) -> (A -> B) -> (A -> C)

Can the above composition be also defined as below? Please clarify. In this case the compose function takes the same two functions f and g and return a new function from A -> C.

f • g = f(g) = compose :: ((B -> C), (A -> B)) -> (A -> C)

#### low SVM accuracy on train and test sets in python

I'm porting some matlab/octave scripts for support vector machines (SVMs) to python but I'm getting poor accuracy in one of two scripts with the sklearn method.

ex6_spam.py loads some data trains a spam-detecting model.

In matlab, the SVM code provided, svmTrain.m, (see below for snippets) gives me ~99% accuracy in both the training and the test sets.

In python, sklearn.svm.SVM().fit() is giving me ~56% if I just use their linear kernel, and ~44% if I precompute the Gram matrix for a linear kernel. (The data and code - ex6_spam.py - are here.)

The odd thing, too, is that the exact same piece of code used in ex6.py gives me proper classification of 2D data points. Its behavior there is almost identical to the matlab/octave script.

I'm not doing much in ex6_spam.py - I load a training set:

mat = scipy.io.loadmat('spamTrain.mat')
X = mat["X"]
y = mat["y"]


I feed it to sklearn.svm.SVM().fit():

C = 0.1
model = svmt.svmTrain(X, y, C, "linear")
# this results in
#        clf = svm.SVC(C = C, kernel=kernelFunction, tol=tol, max_iter=max_passes, verbose=2)
#        return clf.fit(X, y)


and the I make a prediction:

p = model.predict(X)


The matlab/octave equivalent is

load('spamTrain.mat');

C = 0.1;
model = svmTrain(X, y, C, @linearKernel); # see the link to svmTrain.m above

p = svmPredict(model, X);


However, the results are wildly different. Any ideas why? I haven't had the chance to run it in a different computer, but maybe that's a possible reason?

### CompsciOverflow

#### Given an amount of sets with numbers, find a set of numbers not including any of the given

Given an amount of sets with numbers (0-20 e.g) , we are asked to find the maximum set of numbers from 0-20 that doesn't include any of the given sets(it can include numbers from a set,but not the whole set) For example :Setting the max number 8 and given the sets

{1,2} {2,3} {7} {3,4} {5,6,4}, one maximum solution is the set {1, 3, 5, 6, 8}. I was thinking of representing it as a graph and then inducting it to the Max Independent Set problem, but that seems to work only if the sets were consisted only from pairs,which doesn't stand.Any idea?Thanks in advance.

### StackOverflow

#### TensorFlow: Does it only have SGD algorithms? or does it also have others like LBFGS

I was looking at the video and model on the site, and it appeared to only have SGD as an algorithm for machine learning. I was wondering if other algorithms are also included in tensorflow, such as L-BFGS.

### Lobsters

#### Why It's Nearly Impossible To Stop This Amazon and eBay Scheme

Arbitrage, not just for the big banks anymore.

### QuantOverflow

#### So i found a bizzaro relationship between XAU XAG, any idea's and how to deal with this?

Early warning: I am currently in high school and have not studied math past Algebra 2, so I do not have a thorough understanding of the higher level concepts I am using so if the answer is blatantly simple please don't go hard on me...

Anyways I have been working on correlation stat arb and to this point all the results I have seen where in backtest, however now my algo is live so while I am waiting on results I decided to visualize the relationship between XAUUSD and XAGUSD. I had a hypothesis of a linear relationship with a consistent confidence interval, however I found something bizzare when plotting. It appears as the price rises, confidence intervals widen dramatically and the curve is not linear in appearance. Brainstorming I devised a few explanations but would prefer to have insight on this issue.

1st idea: The slope of the relationship is in fact linear but varied during the recovery vs the recession (this is explained by the 3 slopes, two of which are very high in price as gold and silver rose with financial and economic uncertainty). Confidence intervals are also consistent in a smaller sample, but shift wider as volatility rises.

2nd idea: The relationship is not linear, a different regression method should be used to estimate confidence intervals and expected value.

Can anyone explain this plssss?

Any idea where lies the problem? Thank you for suggestions.

#### Basic LIBOR curve question

I'm new to the quant finance and have a very basic question about LIBOR curve.

LIBOR is published every day for 4 different tenors (1M, 3M, 6M, 1Y), and each rate means how much annual interest should be paid when leading banks borrow money from another.

In my understanding, there should be a unique LIBOR yield curve, in which 1M, 3M, 6M, 1Y point values are the same as the quoted value above.

But it doesn't seem to be the case. There's a LIBOR curve for each 4 different tenors. Given this, what does the value of 1M LIBOR curve at 1Y point?

And, when you model LIBOR using short rate model, you're modelling the unique LIBOR short rate, not the LIBOR of each tenor separately. Correct?

Thx!

### Planet Emacsen

#### John Stevenson: Spacemacs - Adding Custom Snippets to Yasnippet

Using yasnippet saves time by avoiding the need to write boilerplate code and minimising other commonly typed content. YASnippet contains mode-specific snippets that expand to anything from a simple text replacement to a code block structure that allows you to skip through parameters and other sections of the code block. See YASnippet in action in this Emacs Yasnippet video.

To use a specific snippet simply type the alias and press M-/. For example, in html-mode typing div and pressing M-/ expands to <div id="▮" class="▯">▯</div> and places the cursor so you can type in the id name, then TAB to the class name, finally TAB to the contents of the div.

You can also combine yasnippets with autocompletion select snippets from the autocompletion menu.

Spacemacs has lots of snippets for most of the languages and modes it supports. However, YASnippets also uses a simple template system in plain text, so its pretty easy to learn. Lets look at how to add your own snippets with Spacemacs.

In regular Emacs, yasnippets expand funciton is usually bound to TAB, but that key is used already in Spacemacs so M-/ is used instead.
If you just want text replacement you can also use Emacs Abbrev mode.

The easiest place to add your own snippet definitions is in the ~/.emacs.d/private/snippets directory. Under this directory structure you should create a folder named after the relevant mode for your snippets, eg markdown-mode. Inside this mode folder, create files whos names are based on the snippet alias you wish.

So for a work in progress snipped called wip in markdown mode I created ~/.emacs.d/private/snippets/markdown-mode/wip file.

You need to load this new snippet into Spacemacs by either restarting or using the command M-x yas-load-snippet-buffer command in the buffer of the new snippet you have just written. Ths snippet with then work within any markdown mode buffer.

Although the private snippets directory is easy to use, it is not under version control. So although its not over-riddend by Spacemacs it also means your private snippets are not backed up anywhere.

If you use the ~/.spacemacs.d/snippets/modename-mode/ directory structure for your snippets then you can version them with Git or similar versioning tools.

# How to write a snippet

Typically each snippet template is contained in its own file, named after the alias of the snippet. So a snippet called wip will be in a filename wip, in a directory named after the relevant Emacs mode.

The basic structure of a snippet template is:

The content can be anything, simple text or more usefully a code strucuture with placeholders for tab stops. You can even include Emacs lisp (elisp) code in there too.

## Example: Simple text replacement

I use markdown mode for writing a lot of content, especially for technical workshops. As I am developing these workshops its useful to highlight which sections are still work in progress. Rather than type the common message I use, I’ve created a simple snippet called wip.

When you expand this snippet with M-/ then the snippet name is replaced by the content.

## Example: Using tab stops

Lets look at an existing snippet called form in the html-mode. This expands into a html form, but also helps you jump from method, id, action and content.

This snippet is the same as the simpler example, except we have added tab stops using the $ sign and a number. When you expand this snippet, the snippet name is replaced by the content as usual but the cursor is placed at the first tab stop $1. Each time you press TAB you move to the next tab stop.

$0 is our exit point from the snippet, so pressing TAB reverts to the usual behaviour outside of YASnippet. # Testing your snippets Once you have written your snippet, you can quickly test it using M-x yas-tryout-snippet. This opens a new empty buffer in the appropriate major mode and inserts the snippet so you can then test it with M-/. If you just want to try the snippet in an existing buffer, then use M-x yas-load-snippet-buffer to load this new snippet into the correct mode. M-x yas-load-snippet-buffer does exactly the same except it kills the snippet buffer (prompting to save first if neccessary). There are no default keybindings for these commands in Spacemacs, so you could create a binding under C-o, for example C-o C-s t to try a snippet and C-o C-s l to load a snippet. # Adding yas-snippets to autocompletion in Spacemacs By adding the autocompletion layer in Spacemacs the YASnippets can be shown in the autocompletion menu as you type. By default, snippets are not shown in the auto-completion popup, so set the variable auto-completion-enable-snippets-in-popup to t. # Summary Find out more about YASnippets and autocompletion from the Github repository for Spacemacs autocompletion layer. For more details and examples on writing your own snipplets, take a look at: Thank you. @jr0cket ### CompsciOverflow #### Is there a similar categorization to Elementary Cellular Automata? Elementary cellular automata show different systems and fractals like rule 30, rule 90, and rule 110. I want to know if there is another classification or type which shows similar structures just like in elementary cellular automata (ECA). For example what other systems can show structures similar to Sierpinski triangle (which is the same as rule 90 in ECA)? Is there a method to build rule 30 or 60 using other system than ECA? ### QuantOverflow #### how to derive yield curve from interest rate swap? According to some textbooks, to derive the yield curve, quote • overnight to 1 week: rates from interbank money market deposit, • 1 month to 1 year: LIBOR; • 1 year to 7 years: Interest Rate Swap; • 7 years above: government bond. I'm a bit lost here: how can an IRS rate be used to derive yield curve? Yield rate is the discount rate, if$ yield (5 years) = 4.1 \% $, it means the NPV of 1 dollar 5 years later is$ NPV ( 1 dollar, 5 years) = 1/[(1+4.1\%)^5] = 0.818 $. While interest rate swap is a contract among to legs. Assume a 5 years' IRS contract is • leg A pays fixed rate to B @ 8.5%, while A receives floating rate @ LIBOR +1.5% • leg B pays floating rate to A @ LIBOR +1.5%, B receives fixed rate@ 8.5%. , how could this swap contract help deriving the 5 years' yield rate? ### AWS #### Amazon RDS for SQL Server – Support for Native Backup/Restore to Amazon S3 Regular readers of this blog will know that I am a big fan of Amazon Relational Database Service (RDS). As a managed database service, it takes care of the more routine aspects of setting up, running, and scaling a relational database. We first launched support for SQL Server in 2012. Continuing our effort to add features that have included SSL supportmajor version upgradestransparent data encryption, enhanced monitoring and Multi-AZ, we have now added support for SQL Server native backup/restore. SQL Server native backups include all database objects: tables, indexes, stored procedures and triggers. These backups are commonly used to migrate databases between different SQL Server instances running on-premises or in the cloud. They can be used for data ingestion, disaster recovery, and so forth. The native backups also simplify the process of importing data and schemas from on-premises SQL Server instances, and will be easy for SQL Server DBAs to understand and use. Support for Native Backup/Restore You can now take native SQL Server database backups from your RDS instances and store them in an Amazon S3 bucket. Those backups can be restored to an on-premises copy of SQL Server or to another RDS-powered SQL Server instance. You can also copy backups of your on-premises databases to S3 and then restore them to an RDS SQL Server instance. SQL Server Native Backup/Restore with Amazon S3 also supports backup encryption using AWS Key Management Service (KMS) across all SQL Server editions. Storing and transferring backups in and out of AWS through S3 provides you with another option for disaster recovery. You can enable this feature by adding the SQL_SERVER_BACKUP_RESTORE option to an option group and associating the option group with your RDS SQL Server instance. This option must also be configured with your S3 bucket information and can include a KMS key to encrypt the backups. Start by finding the desired option group: Then add the SQL_SERVER_BACKUP_RESTORE option, specify (or create) an IAM role to allow RDS to access S3, point to a bucket, and (if you want) specify and configure encryption: After you have set this up, you can use SQL Server Management Studio to connect to the database instance and invoke the following stored procedures (available within the msdb database) as needed: • rds_backup_database – Back up a single database to an S3 bucket. • rds_restore_database – Restore a single database from S3. • rds_task_status – Track running backup and restore tasks. • rds_cancel_task – Cancel a running backup or restore task. To learn more, take a look at Importing and Exporting SQL Server Data. Now Available SQL Server Native Backup/Restore is now available in the US East (Northern Virginia), US West (Oregon), Europe (Ireland), Europe (Frankfurt), Asia Pacific (Sydney), Asia Pacific (Tokyo), Asia Pacific (Singapore), Asia Pacific (Mumbai), and South America (Brazil) Regions. There are no additional charges for using this feature with Amazon RDS for SQL Server, however, the usage of the Amazon S3 storage will be billed at regular rates. Jeff; ### StackOverflow #### How to define a constant in SML in a let binding way? Is there a way to define a constant in SML in a let binding way. So basically what I'm asking is how to for example do constant x = 5, in the way below: let .... in ... end  ### QuantOverflow #### VIX ETP Net Vega exposure Does anyone know the calculation that these EQD desks are using to calculate the net Vega exposure of the VIX ETPs? I am assuming it involves shares outstanding in each ETP, the value of a 30 day constant maturity future or SPVXSTR Index but everything I've tried has not yielded the same results. I've reviewed a few of the prospectuses for these ETPs but it just states that they track +1/-1x the performance of SPVXSTR Index. Could anyone help out with a methodology that is being used here? Thank you. see example attached: ### CompsciOverflow #### Should all internal node keys in B+ tree also be in the leaves? I was reading about B+ tree insertion. The algorithm takes following form: 1. Insert the new node as the leaf node. 2. If the leaf node overflows, split the node and copy the middle element to the parent index node. 3. If the index node overflows, split that node and move the middle element to the parent index node. However, adding the new index value in the parent node may cause it, in turn, to split. In fact, all the nodes on the path from a leaf to the root may split when a new value is added to a leaf node. If the root node splits, the tree grows by one level. Now the book asks to insert 33 in following tree of order 4: I was guessing how those [10,20,30] occur to be the root node. Before performing first split while forming above tree, these [10.20,30] should be in some leaf and in any case they should be present in some leaf. In other words I feel that all internal node keys should also be present in the leaves. However thats not the case with [10,20,30]. This is also inline with the fact that in B+ tree all data is present in the leaves, so all keys should be present in the leaves. Another example on youtube also have 13 and 30 in the root node but not in any leaf. Am I wrong with the fact that all internal node keys should also be in the leaves? ### Lobsters #### A Peek into F# 4.1 ### CompsciOverflow #### Approximate Nearest Neighbour Problem in Spherical Setting There has been significant literature in solving the (Approximate) Nearest Neighbour Problem in the spherical setting in the$\mathbb{R}^n$using Angular and Spherical LSH and other lattice sieving techniques. A proper definition of the problem is found in the image below. (The problem definition is borrowed from Faster sieving for shortest lattice vectors using spherical locality-sensitive hashing by Laarhoven and Weger 2015. Here is the IACR page for the paper. ) (Refer to Sieving for shortest vectors in lattices using angular locality-sensitive hashing by Laarhoven 2015. The link is in the comments.) I was curious if there is a way to have a similar spherical setting for the approximate NN problem for the finite field$\mathbb{Z}_2^n$. Particularly, I was wondering if there was a sphere definition relevant to$\mathbb{Z}_2^n$that could be analogical or atleast very similar to the one in Definition 4. The one in definition 4 allows entire lattices to be embedded on the sphere i.e.$P$is a lattice. The proposed distance measure could either be the$l_2$norm or the hamming distance. It does not seem that it can be simply translated into finite fields. I apologize if this is a naive question or does not make sense because I am a first time undergraduate researcher who is not very familiar with this forum and the level of questions asked here. #### Msc Computer Network / Computer Security [on hold] I am still a CS student I study Cisco Systems but yet I don't need to be a user I think I will apply for Msc in Computer Network towards PhD because I want to innovate NOT to use products . I want to know what do we really study & research in the Major of Msc Computer Networks , for example what do we learn and research in multi-core and distributed data processing , intelligent systems, cloud computing and regarding VoIP is it a programm that universities might offer a Msc in it many thanks in Advance . ### Lobsters #### Mercurial users: Why do you prefer mercurial over git? I have read many good comments on lobsters about mercurial. Why do you prefer mercurial over git? What are the main features that make you use it instead of the more common git? EDIT: How well does mecurial work with git servers? Does the hg-git bridge works properly in general? Do you use it in work? Most of my client related work is done on git and I don’t want to screw things. #### Edward Snowden at the MIT Media Lab Edward Snowden and Bunnie Huang present “Against the law: countering lawful abuses of digital surveillance” Comments ### StackOverflow #### SVR's Predict method in scikit-learn predicting only one number for the test set I am new to the world of machine learning in python and I have currently a lot of questions regarding the algorithms itself and the code. So I have developed a python script that takes in time series data of a stock (timestamps and the Adj. Close). I pre-processed it by taking the log of the Adj. Close and normalize it using the min max approach and split the timestamps and Adj. Close into train and test set (90/10 split going forward). The data I am using is of 891 trading days. I then apply grid search technique on the SVR to fit the train data of timestamps and Adj. Close. svr = SVR(kernel='rbf', tol=0.001, C=1.0, shrinking=True, cache_size=200, verbose=False, max_iter=-1)  then the parameters to hunt for is gamma and epsilon. grs = GridSearchCV(svr, param_grid , n_jobs=1, iid=False, cv=StratifiedKFold(train_y, shuffle=False), verbose=0) fit_results = grs.fit(train_X, train_y) par_found = fit_results.best_params_ svr = SVR(kernel='rbf', gamma = par_found['gamma'], tol=0.001, C=1.0, epsilon = par_found['epsilon'], shrinking=True, cache_size=200, verbose=False, max_iter=-1) final_fit = svr.fit(train_X, train_y) pred = svr.predict(test_X)  but when I print pred, I get an array where all the elements are a single number with an array length equal to the length of the test_X. I feel like regression engine is over fitting the data but I am not sure. Also, what feature is best suited to use as X_data? I think that using timestamps is not a best way of doing things. Also, Is there any good literature on understanding the concept and math behind Support vector regression on financial time series? Is there any better regression technique than SVR for regressing time series? Thanks. Your help will be greatly appreciated. #### Machine learning: How to determine the pattern in text file? [on hold] I am interested to find a pattern in the input file. Are there any pattern finding libraries readily available. for Example: 1 4 5 2 4 6 3 4 7 3 2 4 8 2 5 1 2 5  For the above kind of text file, algorithm should be able to identify most occuring patterns such as *4* , *2* and **5  i.e., 1. numeric 4 occurs between 2 other numerics 2. numeric 2 occurs between 2 other numerics 3. numeric 5 occurs at end of the input ### QuantOverflow #### binary option gap option cash or nothing option [on hold] i have a lot of problem in understanding binary option specially the gap option how the pay-off can be negative ?and the prime can be also negative how we choose the strik price and the montant cash or the triger price in other document thanks for reading ### Lobsters #### Coursera Programming Languages Course Part A About this course: This course is an introduction to the basic concepts of programming languages, with a strong emphasis on functional programming. The course uses the languages ML, Racket, and Ruby as vehicles for teaching the concepts, but the real intent is to teach enough about how any language “fits together” to make you more effective programming in any language – and in learning new ones. This course is neither particularly theoretical nor just about programming specifics – it will give you a framework for understanding how to use language constructs effectively and how to design correct and elegant programs. By using different languages, you will learn to think more deeply than in terms of the particular syntax of one language. The emphasis on functional programming is essential for learning how to write robust, reusable, composable, and elegant programs. Indeed, many of the most important ideas in modern languages have their roots in functional programming. Get ready to learn a fresh and beautiful way to look at software and how to have fun building it. Comments ### QuantOverflow #### Pricing options using particle swarm optimization (PSO) I am currently trying to recreate some of the work done to price various types of options using particle swarm optimization. In particular, I am trying to price European options using a similar method that can be found here. The main issue I am struggling to understand is how the evaluation stage works. As the particles in this particular PSO represent the stock price, how are the particles evaluated in the form of a fitness function to compare between them in order to get the option price? From reading similar academic papers it appears that the payoff of the option is used as a fitness function, but I'm thinking that this not have a global minimum and maximum for the PSO to optimize. For example, would call options$\max(S-K,0)$, just encourage the particles to converge towards a large value of S resulting in a large/infinite option price? I would be so so grateful if anyone can work out the answer. I've been trying to work this out for days with no luck! ### CompsciOverflow #### Which control bit operation is performed in case of two control bit tie in ALU? [on hold] suppose we have two control input bits to ALU zx-------zero the x input. nx-------negate the x input.  when these both bit are set then in which order x input is manipulated or firstly which control bit get the priority as the result obtained by operating zx first is different from the scenario where nx is done first? ### StackOverflow #### What is the meaning of Caffe - Blob Class - member variables? In Caffe, as we can see in blob.hpp, there are 6 member variables in each blob object: data_ diff_ shape_data_ shape_ count_ capacity_ data_ contains the normal data that we pass along diff_ is gradient computed by the network Since there is no comment in the source code and due to lack of the official documentation, I wanted to know, What is the exact meaning of the others? thanks, #### Agent discovery in a mutli-agent distributed system with p2p communication Lets say I have a set of Agents in a distributed network without a centralized unit. I want to communicate them with P2P. So every Agent is a peer right? The network should build itself and when a new Agent want to take part in this network or wants to leave, the whole thing should still run. Even if there are currently no Agents. (Is a possibility in my case). So how can an Agent discover another Agent who wants to take part in this Network? I thought quite a long time about this, and for myself I came to conclusion that a decentralized implementation is not really possible, but I wanted to ask the community. ### TheoryOverflow #### Complexity of computing the parity of read-twice opposite CNF formula ($\oplus\text{Rtw-Opp-CNF}$) In a read-twice opposite CNF formula each variable appears twice, once positive and once negative. I'm interested in the$\oplus\text{Rtw-Opp-CNF}$problem, which consists in computing the parity of the number of satisfying assignments of a read-twice opposite CNF formula. I was unable to find any reference about the complexity of such problem. The closest I was able to find is that the counting version$\#\text{Rtw-Opp-CNF}$is$\#\text{P}$-complete (see section 6.3 in this paper). Thanks in advance for your help. Update 10th April 2016 • In this paper, the$\oplus\text{Rtw-Opp-SAT}$problem is shown to be$\oplus\text{P}$-complete, however the formula produced by reduction from$3\text{SAT}$is not in CNF, and as soon as you try to convert it back into CNF you get a read-thrice formula. • The monotone version$\oplus\text{Rtw-Mon-CNF}$is shown to be$\oplus\text{P}$-complete in this paper. In such paper,$\oplus\text{Rtw-Opp-CNF}$is quickly mentioned at the end of section 4: Valiant says it is degenerate. It is not clear to me what being degenerate exactly means, nor what does it imply in terms of hardness. Update 12th April 2016 It would be also very interesting to know if anyone has ever studied the complexity of the$\Delta\text{Rtw-Opp-CNF}$problem. Given a read-twice opposite CNF formula, such problem asks to compute the difference between the number of satisfying assignments having an odd number of variables set to true and the number of satisfying assignments having an even number of variables set to true. I've not found any literature about it. Update 29th May 2016 As pointed out by Emil Jeřábek in his comment, it is not true that Valiant said that the problem$\oplus\text{Rtw-Opp-CNF}$is degenerate. He only said that a more restricted version of such problem,$\oplus\text{Pl-Rtw-Opp-3CNF}, is degenerate. In the meanwhile, I continue to not know what degenerate exactly means, but at least now it seems clear that it is a synonym of lack of expressive power. ### AWS #### Hot Startups on AWS – July 2016 – Depop, Nextdoor, Branch Today I would like to introduce a very special guest blogger! My daughter Tina is a Recruiting Coordinator for the AWS team and is making her professional blogging debut with today’s post. Jeff; It’s officially summer and it’s hot! Check out this month’s hot AWS-powered startups: • Depop – a social mobile marketplace for artists and friends to buy and sell products. • Nextdoor – building stronger and safer neighborhoods through technology. • Branch – provides free deep linking technology for mobile app developers to gain and retain users. Depop (UK) In 2011, Simon Beckerman and his brother, Daniel, set out to create a social, mobile marketplace that would make buying and selling from mobile a fun and interactive experience. The Depop founders recognized that the rise of m-commerce was changing the way that consumers wanted to communicate and interact with each other. Simon, who already ran PIG Magazine and the luxury eyewear brand RetroSuperFuture, wanted to create a space where artists and creatives like himself could share, buy and sell their possessions. After launching organically in Italy, Depop moved to Shoreditch, London in 2012 to establish its headquarter and has since grown considerably with offices in London, New York, and Milan. With over 4 million users worldwide, Depop is growing and building a community of shop owners with a passion for fashion, music, art, vintage, and lifestyle pieces. The familiar and user-friendly interface allows users to follow, like, comment, and private message with other users and shop owners. Simply download the app (Android or iOS) and you are suddenly connected to millions of unique items ready for purchase. It’s not just clothes either – you can find home décor, vintage furniture, jewelry, and more. Filtering by location allows you to personalize your feed and shop locally for even more convenience. Buyers can scroll through an endless stream of items ready for purchase and have the option to either pick up in-person or have their items shipped directly to them. Selling items is just as easy – upload a photo, write a short description, set a price, and then list your product. Depop chose AWS in order to move fast without needing a large operations team, following a DevOps approach. They use 12 distinct AWS services including Amazon S3 and Amazon CloudFront for image hosting, and Auto Scaling to deal with the unpredictable and fairly large changes in traffic throughout the day. Depop’s developers are able to support their own services in production without needing to call on a dedicated operations team. Check out Depop’s Blog to keep up with the artists using the app! Nextdoor (San Francisco) Based in San Francisco, Nextdoor has helped more than 100,000 neighborhoods across the United States bring their communities closer together. In 2010, the founders of this startup were surprised to learn from a Pew research study that the majority of American adults knew only some (29%) or none (28%) of their neighbors by name. Recognizing an opportunity to bring back a sense of community to neighborhoods across the country, the idea for Nextdoor was born. Neighbors are using Nextdoor to ask questions, get to know one another, and exchange local advice and recommendations. For example, neighbors are able to help one another to: • Find trustworthy babysitters, plumbers, and dentists in the area. • Organize neighborhood events, such as garage sales and block parties. • Get assistance to find lost pets and missing packages. • Sell or give away items, like an old kitchen table or bike. • Report neighborhood crime and share safety concerns. Nextdoor is also giving local agencies such as police and fire departments, and offices of emergency management the ability to connect with verified residents in their jurisdiction through a feature called Nextdoor for Public Agencies. This is incredibly beneficial for agencies to help residents with emergency preparedness, community engagement, crime prevention, and community policing. In his seminal work, Bowling Alone, Harvard Professor Robert Putnam learned that when social capital within a community is high, children do better in school, neighborhoods are safer, people prosper, the government is better, and people are happier and healthier overall. With a comprehensive list of helpful community guidelines, Nextdoor is creating stronger and safer neighborhoods with the power of technology. You can download the Nextdoor app for Android or iOS. AWS is the foundational infrastructure for both the online services in Nextdoor’s technology stack, and all of their offline data processing and analytics systems. Nextdoor uses over 25 different AWS services (Amazon EC2, Elastic Load Balancing, Amazon Cloudfront, Amazon S3, Amazon DynamoDB, Amazon Redshift, and Amazon Kinesis to name a few) to quickly prototype, develop, and deploy new features for community members. Supporting millions of users in the US, Nextdoor runs their services across four AWS Regions worldwide, and has also recently expanded to Europe. In their own words, “Amazon makes it easy for us to flexibly grow our technology footprint with predictable costs in an automated fashion.” Branch (Palo Alto) The idea for Branch came in May 2014 when a group of Stanford business school graduates began working together to build and launch their own mobile app. They soon realized how challenging it was to grow their app, and saw that many of their friends were running into the same difficulties. The graduates saw the potential to create a deep linking platform to help apps get discovered, retain users, and grow exponentially. Branch reached its first million users within several months after its inception, and a little over a year later had climbed to one billion users and 5,000 apps. Companies such as Pinterest, Instacart, Mint, and Redfin are partnering with Branch to improve their user experience worldwide. Over 11,000 apps use the platform today. As the number of smartphone users continues to increase, mobile apps are providing better user experiences, higher conversions, and better retention rates than the mobile web. The issue comes when mobile developers want to link users to the content they worked so hard to create – the transition between emails, ads, referrals, and more can often lead to broken experiences. Mobile deep links allow users to share content that is within an app. Normal web links don’t work unless apps are downloaded on a device, and even then there is no standard way to find and share content as it is specific to every app. Branch allows content within apps to be shared just as they would be on the web. For example, imagine you are shopping for a fresh pair of shoes on the mobile web. You are ready to check out, but are prompted to download the store’s app to complete your purchase. Now that you’ve downloaded the app, you are brought back to the store’s homepage and need to restart your search from the beginning. With a Branch deep link, you instead would be linked directly back to checkout once you’ve installed the app, saving time and creating an overall better user experience. Branch has grown exponentially over the past two years, and relies heavily on AWS to scale its infrastructure. Anticipating continued growth, Branch builds and maintains most of its infrastructure services with open source tools running on Amazon EC2 instances (Amazon API Gateway, Apache Kafka, Apache Zookeeper, Kubernetes, Redis, and Aerospike), and also use AWS services such as Elastic Load Balancing, Amazon CloudFront, Amazon Route 53, and Amazon RDS for PostgreSQL. These services allow Branch to maintain a 99.999% success rate on links with a latency of only 60 ms in the 99th percentile. To learn more about how they did this, read their recent blog post, Scaling to Billions of Requests a Day with AWS. ### StackOverflow #### Samples with no label assignment using multilabel random forest in scikit-learn I am using Scikit-Learn's RandomForestClassifier to predict multiple labels of documents. Each document has 50 features, no document has any missing features, and each document has at least one label associated with it. clf = RandomForestClassifier(n_estimators=20).fit(X_train,y_train) preds = clf.predict(X_test)  However, I have noticed that after prediction there are some samples that are assigned no labels, even though the samples were not missing label data. >>> y_test[0,:] array([1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]) >>> preds[0,:] array([ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])  The results of predict_proba align with those of predict. >>> probas = clf.predict_proba(X_test) >>> for label in probas: >>> print (label[0][0], label[0][1]) (0.80000000000000004, 0.20000000000000001) (0.94999999999999996, 0.050000000000000003) (0.94999999999999996, 0.050000000000000003) (1.0, 0.0) (1.0, 0.0) (1.0, 0.0) (0.94999999999999996, 0.050000000000000003) (0.90000000000000002, 0.10000000000000001) (1.0, 0.0) (1.0, 0.0) (0.94999999999999996, 0.050000000000000003) (1.0, 0.0) (0.94999999999999996, 0.050000000000000003) (0.84999999999999998, 0.14999999999999999) (0.90000000000000002, 0.10000000000000001) (0.90000000000000002, 0.10000000000000001) (1.0, 0.0) (0.59999999999999998, 0.40000000000000002) (0.94999999999999996, 0.050000000000000003) (0.94999999999999996, 0.050000000000000003) (1.0, 0.0)  Each output above shows that for each label, a higher marginal probability has been assigned to the label not appearing. My understanding of decision trees was that at least one label has to be assigned to each sample when predicting, so this leaves me a bit confused. Is it expected behavior for a multilabel decision tree / random forest to be able to assign no labels to a sample? UPDATE 1 The features of each document are probabilities of belonging to a topic according to a topic model. >>>X_train.shape (99892L, 50L) >>>X_train[3,:] array([ 5.21079651e-01, 1.41085893e-06, 2.55158446e-03, 5.88421331e-04, 4.17571505e-06, 9.78104112e-03, 1.14105667e-03, 7.93964896e-04, 7.85177346e-03, 1.92635026e-03, 5.21080173e-07, 4.04680406e-04, 2.68261102e-04, 4.60332012e-04, 2.01803955e-03, 6.73533276e-03, 1.38491129e-03, 1.05682475e-02, 1.79368409e-02, 3.86488757e-03, 4.46729289e-04, 8.82488825e-05, 2.09428702e-03, 4.12810745e-02, 1.81651561e-03, 6.43641626e-03, 1.39687081e-03, 1.71262909e-03, 2.95181902e-04, 2.73045908e-03, 4.77474778e-02, 7.56948497e-03, 4.22549636e-03, 3.78891036e-03, 4.64685435e-03, 6.18710017e-03, 2.40424583e-02, 7.78131179e-03, 8.14288762e-03, 1.05162547e-02, 1.83166124e-02, 3.92332202e-03, 9.83870257e-03, 1.16684231e-02, 2.02723299e-02, 3.38977762e-03, 2.69966332e-02, 3.43221675e-02, 2.78571022e-02, 7.11067964e-02])  The label data was formatted using MultiLabelBinarizer and looks like: >>>y_train.shape (99892L, 21L) >>>y_train[3,:] array([0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])  UPDATE 2 The output of predict_proba above suggested above that the assigning of no classes might be an artifact of trees voting on labels (there are 20 trees and all probabilities are approximately multiples of 0.05). However, using a single decision tree, I still find there are some samples that are assigned no labels. The output looks similar to predict_proba above, in that for each sample there is a probability a given label is assigned or not to the sample. This seems to suggest that at some point the decision tree is turning the problem into binary classification, though the documentation says that the tree takes advantage of label correlations. ### Fefe #### Kurze Durchsage von Donald Trump:"Russia, if you're ... Kurze Durchsage von Donald Trump: "Russia, if you're listening, I hope you're able to find the 30,000 emails that are missing," Trump said. "I think you'll be rewarded mightily by our press!" Oh, ach? Das ist unter Präsident Trump der Standard? Da wird ihn Julian Assange aber beizeiten dran erinnern, denke ich! ### Lobsters #### Lolcat Clone in x64 Assembly ### CompsciOverflow #### Why are regular expressions defined with union, concatenation and star operations? A regular expresssion is defined recursively as 1.a$for some$a \in \Sigma$is a regular expression, 2.$\varepsilon$is a regular expression, 3.$\emptyset$is a regular expression, 4.$(R_1 \cup R_2)$where$R_1$and$R_2$are regular expressions is a regular expression, 5.$(R_1 \circ R_2)$where$R_1$and$R_2$are regular expressions is a regular expression, 6.$(R_1)^*$where$R_1$is a regular expression is a regular expression. This definition is taken from page 64 of Sipser, Michael. Introduction to the Theory of Computation, 3rd edition. Cengage Learning, 2012. Now, I have the following questions. • Why do not the definition contain the intersection, complement or reverse operations? • If we change the 4th item to$R_1 \cap R_2$, do we get an equivalent definition, i.e. for each regular language, there is a modified regular expression and vice versa? • I know that this definition is complete and well-defined, but why is it preferred to other equivalent, well defined and complete definitions? ### QuantOverflow #### Annualized log return for Equity [on hold] I came across an old question answered here My question is theoretical. I'm not a mathematician/quant professional so please excuse my lack of knowledge. I've read a few papers on forecasting equity price (single stock) - in my case, 1 year horizon 99% confidence level, but most are technical and do not address the basics. 1 year future price = price today adjusted for annualized return. Theoretically, I understand that we are annualizing the daily return and taking the log. The annualized log term can be denoted as 'e' raised to annualized return. But why would it be multiplied with the current stock price to get the future 1 year stock price as opposed to adding it?. Shouldn't it be price today (1+ log of annualized return)? But in the paper I referenced it was shown as future price = Price today x e raised to annualized return in log term. ### StackOverflow #### Saving, loading and predicting with Theano CNN (LeNet) I'm looking for the right way to save, load and make some prediction on a single image file with a Theano CNN (LeNet) trained model. I already did it with the Theano LogisticRegression and MLP, it works well. But i can't find out how to do it with the CNN. Actually, i'm not sure of which parameters should I store during saving since there is more layers. ### TheoryOverflow #### Online/approximate weighted and capacitated bipartite matching I wish to take a look at online/approximate weighted and capacitated bipartite matching problem. Consider$G=\{L\cup R, E\}$,$|L|=n_1$,$|R|=n_2$,$|E|=m$and$E\subseteq L\times R$. For each$r_i\in R$, it has capacity$c_i$which means that at most$c_i$nodes from$L$can be matched to$r_i$. The objective function to maximize is$\sum_{i=1}^{n_2}x_iw_i$where$x_i$is the number of nodes in$L$matched to$r_i$and$w_i>0$is the weight. The constraints are (1)$x_i\in\{0,...,c_i\}$, (2) any node in$L$can be matched at most once and (3) any node$l_j$is allowed to be matched to$r_i$if$(l_j, r_i)\in E$. Is there any paper that solved the exact problem as I described above (provides either approximate or online algorithm)? To be clear, I am asking for references, and methods are not necessary. ### Lobsters #### Xen exploitation part 2: XSA-148, from guest to host ### QuantOverflow #### What is implied volatility? I always understood implied volatility as a volatility I need to plug into BS in order to get the market price. My question is if I am using different model, does it mean that implied volatility is the volatility I need to plug into pricing equation of the new model in order to get market price or am I still referring to BS? ### Fefe #### Neues von unserem Innenminister:Zudem hatte Bundesinnenminister ... Neues von unserem Innenminister: Zudem hatte Bundesinnenminister Thomas de Maizière im Mai ein Geheimabkommen mit den USA geschlossen, das die Grundlage eines intensiveren Informationstransfers über Islamisten bilden soll. In der Abteilung Staatsschutz des Bundeskriminalamts ist dazu das Projekt "Dada" eingerichtet worden, das den Fluss der Nachrichten abwickeln soll. Nach Informationen von SPIEGEL ONLINE haben die Amerikaner bereits Tausende Datensätze von Islamisten übermittelt. Oh, ach? Der Innenminister hat jetzt die Befugnis, für die Bundesrepublik Deutschland Geheimabkommen zu treffen?! Das handelt sich ja wohl hoffentlich um ein Missverständnis! Auf der anderen Seite kann ich berichten, dass Projekt Dada schon angelaufen zu sein scheint oder vielleicht hat der Twitter-Mitarbeiter nur einen schlechten Tag heute :) #### Ach nee! Erinnert ihr euch an den Amokläufer von München? ... Ach nee! Erinnert ihr euch an den Amokläufer von München? Der Deutsch-Iraner? ### Lobsters #### Clojure spec Screencast: Testing ### CompsciOverflow #### Where is pivoting done in the Crout decomposition algorithm? Consider the following code, found on wikipedia, that implements the Crout decomposition algorithm:  void crout(double const **A, double **L, double **U, int n) { int i, j, k; double sum = 0; for (i = 0; i < n; i++) { U[i][i] = 1; } for (j = 0; j < n; j++) { for (i = j; i < n; i++) { sum = 0; for (k = 0; k < j; k++) { sum = sum + L[i][k] * U[k][j]; } L[i][j] = A[i][j] - sum; } for (i = j; i < n; i++) { sum = 0; for(k = 0; k < j; k++) { sum = sum + L[j][k] * U[k][i]; } if (L[j][j] == 0) { printf("det(L) close to 0!\n Can't divide by 0...\n"); exit(EXIT_FAILURE); } U[j][i] = (A[j][i] - sum) / L[j][j]; } } }  Which part is responsible for pivoting? The thing is, I've read that partial pivoting is still a rather non-trivial operation (on the order$O(n^2)$), but the above algorithm seems similar in terms of complexity to the "naive" LU decomposition without any pivoting. To check, I've run the algorithm on my 28x28 test matrix [ 0 0.291601633 0 -0.062262937 0 0 0 0 0 -0.22949092 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ] [ 0.291601633 0 -0.062262937 0 0 0 0 0 -0.22949092 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ] [ 0 -0.062262937 0 0.633860174 0 -0.203470344 0 -0.185648901 0 -0.182968732 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ] [ -0.062262937 0 0.633860174 0 -0.203470344 0 -0.185648901 0 -0.182968732 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ] [ 0 0 0 -0.203470344 0 0.386208493 0 -0.18368887 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ] [ 0 0 -0.203470344 0 0.386208493 0 -0.18368887 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ] [ 0 0 0 -0.185648901 0 -0.18368887 0 1.163032501 0 -0.044175498 0 0 0 -0.20912 0 0 0 -0.55618 0 0 0 0 0 0 0 0 0 0 ] [ 0 0 -0.185648901 0 -0.18368887 0 1.163032501 0 -0.044175498 0 0 0 -0.20912 0 0 0 -0.55618 0 0 0 0 0 0 0 0 0 0 0 ] [ 0 -0.22949092 0 -0.182968732 0 0 0 -0.044175498 0 0.702145321 0 -0.25202 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ] [ -0.22949092 0 -0.182968732 0 0 0 -0.044175498 0 0.702145321 0 -0.25202 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ] [ 0 0 0 0 0 0 0 0 0 -0.25202 0 0.883882188 0 0 0 0 0 0 0 0 0 -0.220414179 0 -0.28380561 0 -0.14610303 0 0 ] [ 0 0 0 0 0 0 0 0 -0.25202 0 0.883882188 0 0 0 0 0 0 0 0 0 -0.220414179 0 -0.28380561 0 -0.14610303 0 0 0 ] [ 0 0 0 0 0 0 0 -0.20912 0 0 0 0 0 0.49528 0 -0.17615 0 -0.11001 0 0 0 0 0 0 0 0 0 0 ] [ 0 0 0 0 0 0 -0.20912 0 0 0 0 0 0.49528 0 -0.17615 0 -0.11001 0 0 0 0 0 0 0 0 0 0 0 ] [ 0 0 0 0 0 0 0 0 0 0 0 0 0 -0.17615 0 0.17615 0 0 0 0 0 0 0 0 0 0 0 0 ] [ 0 0 0 0 0 0 0 0 0 0 0 0 -0.17615 0 0.17615 0 0 0 0 0 0 0 0 0 0 0 0 0 ] [ 0 0 0 0 0 0 0 -0.55618 0 0 0 0 0 -0.11001 0 0 0 1.033363204 0 -0.090289125 0 0 0 0 0 0 0 -0.298767964 ] [ 0 0 0 0 0 0 -0.55618 0 0 0 0 0 -0.11001 0 0 0 1.033363204 0 -0.090289125 0 0 0 0 0 0 0 -0.298767964 0 ] [ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -0.090289125 0 0.299090395 0 -0.208861407 0 0 0 0 0 0 ] [ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -0.090289125 0 0.299090395 0 -0.208861407 0 0 0 0 0 0 0 ] [ 0 0 0 0 0 0 0 0 0 0 0 -0.220414179 0 0 0 0 0 0 0 -0.208861407 0 0.429181968 0 0 0 0 0 0 ] [ 0 0 0 0 0 0 0 0 0 0 -0.220414179 0 0 0 0 0 0 0 -0.208861407 0 0.429181968 0 0 0 0 0 0 0 ] [ 0 0 0 0 0 0 0 0 0 0 0 -0.28380561 0 0 0 0 0 0 0 0 0 0 0 0.570852385 0 -0.29792224 0 0 ] [ 0 0 0 0 0 0 0 0 0 0 -0.28380561 0 0 0 0 0 0 0 0 0 0 0 0.570852385 0 -0.29792224 0 0 0 ] [ 0 0 0 0 0 0 0 0 0 0 0 -0.14610303 0 0 0 0 0 0 0 0 0 0 0 -0.29792224 0 0.818338896 0 -0.387730558 ] [ 0 0 0 0 0 0 0 0 0 0 -0.14610303 0 0 0 0 0 0 0 0 0 0 0 -0.29792224 0 0.818338896 0 -0.387730558 0 ] [ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -0.298767964 0 0 0 0 0 0 0 -0.387730558 0 0.68647389 ] [ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -0.298767964 0 0 0 0 0 0 0 -0.387730558 0 0.68647389 0 ]  which is invertible (you can check here), when it returned a number NaNs and Infinities. ### Lambda the Ultimate #### Fully Abstract Compilation via Universal Embedding Fully Abstract Compilation via Universal Embedding by Max S. New, William J. Bowman, and Amal Ahmed: A fully abstract compiler guarantees that two source components are observationally equivalent in the source language if and only if their translations are observationally equivalent in the target. Full abstraction implies the translation is secure: target-language attackers can make no more observations of a compiled component than a source-language attacker interacting with the original source component. Proving full abstraction for realistic compilers is challenging because realistic target languages contain features (such as control effects) unavailable in the source, while proofs of full abstraction require showing that every target context to which a compiled component may be linked can be back-translated to a behaviorally equivalent source context. We prove the first full abstraction result for a translation whose target language contains exceptions, but the source does not. Our translation—specifically, closure conversion of simply typed λ-calculus with recursive types—uses types at the target level to ensure that a compiled component is never linked with attackers that have more distinguishing power than source-level attackers. We present a new back-translation technique based on a deep embedding of the target language into the source language at a dynamic type. Then boundaries are inserted that mediate terms between the untyped embedding and the strongly-typed source. This technique allows back-translating non-terminating programs, target features that are untypeable in the source, and well-bracketed effects. Potentially a promising step forward to secure multilanguage runtimes. We've previously discussed security vulnerabilities caused by full abstraction failures here and here. The paper also provides a comprehensive review of associated literature, like various means of protection, back translations, embeddings, etc. ### High Scalability #### Economics May Drive Serverless We've been following an increasing ephemerality curve to get more and more utilization out of our big brawny boxes. VMs, VMs in the cloud, containers, containers in the cloud, and now serverless, which looks to be our first native cloud infrastructure. Serverless is said to be about functions, but you really need a zip file of code to do much of anything useful, which is basically a container. So serverless isn't so much about packaging as it is about not standing up your own chunky persistent services. Those services, like storage, like the database, etc, have moved to the environment. Your code orchestrates the dance and implements specific behaviours. Serverless is nothing if not a framework writ large. Serverless also intensifies the developer friendly disintermediation of infrastructure that the cloud started. Upload your code and charge it on your credit card. All the developer has to worry about their function. Oh, and linking everything together (events, DNS, credentials, backups, etc) through a Byzantine patch panel of a UI; uploading each of your zillions of "functions" on every change; managing versions so you can separate out test, development, and production. But hey, nothing is perfect. What may drive serverless more than anything else is economics. From markonen In my book, the innovation in Lambda is, above everything else, about the billing model. My company moved the work of 40 dedicated servers onto Lambda and in doing so decimated our costs. Paying for 1500 cores (our current AWS limit) in 100ms increments has been a game changer. I'm sure there are upsides to adopting the same programming model with your own hardware or VMs, but the financial benefit of Lambda will not be there. There are many more quotes likes this, but that's the jist of it. And as pointed out by others, the pay off depends on some utilization threshold. If you can drive the utilization of your instances to some high level then running your own instances makes economic sense. For the rest of us taking advantage of the aggregation of a big cloud provider is a winner. Setting up a highly available service on the cloud, dealing with instances and all the other overhead is still a huge PITA. Why deal with all that if you don't have to? Developers pick winners. Developers follow ease of use. Developers follow the money. So serverless is a winner. You'll just have to get over the name. ### CompsciOverflow #### The meaning of "set" in NP-complete problem Garey and Johnson describe in their book many NP-complete problems which are based on sets, for example Hitting Set, Minimum Test Set, Set Packing, Set Splitting, and many more. The traditional mathematical definition of a set does not allow duplicates; when multiplicities count, the object is called a multiset. Are the sets in the NP-complete problems described by Garey and Johnson allowed to contain duplicates? ### TheoryOverflow #### Which papers state a mathematical formulation of a problem of building vehicle routes across an existing hub-and-spoke transportation network? I'm developing a tool building (near-) optimal routes for an existing set of vehicles which serve a fixed hub-and-spoke network with two hubs. The goal is to minimise the total travel time of all vehicles. A mathematical formulation of the corresponding problem is required. Which scientific papers would you recommend to consider? ### StackOverflow #### Tensorflow Grid LSTM RNN TypeError I'm trying to build a LSTM RNN that handles 3D data in Tensorflow. From this paper, Grid LSTM RNN's can be n-dimensional. The idea for my network is a have a 3D volume [depth, x, y] and the network should be [depth, x, y, n_hidden] where n_hidden is the number of LSTM cell recursive calls. The idea is that each pixel gets its own "string" of LSTM recursive calls. The output should be [depth, x, y, n_classes]. I'm doing a binary segmentation -- think foreground and background, so the number of classes is just 2. # Network Parameters n_depth = 5 n_input_x = 200 # MNIST data input (img shape: 28*28) n_input_y = 200 n_hidden = 128 # hidden layer num of features n_classes = 2 # tf Graph input x = tf.placeholder("float", [None, n_depth, n_input_x, n_input_y]) y = tf.placeholder("float", [None, n_depth, n_input_x, n_input_y, n_classes]) # Define weights weights = {} biases = {} # Initialize weights for i in xrange(n_depth * n_input_x * n_input_y): weights[i] = tf.Variable(tf.random_normal([n_hidden, n_classes])) biases[i] = tf.Variable(tf.random_normal([n_classes])) def RNN(x, weights, biases): # Prepare data shape to match rnn function requirements # Current data input shape: (batch_size, n_input_y, n_input_x) # Permuting batch_size and n_input_y x = tf.reshape(x, [-1, n_input_y, n_depth * n_input_x]) x = tf.transpose(x, [1, 0, 2]) # Reshaping to (n_input_y*batch_size, n_input_x) x = tf.reshape(x, [-1, n_input_x * n_depth]) # Split to get a list of 'n_input_y' tensors of shape (batch_size, n_hidden) # This input shape is required by rnn function x = tf.split(0, n_depth * n_input_x * n_input_y, x) # Define a lstm cell with tensorflow lstm_cell = grid_rnn_cell.GridRNNCell(n_hidden, input_dims=[n_depth, n_input_x, n_input_y]) # lstm_cell = rnn_cell.MultiRNNCell([lstm_cell] * 12, state_is_tuple=True) # lstm_cell = rnn_cell.DropoutWrapper(lstm_cell, output_keep_prob=0.8) outputs, states = rnn.rnn(lstm_cell, x, dtype=tf.float32) # Linear activation, using rnn inner loop last output # pdb.set_trace() output = [] for i in xrange(n_depth * n_input_x * n_input_y): #I'll need to do some sort of reshape here on outputs[i] output.append(tf.matmul(outputs[i], weights[i]) + biases[i]) return output pred = RNN(x, weights, biases) pred = tf.transpose(tf.pack(pred),[1,0,2]) pred = tf.reshape(pred, [-1, n_depth, n_input_x, n_input_y, n_classes]) # pdb.set_trace() temp_pred = tf.reshape(pred, [-1, n_classes]) n_input_y = tf.reshape(y, [-1, n_classes]) cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(temp_pred, n_input_y))  Currently I'm getting the error: TypeError: unsupported operand type(s) for +: 'int' and 'NoneType' It occurs after the RNN intialization: outputs, states = rnn.rnn(lstm_cell, x, dtype=tf.float32) x of course is of type float32 I am unable to tell what type GridRNNCell returns, any helpe here? This could be the issue. Should I be defining more arguments to this? input_dims makes sense, but what should output_dims be? Is this a bug in the contrib code? GridRNNCell is located in contrib/grid_rnn/python/ops/grid_rnn_cell.py ### QuantOverflow #### rugarch: GARCH external regressors I'm currently playing around with the great rugarch package in R. However, I tried to test the external regressor functionality. I implemented a GARCH(1,1) process and compared it with a GARCH(0,1) process where I added the lagged squared returns as external regressor. The results should be the same but aren't. Does anyone of you know where my mistake is? Thank you very much in advance for your help. library(rugarch) library(quantmod) getSymbols('C', from = '2000-01-01') C = adjustOHLC(C, use.Adjusted = TRUE) R_d = ROC(Cl(C), na.pad = FALSE) extReg = R_d[1:length(R_d)-1]^2 spec = ugarchspec(mean.model = list(armaOrder = c(0, 0),include.mean = FALSE), variance.model = list(model = 'sGARCH', garchOrder = c(1, 1)), distribution = 'norm') spec2 = ugarchspec(mean.model = list(armaOrder = c(0, 0),include.mean = FALSE), variance.model = list(model = 'sGARCH', garchOrder = c(0, 1),external.regressors=extReg), distribution = 'norm') fit = ugarchfit(data = R_d[2:length(R_d),1], spec = spec) fit2 = ugarchfit(data = R_d[2:length(R_d),1], spec = spec2)  The coefficients of the fit model are: omega: 2.1038530309075e-06 alpha1: 0.0863073049030114 beta1: 0.912692551076183  The coefficients of the fit2 model are: omega: 8.17097079205033e-07 beta1: 0.999316873189476 vxreg1: 1.01005006640392e-08  ### StackOverflow #### Functional programming preventing side effects Considering this code that receives an object, in javascript they are always passed as reference. Function will act and process the data, remove an element and returns to the caller but it will also have the side effect to it's original data passed, removeEntity = function(data) { for(var i=0; i<data.length; i++) { if(data[i].href == data.href) { data.splice(i, 1); //side effect break; } } return data; };  Caller of the function var body = { entity: entity, contentElements: data, //data will have side effect contentElement: contentElements[0] //entity to remove }; collection.removeEntity(body); //data has one element less now, it does contradicts functional programming?  data will get affected after the function. I'm trying to get the philosophy of functional programming right. Collection.prototype.removeEntity = function(data) { data = JSON.parse(JSON.stringify(data)); //to prevent side effects for(var i=0; i<data.contentElements.length; i++) { if(data.contentElements[i].href == data.contentElement.href) { data.contentElements.splice(i, 1); break; } } return data; };  Constructor Is there a way to have JSON serialized gets called everytime for every member function in my class? function Collection() { for (var p in this) { if (typeof this[p] === "function") { this[p] = (function(){ return function(data){ data = JSON.parse(JSON.stringify(data)); return data; } }()); } } }  For guest271314  var body = { name: {first: 'john', last: 'doe'}, name2: {first: 'jane', last: 'doe'} }; var data = {}; for (var prop in body) { data[prop] = body[prop] } body.name.first = 'guest271314'; console.log(body); console.log(data);  output { name: { first: 'guest271314', last: 'doe' }, name2: { first: 'jane', last: 'doe' } } { name: { first: 'guest271314', last: 'doe' }, name2: { first: 'jane', last: 'doe' } }  #### Tensorflow How to convert words(Strings) from a csv file to proper vectors Hi im trying to make a small classifier in tensorflow. I want to read data from a csv file and use it for my training phase, the problem is the content of my file looks something like this: object,categorie the blue balon,toy a white plastic ship,toy a big book,other the wild cat,animal a wet dolphin,animal ... So i want to read the sentences and then convert them to vector for use in a tensorflow model. All the information i readed was about numerical data but no idea how to use data like this. The turorials from the oficial site use numeric data, the best option so far has been use a dictionary but i think there should exist a better option. Another option is to make my own method but could be imprecise. Have someone any ideas how can i do that? an alternative for mi method or how can i process words in tensorflow? Sorry if my english is not good. EDIT Try to convert sentences into multidimensional arrays but the results were not good, I estimate that the poor results are due to some statements can be short and others long, which affects the final free space on each array and this free space affects the results the probabilistic model. Any recommendation? ### QuantOverflow #### Comparison of quality across different fundamentals data sources? There are a variety of different mechanisms and rules used by each fundamentals data provider to standardize and report company fundamentals. For example, the transformation of reported statements to quarterly statements. Is there a study comparing the quality and tradeoffs of the techniques employed by the various fundamentals data providers? #### Intraday stock prices API I am looking for an API to request intraday data for the London stock exchange. I have seen products like eSignal but this seems to include a lot more than the simple data as XML or JSON and is fairly expensive. The idea is to request data and analyse in an application that I have written so all I need is a real time source. Is there anything available like this? #### How to build a cross currency swap pricer? We're looking to build a pricer to convert a funding spread in a given currency over a specific funding basis e.g. 20 bps EUR 3m€ and convert it to a funding spread to a different currency with a different funding basis say USD 6m$L.

We're in the process of sourcing market swap data including discount factors for EONIA, FedFund and LIBOR for different tenors.

Looking for someone to help us with this, could even turn into a paid project, basically I'm totally lost! Thanks!

### StackOverflow

#### What is the output of the ml.evaluation.BinaryClassificationEvaluator?

I want to compare and evlauate the perfomance/accuracy of tree different types of models using the spark ML library. All models are Binary classifiers.

My code snippet:

from pyspark.ml.evaluation import BinaryClassificationEvaluator

evaluator = BinaryClassificationEvaluator(
labelCol="indexLabel", rawPredictionCol="features")

result_glm = evaluator.evaluate(prediction_glm)
result_gbm = evaluator.evaluate(prediction_gbm)
result_rf  = evaluator.evaluate(prediction_rf)

print ("GLM: %g \nGBM: %g \nRF: %g \n" %
(result_glm, result_gbm, result_rf))

GLM: 0.396855
GBM: 0.396855
RF: 0.396855


What does the output mean? the accuracy, mse ? or how can I interpret these results. In the documentation the evaluate function returns a metric, but what kind of metric?

#### How big should batch size and number of epochs be when fitting a model in Keras?

I am training on 970 samples and validating on 243 samples.

How big should batch size and number of epochs be when fitting a model in Keras to optimize the val_acc? Is there any sort of rule of thumb to use based on data input size?

#### Function signature of Tap (K-combinator)

I've read in a book that the function signature of tap function (also called K-Combinator) is below:

tap :: (a -> *) -> a -> a


"This function takes an input object a and a function that performs some action on a. It runs the given function with the supplied object and then returns the object."

1. Can someone help me to explain what is the meaning of star (*) in the function signature?
2. Are below implementation correct?
3. If all the three implementation are correct, which one should be used when? Any examples?

### Implementation 1:

const tap = fn => a => { fn(a); return a; };

tap((it) => console.log(it))(10); //10


### Implementation 2:

const tap = a => fn => { fn(a); return a; };

tap(10)((it) => console.log(it)); //10


### Implementation 3:

const tap = (a, fn) => {fn(a); return a; };

tap(10, (it) => console.log(it)); //10


#### R as network simulator

I need to simulate a cellular network with static (base stations) and moving (users) nodes. Since I need to perform some statistics and machine learning techniques, I would like to use R. Do you know any package where I can build such network?

### CompsciOverflow

I would really like to know what you guys are personally looking into these days.

Any latest technological development that you are pretty excited about?

### StackOverflow

#### How to access intermediate layers' outputs using nngraph?

I need to apply a loss function to an intermediate layer (L2) representation in a network which has many layers after the L2 layer. I know how to get access to the output of a network in nngraph as follow:

input = nn.Identity()()
net = nn.Sequential()
output = net(input)

gmod = nn.gModule({input}, {output})


However, I don't know how I can access the result of the second layer and apply a loss function (criterion) and do backprop on it in a neat way. Can anyone give me some help with this?

### TheoryOverflow

#### The number of edges in the ith shortest path in a directed graph

$G$ - directed graph, $n$ - count of nodes

According to Eppstein's Algorithm in this paper, the ith shortest path in a digraph may have $\Omega(ni)$ edges.

Anybody can explain how this estimate is taken?

### StackOverflow

#### Randomly splitting training and testing data

I have around 3000 objects where each object has a count associated with it. I want to randomly divide these objects in training and testing data with a 70% training and 30% testing split. But, I want to divide them based on the count associated with each object but not based on the number of objects.

An example, assuming my dataset contains 5 objects.

Obj 1 => 200
Obj 2 => 30
Obj 3 => 40
Obj 4 => 20
Obj 5 => 110


If I split them with a nearly 70%-30% ratio, my training set should be

Obj 2 => 30
Obj 3 => 40
Obj 4 => 20
Obj 5 => 110


and my testing set would be

Obj 1 => 200

If I split them again, I should get a different training and testing set nearing the 70-30 split ratio. I understand the above split does not give me pure 70-30 split but as long as it nears it, it's acceptable.

Are there any predefined methods/packages to do this in Python?

### Fefe

#### Benutzt hier jemand Lastpass?Benutzt Passwort-Manager, ...

Benutzt hier jemand Lastpass?

Benutzt Passwort-Manager, sagten sie! Dann ist das sicher, sagten sie! :-)

### StackOverflow

#### Generate an optimized value that leads to a positive prediction machine learning

I'm working on a machine learning algorithm and my problem is that given certain values of some attributes, I want to generate an optimized one for a specific variable that leads to a positive prediction.

Example : I have 3 attributes (x1 , x2 , x3) and y = 0 or 1 and a trained model with a given ML algorithm

x1 = cte1 , x2 = cte2 , x3 = x that leads to a y =0

I want to find the optimal x3 that results in a y=1, x3 has to be minimal. I can code a simple algorithm using iterations or some dichotomy method using the min - max from the training set but I want to optimize this with a simple function using R or Python.

#### Using Scorer Object for Classifier Score Method

I have written my custom scorer object which is necessary for my problem and which I've called "p_value_scoring_object".

For the function sklearn.cross_validation.cross_val_score one of the parameters is "scoring", which allows to use this scorer object.

However, this option is not available for the score method of a classifier. Is sklearn just lacking that feature, or is there a way around it?

from sklearn.datasets import load_iris
from sklearn.cross_validation import cross_val_score
from sklearn.tree import DecisionTreeClassifier
clf = DecisionTreeClassifier(random_state=0)
cross_val_score(clf, iris.data, iris.target, cv=10,scoring=p_value_scoring_object)


This works. However, this doesn't:

clf.fit(iris.data,iris.target)
clf.score(iris.data,iris.target,scoring=p_value_scoring_object)


#### OpenBSD 6.0 pre-orders up

Pre-orders for the 6.0 CD sets have just been activated.

In addition, one of the six release songs has been released early.
There will be another compilation CD titled "The songs 5.2 - 6.0" alongside the release.

Head on over to the OpenBSD Store to pick up your CD set, poster, or both!

This release has some of the coolest artwork yet.

### StackOverflow

#### ZeroMQ vs Oracle queuing

I'm junior backend developer and now I'm working on a project about bank, which is a distributed system. What I knew before was that there were some message library such as ZeroMQ to realize the communication between components in a distributed system. But now, in the project, they used oracle queuing.

My colleague told me that this was better because we had no risk to lose any message to send even if processes die accidently.

My questions:
Q1: If Oracle queuing is better, when should we use things like ZeroMQ?
and
Q2: What is the disadvantage of Oracle queuing, comparing with ZeroMQ?

#### How to implement a sequence classification LSTM network in CNTK?

I'm working on implementation of LSTM Neural Network for sequence classification. I want to design a network with the following parameters:

1. Input : a sequence of n one-hot-vectors.
2. Network topology : two-layer LSTM network.
3. Output: a probability that a sequence given belong to a class (binary-classification). I want to take into account only last output from second LSTM layer.

I need to implement that in CNTK but I struggle because its documentation is not written really well. Can someone help me with that?

### QuantOverflow

#### Price Barrier Options on Baskets using Quantlib

Is it possible to price barrier options on a basket of stocks using Quantlib, e.g. a Worst-of Down-and-in-Put on a basket of 3 stocks?

I already checked the MCBarrierEngine (does not support multiple stocks) and the MCEuropeanBasketEngine (does not support barrier options), but without any luck.

#### Calculation of option Greek (sensitiviety) theta via finite difference

I am able to get good approximations for delta, gamma, and rho via finite difference method, but not theta. I believe my issue is the value of h. Theta is basically the difference between the price of the the option one time step in the future and the price today divided by the size of the time step, ie

theta (approx) = V(d_v+1) - V(d_v)/(1/365), where V(d_v+1) is the value of the option one time step (1/365) into the future

If I apply this to, for example, the call option quote on 04/18/2013 for ticker A (Agilent, I believe), strike of 40, underlying price of 41.83, expiry of 05/18/2013 (30/365 days to maturity), 1.1% Dividend Yield, 0.3% risk-free rate, I get a theta of -8.9, whereas the actual theta is approximated by a large options data reporting firm as approx -2.2. My other Greek approximations are close enough, but I cannot get a good approximation for theta. Anybody have insight into this issue? Thanks in advance for your help!

### StackOverflow

#### In matlab prtools how do I set continues labels?

I have a dataset with labels and datapoints, problem is that rather then a classification problem I want to get a linair estimator, for example :

dataset=prdataset([2,4,6,8]',[1,2,3,4]')
testset=prdataset([3,5,7,9]')
classifier=dataset*ldc %should probably be changed?
result=testset*classifier


result.data now becomes

ans =

1.0e-307 *

0.2225    0.2225    0.2225    0.2225
0.2225    0.2225    0.2225    0.2225
0.2225    0.2225    0.2225    0.2225
0.2225    0.2225    0.2225    0.2225


which is very wrong.

Ideally it would be [1.5,2.5,3.5,4.5]' or something to close to it. Any idea how to do this in PRtools or in something simulair? This is a linair dependancy but I would also like to be able to play around with other types of dependancies?

Also it would be a huge bonus of the system was somewhat clever about NaN values which heavily polute my real dataset.

I have already found that linearr class but when I use that I get weirdly sized datasets in return,

dataset=prdataset([2,4,6,8]',[1,2,3,4]')
testset=prdataset([3,5,7,9]')
classifier=dataset*linearr%should probably be changed?
result=testset*classifier


gives me the values

    0.1000   -0.3000   -0.7000   -1.1000
-0.5000   -0.5000   -0.5000   -0.5000
-1.1000   -0.7000   -0.3000    0.1000
-1.7000   -0.9000   -0.1000    0.7000


which is again incorrect.

In chat they suggested using .* instead of * that resulted in Error using * Inner matrix dimensions must agree.

Error in linearr (line 42)
beta = prinv(X'*X)*X'*gettargets(x);

Error in prmap (line 139)
[d, varargout{:}] = feval(mapp,a,pars{:});

Error in  *

v1 = a*v1;     % train first mapping

Error in prmap (line 139)
[d, varargout{:}] = feval(mapp,a,pars{:});

Error in  *


In the linearr code.

### QuantOverflow

#### Models crumbling down due to negative (nominal) interest rates

Given that the negative interest rates on a lot of sovereign bonds with maturity under 10 years are trading in the negative (nominal) interest rate territory (recently also the short term EURIBOR has dropped below zero), which are the most striking applications for the models in financial economics/quant finance field?

By that I mean which of the so called "stylized facts" and standard models of modern finance are becoming highly controversial or just plain useless? As a couple of examples which spring to mind are the following (do not necessarily have to do with sovereign bond yields, but the concept of negative (nominal) interest rates as such):

• The CIR interest rates model completely breaks down due to the square root term
• The proof that an American call option written on a non-dividend paying underlying will not be exercised before the maturity is false
• Markowitz selection obviously encounters difficulties incorporating negative yields

What are the other consequences, on let us say, CAPM, APT, M&M or any other model in finance? Which long held beliefs are hurt the most by negative yields?

#### QuantLib FittedBondDiscountCurve fitResults [Error]

I try to use FittedBondDiscountCurve with NelsonSiegelFitting, but I faced with error when call fitResults() method:

14415     def fitResults(self) -> "FittingMethod const &":
14416         return _QuantLib.FittedBondDiscountCurve_fitResults(self)
14417     __swig_destroy__ = _QuantLib.delete_FittedBondDiscountCurve
14418     __del__ = lambda self: None

RuntimeError: unable to bracket root in 100 function evaluations (last
bracket attempt: f[-2.29538e+025,5.968e+025] -> [-1.#IND,10200.1])


What does this problem mean? What can be solution? Moreover, could you tell me what is a minization method using in "NelsonSiegelFitting"? How can I change it? Or does it use several methods by default? I've uploaded full code with input file to my github.

Import QuanLib

from datetime import datetime, date, time
import QuantLib as ql

class Bond(object):

def __init__(self, issuer, bond_name, price, tenor, face_amount):
self.dates = []
self.cashflows = []
self.issuer = issuer
self.bond_name = bond_name
self.price = ql.QuoteHandle(ql.SimpleQuote(price))
self.tenor = tenor
self.face_amount = face_amount

day, month, year = map(int, date.split('.'))
self.dates.append(ql.Date(day, month, year))

self.cashflows.append(float(cashflow))


Import Data

face_amount = 1000.0 # for all bonds face_amount = 1000.0
tenor = ql.Period(6, ql.Months) # for all bonds tenor = 6m

bonds = {}

with open('bonds.txt') as f:

for line in f:
s = line.rstrip().split(';')
bond_name = s[1]
if bond_name not in bonds:
issuer = s[0]
price = float(s[4])
bonds[bond_name] = Bond(issuer, bond_name, price, tenor, face_amount)



Set QuantLib Param

evaluationDate = ql.Date(1, 6, 2016)
ql.Settings.instance().evaluationDate = evaluationDate
calendar = ql.TARGET()
day_counter = ql.Thirty360()
bondSettlementDays = 0
curveSettlementDays = 0


Create QuantLib objects

instruments = []
instruments_names = []

for bond in bonds.keys():

schedule = ql.Schedule(bonds[bond].dates[0] - bonds[bond].tenor,
bonds[bond].dates[-1],
bonds[bond].tenor,
calendar,
accrualConvention,
accrualConvention,
ql.DateGeneration.Forward,
False)

helperA = ql.FixedRateBondHelper(bonds[bond].price,
bondSettlementDays,
bonds[bond].face_amount,
schedule,
bonds[bond].cashflows,
day_counter,
bussiness_convention)

instruments.append(helperA)
instruments_names.append(bond)


QuantLib Optimization

tolerance = 1.0e-5
iterations = 50000
nelsonSiegel = ql.NelsonSiegelFitting()
term_structure = ql.FittedBondDiscountCurve(curveSettlementDays,
calendar,
instruments,
day_counter,
nelsonSiegel,
tolerance,
iterations)
a = term_structure.fitResults()


### StackOverflow

#### All input arrays and target arrays must have the same number of samples."- Training on single image to check if the model works in keras

def obcandidate(inputvgg,outputmodel):

graph = Graph()

graph.add_input(name = 'input1', input_shape = (512, 14, 14))

graph.add_node(Convolution2D(512, 1, 1), name = 'conv11', input = 'input1')

graph.add_node(Convolution2D(512, 14, 14), name = 'conv112', input = 'conv11')

graph.add_node(Flatten(), name = 'flatten11', input = 'conv112')

graph.add_node(Dense(3136), name = 'dense1', input = 'flatten11')

graph.add_node((Activation('relu')), name = 'relu', input = 'dense1')

graph.add_node(Reshape((56,56)), name = 'reshape', input = 'relu')

sgd = SGD(lr = 0.001, decay = .00005, momentum = 0.9, nesterov = True)

graph.add_output(name = 'output1', input = 'reshape')

graph.compile(optimizer = sgd, loss = {

'output1': 'binary_crossentropy'})

print 'compile success'

history = graph.fit({'input1':inputvgg, 'output1':outputmodel}, nb_epoch=1)

predictions = graph.predict({'input1':inputvgg})

return graph

""

"main function"

""

if __name__ == "__main__":

model = VGG_16('vgg16_weights.h5')

sgdvgg = SGD(lr = 0.1, decay = 1e-6, momentum = 0.9, nesterov = True)

model.compile(optimizer = sgdvgg, loss = 'categorical_crossentropy')

finaloutputmodel = outputofconvlayer(model)

finaloutputmodel.compile(optimizer = sgdvgg, loss = 'categorical_crossentropy')

mean_pixel = [103.939, 116.779, 123.68]

img = img.astype(np.float32, copy = False)

for c in range(3):

img[: , : , c] = img[: , : , c] - mean_pixel[c]

img = img.transpose((2, 0, 1))

img = np.expand_dims(img, axis = 0)

imgout[imgout!=0]=1

out=imgout

inputvgg = np.asarray(finaloutputmodel.predict(img))

obcandidate(inputvgg,out)


Hi ,above is my code where i am trying to segment object candidate through graph model,

i want to check for one input if the code works or not so i am giving it one input image and the output image,

But keras gives me an error - "All input arrays and target arrays must have the same number of samples."

Can anyone tell me what do i do to see if my model runs .i am training on one input so that i can verify that my model is correct and start training ,is there any other way to do it?

#### How to use ckpt data model into tensorflow iOS example?

I am quiet new to Machine learning, and I am working on iOS app for object detection using tensorflow, I have been using the sample data model that is provided by tensorflow example in the form of .pb (graph.pb) file which works just fine with object detection.

But My backend team has given me model2_BN.ckpt for data model file, I have tried to research on how to use this file and I have no clue. Is it possible to use the ckpt file on client side as data model? If yes How can I use it in the iOS tensorflow example as data model?

### Fred Wilson

#### The Tortoise And The Hare

One of my favorite childhood stories is Aesop’s The Tortoise And The Hare.

I just love the idea that slow and steady ultimately wins the race.

Mobile games have these explosive take up rates but don’t last forever.

Contrast that with something like Minecraft which emerged slowly but seems to chug along getting more and more popular each year.

And, outside of the games sector, I can’t really think of any super popular technology product (app or device) that blasted off and sustained itself over a decade or more.

When I ran this question by my brother in law last night, he mentioned the iPhone and the iPad, but both of those were relatively slow builds, certainly compared to these mobile game launches.

We could not think of a huge product, in tech or outside of tech, that blasted off and was a sustainably popular product for a decade or more.

Can you?

### infra-talk

#### Dropwizard Deep Dive – Part 2: Authorization

Welcome back! This is Part 2 of a three-part series on extending Dropwizard to have custom authentication, authorization, and multitenancy. In Part 1, we set up custom authentication. When we left off, we had just used the Java annotations @RolesAllowed and @PermitAll to authenticate our resource methods, so they will only run for credentialed users. In this part, we’ll cover Dropwizard authorization. We are going to extend the code we added to check the role assigned to a user and further restrict our methods based on whether that matches.

We can turn role-checking on by enabling another dynamic feature within Jersey. In order for it work, we just need to set up a SecurityContext object that can tell if a given role applies and set that security context on each incoming request. Most of the code and techniques here are actually a core part of JAX-RS and can be used entirely outside of Dropwizard. All of the example code I’m going to show in my series lives in this repo if you want to follow along.

## Enabling Role-Checking

To make Jersey check role annotations before each request, we need to enable the RolesAllowedDynamicFeature, which is a core part of Jersey, not Dropwizard. We can enable it in our app like so:

environment.jersey().register(RolesAllowedDynamicFeature.class);


If you just activate that, you’ll notice that you can no longer use any of the endpoints annotated with @RolesAllowed (though those with @PermitAll still work). These endpoints will return a 403, because they have no way to validate their set of roles against the logged-in user. Fixing that is our next step.

## Custom Security Context

A SecurityContext is a core JAX-RS object that is attached to a request context for the purposes of validating security. We are going to modify our auth filter to attach a security context to each authenticated request. The security context we attach will have logic to check the roles in the @RolesAllowed annotation against the authenticated user (to which we will also add a role field).

First off, we need to create a custom subclass of SecurityContext that can check our roles:

public class CustomSecurityContext implements SecurityContext {
private final CustomAuthUser principal;
private final SecurityContext securityContext;
public CustomSecurityContext(CustomAuthUser principal, SecurityContext securityContext) {
this.principal = principal;
this.securityContext = securityContext;
}
@Override
public Principal getUserPrincipal() {
return principal;
}
@Override
public boolean isUserInRole(String role) {
return role.equals(principal.getRole().name());
}
@Override
public boolean isSecure() {
return securityContext.isSecure();
}
@Override
public String getAuthenticationScheme() {
return "CUSTOM_TOKEN";
}
}


The most important part of this is the isUserInRole method which drives our Dropwizard authorization code. It will be called once for each role we define in our @RolesAllowed annotation, and if it returns “true” for any of them, we are authorized to use the method.

Now we need to update our auth filter method to attach a security context whenever we authenticate a user. We don’t have to do anything when there is no authenticated user, because the default security context has no user attached and will fail authorization checks. We also need to make sure to set the @Prioroity of our auth filter to Priorities.AUTHENTICATION so this code will run before any other filters that depend on authentication.

@PreMatching
@Priority(Priorities.AUTHENTICATION)
public class CustomAuthFilter extends AuthFilter<CustomCredentials, CustomAuthUser> {
private CustomAuthenticator authenticator;
public CustomAuthFilter(CustomAuthenticator authenticator) {
this.authenticator = authenticator;
}
@Override
public void filter(ContainerRequestContext requestContext) throws IOException {
Optional<CustomAuthUser> authenticatedUser;
try {
CustomCredentials credentials = getCredentials(requestContext);
authenticatedUser = authenticator.authenticate(credentials);
} catch (AuthenticationException e) {
throw new WebApplicationException("Unable to validate credentials", Response.Status.UNAUTHORIZED);
}
if (authenticatedUser.isPresent()) {
SecurityContext securityContext = new CustomSecurityContext(authenticatedUser.get(), requestContext.getSecurityContext());
requestContext.setSecurityContext(securityContext);
} else {
throw new WebApplicationException("Credentials not valid", Response.Status.UNAUTHORIZED);
}
}
...
}


## Dropwizard Authorization Complete

With the AuthDynamicFeature enabled and our security context attached to authenticated requests, we now have role-based authentication on every incoming request. If you’ve been following Parts 1 and 2 of this post, you’ll see that we have both Dropwizard authentication and Dropwizard authorization for our API. This is probably enough for many apps, but in Part 3, I’ll show you how you can also add multitenancy to a Dropwizard application using a similar annotation-based approach.

You can see the code for just what we’ve done for Parts 1 and 2 here and the complete code for all three parts here.

The post Dropwizard Deep Dive – Part 2: Authorization appeared first on Atomic Spin.

### QuantOverflow

#### Stochastic Calculus Rescale Exercise

I have the following system of SDE's

$dA_t = \kappa_A(\bar{A}-A_t)dt + \sigma_A \sqrt{B_t}dW^A_t \\ dB_t = \kappa_B(\bar{B} - B_t)dt + \sigma_B \sqrt{B_t}dW^B_t$

If $\sigma_B > \sigma_A$ I would consider the volatility $B_t$ to be more volatile than $A_t$ because

$d\langle A_\bullet\rangle_t = \sigma_A^2 B_t dt$ and $d\langle B_\bullet\rangle_t = \sigma_B^2 B_t dt$

Now, if I rescale the process $B$ by $\sigma_A^2$ and define $\sigma_A^2B =\tilde{B}$, I get the an equivalent system of SDE's

$dA_t = \kappa_A(\bar{A}-A_t)dt + \sqrt{\tilde{B}_t}dW^A_t \\ d\tilde{B}_t = \kappa_B(\sigma_A^2\bar{B} - \tilde{B}_t)dt + \sigma_A\sigma_B \sqrt{\tilde{B}_t}dW^B_t$

But now the claim "If $\sigma_B > \sigma_A$ I would consider the volatility $\tilde{B}_t$ to be more volatile than $A_t$" does not hold anymore. Consider $1>\sigma_B>\sigma_A$ and

$d\langle A_\bullet\rangle_t = \tilde{B}_t dt$ and $d\langle \tilde{B}_\bullet\rangle_t = \sigma_A^2\sigma_B^2 \tilde{B}_t dt$.

In this case the volatility $\tilde{B}$ of $A$ is more volatile than $A$ only if $\sigma_A^2\sigma_B^2>1$, which is completely different from the condition above ($\sigma_B > \sigma_A$).

What went wrong? Is there some error in the rescalling?

### StackOverflow

#### Picking a training set from the larger application set

I'm trying to perform sentiment analysis on a dataset.But there is no existing corpus that my classifier can be trained on that is similar to the dataset that I want to analyze. My question is as follows: Can I use a randomly sampled subset of this data for training/validation phases and then use the trained classifier for performing analysis on the larger dataset? I plan to introduce some variability by adding data points to the training set that are similar to the application dataset but not from that set. Is this is a valid approach?

### CompsciOverflow

#### Shortest cycle for each vertice in directed and weighted graph

Given directed and weighted (positive weights) graph. Find the shortest cyclic path for each vertice.

Since weights are positive, modified Dijskra algorithm can be used. This algorithm can be used to find shortest path from source to all other vertices.

Question: How to extend it to find shortest path not just from one source, but for each vertice separately (after one source is processed, next vertice becomes the source).

If this algorithm traverse from source to all other vertices in each traversal, then how to get the shortest cyclic paths for each vertice?

### Lobsters

#### A Practical Guide to (Correctly) Troubleshooting with Traceroute (2009)

Worth it just for this (slide 4):

The default starting port in UNIX traceroute is 33434. This comes from 32768 (2 This comes from 32768 (2^15 or the max value of a , or the max value of a signed 16-bit integer) + 666 (the mark of Satan).

### TheoryOverflow

#### Could you explain to me the reduction? [on hold]

I am looking at the following solved exercise:

I haven't really understood at the reduction the part that we construct for each number $a_i$ a package of measurement $(\frac{4}{A}a_i, 5,3)$. Why do we consider this measurement?

### Fefe

#### Heute ist einer dieser Tage, wo sich die Nachrichten ...

Heute ist einer dieser Tage, wo sich die Nachrichten selbst persiflieren.

Es ging damit los, dass Michelle Obama auf dem Demokraten-Parteitag eine Rede für Hillary hielt, in der sie unter anderem den Punkt machte, dass das Weiße Haus ja von Sklaven gebaut worden ist.

Daraufhin alle so: oh, ist es? We had no idea! Die Fact Checker liefen alle im Kreis und fanden raus: ja, wurde es.

Und dann kommt Bill O'Reilly, einer der ekelhaftesten rechtsaußen-Talkshow-Vollpfosten im US-Fernsehen (Fox News, natürlich), und gibt zu Protokoll:

O'Reilly: Slaves Who Built White House Were "Well-Fed And Had Decent Lodgings Provided By The Government"
Aber wir haben die doch immer ausreichend gefüttert!!1!

#### Vorhin beim Kunden kam uns eine brilliante Geschäftsidee. ...

Vorhin beim Kunden kam uns eine brilliante Geschäftsidee. Ich publiziere die hier mal, damit die niemand patentieren kann.

Honeypots zum Pokemon-Fangen!

### QuantOverflow

#### Cross Currency Swap pricing

I have seen two methods for calculating the value of a xccy swap -

1) Convert the future foreign payments to the base currency using forward FX rates, net with the base currency payments and discount using the risk-free rate for the base currency.

2) Discount the foreign payments using the foreign risk free curves and convert to the base currency using the spot rate. Discount the base currency payments with the base/foreign basis curve and net with the foreign payments.

It seems to me that if I calculate the forward fx prices using a simple interest rate differential, then the basis curve should match the base risk free curve. Am I correct in this view and if so, how does one calculate forward fx rates to yield a result equivalent to the basis curve method?

### CompsciOverflow

#### simulation of fluid deformation in COMSOL [on hold]

How to simulate laser induced fluid surface deformation using COMSOL MultiPhysics software?

### StackOverflow

#### sklearn Perceptron learning

I'm trying to understand how Perceptron from sklearn.linear_model performs fit() function (Documentation). Question comes from this piece of code:

clf = Perceptron()

accuracy: 0.7


I thought goal of fitting is to create classification function which will give answer with 100% accuracy on test data, but in the example above it gives only 70%. I have tried one more data set where accuracy was 60%.

What do I misunderstand in fitting process?

#### Type Mismatch in Scala Code

Here is the function for balancing parenthesis in scala. I am getting

Error:(36, 10) type mismatch;
found   : Unit
required: Int
a=a+1

var a = 0

def balance(chars: List[Char]): Boolean = {
if(chars.isEmpty)
return Nil
{
a=a-1
a=a+1
}
if (a == -1)
return false
if ((a == 1 || a == 0) && chars.tail.isEmpty!= 0)
balance(chars.tail)
if (a == 0 && chars.tail.isEmpty)
return true
}


Can anyone tell me why this error is coming?

### QuantOverflow

#### EM for conditional Gaussian model

Let $$X_1\sim N(\mu_{X_1},\sigma_{X_2}^2)$$ $$X_2\sim N(\mu_{X_2}, \sigma_{X_2}^2)$$ where $\mu_{X_2}=c+aX_1$. Also, I have data $D$ (with missing values on $X_1,X_2$).

How can I update/estimate the parameters $\mu_{X_1},\sigma_{X_1},\mu_{X_2},a,c,\sigma_{X_2}$ using EM? i.e. what is the formula for updating $\sigma_{X_2}$?

My model is a conditional Gaussian, which is a conditional form of a bivariate Gaussian $(X_1,X_2)$ with mean vector $(\mu_1,\mu_2)^\top$ and covariance matrix $$\left( \begin{matrix} \Sigma _{11} & \Sigma _{12} \\ \Sigma _{21} & \Sigma _{22} \\ \end{matrix} \right)$$

Here is a reference to convert bivariate Gaussian to conditional Gaussian: $$\mu_{2|1}=\mu_{2}+\Sigma_{21}\Sigma_{11}^{-1}(X_1-\mu_1)\quad,\quad \Sigma_{22|1}=\Sigma_{22}-\Sigma_{21}\Sigma_{11}^{-1}\Sigma_{12}$$ which yields my model.

It seems that when $X_1$ has different observations and $X_2$ is unobserved, the variance of $X_2$ remains unchanged.So how to update the $\sigma_{x_2}$? Do I need to estimate covariance matrix?

Initial setting for the model $$X_1\sim N(5,7)\quad,\quad X_2\sim N(0.5X_1,8)$$

Data:

$$X_1:\operatorname{9\,\,4\,\,NA}$$

$$\quad X_2:\operatorname{NA\,\,NA\,\,3}$$

### CompsciOverflow

#### LL(k) vs Strong LL(K)

What is the diference between LL(K) and strong LL(K) grammars definitions?

LL(k):
For every pair of production rules A→α and A→β the following condition holds. FIRSTk(αy) ∩ FIRSTk(βy) = ∅ for all wAy / S =>* wAy

Strong LL(k):
For every pair of production rules A→α and A→β the following condition holds. FIRSTk( α FOLLOWk (A)) ∩ FIRSTk( β FOLLOWk (A)) = ∅

Isn't 'y' equals to FOLLOWk(A) ?

### QuantOverflow

#### Where can I find CMS swap trading prices?

I am writing a paper about CMS swap. To do so, I'd like to compare different theoretical pricing methods of these instruments to the "real prices" i.e. prices used in the marketplace.

But I don't know where I could find such data. I have access to Bloomberg, but I did not find CMS swap there. Maybe I just don't know the right Bloomberg function... Otherwise I also have access to the following databases:

• Bloomberg (as previously said)
• Datastream
• Thomson One Banker
• ResearchMonitor
• Factiva
• IMF e-library
• SDC Platinum

### StackOverflow

#### Any idea to create a phone buying recommender system?(analyze the online phone news or forum)

I m going to create a phone recommender for my school project

The software would generate a recommendation score for the cell phone that the user is going to buy according to one (or more) phone websites. Therefore , the users can have a reference score or idea about should he/she buy the cell phone .

Input: the user select the model of phone (eg. Iphone 6S)

Output : the system generate the recommendation score IPhone 6S ,like 8.9/10 . 8.9 maybe a high score ,suggesting users to buy the cell phone .In contrast , 3.5 maybe a low score , suggesting users not to buy .

My programming language should be in JAVA.

The whole program interface should be using JFRAME ,JPANEL ...

Here is my workout idea:

1)Let the user input the phone model ,like iphone 6s . Store this as string variable .If the input string can't be found , it will prompt out "No result" e.g samsung S2 <- In my .txt , cant find this model information

2)WEB CRAWLING

-Getting people comments about the cell phone (Not sure user reviews or phone news ? -don't know which one is easier or more suitable for the future analyzing data ?).

For example , for the phone news data, crawl and store most updated latest website title using JSOUP(OR other JAVA library is more suitable? ) .If the title contain that string , store the article content into temp.txt through I/O library

user reviews website examples :

http://www.gsmarena.com/apple_iphone_6s-reviews-7242p3.php
http://www.cnet.com/products/apple-iphone-6/user-reviews/
http://www.phonearena.com/phones/Apple-iPhone-6_id8346/reviews


phone news website examples:

http://www.techradar.com/news/phone-and-communications/mobile-phones


3)Training cluster (I do this before setting up the system? Or train cluster while the users running the system on the meantime ?)

Find ten good and bad passages of phone news sample respectively . Using LDA to select 10 good and bad keywords respectively ,as good and bad sample keywords samples

LDA code should be embedded with the main JAVA codes or use external LDA program before setting up the system ?

Any JAVA LDA libraries suggested?

4)**Using those 20 good and bad keywords to identify .

For example , there are 1000 article topic consist of Iphone 6S . Those 1000 article content will be selected and stored in a .txt .Then , i will use those 20 good and bad keywords to determine every articles are positive or negative to Iphone 6S.

Using IF statement ?

5)Output a recommendation score of Iphone 6S a simple formula designed by me(maybe use +-*/) eg , positive article /all article *100%

As i m a rookie in this kind of machine-learning workout . This is just my idea and my proposal .Perhaps there are some better or correct ways to implement this system . Also , the use of libraries (Jsoup.... ) or algorithm or analyzing methods(LDA) maybe wrong because of the wrong understanding toward that element . It is welcomed to correct my initial workflow and give me completely new idea .

# For my idea , it is similar to :

Tools for getting intent from Twitter statuses?

Web page recommender system

### StackOverflow

I was looking at the training data available in sklearn at here. As per documentation, it contains 20 classes of documents, based on some newsgroup collection. It does a fairly good job of classifying documents belonging to those categories. However, I need to add more articles for categories, like cricket, football, nuclear physics, etc.

I have set of documents for each class ready, like sports -> cricket, cooking -> French, etc.. How do I add those documents and classes in sklearn so that the interface which now returns 20 classes will return those 20 plus the new ones as well? If there is some training that I need to do, either through SVM or Naive Bayes, where do I do it before adding it to the dataset?

### QuantOverflow

#### Determining discount factors for non-standard maturities

Let's say we'd like to find a par rate for a 1 month forward starting 20-year interest rate swap. In this case, we'd need to discount cash flows for the payment periods shifted +1 month from standard semiannual or quarterly payments (which we can find by bootstrapping from frequent 1st year values and yearly rates). And annual rates should available well past the final date.

Is there some standard approach to find these discount factors? I assume since par rate is something which defines swaptions' fixing values (its strike), it should be some agreed procedure and not just interpolation.

### CompsciOverflow

#### Is there any error detection decoder that can correct 2 bit consective errors? [on hold]

if yes then please name them and under what channel conditions do two bit error occur? can 2 bit consecutive errors be corrected by convolution encoder and decoders?

### TheoryOverflow

#### Is sparse embedding of a NP-complete problem in a polynomial problem NP-complete?

Consider the following problem P: Input is a finite graph G. If the number of vertices in G is 2^2^i for some integer i, then output a minimum vertex cover of G; otherwise output empty set. Can I say that the problem P is NP-hard?

### CompsciOverflow

#### Registering 3D-NIR image to thermal image and vice versa

In the past I have thought a bit about how to register a NIR-image and a thermal image and noticed that this is not trivial - one statement was that if I had the depth information for each pixel, the task would be much easier.

Now consider having a 3D-NIR camera (e.g. asus xtion) and a thermal camera and I want to map the thermal information to the depth map (the 3d cloud) or vice versa and the NIR information to the thermal information (and vice versa).

I thought I could do it simply like this:

1. Conduct a stereo camera calibration for the two cameras (-> R1,R2,T)
2. Use R1, R2 and T to transform the 3D points from 3D-NIR to the thermal camera's coordinate system
3. Use the camera matrix of the thermal camera to project the 3D points to its image plane
4. Now I know where the 3D points "fall" in the thermal image, which gives me a depth value (and intensity value, because the depth map and NIR-image are aligned) for each pixel in the thermal image.

Which to me sounds similar to the procedure described in this answer by D.W.. However, I can't find any article describing this method. Always there seems to be some form of feature matching step, e.g. here.

Also, thinking about this, the above solution can not really work, I believe. Consider the case where the two cameras look at an object from different sides. The 3D camera will calculate the depth for a point p1 in the world. When projecting p1 to the thermal camera's image plane it will fall in pixel x. However, since the thermal camera looked at another side of the object, another point in the world p2 actually formed pixel x when the thermal camera took the image - and thats what the measured temperature in x is for (i.e., p1 and p2 roughly lie on a line viewed from the thermal camera).

So, all in all:

1. Is my statement that the outlined solution can't always work correct?
2. Under which circumstances does the outlined solution work?
3. What do I need to keep in mind when building the setup?

Thanks for reading this long post!

### StackOverflow

#### Spark Naive Bayes ML OutofMemory Error - Prediction

I'm trying to build a Machine Learning program with Spark 1.6

I have started the Spark shell with the following settings:

spark-shell --driver-class-path sqljdbc_6.0/enu/sqljdbc42.jar --driver-memory 25G --executor-memory 30G --num-executors 180 --conf spark.driver.maxResultSize=0 --conf spark.ui.port=4042 --conf spark.default.parallelism=100 --conf spark.sql.shuffle.partitions=1000

My code works until I try to predict/use the model. After executing this code:

scala> val predictionAndLabel = test.map(p => (model.predict(p.features), p.label))


I get this error message:

/usr/bin/spark-shell: line 41: 33686 Killed
"$FWDIR"/bin/spark-submit --class org.apache.spark.repl.Main --name "Spark shell" "$@"

I hope somebody can help me because I don't have any idea how I could make this code run smoothly!

Here is the Link to the complete full track of the error. https://app.box.com/s/w247yaoaiuogqot2zr76qjbwr9rzeb7b

#### ValueError: Found array with dim 3. Estimator expected <= 2. With own dataset

I'am trying to generate my own training data for recognition problem. I have two folder s0 and s1 and the folder containing is data. images, lables are the two list in which the labels contains the names of the folder.

|—- data
|    |—- s0
|    |    |—- 1.pgm
|    |    |—- 2.pgm
|    |    |—- 3.pgm
|    |    |—- 4.pgm
|    |    |—- ...
|    |—- s1
|    |    |—- 1.pgm
|    |    |—- 2.pgm
|    |    |—- 3.pgm
|    |    |—- 4.pgm
|    |    |—- ...


Below is the code, it's showing me an error on line classifier.fit(images, lables)

 Traceback (most recent call last):
File "mint.py", line 34, in <module>
classifier.fit(images, lables)
File "/usr/local/lib/python2.7/dist-packages/sklearn/svm/base.py",  line 150, in fit
X = check_array(X, accept_sparse='csr', dtype=np.float64, order='C')
File "/usr/local/lib/python2.7/dist-         packages/sklearn/utils/validation.py", line 396, in check_array
% (array.ndim, estimator_name))


ValueError: Found array with dim 3. Estimator expected <= 2. here

code:

import os,sys
import cv2
import numpy as np
from sklearn.svm import SVC
fn_dir ='/home/aquib/Desktop/Natural/data'

#Create a list of images and a list of corresponding names
(images, lables, names, id) = ([], [], {}, 0)
for (subdirs, dirs, files) in os.walk(fn_dir):
for subdir in dirs:
names[id] = subdir
mypath = os.path.join(fn_dir, subdir)
for item in os.listdir(mypath):
if '.png' in item:
label=id
r_image = np.resize(image,(30,30))
if image is not None:
images.append(r_image)
lables.append(int(label))
id += 1
#Create a Numpy array from the two lists above
(images, lables) = [np.array(lis) for lis in [images, lables]]
classifier = SVC(verbose=0, kernel='poly', degree=3)
classifier.fit(images, lables)

I really dont understand how to correct it in 2 dimension. I am trying  the below codes but the error is same:
images = np.array(images)
im_sq = np.squeeze(images).shape
images = images.reshape(images.shape[:2])


#### Data Science for Quiz Application

Hi I am exploring and learning Data Science by creating a Quiz Application.

I have tried the followings :

• Add new questions to a subcategory (classification)
• Find similar questions (NLP)

Now I would like to build an automatic Question selection module:

• Instead of randomly showing questions, increase toughness of question if user is answering questions easily or decrease questions if user is not able to answer

What ML/Data science algorithms and solutions are suitable for this ?

Also what are the other data science possibilities I can explore and used to make the app more intelligent ? Suggestions please..

### StackOverflow

#### Suggestions for NLP TF-IDF implementations [on hold]

I have a problem that relies on predicting a genre of an App according to its text description. I only have the dataset with the already calculated tf-idf values for each word. Each row is an App with 13026 possible words ( features ) for each app description.

I've been trying to optimize the prediction without any success so far. I am just a beginner in Machine Learning and would like some advices on what I can try to improve my prediction. I am working with a sparse matrix that contains mostly 0's, since a description only have a few words when compared to all the 13026 available.

What I've tried so far:

• PCA, TruncatedSVD and SVD for dimensionality reduction before running my algorithms
• Trying to normalize and/or center the data before applying algorithms
• Tried the following algorithms: Gaussian Naive Bayes, Multinomial Naive Bayes, SVM, Logistic Regression, KNN, Random Forest and Decision Trees

The best accuracy I've got so far is 60%, can't go past that, I know that I might be missing something really simple for this scenario but I can't find out what it is. I would really appreciate any help :)

#### black box script execution?

I have a client that would like to examine results of a script I have written. I don't want the client to see the inner workings of the script or I lose my value to them but I want them to be able to run it as many times as they want and observe the results.

I am not sure if there is a general solution to this or specific to a language. If the latter applies, I have scripts in Python and R.

Thanks

#### Way to compute the value of the loss function on data for an SGDClassifier?

I'm using an SGDClassifier in combination with the partial fit method to train with lots of data. I'd like to monitor when I've achieved an acceptable level of convergence, which means I'd like to know the loss every n iterations on some data (possibly training, possibly held-out, maybe both).

I know this information is available if I pass verbose=1 in the constructor of the classifier, but I'd like to query it programmatically rather than visually. I also know I can use the score method to get accuracy, but I'd like actual loss as measured by my chosen loss function.

Does anyone know how to do this?

### CompsciOverflow

#### Basic maths on python [on hold]

I'm getting an invalid syntax error on the l of the diagonal =

I can't figure out the issue.

polygon = int(input("How many sides are on your polygon?")

diagonal = 2 ** (polygon)

print ("There are", diagonals, "for your polygon")

### QuantOverflow

#### How to calculate the JdK RS-Ratio

Anyone have a clue how to calculate the JdK RS-Ratio?

Let's say I want to compare the Relative strength for these:

• EWA iShares MSCI Australia Index Fund

• EWC iShares MSCI Canada Index Fund

• EWD iShares MSCI Sweden Index Fund

• EWG iShares MSCI Germany Index Fund

• EWH iShares MSCI Hong Kong Index Fund

• EWI iShares MSCI Italy Index Fund

• EWJ iShares MSCI Japan Index Fund

• EWK iShares MSCI Belgium Index Fund

• EWL iShares MSCI Switzerland Index Fund

• EWM iShares MSCI Malaysia Index Fund

• EWN iShares MSCI Netherlands Index Fund

• EWO iShares MSCI Austria Index Fund

• EWP iShares MSCI Spain Index Fund

• EWQ iShares MSCI France Index Fund

• EWS iShares MSCI Singapore Index Fund

• EWU iShares MSCI United Kingdom Index Fund

• EWW iShares MSCI Mexico Index Fund

• EWT iShares MSCI Taiwan Index Fund

• EWY iShares MSCI South Korea Index Fund

• EWZ iShares MSCI Brazil Index Fund

• EZA iShares MSCI South Africa Index Fund

Each of them should be compared to the SP500 (SPY index). Calculate the relative strength of each of them to SPY and have it normalized (I think it is the only solution)

### CompsciOverflow

#### effect on storage of Tertiary language [duplicate]

This question is an exact duplicate of:

Binary language or machine language is the basic language of computer system and it consists of zeros and ones. As it is a universal fact; that modern digital computers understand and store data in the form of binary language i.e. 0, 1 Where ‘0’ represents OFF/Absent while ‘1’ represents ON/Present in electronic circuits.

Now assume that you are designing a computer system that is supposed to operate on tertiary language i.e. -1, 0, 1.

In your opinion how will it affect the storage of computer systems? Support your answer with valid arguments.

### StackOverflow

#### Accuracy.meas function in ROSE package of R

I am using accuracy.meas function of ROSE package in R. I got the error Response must have two levels. So checked both the parameter response and predicted1. But both are numeric. Is there some limitations to usability of accuracy.meas function.

Note- The answer is wrong but it has nothing to do with error

accuracy.meas(test$Walc,predicted1,threshold = 0.5) Error in accuracy.meas(response=test$Walc,predicted= predicted1, threshold = 0.5) :
Response must have two levels.

>test$Walc [1] 1 1 1 3 3 3 1 1 2 2 1 2 1 1 3 3 1 1 1 1 3 1 1 4 2 1 1 1 1 4 4 4 5 1 1 1 1 3 1 2 3 [42] 1 5 1 4 4 1 2 2 2 1 2 2 3 2 3 1 2 1 5 1 1 3 2 2 1 1 1 1 1 1 1 2 1 1 3 3 3 2 3 1 2 [83] 2 2 1 1 3 1 1 1 2 3 3 1 1 3 1 2 1 5 2 2 1 2 1 1 2 2 1 1 3 1 2 1 1 1 3 1 1 1 1 1 1 [124] 3 3 3 4 1 1 1 1 4 1 1 1 1 3 2 1 3 3 1 1 1 1 1 1 1 1 5 1 1 1 3 1 1 1 3 4 1 3 2 4 5 [165] 2 1 1 2 1 1 2 3 1 4 1 2 1 4 4 5 1 1 5 3 5 4 5 2 4 2 2 4 1 5 5 4 2 2 1 4 4 4 2 3 4 [206] 2 3 4 4 5 2 3 4 5 5 3 2 4 4 1 5 5 5 3 2 2 4 1 5 5 2 1 1 1 2 3 3 2 1 1 3 4 1 1 1 4 [247] 1 3 1 2 2 3 3 2 2 2 2 1 2 1 1 1 1 3 1 1 1 1 1 1 1 2 1 1 3 1 1 4 3 5 2 2 4 3 4 2 3 [288] 5 5 3 1 1 3 4 4 4 3 4 5 3 3 3 3 3 4 4 3 1 3 3 4 3 > predicted1 [1] 2 2 1 2 2 2 1 1 1 2 2 2 1 1 4 4 1 1 1 1 3 2 2 3 2 2 1 2 2 2 2 2 5 3 3 2 2 2 1 1 2 [42] 1 3 2 3 3 2 2 2 2 2 2 2 3 1 3 2 1 2 4 2 3 2 3 3 1 2 2 2 1 1 2 2 1 1 2 2 3 1 2 2 2 [83] 2 2 1 1 3 2 2 1 1 3 3 1 2 2 2 3 1 3 3 3 1 2 1 2 1 2 3 1 3 2 2 2 2 2 2 2 2 2 2 1 2 [124] 4 1 4 4 2 1 1 2 1 1 2 1 1 2 2 2 3 3 1 1 1 1 2 1 1 1 4 2 1 1 2 2 1 2 2 3 1 2 2 3 4 [165] 2 2 2 3 2 1 2 2 2 4 1 2 2 4 4 5 1 1 5 2 5 4 4 2 4 3 2 2 1 4 4 2 2 2 1 4 2 3 2 3 4 [206] 3 2 4 4 5 2 2 4 4 5 4 3 3 3 2 4 4 4 3 1 2 2 2 4 4 1 1 2 2 2 3 3 1 2 1 2 2 1 1 3 2 [247] 2 2 1 4 2 2 4 2 2 2 2 2 2 2 1 1 3 2 1 2 2 2 2 1 1 2 2 2 4 4 2 3 3 5 2 2 3 3 3 3 3 [288] 3 5 4 2 2 4 4 5 4 3 4 5 3 4 4 3 3 3 3 3 2 4 4 2 3  ### QuantOverflow #### Monetary Policy and the Yield Curve PART TWO The Fed has a number of tools/targets with which they manage monetary policy. I'm looking to refine a concise summary of them and looking for guidance/correction/validation. Think I understand these first three. Please correct me if I'm wrong: 1. Open Market Operations: The Federal Open Market Committee (FOMC) will often instruct the Federal Reserve Bank of New York to engage in open market operations (buying and selling of US securities) to influence interest rates. Movement at all maturities on the yield curve can reflect such operations; the Fed has been known to try and alter the shape/slope of the curve. 2. The Discount Window: offers various types of credit at the discount rate; designed for times of stress; rates are high (penalty rates - see Bagehot's Dictum); use of discount window credit may spark regulator investigation. Discount Window credit is typically overnight (primary/secondary) or less than 9 months in the case of seasonal loans. Changes in discount rate only affects the short end of the yield curve. 3. The Fed Fund rate: overnight rate at which reserve balances, held by banks at the fed can be lent to each other. This rate is calculated from market transactions. The Fed determines their FF rate target and use open market operations to move the Fed Funds rate toward a particular level. Whilst the Fed Fund rate is an overnight rate, it can be related to longer term movements on the yield curve (1-month treasury bill for example) but there are differences; notably the Fed Funds rate, being a market rate, does vary, while the yield on a 1-month Treasury is effectively fixed at the time of purchase. The relationship between the expected values of a fixed rate and a floating rate is expressed through Overnight Indexed Swap values, and 1-month OIS on the Fed Funds rate is the best direct indication of the expected value of compounded overnight borrowing in the Fed Funds market. I'm looking for further confirmation/understanding no the next two: 1. The reverse repo program, which enables it to set a floor under short-term secured borrowing rates. This makes sense: reverse repo = sell security, collect payment from bank, reduce their fed reserve balance, decrease supply of money in the system and put upwards pressure on the federal funds rate for example. Is this logic correct? 2. The interest rate on excess reserves (IOER); from comments on my prior question, I understand that this rate sets the ceiling for fed funds. IOER = interest paid on balances above the required level; how does that set a ceiling? Sounds more like a floor; for a bank to lend its excess reserves, they would want a higher rate than the IOER? This is a follow on from part one which was posted here. ### StackOverflow #### Applying iterative function to every group in pandas DataFrame I have large pandas DataFrame with following format:  prod_id timestamp text 150523 0006641040 9.393408e+08 text_1 150500 0006641040 9.408096e+08 text_2 150499 0006641041 1.009325e+09 text_3 150508 0006641041 1.018397e+09 text_4 150524 0006641042 1.025482e+09 text_5  DataFrame is sorted by prod_id and timestamp. What I am trying to do, is to enumerate counter for every prod_id based on a timestamp from earliest to latest. For example, I am trying to achieve something like this:  prod_id timestamp text enum 150523 0006641040 9.393408e+08 text_1 1 150500 0006641040 9.408096e+08 text_2 2 150499 0006641041 1.009325e+09 text_3 1 150508 0006641041 1.018397e+09 text_4 2 150524 0006641042 1.025482e+09 text_5 1  I can do this iteratively quite easily by going through each row and increasing counter, but is there a way to do this in a more functional programming fashion? Thanks ### XKCD #### Politifact ### StackOverflow #### Optimize Choices - Maximize Set From Selecting A Limited Number Of Bags From A List Of Bags I have a list of bags. If allowed N selections, how do I choose the N bags that will maximize my set? e.g. choices = [ [A,A,Z], [B,A,E], [Z,Z,B,W], [Q], ...]  If N = 2 I would like to know that... choices[1, 2]  Maximizes my set... set = [B, A, E, Z, W]  I'm trying to force fit this into a gradient descent format but I'm having trouble creating a cost function for this. Is this a correct/reasonable approach? What is the best way to solve this? Notes: Assume the list of choices is large enough that computing every possible combination of choices is not possible. Assume a local optimum solution is acceptable. ### QuantOverflow #### Matlab Neural Network data organization I'm trying to train a NARX network using time series data. I've got 80 sets of data I'd like to train the network with. For clarification, one set of data comprises of 6 financial indicators of X company as the input and the default probability of the company as the target, with 8 timesteps each. All in all I have the financial indicators and the default probability of 80 companies, hence 80 datasets. I would like to train the network to handle any time series of any company and then to do a one step prediction of the future default probability. 1. This is essentially a time series problem with independent samples. I'd really like some advice on how to organize my data for training as my limited understanding of Matlab's NN is that it can't train a network with independent sets of time series data. 2. For my mentioned purpose, does generating a script or the genfunction option in Matlab's GUI make a difference? Thanks a lot! ### CompsciOverflow #### How to determine(with some confidence) if two extremely long lists of numbers (terabytes) are similar? I have two very large files with numbers seperated by commas. What is most efficient/fastest way to say with some confidence that these two files have same numbers. The main rule if if two distributions are the same, also the order of data does matter for example: A = {1 2 4 5 6} B = {6 5 4 2 1} C = {0.5 0.5 1 2 3 4 5 6} in above case A and C are still similar but A and B are totally different To be more clear lets split the question into 3 main tasks 1. Given large lists A,B find if lets say 50% of A is contained in B (Most efficient technique without comparing all numbers from list) 2. Given A,B match above condition check that the overlapping numbers have similar order as mentioned above i.e. if 2 in A is followed by 1 same should happen in B. The positions of 1,2 can be different in B. 3. Select two independent batches from A and B and test if those are from Poission distribution, binomial distribution. ### Lobsters #### The hardest problem in computer science ### StackOverflow #### H2O machine learning platform for Python incurs EnvironmentError while building models I am new to h2o machine learning platform and having the below issue while trying to build models. When i was trying to build 5 GBM models with a not so large dataset, it has the following error: gbm Model Build Progress: [##################################################] 100% gbm Model Build Progress: [##################################################] 100% gbm Model Build Progress: [##################################################] 100% gbm Model Build Progress: [##################################################] 100% gbm Model Build Progress: [################# ] 34% EnvironmentErrorTraceback (most recent call last) <ipython-input-22-e74b34df2f1a> in <module>() 13 params_model={'x': features_pca_all, 'y': response, 'training_frame': train_holdout_pca_hex, 'validation_frame': validation_holdout_pca_hex, 'ntrees': ntree, 'max_depth':depth, 'min_rows': min_rows, 'learn_rate': 0.005} 14 ---> 15 gbm_model=h2o.gbm(**params_model) 16 17 #store model C:\Anaconda2\lib\site-packages\h2o\h2o.pyc in gbm(x, y, validation_x, validation_y, training_frame, model_id, distribution, tweedie_power, ntrees, max_depth, min_rows, learn_rate, nbins, nbins_cats, validation_frame, balance_classes, max_after_balance_size, seed, build_tree_one_node, nfolds, fold_column, fold_assignment, keep_cross_validation_predictions, score_each_iteration, offset_column, weights_column, do_future, checkpoint) 1058 parms = {k:v for k,v in locals().items() if k in ["training_frame", "validation_frame", "validation_x", "validation_y", "offset_column", "weights_column", "fold_column"] or v is not None} 1059 parms["algo"]="gbm" -> 1060 return h2o_model_builder.supervised(parms) 1061 1062 C:\Anaconda2\lib\site-packages\h2o\h2o_model_builder.pyc in supervised(kwargs) 28 algo = kwargs["algo"] 29 parms={k:v for k,v in kwargs.items() if (k not in ["x","y","validation_x","validation_y","algo"] and v is not None) or k=="validation_frame"} ---> 30 return supervised_model_build(x,y,vx,vy,algo,offsets,weights,fold_column,parms) 31 32 def unsupervised_model_build(x,validation_x,algo_url,kwargs): return _model_build(x,None,validation_x,None,algo_url,None,None,None,kwargs) C:\Anaconda2\lib\site-packages\h2o\h2o_model_builder.pyc in supervised_model_build(x, y, vx, vy, algo, offsets, weights, fold_column, kwargs) 16 if not is_auto_encoder and y is None: raise ValueError("Missing response") 17 if vx is not None and vy is None: raise ValueError("Missing response validating a supervised model") ---> 18 return _model_build(x,y,vx,vy,algo,offsets,weights,fold_column,kwargs) 19 20 def supervised(kwargs): C:\Anaconda2\lib\site-packages\h2o\h2o_model_builder.pyc in _model_build(x, y, vx, vy, algo, offsets, weights, fold_column, kwargs) 86 do_future = kwargs.pop("do_future") if "do_future" in kwargs else False 87 future_model = H2OModelFuture(H2OJob(H2OConnection.post_json("ModelBuilders/"+algo, **kwargs), job_type=(algo+" Model Build")), x) ---> 88 return future_model if do_future else _resolve_model(future_model, **kwargs) 89 90 def _resolve_model(future_model, **kwargs): C:\Anaconda2\lib\site-packages\h2o\h2o_model_builder.pyc in _resolve_model(future_model, **kwargs) 89 90 def _resolve_model(future_model, **kwargs): ---> 91 future_model.poll() 92 if '_rest_version' in kwargs.keys(): model_json = H2OConnection.get_json("Models/"+future_model.job.dest_key, _rest_version=kwargs['_rest_version'])["models"][0] 93 else: model_json = H2OConnection.get_json("Models/"+future_model.job.dest_key)["models"][0] C:\Anaconda2\lib\site-packages\h2o\model\model_future.pyc in poll(self) 8 9 def poll(self): ---> 10 self.job.poll() 11 self.x = None C:\Anaconda2\lib\site-packages\h2o\job.pyc in poll(self) 39 time.sleep(sleep) 40 if sleep < 1.0: sleep += 0.1 ---> 41 self._refresh_job_view() 42 running = self._is_running() 43 self._update_progress() C:\Anaconda2\lib\site-packages\h2o\job.pyc in _refresh_job_view(self) 52 53 def _refresh_job_view(self): ---> 54 jobs = H2OConnection.get_json(url_suffix="Jobs/" + self.job_key) 55 self.job = jobs["jobs"][0] if "jobs" in jobs else jobs["job"][0] 56 self.status = self.job["status"] C:\Anaconda2\lib\site-packages\h2o\connection.pyc in get_json(url_suffix, **kwargs) 410 if __H2OCONN__ is None: 411 raise ValueError("No h2o connection. Did you run h2o.init() ?") --> 412 return __H2OCONN__._rest_json(url_suffix, "GET", None, **kwargs) 413 414 @staticmethod C:\Anaconda2\lib\site-packages\h2o\connection.pyc in _rest_json(self, url_suffix, method, file_upload_info, **kwargs) 419 420 def _rest_json(self, url_suffix, method, file_upload_info, **kwargs): --> 421 raw_txt = self._do_raw_rest(url_suffix, method, file_upload_info, **kwargs) 422 return self._process_tables(raw_txt.json()) 423 C:\Anaconda2\lib\site-packages\h2o\connection.pyc in _do_raw_rest(self, url_suffix, method, file_upload_info, **kwargs) 476 477 begin_time_seconds = time.time() --> 478 http_result = self._attempt_rest(url, method, post_body, file_upload_info) 479 end_time_seconds = time.time() 480 elapsed_time_seconds = end_time_seconds - begin_time_seconds C:\Anaconda2\lib\site-packages\h2o\connection.pyc in _attempt_rest(self, url, method, post_body, file_upload_info) 526 527 except requests.ConnectionError as e: --> 528 raise EnvironmentError("h2o-py encountered an unexpected HTTP error:\n {}".format(e)) 529 530 return http_result EnvironmentError: h2o-py encountered an unexpected HTTP error: ('Connection aborted.', BadStatusLine("''",))  My hunch is that the cluster memory has only around 247.5 MB which is not enough to handle the model building hence aborted the connection to h2o. Here are the codes I used to initiate h2o:  #initialization of h2o module import subprocess as sp import sys import os.path as p # path of h2o jar file h2o_path = p.join(sys.prefix, "h2o_jar", "h2o.jar") # subprocess to launch h2o # the command can be further modified to include virtual machine parameters sp.Popen("java -jar " + h2o_path) # h2o.init() call to verify that h2o launch is successfull h2o.init(ip="localhost", port=54321, size=1, start_h2o=False, enable_assertions=False, \ license=None, max_mem_size_GB=4, min_mem_size_GB=4, ice_root=None)  and here is the returned status table: Any ideas on the above would be greatly appreciated!! ### QuantOverflow #### What data sources are available online? What sources of financial and economic data are available online? Which ones are free or cheap? What has your experience been like with these data sources? #### VAR FPCA analysis paper replication I've been trying to replicate the following publication: toronto.edu/sjaimung/papers/VAR-FPCA.pdf but I havent been able to get the same results estimating the$\beta_{k}$parameters. First, I got the data from Bloomberg (CL Comdty contracts) from the period specified in the paper but I don't get how is the$\tau$parameter specified. I using it as number of years to maturity but with the 70 contracts available in bbg seems like I won't fit the$\tau$that appears in the publication's charts. Also, assuming I'm correct, when i fit the$\beta_{k}$using MLS with the common formula ($\beta_{k} = (X^{T}X)^{-1}Xy)$I get, for the first beta, a similar chart but with a different scale: I guess that the parameter calculation is correct, because when i plot the curve against the data it seems very close and the error is quite small, but cant get the same$ \beta_{k}$scale: any suggestions? Thanks to all!! ### arXiv Distributed, Parallel, and Cluster Computing #### Exploiting Workload Cycles for Orchestration of Virtual Machine Live Migrations in Clouds. (arXiv:1607.07846v1 [cs.DC]) Virtual machine live migration in cloud environments aims at reducing energy costs and increasing resource utilization. However, its potential has not been fully explored because of simultaneous migrations that may cause user application performance degradation and network congestion. Research efforts on live migration orchestration policies still mostly rely on system level metrics. This work introduces an Application-aware Live Migration Architecture (ALMA) that selects suitable moments for migrations using application characterization data. This characterization consists in recognizing resource usage cycles via Fast Fourier Transform. From our experiments, live migration times were reduced by up to 74% for benchmarks and by up to 67% for real applications, when compared to migration policies with no application workload analysis. Network data transfer during the live migration was reduced by up to 62%. #### Multi-Variant Execution of Parallel Programs. (arXiv:1607.07841v1 [cs.CR]) Multi-Variant Execution Environments (MVEEs) are a promising technique to protect software against memory corruption attacks. They transparently execute multiple, diversified variants (often referred to as replicae) of the software receiving the same inputs. By enforcing and monitoring the lock-step execution of the replicae's system calls, and by deploying diversity techniques that prevent an attacker from simultaneously compromising multiple replicae, MVEEs can block attacks before they succeed. Existing MVEEs cannot handle non-trivial multi-threaded programs because their undeterministic behavior introduces benign system call inconsistencies in the replicae, which trigger false positive detections and deadlocks in the MVEEs. This paper for the first time extends the generality of MVEEs to protect multi-threaded software by means of secure and efficient synchronization replication agents. On the PARSEC 2.1 parallel benchmarks running with four worker threads, our prototype MVEE incurs a run-time overhead of only 1.32x. #### Regular Behaviours with Names. (arXiv:1607.07828v1 [cs.LO]) Nominal sets provide a framework to study key notions of syntax and semantics such as fresh names, variable binding and$\alpha$-equivalence on a conveniently abstract categorical level. Coalgebras for endofunctors on nominal sets model, e.g., various forms of automata with names as well as infinite terms with variable binding operators (such as$\lambda$-abstraction). Here, we first study the behaviour of orbit-finite coalgebras for functors$\bar F$on nominal sets that lift some finitary set functor$F$. We provide sufficient conditions under which the rational fixpoint of$\bar F$, i.e. the collection of all behaviours of orbit-finite$\bar F$-coalgebras, is the lifting of the rational fixpoint of$F$. Second, we describe the rational fixpoint of the quotient functors: we introduce the notion of a sub-strength of an endofunctor on nominal sets, and we prove that for a functor$G$with a sub-strength the rational fixpoint of each quotient of$G$is a canonical quotient of the rational fixpoint of$G$. As applications, we obtain a concrete description of the rational fixpoint for functors arising from so-called binding signatures with exponentiation, such as those arising in coalgebraic models of infinitary$\lambda$-terms and various flavours of automata. #### Natural Steganography: cover-source switching for better steganography. (arXiv:1607.07824v1 [cs.MM]) This paper proposes a new steganographic scheme relying on the principle of cover-source switching, the key idea being that the embedding should switch from one cover-source to another. The proposed implementation, called Natural Steganography, considers the sensor noise naturally present in the raw images and uses the principle that, by the addition of a specific noise the steganographic embedding tries to mimic a change of ISO sensitivity. The embedding methodology consists in 1) perturbing the image in the raw domain, 2) modeling the perturbation in the processed domain, 3) embedding the payload in the processed domain. We show that this methodology is easily tractable whenever the processes are known and enables to embed large and undetectable payloads. We also show that already used heuristics such as synchronization of embedding changes or detectability after rescaling can be respectively explained by operations such as color demosaicing and down-scaling kernels. #### Energy-Efficient Real-Time Scheduling for Two-Type Heterogeneous Multiprocessors. (arXiv:1607.07763v1 [cs.DC]) We propose three novel mathematical optimization formulations that solve the same two-type heterogeneous multiprocessor scheduling problem for a real-time taskset with hard constraints. Our formulations are based on a global scheduling scheme and a fluid model. The first formulation is a mixed-integer nonlinear program, since the scheduling problem is intuitively considered as an assignment problem. However, by changing the scheduling problem to first determine a task workload partition and then to find the execution order of all tasks, the computation time can be significantly reduced. Specifically, the workload partitioning problem can be formulated as a continuous nonlinear program for a system with continuous operating frequency, and as a continuous linear program for a practical system with a discrete speed level set. The task ordering problem can be solved by an algorithm with a complexity that is linear in the total number of tasks. The work is evaluated against existing global energy/feasibility optimal workload allocation formulations. The results illustrate that our algorithms are both feasibility optimal and energy optimal for both implicit and constrained deadline tasksets. Specifically, our algorithm can achieve up to 40% energy saving for some simulated tasksets with constrained deadlines. The benefit of our formulation compared with existing work is that our algorithms can solve a more general class of scheduling problems due to incorporating a scheduling dynamic model in the formulations and allowing for a time-varying speed profile. Moreover, our algorithms can be applied to both online and offline scheduling schemes. #### From Graph Isoperimetric Inequality to Network Connectivity -- A New Approach. (arXiv:1607.07761v1 [cs.DC]) We present a new, novel approach to obtaining a network's connectivity. More specifically, we show that there exists a relationship between a network's graph isoperimetric properties and its conditional connectivity. A network's connectivity is the minimum number of nodes, whose removal will cause the network disconnected. It is a basic and important measure for the network's reliability, hence its overall robustness. Several conditional connectivities have been proposed in the past for the purpose of accurately reflecting various realistic network situations, with extra connectivity being one such conditional connectivity. In this paper, we will use isoperimetric properties of the hypercube network to obtain its extra connectivity. The result of the paper for the first time establishes a relationship between the age-old isoperimetric problem and network connectivity. #### New security notions and feasibility results for authentication of quantum data. (arXiv:1607.07759v1 [cs.CR]) We give a new class of security definitions for authentication in the quantum setting. Our definitions capture and strengthen several existing definitions, including superposition attacks on classical authentication, as well as full authentication of quantum data. We argue that our definitions resolve some of the shortcomings of existing definitions. We then give several feasibility results for our strong definitions. As a consequence, we obtain several interesting results, including: (1) the classical Carter-Wegman authentication scheme with 3-universal hashing is secure against superposition attacks, as well as adversaries with quantum side information; (2) quantum authentication where the entire key can be reused if verification is successful; (3) conceptually simple constructions of quantum authentication; and (4) a conceptually simple QKD protocol. #### Leveraging the Potential of Control-Flow Error Resilient Techniques in Multithreaded Programs. (arXiv:1607.07727v1 [cs.PL]) This paper presents a software-based technique to recover control-flow errors in multithreaded programs. Control-flow error recovery is achieved through inserting additional instructions into multithreaded program at compile time regarding to two dependency graphs. These graphs are extracted to model control-flow and data dependencies among basic blocks and thread interactions between different threads of a program. In order to evaluate the proposed technique, three multithreaded benchmarks quick sort, matrix multiplication and linked list utilized to run on a multi-core processor, and a total of 5000 transient faults has been injected into several executable points of each program. The results show that this technique detects and corrects between 91.9% and 93.8% of the injected faults with acceptable performance and memory overheads. #### Discovering, quantifying, and displaying attacks. (arXiv:1607.07720v1 [cs.CR]) In the design of software and cyber-physical systems, security is often perceived as a qualitative need, but can only be attained quantitatively. Especially when distributed components are involved, it is hard to predict and confront all possible attacks. A main challenge in the development of complex systems is therefore to discover attacks, quantify them to comprehend their likelihood, and communicate them to non-experts for facilitating the decision process. To address this three-sided challenge we propose a protection analysis over the Quality Calculus that (i) computes all the sets of data required by an attacker to reach a given location in a system, (ii) determines the cheapest set of such attacks for a given notion of cost, and (iii) derives an attack tree that displays the attacks graphically. The protection analysis is first developed in a qualitative setting, and then extended to quantitative settings following an approach applicable to a great many contexts. The quantitative formulation is implemented as an optimisation problem encoded into Satisfiability Modulo Theories, allowing us to deal with complex cost structures. The usefulness of the framework is demonstrated on a national-scale authentication system, studied through a Java implementation of the framework. #### Domains and Random Variables. (arXiv:1607.07698v1 [cs.LO]) The aim of this paper is to establish a theory of random variables on domains. Domain theory is a fundamental component of theoretical computer science, providing mathematical models of computational processes. Random variables are the mainstay of probability theory. Since computational models increasingly involve probabilistic aspects, it's only natural to explore the relationship between these two areas. Our main results show how to cast results about random variables using a domain-theoretic approach. The pay-off is an extension of the results from probability measures to sub-probability measures. We also use our approach to extend the class of domains for which we can classify the domain structure of the space of sub-probability measures. #### Parallelized Proximity-Based Query Processing Methods for Road Networks. (arXiv:1607.07696v1 [cs.DC]) In this paper, we propose a paradigm for processing in parallel graph joins in road networks. The methodology we present can be used for distance join processing among the elements of two disjoint sets R,S of nodes from the road network, with R preceding S, and we are in search for the pairs of vertices (u,v), where u in R and v in S, such that dist(u,v) < {\theta}. Another variation of the problem would involve retrieving the k closest pairs (u,v) in the road network with u in R and v in S, such that dist(u,v) <= dist(w,y), where w,y do not belong in the result. We reckon that this is an extremely useful paradigm with many practical applications. A typical example of usage of our methods would be to find the pairs of restaurants and bars (in that order) from which to select for a night out, that either fall within walking distance for example, or just the k closest pairs, depending on the parameters. Another entirely different scenario would involve finding the points of two distinct trajectories that are within a certain distance predicate, or the k closest such points. For example, we would like to transfer from one train to another a few tones of freight, and hence, we want to minimize the distance we have to cover for moving the cargo from the carrying train to the other. We reckon that this endeavor of ours covers exactly those needs for processing such queries efficiently. Moreover, for the specific purposes of this paper, we also propose a novel heuristic graph partitioning scheme. It resembles a recursive bisection method, and is tailored to the requirements of the problem, targeting at establishing well separated partitions, so as to allow computations to be performed simultaneously and independently within each partition, unlike hitherto work that aims at minimizing either the number of edges among different partitions, or the number of nodes thereof. #### The Price of Anarchy in Auctions. (arXiv:1607.07684v1 [cs.GT]) This survey outlines a general and modular theory for proving approximation guarantees for equilibria of auctions in complex settings. This theory complements traditional economic techniques, which generally focus on exact and optimal solutions and are accordingly limited to relatively stylized settings. We highlight three user-friendly analytical tools: smoothness-type inequalities, which immediately yield approximation guarantees for many auction formats of interest in the special case of complete information and deterministic strategies; extension theorems, which extend such guarantees to randomized strategies, no-regret learning outcomes, and incomplete-information settings; and composition theorems, which extend such guarantees from simpler to more complex auctions. Combining these tools yields tight worst-case approximation guarantees for the equilibria of many widely-used auction formats. #### Context-based Pseudonym Changing Scheme for Vehicular Adhoc Networks. (arXiv:1607.07656v1 [cs.CR]) Vehicular adhoc networks allow vehicles to share their information for safety and traffic efficiency. However, sharing information may threaten the driver privacy because it includes spatiotemporal information and is broadcast publicly and periodically. In this paper, we propose a context-adaptive pseudonym changing scheme which lets a vehicle decide autonomously when to change its pseudonym and how long it should remain silent to ensure unlinkability. This scheme adapts dynamically based on the density of the surrounding traffic and the user privacy preferences. We employ a multi-target tracking algorithm to measure privacy in terms of traceability in realistic vehicle traces. We use Monte Carlo analysis to estimate the quality of service (QoS) of a forward collision warning application when vehicles apply this scheme. According to the experimental results, the proposed scheme provides a better compromise between traceability and QoS than a random silent period scheme. #### Minimum rank and zero forcing number for butterfly networks. (arXiv:1607.07522v1 [math.CO]) The minimum rank of a simple graph$G$is the smallest possible rank over all symmetric real matrices$A$whose nonzero off-diagonal entries correspond to the edges of$G$. Using the zero forcing number, we prove that the minimum rank of the butterfly network is$\frac19\left[(3r+1)2^{r+1}-2(-1)^r\right]$and that this is equal to the rank of its adjacency matrix. #### Realistic DNA De-anonymization using Phenotypic Prediction. (arXiv:1607.07501v1 [cs.CR]) There are a number of vectors for attack when trying to link an individual to a certain DNA sequence. Phenotypic prediction is one such vector; linking DNA to an individual based on their traits. Current approaches are not overly effective, due to a number of real world considerations. This report will improve upon current phenotypic prediction, and suggest a number of methods for defending against such an attack. #### Adaptive Closed Loop OFDM-Based Resource Allocation Method using Machine Learning and Genetic Algorithm. (arXiv:1607.07494v1 [cs.NI]) In this paper, the concept of Machine Learning (ML) is introduced to the Orthogonal Frequency Division Multiple Access-based (OFDMA-based) scheduler. Similar to the impact of the Channel Quality Indicator (CQI) on the scheduler in the Long Term Evolution (LTE), ML is utilized to provide the scheduler with pertinent information about the User Equipment (UE) traffic patterns, demands, Quality of Service (QoS) requirements, instantaneous user throughput and other network conditions. An adaptive ML-based framework is proposed in order to optimize the LTE scheduler operation. The proposed technique targets multiple objective scheduling strategies. The weights of the different objectives are adjusted to optimize the resources allocation per transmission based on the UEs demand pattern. In addition, it overcomes the trade-off problem of the traditional scheduling methods. The technique can be used as a generic framework with any scheduling strategy. In this paper, Genetic Algorithm-based (GA-based) multi- objective scheduler is considered to illustrate the efficiency of the proposed adaptive scheduling solution. Results show that using the combination of clustering and classification algorithms along with the GA optimizes the GA scheduler functionality and makes use of the ML process to form a closed loop scheduling mechanism. #### F\"uredi-Hajnal limits are typically subexponential. (arXiv:1607.07491v1 [math.CO]) A binary matrix is a matrix with entries from the set$\{0,1\}$. We say that a binary matrix$A$contains a binary matrix$S$if$S$can be obtained from$A$by removal of some rows, some columns, and changing some$1$-entries to$0$-entries. If$A$does not contain$S$, we say that$A$avoids$S$. A$k$-permutation matrix$P$is a binary$k \times k$matrix with exactly one$1$-entry in every row and one$1$-entry in every column. The F\"uredi-Hajnal conjecture, proved by Marcus and Tardos, states that for every permutation matrix$P$, there is a constant$c_P$such that for every$n \in \mathbb{N}$, every$n \times n$binary matrix$A$with at least$c_P n1$-entries contains$P$. We show that$c_P \le 2^{O(k^{2/3}\log^{7/3}k / (\log\log k)^{1/3})}$asymptotically almost surely for a random$k$-permutation matrix$P$. We also show that$c_P \le 2^{(4+o(1))k}$for every$k$-permutation matrix$P$, improving the constant in the exponent of a recent upper bound on$c_P$by Fox. We also consider a higher-dimensional generalization of the Stanley-Wilf conjecture about the number of$d$-dimensional$n$-permutation matrices avoiding a fixed$d$-dimensional$k$-permutation matrix, and prove almost matching upper and lower bounds of the form$(2^k)^{O(n)} \cdot (n!)^{d-1-1/(d-1)}$and$n^{-O(k)} k^{\Omega(n)} \cdot (n!)^{d-1-1/(d-1)}$, respectively. #### Automatic Construction of Statechart-Based Anomaly Detection Models for Multi-Threaded Industrial Control Systems. (arXiv:1607.07489v1 [cs.CR]) Traffic of Industrial Control System (ICS) between the Human Machine Interface (HMI) and the Programmable Logic Controller (PLC) is known to be highly periodic. However, it is sometimes multiplexed, due to asynchronous scheduling. Modeling the network traffic patterns of multiplexed ICS streams using Deterministic Finite Automata (DFA) for anomaly detection typically produces a very large DFA, and a high false-alarm rate. We introduce a new modeling approach that addresses this gap. Our Statechart DFA modeling includes multiple DFAs, one per cyclic pattern, together with a DFA-selector that de-multiplexes the incoming traffic into sub-channels and sends them to their respective DFAs. We demonstrate how to automatically construct the Statechart from a captured traffic stream. Our unsupervised learning algorithm builds a Discrete-Time Markov Chain (DTMC) from the stream. Next it splits the symbols into sets, one per multiplexed cycle, based on symbol frequencies and node degrees in the DTMC graph. Then it creates a sub-graph for each cycle, and extracts Euler cycles for each sub-graph. The final Statechart is comprised of one DFA per Euler cycle. The algorithms allow for non-unique symbols, that appear in more than one cycle, and also for symbols that appear more than once in a cycle. We evaluated our solution on traces from a production ICS using the Siemens S7-0x72 protocol. We also stress-tested our algorithms on a collection of synthetically-generated traces that simulated multiplexed ICS traces with varying levels of symbol uniqueness and time overlap. The algorithms were able to split the symbols into sets with 99.6% accuracy. The resulting Statechart modeled the traces with a low median false-alarm rate of 0.483%. In all but the most extreme scenarios the Statechart model drastically reduced both the false-alarm rate and the learned model size in compare to a naive single-DFA model ### QuantOverflow #### Calculate the 0.50 Beta of an Index I am trying to come up with a benchmark 0.50 Beta S&P 500 Index. I have 1 year time series data of 500 constituents of the S&P 500 Index. Using the standard stock beta calculation method, how do I calculate or implement the 0.50 beta using the standard formula calculation. So I have the standard conventional beta calculation: Beta = Covariance ( STOCK % Daily Change, INDEX % daily Change)/VAR (INDEX % Daily Change But I am pretty much stuck here. I have not been able to implement the 0.50 in the computation. I even wonder if it can be done. I read somewhere that I need to use the 3 Month T-Bill Index in the implementation. Not sure if I need to use this. Thank you for any help I can get here. ### CompsciOverflow #### Average Cost Threshold Protocol with Minimum Thresholds: How to find the price? The protocol is defined here, but I'll give a summary here. Okay, so a number of agents want a certain public good to be constructed (a public good is something like a book, a program, or a statue, something that can optionally benefit everyone once constructed.) It costs$P$to construct this good. The agents each give an interval that they are willing to pay. Agent$i$'s interval is$[m_i,M_i]$. (In the case of a rational agent,$m_i=0$, but if the agent is somewhat altruistic,$m_i>0$. It has been proven that$M_i$will always be how much value a rational agent$i$expects to extract from the public good.) Now, in this protocol, each person pays the same price$p$, or$m_i$, whichever is greater, or pays nothing at all if$p > M_i$. (Those who pay get access to the good, those who don't are excluded from the good.) What I am trying to do is find the minimum$p$, such that the total amount paid (let's call it$F(p)$) is equal to or greater than$P$, efficiently. If all$m_i$are$0$and there are$n$agents, I can do it$\mathcal O(n \log n)$. I first sort the agents by$M_i$. Then starting with the agent with least$M_i$, I figure out how much funds will be available with$p=M_i$. If$F(M_i) \gt P$, than$M_i$is the minimum$p$. Otherwise, we go to the next agent, and so on.$F(M_i)$can be calculated in$\mathcal O (n)$, by multiplying$M_i$by how many agents come after (or are) agent$i$(for each agent$j$that comes after$i$,$M_j \ge M_i$. Since$p=M_i$, not$p>M_j$, and agent$j$pays$p=M_i$.) (Note, I've left out the analysis for when$p$is between the$M_i$'s; it doesn't change the it too much.) My question is, how does one solve this problem efficiently in general, when the$m_i$may not be$0$? Note: Sorry if I didn't phrase this clearly. Feel free to ask for clarification in the comments. ### Lobsters #### I wrote a literate program ### Planet Theory #### Low-Congestion Shortcuts without Embedding Authors: Bernhard Haeupler, Taisuke Izumi, Goran Zuzic Download: PDF Abstract: Distributed optimization algorithms are frequently faced with solving sub-problems on disjoint connected parts of a network. Unfortunately, the diameter of these parts can be significantly larger than the diameter of the underlying network, leading to slow running times. Recent work by [Ghaffari and Hauepler; SODA'16] showed that this phenomenon can be seen as the broad underlying reason for the pervasive$\Omega(\sqrt{n} + D)$lower bounds that apply to most optimization problems in the CONGEST model. On the positive side, this work also introduced low-congestion shortcuts as an elegant solution to circumvent this problem in certain topologies of interest. Particularly, they showed that there exist good shortcuts for any planar network and more generally any bounded genus network. This directly leads to fast$O(D \log^{O(1)} n)$distributed algorithms for MST and Min-Cut approximation, given that one can efficiently construct these shortcuts in a distributed manner. Unfortunately, the shortcut construction of [Ghaffari and Hauepler; SODA'16] relies heavily on having access to a genus embedding of the network. Computing such an embedding distributedly, however, is a hard problem - even for planar networks. No distributed embedding algorithm for bounded genus graphs is in sight. In this work, we side-step this problem by defining a restricted and more structured form of shortcuts and giving a novel construction algorithm which efficiently finds a shortcut which is, up to a logarithmic factor, as good as the best shortcut that exists for a given network. This new construction algorithm directly leads to an$O(D \log^{O(1)} n)$-round algorithm for solving optimization problems like MST for any topology for which good restricted shortcuts exist - without the need to compute any embedding. This includes the first efficient algorithm for bounded genus graphs. ### Planet Emacsen #### Irreal: Why Literate Programming: One Man's Answer Shane Cellos has a nice post on why literate programming makes sense. He tells the story of how he and an equally matched colleague worked on a relatively simple project. It was, he says, an ideal situation. One colleague with a similar technical background. What could go wrong? You can read Celis' post to see what went wrong but the TL;DR is that the code quickly diverged from the spec and only one of the researchers understood it. Celis suggests that using literate programming to document what the code is trying to do would have made the collaboration easier. I've been coding so long that presenting code in its natural order seems like the right way to me so I don't feel the need to document the code in an order different from what the compiler sees. That said, I think it makes a lot of sense to write a document explaining the code in which the actual code is embedded in the document. Of course, this is ideally suited for Org mode, which is what I would use, but others may find another method better. Regardless, we should all consider how literate programming can enhance our workflow. See Howard Abrams posts on Literate Devops for a fine example of this. ### Planet Theory #### A Scalable Algorithm for Tracking an Unknown Number of Targets Using Multiple Sensors Authors: Florian Meyer, Paolo Braca, Peter Willett, Franz Hlawatsch Download: PDF Abstract: We propose a method for tracking an unknown number of targets based on measurements provided by multiple sensors. Our method achieves low computational complexity and excellent scalability by running belief propagation on a suitably devised factor graph. A redundant formulation of data association uncertainty and the use of "augmented target states" including binary target indicators make it possible to exploit statistical independencies for a drastic reduction of complexity. An increase in the number of targets, sensors, or measurements leads to additional variable nodes in the factor graph but not to higher dimensions of the messages. As a consequence, the complexity of our method scales only quadratically in the number of targets, linearly in the number of sensors, and linearly in the number of measurements per sensors. The performance of the method compares well with that of previously proposed methods, including methods with a less favorable scaling behavior. In particular, our method can outperform multisensor versions of the probability hypothesis density (PHD) filter, the cardinalized PHD filter, and the multi-Bernoulli filter. #### Unrelated Machine Scheduling of Jobs with Uniform Smith Ratios Authors: Christos Kalaitzis, Ola Svensson, Jakub Tarnawski Download: PDF Abstract: We consider the classic problem of scheduling jobs on unrelated machines so as to minimize the weighted sum of completion times. Recently, for a small constant$\varepsilon >0 $, Bansal et al. gave a$(3/2-\varepsilon)$-approximation algorithm improving upon the natural barrier of$3/2$which follows from independent randomized rounding. In simplified terms, their result is obtained by an enhancement of independent randomized rounding via strong negative correlation properties. In this work, we take a different approach and propose to use the same elegant rounding scheme for the weighted completion time objective as devised by Shmoys and Tardos for optimizing a linear function subject to makespan constraints. Our main result is a$1.21$-approximation algorithm for the natural special case where the weight of a job is proportional to its processing time (specifically, all jobs have the same Smith ratio), which expresses the notion that each unit of work has the same weight. In addition, as a direct consequence of the rounding, our algorithm also achieves a bi-criteria$2$-approximation for the makespan objective. Our technical contribution is a tight analysis of the expected cost of the solution compared to the one given by the Configuration-LP relaxation - we reduce this task to that of understanding certain worst-case instances which are simple to analyze. ### CompsciOverflow #### "Not declared in this scope" error in a school riddle coding [on hold] #include <iostream> using namespace std; void rumuslebar(int digit[], int b) { for (int j=0; j<b; j++) { for (int k=0; k<b; k++) { if (digit[j]>digit[k]) { int isi; isi = digit[k]; digit[k]= digit [j]; digit [j]=isi; cout<<digit[k]<<digit[j]<<endl; } } } } void panjang() { int a=1 ; a+=1; int b=3 ; b*=3; int c=7 ; c%=10; cout<<a<<b<<","<<c; } int main() { cout << "Nilai Panjang:"; int digit[2]; int jumlahdigit=2; int nilaiawal=0; for (int i=0; i<2; i++) { nilaiawal+=1; digit[i]=nilaiawal; } rumuspanjang(digit,jumlahdigit); cout <<"Nilai lebar:"; lebar(); return 0; }  Error: 'lebar' was not declared in this scope Error: 'rumuspanjang' was not declared in this scope Tried to bring out the formulas outside of the int main but it didnt work. Thanks ### Planet Theory #### Computing the k Nearest-Neighbors for all Vertices via Dijkstra Authors: Sariel Har-Peled Download: PDF Abstract: We are given a directed graph$G = (V,E)$with$n$vertices and$m$edges, with positive weights on the edges, and a parameter$k >0$. We show how to compute, for every vertex$v \in V$, its$k$nearest-neighbors. The algorithm runs in$O( k ( n \log n + m ) )$time, and follows by a somewhat careful modification of Dijkstra's shortest path algorithm. This result is probably folklore, but we were unable to find a reference to it -- thus, this note. #### Fast Global Convergence of Online PCA Authors: Zeyuan Allen-Zhu, Yuanzhi Li Download: PDF Abstract: We study online principle component analysis (PCA), that is to find the top$k$eigenvectors of a$d\times d$hidden matrix$\bf \Sigma$with online data samples drawn from covariance matrix$\bf \Sigma$. We provide$global$convergence for the low-rank generalization of Oja's algorithm, which is popularly used in practice but lacks theoretical understanding. Our convergence rate matches the lower bound in terms of the dependency on error, on eigengap and on dimension$d$; in addition, our convergence rate can be made gap-free, that is proportional to the approximation error and independent of the eigengap. In contrast, for general rank$k$, before our work (1) it was open to design any algorithm with efficient global convergence rate; and (2) it was open to design any algorithm with (even local) gap-free convergence rate. #### Finding Detours is Fixed-parameter Tractable Authors: Ivona Bezáková, Radu Curticapean, Holger Dell, Fedor V. Fomin Download: PDF Abstract: We consider the following natural "above guarantee" parameterization of the classical Longest Path problem: For given vertices s and t of a graph G, and an integer k, the problem Longest Detour asks for an (s,t)-path in G that is at least k longer than a shortest (s,t)-path. Using insights into structural graph theory, we prove that Longest Detour is fixed-parameter tractable (FPT) on undirected graphs and actually even admits a single-exponential algorithm, that is, one of running time exp(O(k)) poly(n). This matches (up to the base of the exponential) the best algorithms for finding a path of length at least k. Furthermore, we study the related problem Exact Detour that asks whether a graph G contains an (s,t)-path that is exactly k longer than a shortest (s,t)-path. For this problem, we obtain a randomized algorithm with running time about 2.746^k, and a deterministic algorithm with running time about 6.745^k, showing that this problem is FPT as well. Our algorithms for Exact Detour apply to both undirected and directed graphs. #### A Selectable Sloppy Heap Authors: Adrian Dumitrescu Download: PDF Abstract: We study the selection problem, namely that of computing the$i$th order statistic of$n$given elements. Here we offer a data structure handling a dynamic version in which upon request: (i)~a new element is inserted or (ii)~an element of a prescribed quantile group is deleted from the data structure. Each operation is executed in (ideal!) constant time---and is thus independent of$n$(the number of elements stored in the data structure). The design demonstrates how slowing down a certain computation can reduce the response time of the data structure. #### Quantum Advantage on Information Leakage for Equality Authors: Juan Miguel Arrazola, Dave Touchette Download: PDF Abstract: We prove a lower bound on the information leakage of any classical protocol computing the equality function in the simultaneous message passing (SMP) model. Our bound is valid in the finite length regime and is strong enough to demonstrate a quantum advantage in terms of information leakage for practical quantum protocols. We prove our bound by obtaining an improved finite size version of the communication bound due to Babai and Kimmel, relating randomized communication to deterministic communication in the SMP model. We then relate information leakage to randomized communication through a series of reductions. We first provide alternative characterizations for information leakage, allowing us to link it to average length communication while allowing for shared randomness (pairwise, with the referee). A Markov inequality links this with bounded length communication, and a Newman type argument allows us to go from shared to private randomness. The only reduction in which we incur more than a logarithmic additive factor is in the Markov inequality; in particular, our compression method is essentially tight for the SMP model with average length communication. ### CompsciOverflow #### Is C actually Turing-complete? I was trying to explain to someone that C is Turing-complete, and realized that I don't actually know if it is, indeed, technically Turing-complete. (C as in the abstract semantics, not as in an actual implementation.) The "obvious" answer (roughly: it can address an arbitrary amount of memory, so it can emulate a RAM machine, so it's Turing-complete) isn't actually correct, as far as I can tell, as although the C standard allows for size_t to be arbitrarily large, it must be fixed at some length, and no matter what length it is fixed at it is still finite. (In other words, although you could, given an arbitrary halting Turing machine, pick a length of size_t such that it will run "properly", there is no way to pick a length of size_t such that all halting Turing machines will run properly) So: is C99 Turing-complete? ### StackOverflow #### Errors occur when compiling caffe files in its installation (Makefile.config) Could someone help me with these errors while compiling caffe during installation? this is the Makefile.config of caffe files modified by me ## Refer to http://caffe.berkeleyvision.org/installation.html # Contributions simplifying and improving our build system are welcome! # cuDNN acceleration switch (uncomment to build with cuDNN). USE_CUDNN := 1 # CPU-only switch (uncomment to build without GPU support). # CPU_ONLY := 1 # uncomment to disable IO dependencies and corresponding data layers # USE_OPENCV := 0 # USE_LEVELDB := 0 # USE_LMDB := 0 # uncomment to allow MDB_NOLOCK when reading LMDB files (only if necessary) # You should not set this flag if you will be reading LMDBs with any # possibility of simultaneous read and write # ALLOW_LMDB_NOLOCK := 1 # Uncomment if you're using OpenCV 3 # OPENCV_VERSION := 3 # To customize your choice of compiler, uncomment and set the following. # N.B. the default for Linux is g++ and the default for OSX is clang++ # CUSTOM_CXX := g++ # CUDA directory contains bin/ and lib/ directories that we need. CUDA_DIR := /usr/local/cuda # On Ubuntu 14.04, if cuda tools are installed via # "sudo apt-get install nvidia-cuda-toolkit" then use this instead: # CUDA_DIR := /usr # CUDA architecture setting: going with all of them. # For CUDA < 6.0, comment the *_50 lines for compatibility. CUDA_ARCH := -gencode arch=compute_20,code=sm_20 \ -gencode arch=compute_20,code=sm_21 \ -gencode arch=compute_30,code=sm_30 \ -gencode arch=compute_35,code=sm_35 \ -gencode arch=compute_50,code=sm_50 \ -gencode arch=compute_50,code=compute_50 # BLAS choice: # atlas for ATLAS (default) # mkl for MKL # open for OpenBlas BLAS := ATLAS # Custom (MKL/ATLAS/OpenBLAS) include and lib directories. # Leave commented to accept the defaults for your choice of BLAS # (which should work)! # BLAS_INCLUDE := /path/to/your/blas # BLAS_LIB := /path/to/your/blas # Homebrew puts openblas in a directory that is not on the standard search path # BLAS_INCLUDE :=$(shell brew --prefix openblas)/include
# BLAS_LIB := $(shell brew --prefix openblas)/lib # This is required only if you will compile the matlab interface. # MATLAB directory should contain the mex binary in /bin. # MATLAB_DIR := /usr/local # MATLAB_DIR := /Applications/MATLAB_R2012b.app # NOTE: this is required only if you will compile the python interface. # We need to be able to find Python.h and numpy/arrayobject.h. # PYTHON_INCLUDE := /usr/include/python2.7 \ # /usr/lib/python2.7/dist-packages/numpy/core/include # Anaconda Python distribution is quite popular. Include path: # Verify anaconda location, sometimes it's in root. ANACONDA_HOME :=$(HOME) /home/desmond/anaconda2
PYTHON_INCLUDE := $(ANACONDA_HOME) /home/desmond/anaconda2/include \$(ANACONDA_HOME) /home/desmond/anaconda2/include/python2.7 \
$(ANACONDA_HOME) /home/desmond/anaconda2/lib/python2.7/site-packages/numpy/core/include \ # Uncomment to use Python 3 (default is Python 2) # PYTHON_LIBRARIES := boost_python3 python3.5m # PYTHON_INCLUDE := /usr/include/python3.5m \ # /usr/lib/python3.5/dist-packages/numpy/core/include # We need to be able to find libpythonX.X.so or .dylib. #PYTHON_LIB := /usr/lib PYTHON_LIB :=$/home/desmond/anaconda2/lib

# Homebrew installs numpy in a non standard path (keg only)
# PYTHON_INCLUDE += $(dir$(shell python -c 'import numpy.core; print(numpy.core.__file__)'))/include
# PYTHON_LIB += $(shell brew --prefix numpy)/lib # Uncomment to support layers written in Python (will link against Python libs) WITH_PYTHON_LAYER := 1 # Whatever else you find you need goes here. INCLUDE_DIRS :=$(PYTHON_INCLUDE) /usr/local/include
LIBRARY_DIRS := $(PYTHON_LIB) /usr/local/lib /usr/lib # If Homebrew is installed at a non standard location (for example your home directory) and you use it for general dependencies # INCLUDE_DIRS +=$(shell brew --prefix)/include
# LIBRARY_DIRS += $(shell brew --prefix)/lib # Uncomment to use pkg-config to specify OpenCV library paths. # (Usually not necessary -- OpenCV libraries are normally installed in one of the above$LIBRARY_DIRS.)
# USE_PKG_CONFIG := 1

# N.B. both build and distribute dirs are cleared on make clean
BUILD_DIR := build
DISTRIBUTE_DIR := distribute

# Uncomment for debugging. Does not work on OSX due to https://github.com/BVLC/caffe/issues/171
DEBUG := 1

# The ID of the GPU that 'make runtest' will use to run unit tests.
TEST_GPUID := 0

# enable pretty build (comment to see full commands)
Q ?= @


And then the corresponding compiling result(or errors) (Tip: 错误 means error or wrong)

desmond@desmond-Lenovo-IdeaPad-Y400:~/caffe-master$make all -j4 CXX src/caffe/util/db_leveldb.cpp CXX src/caffe/parallel.cpp CXX src/caffe/util/db_lmdb.cpp CXX src/caffe/util/upgrade_proto.cpp In file included from ./include/caffe/util/device_alternate.hpp:40:0, from ./include/caffe/common.hpp:19, from ./include/caffe/util/db.hpp:6, from ./include/caffe/util/db_leveldb.hpp:10, from src/caffe/util/db_leveldb.cpp:2: ./include/caffe/util/cudnn.hpp: In function ‘void caffe::cudnn::createPoolingDesc(cudnnPoolingStruct**, caffe::PoolingParameter_PoolMethod, cudnnPoolingMode_t*, int, int, int, int, int, int)’: ./include/caffe/util/cudnn.hpp:136:9: error: ‘CUDNN_PROPAGATE_NAN’ was not declared in this scope CUDNN_PROPAGATE_NAN, h, w, pad_h, pad_w, stride_h, stride_w)); ^ ./include/caffe/util/cudnn.hpp:15:28: note: in definition of macro ‘CUDNN_CHECK’ cudnnStatus_t status = condition; \ ^ ./include/caffe/util/cudnn.hpp:136:68: error: there are no arguments to ‘cudnnSetPooling2dDescriptor_v4’ that depend on a template parameter, so a declaration of ‘cudnnSetPooling2dDescriptor_v4’ must be available [-fpermissive] CUDNN_PROPAGATE_NAN, h, w, pad_h, pad_w, stride_h, stride_w)); ^ ./include/caffe/util/cudnn.hpp:15:28: note: in definition of macro ‘CUDNN_CHECK’ cudnnStatus_t status = condition; \ ^ ./include/caffe/util/cudnn.hpp:136:68: note: (if you use ‘-fpermissive’, G++ will accept your code, but allowing the use of an undeclared name is deprecated) CUDNN_PROPAGATE_NAN, h, w, pad_h, pad_w, stride_h, stride_w)); ^ ./include/caffe/util/cudnn.hpp:15:28: note: in definition of macro ‘CUDNN_CHECK’ cudnnStatus_t status = condition; \ ^ ./include/caffe/util/cudnn.hpp: At global scope: ./include/caffe/util/cudnn.hpp:141:40: error: variable or field ‘createActivationDescriptor’ declared void inline void createActivationDescriptor(cudnnActivationDescriptor_t* activ_desc, ^ ./include/caffe/util/cudnn.hpp:141:40: error: ‘cudnnActivationDescriptor_t’ was not declared in this scope ./include/caffe/util/cudnn.hpp:141:69: error: ‘activ_desc’ was not declared in this scope inline void createActivationDescriptor(cudnnActivationDescriptor_t* activ_desc, ^ ./include/caffe/util/cudnn.hpp:142:27: error: expected primary-expression before ‘mode’ cudnnActivationMode_t mode) { ^ In file included from ./include/caffe/util/device_alternate.hpp:40:0, from ./include/caffe/common.hpp:19, from ./include/caffe/util/db.hpp:6, from ./include/caffe/util/db_lmdb.hpp:10, from src/caffe/util/db_lmdb.cpp:2: ./include/caffe/util/cudnn.hpp: In function ‘void caffe::cudnn::createPoolingDesc(cudnnPoolingStruct**, caffe::PoolingParameter_PoolMethod, cudnnPoolingMode_t*, int, int, int, int, int, int)’: ./include/caffe/util/cudnn.hpp:136:9: error: ‘CUDNN_PROPAGATE_NAN’ was not declared in this scope CUDNN_PROPAGATE_NAN, h, w, pad_h, pad_w, stride_h, stride_w)); ^ ./include/caffe/util/cudnn.hpp:15:28: note: in definition of macro ‘CUDNN_CHECK’ cudnnStatus_t status = condition; \ ^ ./include/caffe/util/cudnn.hpp:136:68: error: there are no arguments to ‘cudnnSetPooling2dDescriptor_v4’ that depend on a template parameter, so a declaration of ‘cudnnSetPooling2dDescriptor_v4’ must be available [-fpermissive] CUDNN_PROPAGATE_NAN, h, w, pad_h, pad_w, stride_h, stride_w)); ^ ./include/caffe/util/cudnn.hpp:15:28: note: in definition of macro ‘CUDNN_CHECK’ cudnnStatus_t status = condition; \ ^ ./include/caffe/util/cudnn.hpp:136:68: note: (if you use ‘-fpermissive’, G++ will accept your code, but allowing the use of an undeclared name is deprecated) CUDNN_PROPAGATE_NAN, h, w, pad_h, pad_w, stride_h, stride_w)); ^ ./include/caffe/util/cudnn.hpp:15:28: note: in definition of macro ‘CUDNN_CHECK’ cudnnStatus_t status = condition; \ ^ ./include/caffe/util/cudnn.hpp: At global scope: ./include/caffe/util/cudnn.hpp:141:40: error: variable or field ‘createActivationDescriptor’ declared void inline void createActivationDescriptor(cudnnActivationDescriptor_t* activ_desc, ^ ./include/caffe/util/cudnn.hpp:141:40: error: ‘cudnnActivationDescriptor_t’ was not declared in this scope ./include/caffe/util/cudnn.hpp:141:69: error: ‘activ_desc’ was not declared in this scope inline void createActivationDescriptor(cudnnActivationDescriptor_t* activ_desc, ^ ./include/caffe/util/cudnn.hpp:142:27: error: expected primary-expression before ‘mode’ cudnnActivationMode_t mode) { ^ make: *** [.build_debug/src/caffe/util/db_leveldb.o] 错误 1 make: *** 正在等待未完成的任务.... make: *** [.build_debug/src/caffe/util/db_lmdb.o] 错误 1 In file included from ./include/caffe/util/device_alternate.hpp:40:0, from ./include/caffe/common.hpp:19, from src/caffe/util/upgrade_proto.cpp:8: ./include/caffe/util/cudnn.hpp: In function ‘void caffe::cudnn::createPoolingDesc(cudnnPoolingStruct**, caffe::PoolingParameter_PoolMethod, cudnnPoolingMode_t*, int, int, int, int, int, int)’: ./include/caffe/util/cudnn.hpp:136:9: error: ‘CUDNN_PROPAGATE_NAN’ was not declared in this scope CUDNN_PROPAGATE_NAN, h, w, pad_h, pad_w, stride_h, stride_w)); ^ ./include/caffe/util/cudnn.hpp:15:28: note: in definition of macro ‘CUDNN_CHECK’ cudnnStatus_t status = condition; \ ^ ./include/caffe/util/cudnn.hpp:136:68: error: there are no arguments to ‘cudnnSetPooling2dDescriptor_v4’ that depend on a template parameter, so a declaration of ‘cudnnSetPooling2dDescriptor_v4’ must be available [-fpermissive] CUDNN_PROPAGATE_NAN, h, w, pad_h, pad_w, stride_h, stride_w)); ^ ./include/caffe/util/cudnn.hpp:15:28: note: in definition of macro ‘CUDNN_CHECK’ cudnnStatus_t status = condition; \ ^ ./include/caffe/util/cudnn.hpp:136:68: note: (if you use ‘-fpermissive’, G++ will accept your code, but allowing the use of an undeclared name is deprecated) CUDNN_PROPAGATE_NAN, h, w, pad_h, pad_w, stride_h, stride_w)); ^ ./include/caffe/util/cudnn.hpp:15:28: note: in definition of macro ‘CUDNN_CHECK’ cudnnStatus_t status = condition; \ ^ ./include/caffe/util/cudnn.hpp: At global scope: ./include/caffe/util/cudnn.hpp:141:40: error: variable or field ‘createActivationDescriptor’ declared void inline void createActivationDescriptor(cudnnActivationDescriptor_t* activ_desc, ^ ./include/caffe/util/cudnn.hpp:141:40: error: ‘cudnnActivationDescriptor_t’ was not declared in this scope ./include/caffe/util/cudnn.hpp:141:69: error: ‘activ_desc’ was not declared in this scope inline void createActivationDescriptor(cudnnActivationDescriptor_t* activ_desc, ^ ./include/caffe/util/cudnn.hpp:142:27: error: expected primary-expression before ‘mode’ cudnnActivationMode_t mode) { ^ make: *** [.build_debug/src/caffe/util/upgrade_proto.o] 错误 1 In file included from ./include/caffe/util/device_alternate.hpp:40:0, from ./include/caffe/common.hpp:19, from ./include/caffe/blob.hpp:8, from ./include/caffe/caffe.hpp:7, from src/caffe/parallel.cpp:12: ./include/caffe/util/cudnn.hpp: In function ‘void caffe::cudnn::createPoolingDesc(cudnnPoolingStruct**, caffe::PoolingParameter_PoolMethod, cudnnPoolingMode_t*, int, int, int, int, int, int)’: ./include/caffe/util/cudnn.hpp:136:9: error: ‘CUDNN_PROPAGATE_NAN’ was not declared in this scope CUDNN_PROPAGATE_NAN, h, w, pad_h, pad_w, stride_h, stride_w)); ^ ./include/caffe/util/cudnn.hpp:15:28: note: in definition of macro ‘CUDNN_CHECK’ cudnnStatus_t status = condition; \ ^ ./include/caffe/util/cudnn.hpp:136:68: error: there are no arguments to ‘cudnnSetPooling2dDescriptor_v4’ that depend on a template parameter, so a declaration of ‘cudnnSetPooling2dDescriptor_v4’ must be available [-fpermissive] CUDNN_PROPAGATE_NAN, h, w, pad_h, pad_w, stride_h, stride_w)); ^ ./include/caffe/util/cudnn.hpp:15:28: note: in definition of macro ‘CUDNN_CHECK’ cudnnStatus_t status = condition; \ ^ ./include/caffe/util/cudnn.hpp:136:68: note: (if you use ‘-fpermissive’, G++ will accept your code, but allowing the use of an undeclared name is deprecated) CUDNN_PROPAGATE_NAN, h, w, pad_h, pad_w, stride_h, stride_w)); ^ ./include/caffe/util/cudnn.hpp:15:28: note: in definition of macro ‘CUDNN_CHECK’ cudnnStatus_t status = condition; \ ^ ./include/caffe/util/cudnn.hpp: At global scope: ./include/caffe/util/cudnn.hpp:141:40: error: variable or field ‘createActivationDescriptor’ declared void inline void createActivationDescriptor(cudnnActivationDescriptor_t* activ_desc, ^ ./include/caffe/util/cudnn.hpp:141:40: error: ‘cudnnActivationDescriptor_t’ was not declared in this scope ./include/caffe/util/cudnn.hpp:141:69: error: ‘activ_desc’ was not declared in this scope inline void createActivationDescriptor(cudnnActivationDescriptor_t* activ_desc, ^ ./include/caffe/util/cudnn.hpp:142:27: error: expected primary-expression before ‘mode’ cudnnActivationMode_t mode) { ^ make: *** [.build_debug/src/caffe/parallel.o] 错误 1  #### n_steps and batch_size in many to one LSTM input_dim = 50 hidden_dim = 70 output_dim = 20 # Sequences we will provide at runtime seq_input = tf.placeholder(tf.float32, [n_steps, batch_size,input_dim]) # What timestep we want to stop at early_stop = tf.placeholder(tf.int32) # Inputs for rnn needs to be a list, each item being a timestep. # we need to split our input into each timestep, and reshape it because # split keeps dims by default inputs = [tf.reshape(i, (batch_size, input_dim)) for i in tf.split(0, n_steps, seq_input)] with tf.device("/cpu:0"): cell1 = LSTMCell(hidden_dim, input_dim, initializer=initializer) initial_state1 = cell1.zero_state(batch_size, tf.float32) outputs1, states1 = rnn.rnn(cell1, inputs, initial_state=initial_state1, sequence_length=early_stop, scope="RNN1")  Details : • I have an array of shape 50{ input_dimension }xLengthOfTotalData, which contains the vectorized data. • for a segment of the above array of length 100, i have a single target of dimension 20 1. If i use the above code ,in case of a many-to-one prediction, what should n_step and batch_size be ? 2. sequence_length=early_stop , what does this mean ? isnt n_step the length of the segment where the NN will output a prediction ? ### TheoryOverflow #### What are some of the existing methods (preferably with implementations) that cluster dynamic brain network data with signed edge weights? [on hold] 0 down vote favorite I have a dynamic graph data with nodes and edges attributed to each timestep. The problem is to find how many communities are found at each timestep and what is their membership. I have an existing framework that does a fairly ok job of finding these communities/clusters at each timestep. The method uses consensus clustering with K-means at each time step and computes correspondences between two adjacent time steps. The disadvantage of this approach is that it does not utilize temporal smoothness in coming up with the results. As a result, the clustering results are unreasonable (as in there is an unpredictive jump in the number and the assignment of clusters). An introduction to this problem can be found inhttp://www.cs.cmu.edu/~deepay/mywww/papers/kdd06-evolutionary.pdf I know that this is a really hard problem in Knowledge discovery. Some of the existing heuristic works in clustering dynamic networks that have tried to address this issue include DYNMOGA, FACTNET. My end goal is to look for a temporally smooth meaningful clustering result with little variations between timesteps. What are some of the existing methods (preferably with implementations) that cluster dynamic network data with signed edge weights? ### HN Daily #### Daily Hacker News for 2016-07-26 ### Planet Theory #### Complexity of Token Swapping and its Variants Authors: Édouard Bonnet, Tillmann Miltzow, Paweł Rzążewski Download: PDF Abstract: In the Token Swapping problem we are given a graph with a token placed on each vertex. Each token has exactly one destination vertex, and we try to move all the tokens to their destinations, using the minimum number of swaps, i.e., operations of exchanging the tokens on two adjacent vertices. As the main result of this paper, we show that Token Swapping is$W[1]$-hard parameterized by the length$k$of a shortest sequence of swaps. In fact, we prove that, for any computable function$f$, it cannot be solved in time$f(k)n^{o(k / \log k)}$where$n$is the number of vertices of the input graph, unless the ETH fails. This lower bound almost matches the trivial$n^{O(k)}$-time algorithm. We also consider two generalizations of the Token Swapping, namely Colored Token Swapping (where the tokens have different colors and tokens of the same color are indistinguishable), and Subset Token Swapping (where each token has a set of possible destinations). To complement the hardness result, we prove that even the most general variant, Subset Token Swapping, is FPT in nowhere-dense graph classes. Finally, we consider the complexities of all three problems in very restricted classes of graphs: graphs of bounded treewidth and diameter, stars, cliques, and paths, trying to identify the borderlines between polynomial and NP-hard cases. #### A Hierarchy of Lower Bounds for Sublinear Additive Spanners Authors: Amir Abboud, Greg Bodwin, Seth Pettie Download: PDF Abstract: Spanners, emulators, and approximate distance oracles can be viewed as lossy compression schemes that represent an unweighted graph metric in small space, say$\tilde{O}(n^{1+\delta})$bits. There is an inherent tradeoff between the sparsity parameter$\delta$and the stretch function$f$of the compression scheme, but the qualitative nature of this tradeoff has remained a persistent open problem. In this paper we show that the recent additive spanner lower bound of Abboud and Bodwin is just the first step in a hierarchy of lower bounds that fully characterize the asymptotic behavior of the optimal stretch function$f$as a function of$\delta \in (0,1/3)$. Specifically, for any integer$k\ge 2$, any compression scheme with size$O(n^{1+\frac{1}{2^k-1} - \epsilon})$has a sublinear additive stretch function$f$: $$f(d) = d + \Omega(d^{1-\frac{1}{k}}).$$ This lower bound matches Thorup and Zwick's (2006) construction of sublinear additive emulators. It also shows that Elkin and Peleg's$(1+\epsilon,\beta)$-spanners have an essentially optimal tradeoff between$\delta,\epsilon,$and$\beta$, and that the sublinear additive spanners of Pettie (2009) and Chechik (2013) are not too far from optimal. To complement these lower bounds we present a new construction of$(1+\epsilon, O(k/\epsilon)^{k-1})$-spanners with size$O((k/\epsilon)^{h_k} kn^{1+\frac{1}{2^{k+1}-1}})$, where$h_k < 3/4$. This size bound improves on the spanners of Elkin and Peleg (2004), Thorup and Zwick (2006), and Pettie (2009). According to our lower bounds neither the size nor stretch function can be substantially improved. ## July 26, 2016 ### QuantOverflow #### Where can I get two to four years worth of historic data news for companies included in DJ and S&P? Where can I get two to four years worth of historic data news for companies included in DJ and S&P? I mean not just prices historic data but also news. Preferably for free and in CSV or any similar form. ### StackOverflow #### General purpose immutable classes in C# I am writing code in a functional style in C#. Many of my classes are immutable with methods for returning a modified copy of an instance. For example: sealed class A { readonly X x; readonly Y y; public class A(X x, Y y) { this.x = x; this.y = y; } public A SetX(X nextX) { return new A(nextX, y); } public A SetY(Y nextY) { return new A(x, nextY); } }  This is a trivial example, but imagine a much bigger class, with many more members. The problem is that constructing these modified copies is very verbose. Most of the methods only change one value, but I have to pass all of the unchanged values into the constructor. Is there a pattern or technique to avoid all of this boiler-plate when constructing immutable classes with modifier methods? Note: I do not want to use a struct for reasons discussed elsewhere on this site. Update: I have since discovered this is called a "copy and update record expression" in F#. #### Need a text dump of literary novels for data analysis So I need to do some data analysis on literary novels and I was wondering if there was any way to get access to txt files for popular literary works. I know Project Gutenberg is one possibility but it is limited to old classics who's copyrights have expired. Is there like a creative commons license type business which would allow me to get txt versions of more modern copyrighted literature to use for my data analysis? So far I'm running a spider to scrape all of gutenberg but I need more modern texts to include in my analysis. Anyone with data science experience in textual analysis know of a way to get access to such texts? ### CompsciOverflow #### I am looking for a practical and worthy PhD topic in IT Security [on hold] I will be starting my PhD research soon. I am looking for a research topic related to information security. I have reviewed many topics and read many papers but I have not yet found a topic that I like to dedicate 3 years of my life to. So I am asking this community, what are the major security challenges in todays world - and the near future? This could be related to security in IoT, clouds or mobile communication.. or secure SDN or new Authentication methods.. or anything else that is related to info sec. What problems do you find in the space of computer security that are not yet addressed by the industry and academia? I am looking for a practical topic, and not a topic that is theory for the most part... topics such as computational complexity and other topics, however important, are not my cup of tea. ### StackOverflow #### How to implement VAE + GAN in Torch (preferably using nngraph) I am trying to implement a network shown in the shown image (from the paper 'Autoencoding beyond pixels using a similarity metric'). I haven't been able to wrap my head around on how to implement such a network with a non-standard training techniques (as described below) I wonder, what would be the best strategy to implement such a network? There are a few parts that one needs to take into account when implementing such a network: 1- The gradients of the discriminator should not go beyond the decoder and affect the encoder weights 2- There is a "Distance" error function that computes the difference between representations of a hidden layer for a reconstruction and an original image on the discriminator. 3- The gradients of the "Distance" layer should be multiplied by a scalar and then sum up with the gradients of the discriminator. Then the total gradient is used to update the decoder weight of the decoder Please take a look at the attachment "Algorithm1.png" to get an idea on how to implement it. So far, I've been trying to implement the network using nngraph, but haven't been able to wrap my head around on how exactly take all those subtleties into account. I took the VAE implementation in Torch using nngraph code and am trying to modify it. Here's my attempt so far: encoder = VAE.get_encoder(opt.modelType, modelParams) decoder = VAE.get_decoder(opt.modelType, modelParams) local input, mean, log_var, z, zPrior, inputOriginal input = nn.Identity()() mean, log_var = encoder(input):split(2) z = nn.Sampler()({mean, log_var}) meanPrior = nn.Identity()() log_varPrior = nn.Identity()() zPrior = nn.Sampler()({meanPrior, log_varPrior}) -- Build the discriminator discriminator = VAE.get_discriminator(opt.modelType, modelParams) local reconstruction, reconPrior, model reconstruction = decoder(z) reconPrior = decoder(zPrior) model = nn.gModule({input, meanPrior, log_varPrior, originalInput},{reconstruction, reconPrior, mean, log_var})  In my rough implementation I am missing the Distance error function and use its gradient for updating the other network parts as I don't know how I should implement it. Does anyone know how I can implement the network in the paper properly? Thanks ### CompsciOverflow #### What does pixel depth do with an image I'm reading this summary for black & white picture Summary of the process for creating a digital image, for the specific case of a 5 megapixel digital camera: • Use the zoom control and the viewfinder to identify the part of the scene that will be recorded. • Divide the scene into pixels: 2500 across the width of the scene and 2000 across the height. • Measure the average brightness in each pixel and calculate its square root. • Record the number 255 for the brightest pixels, the number 0 where there was no light in the scene, and a proportional number in between for each of the other pixels. • List all 5,000,000 numbers, starting in the upper left-hand corner.Tack on a few extra numbers that will tell everyone how to interpret this long list of numbers. This is confusing me in term of Image quality If Phone A has 5 Megapixels camera: in short it can spawn 5 million squares per object. But assuming it has 65536 possible brightness combination per pixel Phone B has 10 Megapixels camera: in short it can spawn 10 million squares per object. But assuming it has 255 possible brightness combination per pixel Now which one could probably produce better image? Phone B could (if my assumption is right) produce image in more details. But what's with the Phone A who has bigger possible brightness combination? ### Lobsters #### OpenBSD: Release Songs: 6.0: "Another Smash of the Stack" http://marc.info/?l=openbsd-cvs&m=146957264927350&w=2 2016/07/26 16:36:54 Activate pre-orders for 6.0. Release the first of the songs (“6 for 6”) Comments #### Shut up snitch! - reverse engineering and exploiting a critical Little Snitch vulnerability ### QuantOverflow #### How can you find change in working capital and capital expenditures without a balance sheet? I'm working with the following information trying to work through a valuation exercise and I'm absolutely stuck. How can I find ∆WC and CAPX with this information? ### CompsciOverflow #### A question on the graph induced by the transition function of a deterministic Turing machine Let$M$be a deterministic Turing machine with totally defined transition function$\delta$and working alphabet$\Gamma$. Let$Q$denote the statespace of$M$'s finite control. Let$G_M$be the graph induced by$\delta$which is defined as$G_M = (V,E)$with$V = Q$and$(q,q') \in E$iff there exist$a,b \in \Gamma$and$m \in \{\pm 1\}$such that$\delta(q,a) = (q',b,m)$. Are there any results which translate properties of the machine$M$to graph theoretic properties of$G_M$and vice versa? ### DataTau #### Words2map: A reco framework built w/ word2vec, t-SNE and HDBSCAN ### Lobsters #### EuroBSDcon 2016 Talks & Schedule #### Web Design in 4 minutes ### TheoryOverflow #### Incomplete basis of combinators This is inspired by this question. Let$\mathcal{C}$be the collection of all combinators which only have two bound variables. Is$\mathcal{C}$combinatorially complete? I believe the answer is negative, however I was not able to find a reference for this. I would also be interested in references for proofs of combinatorial incompleteness of sets of combinators (I can see why the set$\mathcal{D}$consisting of combinators with only one bound variable is incomplete, so these sets ought to contain more than just elements of$\mathcal{D}$). ### CompsciOverflow #### Minimal hypergraph transversals the exact complexity for hypergraph transversal problem is yet unknown and is an open research problem. However, I would need a fast way to compute, on paper, the minimal transversal of a hypergraph. A transversal intersects all hyperedges of the graph. A transversal is minimal if it does not contain any transversal as a proper subset. Let us look at this example, where V = {a,b,c,d,e} Hyperedge 1: de Hyperedge 2: abce Hyperedge 3: bd  How can I compute, on paper, the minimal transversal of this hypergraph? Computing the minimal transversal with only 2 hyperedges is trivial, but in the 3 hyperedges case, the complexity increases. Looking forward to your input. #### Is hedge union always as fast as divide and conquer? Adams describes a divide-and-conquer algorithm for finding the union of two sets (represented as weight-balanced binary search trees). He then describes a then-new "hedge union" algorithm which he claims improves on the divide-and-conquer one. However, he does not offer a proof, or even a real explanation, of why it should be$O(m + n)$, let alone why it should be faster than divide-and-conquer. Blelloch, Ferizovic, and Sun show that Adams's divide-and-conquer algorithm actually attains the theoretically optimal$\Theta (m \log (n/m + 1))$where$m \le n$. They do not, however, address the hedge union algorithm. Is hedge union, in fact, as efficient as divide-and-conquer? The least obvious part is the inner trim. It appears, at least superficially, to duplicate work between the left and right subtrees that the full split shares between them. Perhaps this is okay for some reason, but I don't know why. A further inquiry: Haskell's Data.Set and Data.Map use hedge variants of intersection and difference, as well as union. I have not found any published discussion of those algorithms at all. Similar questions apply to these as well. ### StackOverflow #### Ensemble multiple ensemble methods I am a newbie to Machine Learning, and here is a question on whether I can ensemble multiple ensemble methods together to improve the performance. Basically, I am working on a binary classification problem, and I am trying to leverage multiple learning methods through a majority-voting mechanism to improve the precision/recall rate. According to my test, the following combination of models demonstrate good performance regarding majority voting: 1. LinearSVC 2. Adaboost 3. GradientBoosting  My evaluation shows that the ensembled voting of these three methods can output each individual model. However, my concern is that, as Adaboost and GradientBoosting they are already "ensembled methods", it is reasonable to ensemble the ensembled methods? I am looking for some papers/envidence to prove the legality for my double-ensembling approach. Or can anyone tell me why I cannot ensemble the already ensembled methods? Am I clear? thank you! ### Fefe #### Habt ihr euch bei dem Putsch in der Türkei auch solche ... Habt ihr euch bei dem Putsch in der Türkei auch solche Sorgen gemacht? Nein, nicht um die Türken. Um die CSU! In ihrem eigenen Law&Order-Stadium geschlagen! Von einem Amateur! Einem türkischen Amateur, auch noch! Einem Ausländer!! Geht es nach der CSU, muss die Antwort auf die Ereignisse der vergangenen Tage heißen: mehr Polizei, mehr Kontrollen und ein strengerer Umgang mit Flüchtlingen und Asylbewerbern. Jawollja! #### Ich frage mich ja, was die Bernie Sanders aus ihrem ... Ich frage mich ja, was die Bernie Sanders aus ihrem Kompromatkoffer gezeigt haben, aber der ist so vollständig umgekippt, dass die SPD Neidpickel kriegt. Da blieb seinen Anhängern gar nichts anderes mehr außer Buh-Rufen. ### QuantOverflow #### Implied Vol in Different Payoffs Let's say I have a black box stock price model I run Monte Carlo on to estimate European call prices. For a given strike$K$and expiration$T$, I then back out the Black-Scholes implied volatility$\sigma(K, T)$from the Monte Carlo price$C_{MC}$, and this assumes the model is lognormal. I now want to price a digital option using this black box model at the same$K$and$T$for which I computed the implied vol. I use Monte Carlo for this and obtain a price$D_{MC}$Let's say I also compute the price of this digital using the closed-form Black-Scholes price and use$\sigma(K,T)$as my vol. I obtain a price of$D_{BS}$. Now, as I understand,$\sigma(K, T)$is exactly that volatility I had to plug into my Black-Scholes price to obtain$C_{MC}$. I am now pricing another type of option (digital) for this$K$and$T$and use$\sigma(K, T)$in my closed-form. My question is, what is the relation between$D_{MC}$and$D_{BS}$? Would one expect these to be equal? Would they be equal only if the black box model was actually just the lognormal model? #### How to price a reversion? A reversion is the right to exclusive possession of a property after a specified date (‘reversion date’). Assume no ground rent is payable. Its complement is a leasehold, the right to exclusive possession before the reversion date. Both rights together add up to total rights to the property, hence the sum of leasehold and reversion = vacant value of property (approximately, since there is something called ‘marriage value’, which I shall ignore). How should we price the reversion? It’s a controversial issue in real estate appraisal. One method is to estimate the present value of the foregone rent on the property, but even this is not simple. The problem is that the corresponding leasehold is like a long term rental for the property. Short term rentals are a guide, but with short term rentals (a) there is the problem of void periods between tenancies, which will reduce the effective rental, and importantly (b) short term rentals can be re-fixed upwards after the short term rental expires. Also (c) when a leasehold approaches expiry, and where the lessee chooses not to apply for an extension, the lessee may have little interest in maintaining the property. They may choose to rent to people who will wreck the property, and they certainly will not pay for structural repairs if not in the terms of the lease. And (d) the owner of the reversion has comparatively few management costs, particularly in the case of long dates greater than 99 years. Any ideas appreciated. (I have my own, but sitting on them for now). ### Lobsters #### DeepWarp: Photorealistic Image Resynthesis for Gaze Manipulation ### CompsciOverflow #### How Jump instruction is executed based on value of Out- The Alu Output Figure from The Elements of Computer System (Nand2Tetris) Have a look at the scenario where  j1 = 1 (out < 0 ) j2 = 0 (out = 0 ) j3 = 1 (out > 0 )  How this scenario is possible as out < 0 is true as well as out > 0 but out = 0 is false. How out can have both positive and negative values at the same time? In other words when JNE instruction is going to execute although it theoretically seems possible to me but practically its not? ### StackOverflow #### Pandas, deal with mixed character, like 1a2 I have my training data csv loaded to Pandas dataframe, but some cols are in object. Especially data type are mixed, for example 1a1, 1a2, 2a1. Is there any way to handle this? Transfer to category or even numerical. For machine learning purpose. What I searched most cases were try to deal with {'111', 'bbb', '222'}, not {'1a2','1a2','2a1'} or {'c2a','c2b','c2c','c2d','c2e'} or {from 'A0' to 'AN'} Any solution? Thanks in advance! ### UnixOverflow #### Is the (Free)BSD Codebase ANSI Compliant? Google does not seem to shed any light on this topic - But the question is quite simple: Is the FreeBSD (Or any BSD) CodeBase ANSI (c89) Compliant, or does it use c99, c11 or non-standard features like the Linux Kernel Does? ### StackOverflow #### TensorFlow RNN PTB trouble with train, validate, and test file formats I have been playing around with TensorFlow for about a week or two, and have gone through some of the tutorials. (My python isn't too great either). In the /model/rnn/ptb/ path there are two python files, reader.py and ptb_word_lm.py found here Inside the reader.py file are the following:  train_path = os.path.join(data_path, "ptb.train.txt") valid_path = os.path.join(data_path, "ptb.valid.txt") test_path = os.path.join(data_path, "ptb.test.txt")  When I follow the TensorFlow tutorial, the given txt files work fine. However, I want to use my own train/valid/test txt files. How do I format these? Here is what the format I am using: 12345 The label is the number and these are the words to be predicted 22222 The same goes for this example 54321 This is yet another example  However I get a "KeyError" traceback from MY ptb.valid.txt file, which leads me to believe that the format of the txt file is incorrect. It's basically just a number (the label) followed by words (to build the prediction model). Has anyone tried to adapt their own train/valid/test files to the existing TensorFlow python files, ptb_word_lm.py & reader.py? Thank you ### CompsciOverflow #### Finding an optimal topological ordering I have some jobs, which calculate values. Some of these jobs require the calculated values of other jobs for their own calculation. An execution plan for these jobs can be found with a topological sort on the dependency graph. However, these calculated values are huge and occupy a large chunk of memory. I want to find a topological order that minimizes memory usage. To give a simple example, consider the following dependency graph: There are three possible topological orders:$1\, 2\, 3\, 4$,$1\, 3\, 2\, 4$, and$3\, 1\, 2\, 4$. Let's look at each of them. In the first case, the values of$1$and$3$can be discarded immediately after$2$and$4$have finished, respectively, so each of them must be kept in memory for one additional step. The value of$2$has to be kept for two additional steps until$4$has finished. So in total we have to keep$4$results in memory over the whole execution. In the second case the values of$1$and$3$have to be kept for two steps and the value of$2$for one step. This makes$5$in total, so this ordering is worse in terms of memory usage. In the third case the value of$3$has to be kept for three steps and the values of$1$and$2$for one step each. Again we have a total of$5$, so this ordering is also worse than the first one. Thus the first ordering should be chosen. Is there an efficient algorithm to find an optimal ordering without needing to inspect all possible orderings? Bonus: is it also possible to give an optimal solution, if the values would need different amounts of memory per job? ### StackOverflow #### Why do neural networks work so well? I understand all the computational steps of training a neural network with gradient descent using forwardprop and backprop, but I'm trying to wrap my head around why they work so much better than logistic regression. For now all I can think of is: A) the neural network can learn it's own parameters B) there are many more weights than simple logistic regression thus allowing for more complex hypotheses Can someone explain why a neural network works so well in general? I am a relative beginner. ### CompsciOverflow #### Checking membership in DFA with fixed length using AC1 circuit? I'm supposed to find circuits , which can solve the question of membership in a regular language A with fixed length. The depth is limited by O(log(n)) and the size by O(n). Divide and Conquer should be the way to go, but I always exceed the max size. Would really appreciate any help ### StackOverflow #### How can I classify datasets? How can I classify new datasets into classes A and B by using the bellow training data?  1.0 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 class Dataset 1 42 13 22 324 270 96 107 93 80 228 A Dataset 2 45 23 14 596 445 135 153 124 132 331 A Dataset 3 42 22 16 479 407 130 150 121 128 342 A Dataset 4 37 63 10 481 397 155 143 159 172 394 B Dataset 5 46 18 10 387 356 127 118 129 136 359 B Dataset 6 23 34 9 550 436 147 166 164 208 467 B  It will be very ideal if there is a equation that can divide the datasets. For example if # of 1.0 + # of 0.9 is higher than 55 it is class A.(This might be wrong but something like this) ### QuantOverflow #### How to take care of newly auctioned yield/price in fixed income data This is a financial data cleaning question. I have raw price and yield data for US cash treasury across the curve. In the time-series there are jumps on the day after the treasury auction results come out. Prior to using the data, is it good practice to manually remove the jumps? Thanks. ### CompsciOverflow #### Streaming set equality If you have two streams of arriving strings, is there a fast and space efficient way to determine when the two streams have seen the same set of strings? It feels like there ought to be a randomised hashing method but I can't see it at the moment. ### StackOverflow #### Text Classification/Document Classification with Sequence Tagging with Mallet I have documents arranged in folders as classes called categories. For a new input (such as a question asked), I have to identify its category. What is be the best way to do this using MALLET? I've gone through multiple articles about this, but couldn't find such a way. Also, do I need to do sequence tagging on the input text? ### QuantOverflow #### Why are Quantquote historical trades different vom ActiveTick historical trades I bought quantquote.com historical data of AAPL on second basis. To comapre I also got activetick.com For activetick I used the historical trading API. If you look at around 15:13:53 you see that ActiveTick reports Quotes where QuantCode claims non exist in specific second. Does QuantQuote only consider quotes form certain exchanges?  ActiveTick (quotes) Datetime,Price,Size,Exchange,Cond1,Cond2,Cond3,Cond4 2014-07-21 15:13:40.134,94.045,1100,Q,37,0,14,0 2014-07-21 15:13:41.003,94.047,3,D,0,0,0,37 2014-07-21 15:13:43.429,94.05,100,D,0,0,0,0 2014-07-21 15:13:43.472,94.045,900,D,0,0,0,0 2014-07-21 15:13:47.612,94.0401,50,D,0,0,0,37 2014-07-21 15:13:47.869,94.04,12,D,0,0,0,37 2014-07-21 15:13:48.508,94.0401,21,D,0,0,0,37 2014-07-21 15:13:48.648,94.0449,159,D,0,0,0,0 2014-07-21 15:13:49.291,94.0497,100,D,0,0,0,0 2014-07-21 15:13:50.350,94.047,105,D,0,0,0,0 2014-07-21 15:13:51.449,94.05,100,Y,0,0,0,0 2014-07-21 15:13:51.450,94.05,95,J,0,0,0,37 2014-07-21 15:13:51.450,94.05,100,J,0,0,0,0 2014-07-21 15:13:51.450,94.045,100,Z,0,0,0,0 2014-07-21 15:13:51.450,94.045,500,Z,0,0,0,0 2014-07-21 15:13:51.450,94.05,100,Z,0,0,0,0 2014-07-21 15:13:51.623,94.05,400,D,0,0,0,0 2014-07-21 15:13:51.623,94.05,200,D,0,0,0,0 2014-07-21 15:13:51.623,94.05,200,D,0,0,0,0 2014-07-21 15:13:52.683,94.045,200,Z,0,0,14,0 2014-07-21 15:13:54.319,94.0499,50,D,0,0,0,37 2014-07-21 15:13:55.278,94.045,200,Z,0,0,0,0 2014-07-21 15:13:58.186,94.05,20,Z,0,0,0,37 2014-07-21 15:13:58.212,94.05,100,D,0,0,0,0 2014-07-21 15:13:58.346,94.0499,20,D,0,0,0,37   QuantQuote (seconds) datetime_unix, datetime,time,usd_open,usd_high,usd_low,usd_close,volume 1405969978 2014-07-21 15:12:58 33178000 94.63 94.6394.63 94.63 55 1405969988 2014-07-21 15:13:08 33188000 94.62 94.6294.62 94.621000 1405969993 2014-07-21 15:13:13 33193000 94.62 94.6294.62 94.62 300 1405969994 2014-07-21 15:13:14 33194000 94.62 94.6294.62 94.62 200 1405969998 2014-07-21 15:13:18 33198000 94.62 94.6294.62 94.621919 1405970000 2014-07-21 15:13:20 33200000 94.62 94.6294.62 94.62 100 1405970006 2014-07-21 15:13:26 33206000 94.63 94.6394.63 94.63 25 1405970010 2014-07-21 15:13:30 33210000 94.62 94.6294.62 94.622000 1405970016 2014-07-21 15:13:36 33216000 94.62 94.6294.62 94.622400 1405970017 2014-07-21 15:13:37 33217000 94.62 94.6294.61 94.61 100 1405970023 2014-07-21 15:13:43 33223000 94.63 94.6494.62 94.642100 1405970029 2014-07-21 15:13:49 33229000 94.63 94.6394.63 94.63 100 1405970030 2014-07-21 15:13:50 33230000 94.64 94.6494.63 94.631900 1405970033 2014-07-21 15:13:53 33233000 94.63 94.6394.63 94.63 100 1405970035 2014-07-21 15:13:55 33235000 94.63 94.6394.63 94.63 200 1405970037 2014-07-21 15:13:57 33237000 94.63 94.6394.63 94.634323 1405970039 2014-07-21 15:13:59 33239000 94.63 94.6394.63 94.63 400 1405970048 2014-07-21 15:14:08 33248000 94.63 94.6394.63 94.631000 1405970049 2014-07-21 15:14:09 33249000 94.63 94.6394.63 94.63 200 1405970061 2014-07-21 15:14:21 33261000 94.63 94.6394.63 94.63 200 1405970062 2014-07-21 15:14:22 33262000 94.62 94.6294.62 94.62 100 1405970072 2014-07-21 15:14:32 33272000 94.63 94.6394.63 94.63 100 1405970073 2014-07-21 15:14:33 33273000 94.63 94.6394.63 94.633176 1405970106 2014-07-21 15:15:06 33306000 94.64 94.6494.64 94.64 265 1405970112 2014-07-21 15:15:12 33312000 94.65 94.6594.65 94.65 100 1405970120 2014-07-21 15:15:20 33320000 94.65 94.6594.65 94.65 200 1405970133 2014-07-21 15:15:33 33333000 94.66 94.6694.66 94.663339 1405970134 2014-07-21 15:15:34 33334000 94.66 94.6694.66 94.66 400 1405970135 2014-07-21 15:15:35 33335000 94.66 94.6694.66 94.66 200 1405970142 2014-07-21 15:15:42 33342000 94.66 94.6694.66 94.66 296 1405970146 2014-07-21 15:15:46 33346000 94.66 94.6794.66 94.67 400  EDIT: ActiveTick includes the following exchanges in the data:  exchange value ExchangeAMEX A ExchangeNasdaqOmxBx B ExchangeNationalStockExchange C ExchangeFinraAdf D ExchangeCQS E ExchangeForex F ExchangeInternationalSecuritiesExchange I ExchangeEdgaExchange J ExchangeEdgxExchange K ExchangeChicagoStockExchange M ExchangeNyseEuronext N ExchangeNyseArcaExchange P ExchangeNasdaqOmx Q ExchangeCTS S ExchangeCTANasdaqOMX T ExchangeOTCBB U ExchangeNNOTC u ExchangeChicagoBoardOptionsExchange W ExchangeNasdaqOmxPhlx X ExchangeBatsYExchange Y ExchangeBatsExchange Z ExchangeCanadaToronto T ExchangeCanadaVenture V ExchangeComposite  ### StackOverflow #### Combine different "containers" in cats XorT For example, we have some services with different "containers" Future and Option: //first service with Future class FirstService { getData(): XorT[Future, ServiceError, SomeData] } //second service with Optin class SecondService { getData(): XorT[Option, ServiceError, SomeData] }  How do we can combine them to use one for comprehension to avoid type mismatch? val result = for { data1 <- firstService.getData() data2 <- secondService.getData() // type mismatch required XorT[Future, ServiceError, SomeData] } yield mergeResult(data1, data2)  #### Access actual Features after a Feature Selection Pipeline in SciKit-Learn I use a feature selection in combination with a pipeline in SciKit-Learn. As a feature selection strategy I use SelectKBest. The pipeline is created and executed like this: select = SelectKBest(k=5) clf = SVC(decision_function_shape='ovo') parameters = dict(feature_selection__k=[1,2,3,4,5,6,7,8], svc__C=[0.01, 0.1, 1], svc__decision_function_shape=['ovo']) steps = [('feature_selection', select), ('svc', clf)] pipeline = sklearn.pipeline.Pipeline(steps) cv = sklearn.grid_search.GridSearchCV(pipeline, param_grid=parameters) cv.fit( features_training, labels_training )  I know that I can get the best-parameters afterwards via cv.best_params_. However, this only tells me that a k=4 is optimal. But I would like to know which features are these? How can this be done? ### TheoryOverflow #### Finding a minimum tree which is isomorphic to a subtree of$T_1$but not to a subtree of$T_2$Consider the problem that receives two trees$T_1$,$T_2$, and asks to find a minimum size tree$T$such that there exists a subtree of$T_1$which is isomorphic to$T$, but there is no such isomorphic subtree in$T_2$. What is known about the complexity of this problem? ### QuantOverflow #### Close form solution for Geometric Brownian Motion I have a very fundamental problem, please help me out. I am little confused with the derivation for the close form solution for the Geometric Brownian Motion, from the very fundamental stock model: $$$$dS(t)=\mu S(t)dt+\sigma S(t)dW(t)$$$$ The close form of the above model is following: $$$$S(T)=S(t)\exp((\mu-\frac1 2\sigma^2)(T-t)+\sigma(W(T)-W(t)))$$$$ I believe this is quite straightforward for most of you guys, but I really dont know how did you get the$(\mu-\frac 1 2 \sigma^2)$term. It is clear for me the other way round (from bottom to top), but I fail to derive directly from the top to bottom. I checked some material online, it was saying something with the drift term, which some terms are artificially added during the derivation. Your answer and detailed explanation will be greatly appreciated. Thanks in advance! ### TheoryOverflow #### Computing Minima of the Projection of a Binary Cube The problem is as follows: I want to compute the minima (with respect to the canonical partial order on vectors "$\leq$") of the linear projection of the extreme points of an$n$-dimensional$\{0,1\}$-cube into the plane. Thus, I am given a$2\times n$-matrix with integer entries which encodes my projection. Note that the result of the projection is a set of isolated points and that these points do not lie in convex position in general. Is it possible to compute these points in an output-sensitive way? Is this a (well-)known problem? What are problems which might be related to this one? #### Subtypes as subsets of SML datatypes One of the few things that I dislike about Okasaki's book on purely functional data structures is that his code is littered with inexhaustive pattern matching. As an example, I'll give his implementation of real-time queues (refactored to eliminate unnecessary suspensions): infixr 5 ::: datatype 'a stream = Nil | ::: of 'a * 'a stream lazy structure RealTimeQueue :> QUEUE = struct (* front stream, rear list, schedule stream *) type 'a queue = 'a stream * 'a list * 'a stream (* the front stream is one element shorter than the rear list *) fun rotate (x :::$xs, y :: ys, zs) = x ::: $rotate (xs, ys, y :::$zs)
| rotate (Nil, y :: nil, zs) = y ::: $zs fun exec (xs, ys, _ :::$zs) = (xs, ys, zs)
| exec args = let val xs = rotate args in (xs, nil, xs) end

(* public operations *)
val empty = (Nil, nil, Nil)
fun snoc ((xs, ys, zs), y) = exec (xs, y :: ys, zs)
fun uncons (x ::: $xs, ys, zs) = SOME (x, exec (xs, ys, zs)) | uncons _ = NONE end  As can be seen rotate isn't exhaustive, because it doesn't cover the case where the rear list is empty. Most Standard ML implementations will generate a warning about it. We know that the rear list can't possibly be empty, because rotate's precondition is that the rear list one element longer than the front stream. But the type checker doesn't know - and it can't possibly know, because this fact is inexpressible in ML's type system. Right now, my solution to suppress this warning is the following inelegant hack:  fun rotate (x :::$xs, y :: ys, zs) = x ::: $rotate (xs, ys, y :::$zs)
| rotate (_, ys, zs) = foldl (fn (x, xs) => x ::: $xs) zs ys  But what I really want is a type system that can understand that not every triplet is a valid argument to rotate. I'd like the type system to let me define types like: type 'a triplet = 'a stream * 'a list * 'a stream subtype 'a queue of 'a triplet = (Nil, nil, Nil) | (xs, ys, zs) : 'a queue => (_ :::$xs, _ :: ys, zs)
| (xs, ys, zs) : 'a queue => (_ ::: $xs, ys, _ :::$zs)


And then infer:

subtype 'a rotatable of 'a triplet
= (xs, ys, _) : 'a rotatable => (_ ::: $xs, _ :: ys, _) | (Nil, y :: nil, _) subtype 'a executable of 'a triplet = (xs, ys, zs) : 'a queue => (xs, ys, _ :::$zs)
| (xs, ys, Nil) : 'a rotatable => (xs, ys, Nil)

val rotate : 'a rotatable -> 'a stream
val exec : 'a executable -> 'a queue


However, I don't want full-blown dependent types, or even GADTs, or any of the other crazy things certain programmers use. I just want to define subtypes by “carving out” inductively defined subsets of existing ML types. Is this feasible?

### StackOverflow

#### Totally independent jobs on same data on Hadoop?

I need to optimize some hyperparameters for a machine learning problem. This involves launching many jobs on the same input data and saving their outputs, completely independently of each other. On every job distribution system that I've ever used, this is a very common use case, which is handled with a few switches on the command line and/or a job config file. Now I'm on a cluster whose job distribution system is Hadoop/Yarn, which I haven't used before. Despite much searching, the only way to do this on Hadoop seems to be to submit each run as a separate job. This would incur the job submission overhead for each run, of which there can be 1000's. Is there a simple way around that? Maybe some kind of MR job without any R? (BTW, my ML code is in C++ so I guess I need to use Hadoop Streaming.) I'll learn Java if I have to, but it seems like a disproportionate amount of effort for something so simple.

### TheoryOverflow

#### Applications for set theory, ordinal theory, infinite combinatorics and general topology in computer science?

I am a mathematician interested in set theory, ordinal theory, infinite combinatorics and general topology.

Are there any applications for these subjects in computer science? I have looked a bit, and found a lot of applications (of course) for finite graph theory, finite topology, low dimensional topology, geometric topology etc.

However, I am looking for applications of the infinite objects of these subjects, i.e. infinite trees (Aronszajn trees for example), infinite topology etc.

Any ideas?

Thank you!!

### StackOverflow

#### Idiomatic Ramda for generating higher order functions?

My goal is to create a custom map function that first needs to filter the list to remain, for example, only even items before invoking the supplied function on every item. I do need the function to be curried and for the first parameter to be the function, not the list. I believe the signature would look like this: (a -> b) -> [a] -> [b]

There are of course many ways to do this. Here is what my first attempt looked like.

var isEven = x => x % 2 === 0;

var filterEvensMap = R.curry((fn, items) => R.map(fn, R.filter(isEven, items)));

filterEvensMap(R.negate, [1,2,3,4]); // [-2, -4]


However, since the above uses an anonymous function with the fn and items "glue parameters", I'm not sure this is the way that Ramda was intended to be used.

Below I included another way to do it. It seems to be more in the spirit of Ramda but I'm not sure if I'm over complicating things.

var filterEvensMap = R.compose(
R.flip,
R.uncurryN(2)
)(R.compose(
R.flip(R.map),
R.filter(isEven)
));


Am I overcomplicating with the multiple composes and uncurryN? Is there a more idiomatic way to achieve this? In your experience, does it matter?

### StackOverflow

#### Removing Image background for feeding the neural network

There is an image which contains one Monitor displaying an image and monitor's background(a computer lab). My aim is to detect the monitor and feed all its displaying content to the neural network. To do this I have to crop an image so that it will contain monitor's content only.

Currently using OpenCV and octave. OpenCv for image processing and octave for machine learning.

I have tried contours using cv2.findContours But it is not usable as - the monitor is showing an image which also creates contour , makes it difficult to detect the monitor and crop its content.

### QuantOverflow

#### Time series of European sovereign credit ratings by the Big Three?

I would need time series, from 2000 to 2015 (if possible) of sovereign credit ratings by Moody's, S&P and Fitch. Could you suggest me a source or provide me such a dataset? Thank you very much!

### StackOverflow

#### Tensorflow: graph building dependent on evaluation

I am writing a tensorflow graph of the following format:

def process_label(label):
return some_operation(label.eval())

Input: X, Label
output,state = NN_Layers()
processed_output = process_data(output)
processed_label = process_label(processed_output, Label)
loss = cross_entropy(processed_output, processed_label)

Session.run(optimize, feed_dict{X:input, Label:label})


The problem with this model is that I need the values of output in order to process my label the way I want it so I can calculate the loss. If I try to use output.eval(session) here, it will not work, because my session will execute the whole graph and therefore I can only call it in the end.

I wonder if theres a way I can break this graph apart in two and still have the gradient descent work through the network, or if theres any other way to do this.

### StackOverflow

#### Accuracy Analysis in case of Classification algorithm in R [on hold]

In regression problem I use accuracy function to get comparison between predicted values and actual values.

Ex-

accuracy(stest$class,clusters2) ME RMSE MAE MPE MAPE Test set 1.555129 3.022764 2.130393 0.4099895 50.74556  For classification problem we just need to know %accuracy and accuracy function is not appropriate. So what do we use in case we are using classification problem ### QuantOverflow #### How do you check your option calculations? I'm implementing a bunch of different algorithms to price options/find Greeks: finite difference, Monte Carlo, binomial... I'm not really sure how to check my calculations. I tried using QuantLib to price things for me, but it seems to use actual dates a lot (whereas I'm just interested in year fractions) and the documentation is lacking. I implemented a finite difference algorithm as described in Wilmott's "Mathematics of Financial Derivatives" and he has some numbers in his book. But my "implementation" of just the analytical Black-Scholes formula already gives different results than his (not by much though). Again, I just typed up the down and out call option formula from Zhang's Exotic options. He actually goes through explicit examples for each of his formulas. But for a down and out call with$S = 100$,$K= 92$,$H = 95$,$r = 0.08$,$q = 0.03$,$\sigma = 0.2$,$\tau = 0.5$he gets \$6.936 and I get \$6.908. So my question is, what is your go to reference for option prices for checking your code? ### DataTau #### Tools in demand in the analytics job market #### Gradient Boosting: explained in 3d ### StackOverflow #### Is it possible to implement a bidirectional relatioship between two structs using only constant properties? My use case is based on the following model: struct Person { let name: String let houses: [House] } struct House { let owner: Person }  Now, ideally I would like to maintain a bidirectional relationship that requires every house to have exactly one owner where an owner should also know all its houses. Using the above data structures, is it possible to create instances of House and Person such that there is a relationship between the two and the objects are essentially pointing at each other? I guess the phrasing of this already is somewhat misleading, because due to the value semantics of structs, they don't really point anywhere but are only holding copies of values. It seems to be like it should be obvious that it's not possible to create these two objects with a bidirectional relationship, but I still wanted to be sure and ask this questions here! An obvious solution would also be to make houses and owner variables using var instead of let when declaring them, then the relationship could be maintained in the initializer of each struct: struct Person { let name: String var houses: [House] init(name: String, houses: [House]) { self.name = name self.houses = houses self.houses = houses.map { (house: House) in var newHouse = house newHouse.owner = self return newHouse } } } struct House { var owner: Person init(owner: Person) { self.owner = owner var newHouses = self.owner.houses newHouses.append(self) self.owner = Person(name: owner.name, houses: newHouses) } }  However, what if I want to keep houses and owner constant? As I said, it seems to be obvious that it's not possible, but I'm wondering if there's some fancy (maybe functional) way to achieve this? I was thinking about lenses, which can be used as getters and setters when dealing with immutable models. ### QuantOverflow #### What is the effect of mean-reversion on an upper barrier knock-out call option? Consider a mean-reverting normal model for an underlying$dX^{(1)}_t=-\kappa X^{(1)}_tdt+\sigma^{(1)} dW^{(1)}_t$, for fixed time-independent constants,$\kappa$(mean-reversion) and$\sigma^{(1)}$(volatility) and Brownian motion,$W^{(1)}_t$. Suppose that using this model, I calculate options prices for all$t$, then calibrate the time-dependent local vol,$\sigma_t^{(2)}$, of a second normal model (without mean-reversion)$dX^{(2)}_t=\sigma_t^{(2)} dW^{(2)}_t$, so that the two models give the same prices for vanilla options at all times. Will a continuous upper barrier knock-out call option be cheaper in the first or second model? For simplicity, take$X_0=Y_0=0$, and assume that the upper barrier,$B$, is larger than the strike,$K$. ### StackOverflow #### ValueError: Found array with dim 3. Estimator expected <= 2 [on hold] I'am trying to generate my own training data for recognition problem. I have two folder s0 and s1 and the folder containing is data. images, lables are the two list in which the labels contains the names of the folder. |—- data | |—- s0 | | |—- 1.pgm | | |—- 2.pgm | | |—- 3.pgm | | |—- 4.pgm | | |—- ... | |—- s1 | | |—- 1.pgm | | |—- 2.pgm | | |—- 3.pgm | | |—- 4.pgm | | |—- ...  Below is the code, it's showing me an error on line classifier.fit(images, lables)  Traceback (most recent call last): File "mint.py", line 34, in <module> classifier.fit(images, lables) File "/usr/local/lib/python2.7/dist-packages/sklearn/svm/base.py", line 150, in fit X = check_array(X, accept_sparse='csr', dtype=np.float64, order='C') File "/usr/local/lib/python2.7/dist- packages/sklearn/utils/validation.py", line 396, in check_array % (array.ndim, estimator_name))  ValueError: Found array with dim 3. Estimator expected <= 2. here import os,sys import cv2 import numpy as np from sklearn.svm import SVC fn_dir ='/home/aquib/Desktop/Natural/data' # Create a list of images and a list of corresponding names (images, lables, names, id) = ([], [], {}, 0) for (subdirs, dirs, files) in os.walk(fn_dir): for subdir in dirs: names[id] = subdir mypath = os.path.join(fn_dir, subdir) for item in os.listdir(mypath): if '.png' in item: label=id image = cv2.imread(os.path.join(mypath, item),0) r_image = np.resize(image,(30,30)) if image is not None: images.append(r_image) lables.append(int(label)) id += 1 #Create a Numpy array from the two lists above (images, lables) = [np.array(lis) for lis in [images, lables]] classifier = SVC(verbose=0, kernel='poly', degree=3) classifier.fit(images, lables)  I really don't understand how to correct it in 2 dimension. I am trying the below codes but the error is same:  images = np.array(images) im_sq = np.squeeze(images).shape images = images.reshape(images.shape[:2]) ### Lobsters #### Why do we automate? #### Reading a Postgres EXPLAIN ANALYZE Query Plan ### CompsciOverflow #### What's the difference between a binary search tree and a binary heap? These two seem very similar and have almost an identical structure. What's the difference? What are the runtime complexities of each? ### Planet Theory #### postdoc at Ulm University (apply by September 2016) A postdoc position for up to 2 years is available with Thomas Thierauf at Ulm University, Germany. The starting date is ideally in October 2016, but can be slightly shifted to a later point. Topic is complexity theory, with a focus on Polynomial Identity Testing (PIT) and related subjects. Email: thomas.thierauf@uni-ulm.de ### CompsciOverflow #### Why are Python 2 and 3 so incompatible with each other? Every software should have backward compatibility (at least some sense of it), this ensures old codes can run in the future updated systems without any problems in most situations. But why did the Python team decide to make the v3 a new language? The community has to split up into two and every computer has to install both v2 and v3 in order to run every python code. This is a huge mess. Is there a computer science reason behind this decision? ### Lobsters #### The Shoemaker’s Son - an Erlang story about Dialyzer ### QuantOverflow #### volatility of a mid curve option Question: When checking the volatility surface for, let's say, a swaption, where the the option expires in 1Y and the underlying starts in 1Y and ends in 5Y, one would check the volatility surface for the quoted volatilities and pick the volatility from Exp. 1Yx5Y ; What happens to the volatility of a mid curve option? how do you relate/ interpolate the volatility in this case? let's say the option expires in 1Y, and the asset starts in 6Y and ends in 5Y after start? where on the volatility surface should the volatility of a mid curve option be situated? Or in other words howw do you get the volatility for the 6Y fwd 5Y swap for an option that expires in 1Y ? ### Lobsters #### EventQL - open-source distributed SQL analytics database (C++11) ### StackOverflow #### Use both Recursive Feature Eliminiation and Grid Search in SciKit-Learn I have a machine learning problem and want to optimize my SVC estimators as well as the feature selection. For optimizing SVC estimators I use essentially the code from the docs. Now my question is, how can I combine this with recursive feature elimination cross validation (RCEV)? That is, for each estimator-combination I want to do the RCEV in order to determine the best combination of estimators and features. I tried the solution from this thread, but it yields the following error: ValueError: Invalid parameter C for estimator RFECV. Check the list of available parameters with estimator.get_params().keys().  My code looks like this: tuned_parameters = [{'kernel': ['rbf'], 'gamma': [1e-4,1e-3],'C': [1,10]}, {'kernel': ['linear'],'C': [1, 10]}] estimator = SVC(kernel="linear") selector = RFECV(estimator, step=1, cv=3, scoring=None) clf = GridSearchCV(selector, tuned_parameters, cv=3) clf.fit(X_train, y_train)  The error appears at clf = GridSearchCV(selector, tuned_parameters, cv=3). ### QuantOverflow #### Stratonovich Integral and Ito's lemma Let$(\Omega, \mathcal{F},\mathbb{P},\{\mathcal{F}\}_t)$be a filtered- probability space and$W_t$be standard Wiener process. I want to show stratonovich integral of$W_t$, i.e$\int_{0}^{t} W_s ○ dW_s$, is not a martingale by Ito lemma. Thanks. ### Lobsters #### Cost-efficient container scheduling with Docker Swarm ### StackOverflow #### Spark ML naive bayes class value to probability index mapping Question May sound very obvious , but I have done lots of search to find the answer, yet not able to get full proof solution. I am using Spark ML package and after running Naive Bayes getting proper results for probabilities, but not finding a way to map a particular class value with probability index (Not by looking at result I know which index is pointing to which class value from label column , but I want a programatic way, is there any way, in one of the document I found the class in label col which occurs most will have index 0 and so on, but what if multiple class will have same number of occurence) ### QuantOverflow #### Skewed Student t distribution MLE and Simulation I have Financial LOB data and I feel that a skewed t distribution will fit best. I have a problem trying to find the parameters using MLE numerically since Matlabs built in function does not allow for Skewed t-distn. Can somebody point me to some code which will find the parameters? Or can someone offer advice for an easy way to do this? I also need to simulate using these parameters but I think this is easier Cheers #### Where to source security ID data (ISIN, CUSIP)? I am lookign for a list of ISINs or CUSIP for all equity securities traded on the NASDAQ and NYSE. while not quantitative, this question is directly related to the data collection process of many quant finance studies so I apologize in advance if this is not the appropriate place to ask. Thanks, #### How to choose the correct ticker for rates? I would like to calculate funding liquidity following Asness/Moskowitz/Pedersen (2013). Among others, they calculate the LIBOR minus term repo rate, and the Swap-T-bill, LIBOR minus interest rate swaps. I checked the FRED homepage and CRSP for the term repo rate and interest swap rate and found numerous different versions of both variables. Are there any industry standards in research which particular data (e.g. Swap for 1y or 30y) I should choose for these two variables? ### Lobsters #### Lessons Learned the Hard Way: Postgres in Production at GoCardless Slides, for those less inclined to watch a video! Comments ### QuantOverflow #### Correct Alphabet (Google) market cap calculation? Given the definition: Market capitalization (market cap) is the total market value of the shares outstanding of a publicly traded company; it is equal to the share price times the number of shares outstanding. I find it quite puzzling to find different numbers everywhere for Google's (now renamed Alphabet) market capitalisation. I have summarised my findings in this Google sheet (https://docs.google.com/spreadsheets/d/1C8sSp7Kf3wdiiYFHCGM2I3qvjESbLjZR04OCFZHBDi0/edit?usp=sharing) hope you can all access it. It ranges from the totally wrong for Market Cap (see CNBC), to inconsistent (see Nasdaq, WSJ and Yahoo finance) to differences in number of shares (Google finance and Bloomberg.com don't seem to agree on the number of outstanding shares). My aim is first to understand what is the right number for outstanding shares and market cap and second what is the right "price" for the Class B shares that are unlisted. Data in the sheet is as of Feb 4th, 2016 11AM Sydney time (so based on closing prices, way after market closes). #### Where do I get historic performance data of the MSCI World Growth/Value index I'm looking for a free data source of historical performance data of the MSCI world Value- and Growth index. The data should be calculated with reinvested dividends. #### What do you call a group consisting of stocks, etfs, and futures? In the command line interface to my program, the user can create a basket of stocks, etfs, or futures by saying: basket = stocks basket = etfs basket = futures basket = options But I want to make a pre-combined basket, consisting of stocks, etfs and futures. Yes, I could make the user enter: basket = stocks,etfs,futures But is there a single word appropriate for this predefined basket? "Securities" doesn't include etfs and futures. Is there an appropriate word? ### StackOverflow #### How do I do a "break" or "continue" when in a functional loop within Kotlin? In Kotlin, I cannot do a break or continue within a function loop and my lambda -- like I can from a normal for loop. For example, this does not work: (1..5).forEach { continue@forEach // not allowed, nor break@forEach }  There are old documentation that mentions this being available but it appears it was never implemented. What is the best way to get the same behavior when I want to continue or break from within the lambda? Note: this question is intentionally written and answered by the author (Self-Answered Questions), so that the idiomatic answers to commonly asked Kotlin topics are present in SO. Also to clarify some really old answers written for alphas of Kotlin that are not accurate for current-day Kotlin. #### R programming: how to cut dendrogram in 3 steps and show the plot in texts Please refer to the attached code: train <- read.csv("~/Desktop/R/2014data.csv") d <- dist(train, method = "euclidean") # distance matrix fit <- hclust(d, method="ward") plot(fit)  Here is the output figure: How to cut this dendrogram in 4-5 layers and show the output (texts)? ### QuantOverflow #### Risk Free Rate vs LIBOR Theoretically, in pricing derivatives, most textbooks refer to the risk-free rate. What is obtainable in practice? The risk-free rate or the LIBOR rate? ### StackOverflow #### Tensorflow: InvalidArgumentError: logits must be 2-dimensional I have a dataset where each row is a (x, y) tuple. So, each row is a point of a curve in the X-Y plane. I would like to do logistic regression for it. Following the examples give here, I have created the model in the following chunk of the code.  # tf Graph Input X = tf.placeholder("float") Y = tf.placeholder("float") # Set model weights W = tf.Variable(rng.randn(), name="weight") b = tf.Variable(rng.randn(), name="bias") # Construct a logistic model pred = tf.nn.softmax(tf.mul(X, W) + b) # Softmax # Mean squared error cost = tf.reduce_sum(tf.pow(pred-Y, 2))/(2.0*n_samples) # Gradient descent optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost) # Initializing the variables init = tf.initialize_all_variables() # Launch the graph with tf.Session() as sess: sess.run(init) # Fit all training data for epoch in range(training_epochs): for (x, y) in zip(train_X, train_Y): sess.run(optimizer, feed_dict={X: x, Y: y}) #Display logs per epoch step if (epoch+1) % display_step == 0: c = sess.run(cost, feed_dict={X: train_X, Y:train_Y}) print "Epoch:", '%04d' % (epoch+1), "cost=", "{:.9f}".format(c), \ "W=", sess.run(W), "b=", sess.run(b)  I am getting the following error in the last line. InvalidArgumentError: logits must be 2-dimensional [[Node: Softmax = SoftmaxT=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"]] I have two 1D vectors one for the X values and the other for the Y values. I am not sure how I would do logistic regression of the logits must be 2D. ### Lobsters #### NIST declares the age of SMS-based 2-factor authentication over #### Bitcoin’s not money, judge rules as she tosses money-laundering charge ### QuantOverflow #### CBOE Index Minute Data I am doing a small research and looking for a place to purchase historical minute CBOE Index data. I am interested in: VIX - CBOE Volatility Index VVIX - CBOE VIX VOLATILITY INDEX VXV - CBOE VIX VOLATILITY INDEX VXST - CBOE Short-Term Volatility Index VXMT - CBOE Mid-Term Volatility Index VIN - CBOE Near-term VIX Index VIF - CBOE Far-term VIX Index Is there a reliable place of getting such data? ### Lobsters #### Apple and the History of Personal Computer Design (computers as material culture) ### StackOverflow #### Spell check NLP, machine learning Is there a machine learning algorithm that corrects typos in the text, taking into account not only the training sample but and part of speech in a sentence? For example: drag - drug "Failure to report can result to withdrawal of number the new drug application" But: "Drag me down" #### SVC (support vector classification) with categorical (string) data as labels I use scikit-learn to implement a simple supervised learning algorithm. In essence I follow the tutorial here (but with my own data). I try to fit the model: clf = svm.SVC(gamma=0.001, C=100.) clf.fit(features_training,labels_training)  But at the second line, I get an error: ValueError: could not convert string to float: 'A' The error is expected because label_training contains string values which represent three different categories, such as A, B, C. So the question is: How do I use SVC (support vector classification), if the labelled data represents categories in form of strings. One intuitive solution to me seems to simply convert each string to a number. For instance, A = 0, B = 1, etc. But is this really the best solution? ### Lobsters #### What “Full Stack” really means to the job market ### QuantOverflow #### Stress Testing for VaR I am trying to perform stress testing for VaR and have taken into consideration two methods:- 1. Sensitivity analysis 2. Historical scenario analysis. According to the Derivatives Policy group we need to take into consideration 5 factors which are:- o Parallel yield curve in ±100 basis points. o Yield curve shifts of ±25 basis points. o Stock index changes of ±10%. o Currency changes of ±6%. o Volatility changes of ±20%. 1. I am trying to perform the stress testing through sensitivity analysis in excel for which I am not able to figure out how to mould the prices for equities,bonds and derivatives by taking into account above factors through the excel function data table. For instance, if I take into account the 3rd factor mentioned above as STOCK INDEX CHANGES OF +- 10% and one of my stock in my portfolio is listed in Dow Jones, so how can I adjust the prices for a particular time period (say 6 months).? 2.Secondly if I take historical scenario analysis in which I am taking the scenario for instance 1997 Asian crisis, how do I adjust the prices in this scenario also. In this case, for instance, my portfolio contains all the asset class which are issued in the last 10 years and therefore I dont have any data (prices etc.) for them related to the 1997 asian crisis. SO how do I adjust the prices in this case also?. P.S :-I am using variance covariance method for calculating VaR. Eagerly waiting for valuable suggestions on this. ### infra-talk #### Dropwizard Deep Dive – Part 1: Custom Authentication This is Part 1 of a three-part series on extending Dropwizard with custom authentication, authorization, and multitenancy. For Part 1, we are going to go over adding custom authentication to Dropwizard. If you don’t already know, Dropwizard is an awesome Java web API framework. It is my preferred web stack. I’ve written about it previously in: Serving Static Assets with DropWizard, Using Hibernate DAOs in DropWizard Tasks, and Hooking up Custom Jersey Servlets in Dropwizard (note that some of those posts are out-of-date for newer versions of Dropwizard). It already comes with out-of-the-box support for http basic authentication and OAuth in the dropwizard-auth package. However, in a recent project, I needed to integrate a Dropwizard API with an existing API from another platform using a custom authentication scheme. Fortunately for me, Dropwizard exposes a set of extendible privative for authentication that I was able to leverage in my solution. All of the example code I’m going to share in my posts lives in this repo, if you want to follow along. ## Disclaimers Before we begin, I want to share a couple of disclaimers: 1. This was written for Dropwizard 0.9.x (the current version as of this writing). Future changes to Dropwizard may alter or invalidate details of this post. Be aware if you are using a later version. 2. This post uses hibernate for the database integration. While much of what we do here is possible with JDBI or other database integrations, I am not going to cover or discuss those. I will likely not be able to answer questions about them. Now let’s get to work. ## Adding Auth Annotations Dropwizard allows us to use the role annotations under the javax.annotation.security package, @RolesAllowed, @PermitAll, and @DenyAll to enforce authentication on our resource methods. You can add them to each method to set the permissions like so:  @POST @UnitOfWork @RolesAllowed({"MANAGER"}) public Widget createWidget(Widget widget) { WidgetModel widgetModel = widgetDAO.createWidget(widget); return new Widget(widgetModel.getId(), widgetModel.getName(), widgetModel.getScope()); } @GET @Path("/public") @UnitOfWork @PermitAll public List getPublicWidgets() { return getWidgetsForScope(WidgetScope.PUBLIC); } @GET @Path("/private") @UnitOfWork @RolesAllowed({"EMPLOYEE", "MANAGER"}) public List getPrivateWidgets() { return getWidgetsForScope(WidgetScope.PRIVATE); } @GET @Path("/top-secret") @UnitOfWork @RolesAllowed({"MANAGER"}) public List getTopSecretWidgets() { return getWidgetsForScope(WidgetScope.TOP_SECRET); }  In the above example, you can see that only users with the MANAGER role are allowed to create new widgets or view top-secret widgets. Users with an EMPLOYEE or MANAGER roles can see internal widgets, and anyone can see public widgets. It is important to note that if we don’t put any of these annotations on to a resource method, it will be open to anyone by default. This is different from the @PermitAll annotation, which still authenticates a user and just disregards what roles they have. To protect against this, I usually use reflection to write a test like the one here that ensures every resource method has one of these annotations. If you just add these roles and start up your application, you are going to be disappointed. These annotations don’t have any handler by default, so we are going to need to add an AuthFilter to Dropwizard to make them do something. ## Adding an AuthFilter In order to make our annotations do something, we need to use the tools in the dropwizard-auth package (which you’ll need to add to your dependency list). If we were just using HTTP basic auth or OAuth, we could use the out-of-the-box tools for those and hook them up to a role schema. However, since we are hooking into a custom authentication scheme, we are going to have to create our own stuff using the package primitives. The first thing we need to create is a Jersey filter that will run before each request and execute our authentication code. Dropwizard provides a convenient base class called AuthFilter that will do the job. A bare-bones AuthFilter is parameterized to a type of credentials and security principal and would look like this:  @PreMatching @Priority(Priorities.AUTHENTICATION) public class CustomAuthFilter extends AuthFilter { @Override public void filter(ContainerRequestContext requestContext) throws IOException { throw new WebApplicationException(Response.Status.UNAUTHORIZED); } }  We can register the filter inside our Dropwizard application’s run method using the AuthDynamicFeature like so:  CustomAuthFilter filter = new CustomAuthFilter(); environment.jersey().register(new AuthDynamicFeature(filter));  Now our filter will run before each request annotated with @RolesAllowed, @PermitAll, or @DenyAll to authenticate the user. Right now, though, our filter just rejects every request with a 401 status code. The next thing we need to do is add an Authenticator which will actually run the logic of authenticating a user’s credentials. ## Adding an Authenticator Our authenticator exposes a single method, authenticate, which takes in a CustomCredentials as an argument. The authenticator then uses the userId and token in the credentials to authenticate the user against the token stored in our database for that user. If the token matches, we return the user wrapped as an optional—otherwise, we return an empty optional. Also note that since we are using hibernate, we need to new up our authenticator using UnitOfWorkProxyFactory like so:  CustomAuthenticator authenticator = new UnitOfWorkAwareProxyFactory(hibernate) .create(CustomAuthenticator.class, new Class[]{TokenDAO.class, UserDAO.class}, new Object[]{tokenDAO, userDAO});  And our whole authenticator looks like this:  public class CustomAuthenticator implements Authenticator { private TokenDAO tokenDAO; private UserDAO userDAO; public CustomAuthenticator(TokenDAO tokenDAO, UserDAO userDAO) { this.tokenDAO = tokenDAO; this.userDAO = userDAO; } @Override @UnitOfWork public Optional authenticate(CustomCredentials credentials) throws AuthenticationException { CustomAuthUser authenticatedUser = null; Optional user = userDAO.getUser(credentials.getUserId()); if (user.isPresent()) { Optional token = tokenDAO.findTokenForUser(user.get()); if (token.isPresent()) { TokenModel tokenModel = token.get(); if (tokenModel.getId().equals(credentials.getToken())) { authenticatedUser = new CustomAuthUser(tokenModel.getUser().getId(), tokenModel.getUser().getName()); } } } return Optional.fromNullable(authenticatedUser); } }  Now we just need to hook up our CustomAuthFilter to our CustomAuthenticator by calling the authenticate method with some credentials. We’ll create the credentials in our auth filter by parsing our request context. (In our case, this means pulling the credentials out of cookies.)  public class CustomAuthFilter extends AuthFilter { private CustomAuthenticator authenticator; public CustomAuthFilter(CustomAuthenticator authenticator) { this.authenticator = authenticator; } @Override public void filter(ContainerRequestContext requestContext) throws IOException { Optional authenticatedUser; try { CustomCredentials credentials = getCredentials(requestContext); authenticatedUser = authenticator.authenticate(credentials); } catch (AuthenticationException e) { throw new WebApplicationException("Unable to validate credentials", Response.Status.UNAUTHORIZED); } if (!authenticatedUser.isPresent()) { throw new WebApplicationException("Credentials not valid", Response.Status.UNAUTHORIZED); } } private CustomCredentials getCredentials(ContainerRequestContext requestContext) { CustomCredentials credentials = new CustomCredentials(); try { String rawToken = requestContext .getCookies() .get("auth_token") .getValue(); String rawUserId = requestContext .getCookies() .get("auth_user") .getValue(); credentials.setToken(UUID.fromString(rawToken)); credentials.setUserId(Long.valueOf(rawUserId)); } catch (Exception e) { throw new WebApplicationException("Unable to parse credentials", Response.Status.UNAUTHORIZED); } return credentials; } }  Notice that we parse the credentials out of the request cookies and create a CustomCredentials instance. We then pass those credentials to the CustomAuthenticator in the filter method. If the authenticator returns a user, we are properly authenticated. Otherwise, we abort the request with a 401 error. ## Custom Authentication Complete And with that, we now have custom authentication on all of our annotated resource methods. None of them can be successfully called without a valid userId and token combination in the request cookies. But what about those roles that we listed in the method with the @RolesAllowed annotation? Right now, any of the methods are open to any authenticated user. To check the roles on the user requires a little more work on our part, which we will cover in Part 2. You can see the code for Part 1 here and the complete code for all three parts here. The post Dropwizard Deep Dive – Part 1: Custom Authentication appeared first on Atomic Spin. ### Related Posts ### Fefe #### Es ist mal wieder Zeit für ein paar Leser-Zuschriften.1. ... Es ist mal wieder Zeit für ein paar Leser-Zuschriften. 1. Man liest seit dem Attentat, es gäbe für die UMFs [unbegleitete jugendliche Flüchtlinge] die Möglichkeit, eine Traumatherapie zu machen, z.B. hier. Stimmt so aber nicht. Tatsächlich bekommt man für jugendliche Flüchtlinge nichtmal Plätze in der Psychiatrie, wenn man von jenen absieht, die einen so schlechten Ruf haben, dass sie einfach keine einheimische Kundschaft mehr bekommen. Selbiges gilt für Psychologen - der einzige mir bekannte in [Gegend zensiert], der UMFs behandelt, stopft die dann einfach mit Psychopharmaka zu, und die Behandlung ist fertig. 2. Wo wir gerade bei Ärzten sind: Viele weigern sich, UMFs zu behandeln, oder stellen nicht erfüllbare Forderungen. Z.b. das ein Dolmetscher zur Behandlung mitkommt. 3. Wo wir gerade bei Dolmetschern sind: Die paar Male, als ich welche bekommen habe, wenn ich sie brauchte (kostet ja Geld), kann ich an einer Hand abzählen. 4. Und Geld wird gerne gespart. Im oben verlinkten Artikel steht, der Vormund hätte die Aufgabe den Familiennachzug zu beantragen. Klingt toll? In beiden Landkreisen werden die Anträge von Minderjährigen systematisch verschleppt. Und solange der Antrag nicht bearbeitet wurde, kann kein Familiennachzug beantragt werden. Auf wundersame Weise kommt das Ausländeramt immer erst dann dazu, wenn der Jugendliche 18 Jahre alt wird, und damit den Anspruch auf den Familiennachzug verliert. Mit diesem Nachzug bringen übrigens die Schleuser die Familien dazu, ihre minderjährigen Söhne auf die Reise zu schicken. 5. Eine ähnlich nette Praxis gibt es bei der Hilfe für junge Volljährige. Laut Gesetz müssen die noch weiter betreut werden, wenn sie (noch) Hilfe brauchen. Da Vater Staat aber gerne spart, werden die ohne weitere Überpüfung in die GU/DU [GU = Gemeinschaftsunterkunft, DU weiß ich auch nicht] gesteckt. Wenn sie Glück haben, gibt es jemanden, der sich um sie kümmert. Wenn sie Pech haben nicht. Das wird hier so praktiziert, selbst wenn ein Blinder sieht, das die Person, um die es geht (weshalb auch immer), alleine nicht über die Runden kommen wird. Die einzigen Ausnahmen sind attestierte Behinderungen. 6. Die UMF-Unterkünfte. Die Mitarbeiter sind größtenteils nicht für den Umgang mit Traumatisierten ausgebildet. Sie sind größtenteils nicht für Migrationsarbeit ausgebildet. Sie sind größtenteils nicht für die Arbeit mit Jugendlichen ausgebildet. Die Mitarbeiter in den Unterkünften sind häufig anstrengender als die Jugendlichen. Fast keiner unter den Mitarbeitern spricht Englisch. Die Träger versuchen zu sparen, so gut es halt geht, und holen meiner Einschätzung nach teilweise auch sehr gut Geld aus der Geschichte. Dementsprechend stellen die viel zu wenige, viel zu gering Qualifizierte ein, die dann teilweise 60, 70 Stunden pro Woche (und schlimmstenfalls mehr) arbeiten. Die Heimaufsicht ist dabei sehr großzügig, weil es ja zu wenig Plätze gibt (weil ja niemand damit rechnen konnte das hier mal Leute herwollen!). Die haben in [Gegend zensiert] zwei Stellen besetzt, die sich mit den UMF-Gruppen beschäftigen, und für den Bezirk [Gegend zensiert] zuständig sind. Nur die richtig miesen Läden (z.B. [Name zensiert], ein "sozialer" Sperrmüllhändler bzw. ein "soziales Zeitarbeitsunternehmen" der seine angeschlagenen Finanzen mit seinem Ausflug in den Bereich aufbessern wollte) wurden dabei ernsthaft kontrolliert und dichtgemacht. Den Standards gerecht wird aber keiner. 7. Die Beschulung ist auch scheiße. Viel zu große Gruppen, teilweise nicht ausreichend ausgebildete Lehrer. Und nicht annähernd genügend Schulplätze. In der Umgebung warten ca. 200 minderjährige auf einen Schulplatz. 8. Jetzt noch etwas positives zum Schluss: Die Flüchtlinge mit denen ich arbeite, ob jung oder alt, sind sehr viel höflicher und motivierter als die Deutschen, die einen ähnlichen Bildungszugang haben. Eigentlich sind die alle sehr offen und freundlich, sieht man von extremen Ausnahmefällen ab (die dann aber auch eine entsprechende Biographie haben). Im Alltag sind sie auch viel weniger chauvinistisch (sei es gegenüber anderen Religionen, dem anderen Geschlecht, oder anderen Sexualitäten) als die Vergleichsgruppe. Probleme mit Gewalt gibt es ab und zu, aber das sind dann harmlose Raufereien unter Jugendlichen. Aber auch hier ist es bei den Flüchtlingen in der Qualität gefühlt harmloser. Ich und alle anderen in dem Bereich sind guten Mutes, das die Integration der Jugendlichen klappt. Wenn die beschriebenen Probleme beseitigt würden, täten wir uns daran allerdings deutlich leichter. Update: Ein anderer Leser dazu: Diese Art der Verschleppung von Anträgen und die unberechtigte Einstellung von Hilfsleistungen bei Volljährigkeit gibt es nicht nur bei Flüchtlingen. Wir haben als Pflegefamilie so ziemlich genau dieselben Dinge erlebt, die Jugendämter verfahren mit unbequemen oder teuren Anträgen gerne genauso. Und wir reden hier über deutsche Pflegekinder aus deutschen Herkunftsfamilien. Wir haben regelmässig Anträge gegen persönliche Unterschrift des Sach^h^h^h^hSozialarbeiters direkt im Amt abgeliefert, und mussten des öfteren Leistungen, die den Kindern nach SGB VIII zustanden und die benötigt waren, unter Androhung von Klage bzw durch tatsächliche Klage einfordern. Hier wird sehr gerne nach dem Prinzip verfahren, das das Jugendamt zunächst versucht, Dinge auszusitzen, dann damit droht, die Kinder aus der Familie zu nehmen, da man ja offensichtlich überfordert sei, im Verfahren dann schnell wegen klarer Chancenlosigkeit einknickt und trotzdem noch um jeden Cent (buchstäblich) einer Leistung feilscht. Es geht hier niemals um die Betroffenen, sondern darum, das den Buchstaben des Gesetzes Folge geleistet wird, und das mit möglichst wenig finanziellem Aufwand. Dabei sind die Gesetze im SGB VIII dazu da, Leistungen nach dem Bedarf des Hilfsbedürftigen zu regeln. Regelmässig wird auch von Richtern dementsprechend entschieden, das eine Leistung dem Bedarf entsprechend zu leisten ist, dennoch werden diese Urteile von Amtsseite immer wieder ignoriert. Wir behandeln also in Deutschland alle Hilfesuchenden gleich beschissen. #### Wofür braucht die französische Regierung eigentlich ... Wofür braucht die französische Regierung eigentlich den Dauer-Ausnahmezustand? #### Und noch eine Einsendung eines Lesers:In meiner Arbeit ... Und noch eine Einsendung eines Lesers: In meiner Arbeit im psychiatrischen Bereich habe ich noch nie so viele Krisen wie derzeit erlebt. Leute aus dem BEW (betreutes Einzelwohnen), aber auch aus der TWG (therapeutische Wohngemeinschaften), flippen gerade regelrecht aus. Aus Altenheimen wird übrigens Ähnliches berichtet. Dort rückt derzeit sehr oft der SpD (sozialpsychiatrischer Dienst) an, weil die Damen und Herren psychotisches Erleben zeigen. Grund; festhalten: Die Pfleger bekommen es nicht hin, den alten Leuten genug Wasser zuzuführen. Die Hitze macht den Rest. Wassermangel kann zu psychotischem Erleben führen. Und so kommen wir zur Verschwörungstheorie der Woche: Schuld an Amokläufen und Terrorismus ist der Klimawandel! #### Schüsse im Benjamin-Franklin-Klinikum in Berlin.OK, ... Schüsse im Benjamin-Franklin-Klinikum in Berlin. OK, wir brauchen mal eine EU-weite Klimaanlage. So geht das nicht weiter. Update: Der Täter in Berlin hat wohl einen Arzt angeschossen und sich dann selbst umgebracht. Oh und in Frankreich haben zwei Männer in einer Kirche Geiseln genommen und den 84-Jährigen Priester erschossen. Wer erschießt denn bitte einen 84-Jährigen Priester oder einen Arzt in einem Krankenhaus!? #### Schaut euch mal bitte den John Oliver über den RNC ... Schaut euch mal bitte den John Oliver über den RNC an. Der hat da ein Interview mit Newt Gingrich ausgegraben, das einen Haufen von Fragen darüber beantwortet, wie Politik heutzutage funktioniert. ### QuantOverflow #### Historic and future (next) round of machine readable dividend info Is there a source of historic dividend dates, ex-dividend dates and dividend values? Also, the next dividend date / ex-dividend date. Ideally in a machine readable form? The publically available APIs don't seem to have the above info as an optional form of info. UpcomingDividends seems to have some of the future data and Google Finance seems to have the last dividend info (although no date). ### StackOverflow #### How do I get TensorFlow's 'import_graph_def' to return Tensors If I attempt to import a saved TensorFlow graph definition with import tensorflow as tf from tensorflow.python.platform import gfile with gfile.FastGFile(FLAGS.model_save_dir.format(log_id) + '/graph.pb', 'rb') as f: graph_def = tf.GraphDef() graph_def.ParseFromString(f.read()) x, y, y_ = tf.import_graph_def(graph_def, return_elements=['data/inputs', 'output/network_activation', 'data/correct_outputs'], name='')  the returned values are not Tensors as expected, but something else: instead, for example, of getting x as Tensor("data/inputs:0", shape=(?, 784), dtype=float32)  I get name: "data/inputs_1" op: "Placeholder" attr { key: "dtype" value { type: DT_FLOAT } } attr { key: "shape" value { shape { } } }  That is, instead of getting the expected tensor x I get, x.op. This confuses me because the documentation seems to say I should get a Tensor (though there are a bunch of ors there that make it hard to understand). How do I get tf.import_graph_def to return specific Tensors that I can then use (e.g. in feeding the loaded model, or running analyses)? #### Adapting binary stacking example to multiclass in Python I have been studying this example of stacking. In this case, each set of K-folds produces one column of data, and this is repeated for each classifier. I.e: the matrices for blending are: dataset_blend_train = np.zeros((X.shape[0], len(clfs))) dataset_blend_test = np.zeros((X_submission.shape[0], len(clfs)))  I need to stack predictions from a multiclass problem (probs 15 different classes per sample). This will produce an n*15 matrix for each clf. Should these matrices just be concatenated horizontally? Or should they be combined in some other way, before logistic regression is applied? Thanks. ### QuantOverflow #### Given Brownian motion$B_t,B_s$and$t>s$, how to calculate$P(B_t>0,B_s<0)$? As stated, this is an interview question. Given Brownian motion$B_t,B_s$and$t>s$, how to calculate$P(B_t>0,B_s<0)$? ### Fred Wilson #### Hailo and MyTaxi The news broke this morning that our portfolio company Hailo is combining forces with MyTaxi. The combined company, which will operate under the MyTaxi brand, will be the dominant taxi hailing app in Western Europe. Hailo is huge in the UK and Ireland and has a strong position in Spain. MyTaxi operates in Germany, Australia, Italy, Poland, Portugal, Spain, and Sweden. So this combination is a great strategic fit and the new company will benefit from a lot of synergies. Andrew Pinnington, the current CEO of Hailo, will become the CEO of MyTaxi and the company will consolidate its operations in Hamburg Germany. The combined company will be majority owned by Daimler. Unlike the US, the regulated taxi business in Europe got on the ridesharing bandwagon early and it is as simple and easy to hail at taxi in Europe as it is to use Uber. If you travel to Berlin frequently, you will know that ridesharing in Berlin is all about taxis. I can’t reveal numbers, but the combined MyTaxi/Hailo business will operate at a scale that puts it in the big leagues along with Uber and a number of other emerging winners in the ridesharing business. This is a great outcome for Hailo and I would be remiss if I didn’t thank Andrew Pinnington for his incredible leadership at Hailo over the past 18 months. Without that, none of this would have been possible. ### StackOverflow #### Time Series Clustering With Dynamic Time Warping Distance (DTW) with dtwclust I am trying to perform a Time Series Clustering With Dynamic Time Warping Distance (DTW) with the dtwclust package. I use this function, dtwclust(data = NULL, type = "partitional", k = 2L, method = "average", distance = "dtw", centroid = "pam", preproc = NULL, dc = NULL, control = NULL, seed = NULL, distmat = NULL, ...)  I save my data as a list, they have different length. like the example below, and it is a time series. $a
[1]  0  0  0  0  2  3  6  7  8  9 11 13

$b [1] 0 1 1 2 4 7 8 11 13 15 17 19 22 25 28 31 34 35$c
[1]  1  2  4  4  4  4  4  4  4  4  5  5  5  5  5  5  5  6  6  6  6  7  7  8  8  9 10 10 12 14 15 17 19

$d [1] 0 0 0 0 0 1 2 4 4 4$e
[1]  0  1  1  3  5  6  9 12 14 17 19 20 22 24 28 31 32 34


Now, my problem are

(1) I can only choose dtw, dtw2 or sbd for my distance and dba, shape or pam for my centroid (because of different length of list). But, I don't know which distance and centroid is correct.

(2) I have plot some graphs, but I don't know how to choose the right and reasonable one.

k = 6, distance = dtw, centroid = dba:

k = 4, distance = dtw, centroid = dba (the cluster center seems wired?)

I have do all the combination, k from 4 to 13... but I have no idea about how to choose the right one...

### QuantOverflow

#### Deduce expected exposure profile from option/structure delta?

I am thinking about whether there exists a relationship between the delta of an option (or any structured derivative) and it's expected positive/negative exposure?

An intuitive question would be the following: A Foward has a Delta of 1 and given the above exposure profile and the Delta of an Option with the same underlying, can I deduce that the exposure profile of the Option equals Delta * Forward_Exposure?

However, after running some simulations I see that this is not the case, part of the reason being (I think) that for exposure generation one simulates values for all relevant risk parameters and not just the one which corresponds to the Delta/sensitivity.

If there are any questions on Definitions of terms I used, I am happy to clarify. Image taken from Jon Gregory's book on CVA.

### TheoryOverflow

#### Ordering of a DAG minimizing some definition of cost

Consider a DAG $(V,A)$ with a topological ordering $(v_1,v_2,\ldots,v_n)$. I define the cost of this ordering as the maximum over all $1\leq i\leq n$ of $|\{j\leq i \mid \exists k>i: (v_j,v_k)\in A\}|$. The problem is: given a DAG, find a topological ordering with minimum cost.

In other words, for each $i$, I consider that a vertex that appeared before $i$ is pending if it still has one or more out-neighbours that did not appear yet, and I want to minimize the maximum number of pending vertices at any time.

It looks like some graph measures (say treewidth, etc.), but I didn't manage to find this in the literature. Has it been studied before?

I'd say it's almost-certainly NP-hard (although I don't have a proof)... would there be any approximation algorithm, or at least some smart heuristic?

Edit: for a little bit of context. This came up while I was trying to program a tool solving a completely unrelated string problem. To make things short: the graph models my input, and I need to compute a set of strings $\mathcal S(v)$ for each node $v$. To compute each $\mathcal S(v)$, I need to know $\mathcal S(u)$ for each node $u$ with an arc $u\rightarrow v$. In the end I'm only interested in $\mathcal S(v)$ for the targets of the graph. Now I want to improve the memory needs: the sets $\mathcal S$ are huge, so I want to keep only a minimum number of them in memory at any time: they correspond to the "pending vertices" defined above.

### Lobsters

#### Zagat - Restaurant reviews, trusted ratings, photos, new places, best-of lists, neighborhood guide on the App Store

I don’t want to be spammy but I did work on this for the past year and we just launched it today so I’m quite excited about it. US-only for now, unfortunately.

### StackOverflow

#### OneHotEncoded features causing error when input to Classifier

I’m trying to prepare data for input to a Decision Tree and Multinomial Naïve Bayes Classifier.

This is what my data looks like (pandas dataframe)

Label  Feat1  Feat2  Feat3  Feat4

0        1     3       2      1
1        0     1       1      2
2        2     2       1      1
3        3     3       2      3


I have split the data into dataLabel and dataFeatures. Prepared dataLabel using dataLabel.ravel()

I need to discretize features so the classifiers treat them as being categorical not numerical.

I’m trying to do this using OneHotEncoder

enc = OneHotEncoder()

enc.fit(dataFeatures)
chk = enc.transform(dataFeatures)
from sklearn.naive_bayes import MultinomialNB

mnb = MultinomialNB()

from sklearn import metrics
from sklearn.cross_validation import cross_val_score
scores = cross_val_score(mnb, Y, chk, cv=10, scoring='accuracy')


I get this error - bad input shape (64, 16)

This is the shape of label and input

dataLabel.shape = 72 chk.shape = 72,16

Why won't the classifier accept the onehotencoded features?

EDIT - Entire Stack trace code

/root/anaconda2/lib/python2.7/site-packages/sklearn/utils /validation.py:386: DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 and willraise ValueError in 0.19. Reshape your data either using X.reshape(-1, 1) if your data has a single feature or X.reshape(1, -1) if it contains a single sample.
DeprecationWarning)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/root/anaconda2/lib/python2.7/site-packages/sklearn /cross_validation.py", line 1433, in cross_val_score
for train, test in cv)
File "/root/anaconda2/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.py", line 800, in __call__
while self.dispatch_one_batch(iterator):
File "/root/anaconda2/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.py", line 658, in dispatch_one_batch
File "/root/anaconda2/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.py", line 566, in _dispatch
job = ImmediateComputeBatch(batch)
File "/root/anaconda2/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.py", line 180, in __init__
self.results = batch()
File "/root/anaconda2/lib/python2.7/site-packages/sklearn/externals/joblib/parallel.py", line 72, in __call__
return [func(*args, **kwargs) for func, args, kwargs in self.items]
File "/root/anaconda2/lib/python2.7/site-packages/sklearn/cross_validation.py", line 1531, in _fit_and_score
estimator.fit(X_train, y_train, **fit_params)
File "/root/anaconda2/lib/python2.7/site-packages/sklearn/naive_bayes.py", line 527, in fit
X, y = check_X_y(X, y, 'csr')
File "/root/anaconda2/lib/python2.7/site-packages/sklearn/utils/validation.py", line 515, in check_X_y
y = column_or_1d(y, warn=True)
File "/root/anaconda2/lib/python2.7/site-packages/sklearn/utils/validation.py", line 551, in column_or_1d


ValueError: bad input shape (64, 16)

### QuantOverflow

#### Is there a way to meaningfully generate daily returns from monthly?

I have a set of 7 investments in a portfolio and I need to optimize the weightings based on some exposures to various markets/styles/economic factors. I was hoping to do some sort of simple exposure analysis or 'factor analysis' (not actual Factor Analysis, but more just a bunch of regressions), using daily returns of various risk factors (for example, SPX, TBills, MSCI, FamaFrench Factors, etc).

I only have daily returns for 5 of the 7 investments in the portfolio. I have monthly returns for the remaining two. Is there an easy way to do some sort of generation of daily returns from monthly returns, possibly modelling the monthly against the factors' monthly returns, and then generating daily returns based on the model? (I know this is circular, but I am spitballing.) The problem is that I need some way to tie or anchor the modeled daily returns back to the actual monthly returns.

Any ideas? And does this make sense?

#### Is there a better, more rigorous explanation for why this partial derivative is 0 using Ito's Lemma?

I encountered the following slide in a lecture on Ito's Lemma.

The lecturer explained that $$\frac{\partial V}{\partial t} = 0$$ because the first two derivatives on the slide already took into account time into the change of the value of V.

I'm not convinced. If $V = \log S(t)$ is a function of time, why wouldn't we have to use the chain rule for the third derivative on the slide?

$$\frac{\partial V}{\partial t} = \frac{\partial V}{\partial S(t)} \cdot \frac{\partial S(t)}{\partial t} = S^{-1} \cdot \frac{\partial S(t)}{\partial t} = ...$$

I'm not sure where to go from here to show that it is in fact 0.

### TheoryOverflow

#### distance between codewords and preimages

Let $\varepsilon>0$. Does there exist a $[n,k,d]$ code over the field $\mathbf{F}_2$ that satisfies:

$d(Cx,Cy)\in [\alpha(1-\varepsilon)d(x,y), \alpha(1+\varepsilon)d(x,y)]$ (where $C$ is the generator matrix for the code) for every $x,y \in \{0,1\}^k$ and some $\alpha\in \mathbf{Z}$. In some sense, I want a code where the distance between the pre-images are "sort of" preserved upto a multiplicative factor $\alpha(1\pm \varepsilon)$ by this code

### StackOverflow

#### Good image preprocessing method for image obtained from webcam for better classification

I am working on a project to recognize facial expression of a human face. We used JAFFE database as our dataset and Local Binary Patterns(LBP) as feature extraction. SVM was used as the classifier. We were able to obtain an average accuracy of around 80% on the dataset but when testing with the image from the live webcam, the accuracy/prediction was sub-par. So I assumed that it was because of the quality of the image from the webcam. So what preprocessing method can be used to obtain better prediction?

### QuantOverflow

#### Step By Step method to calculating VaR using MonteCarlo Simulations

In trying to find VaR for 5 financial assets with prices over a long period of time(2000 days worth of data) how would I do the following:

1. Carry out monte-carlo simulation in order to find a VaR value, assuming all 5 assets are standard normally distributed.
2. Carry out monte-carlo simulation for all 5 assets to find a VaR value, assuming they follow a student-t distribution with 10 degrees of freedom?

I am trying to do this at both the 95% and 90% confidence levels, and simulate the data with 10,000 replications. Any help would be greatly appreciated. I have already created a Cholesky Decomposition Matrix but not sure how to use the data in it to get the VaR figure.

### StackOverflow

#### Java for loop to lambda expression [on hold]

What would be the best way to refactor the following code using the new Java 8 features (lambda expressions)

public List<String> getNames(List<User> users){
for (User user : users){
}
}


Also, will there be any difference in performance? if so, what is the difference?

### QuantOverflow

#### forward space vs zero space in finance jargon

Would anyone know what does it mean to value an asset in "forward space" versus "zero space" ? where does one start from when trying to dig into the meaning of this? Thanks in advance.

### TheoryOverflow

#### Can the "mutual independence" condition in the Lovász local lemma be weakened?

The Lovász local lemma, as stated in Corollary 5.1.2 here, is given as follows.

Lemma. Let $A_1, \ldots, A_k$ be events such that each $A_i$ has probability at most $p$ and such that each $A_i$ is mutually independent of all but at most $d$ of the $A_j$'s. Then if $ep(d+1) \leq 1$, the probability that none of the $A_i$'s occur is positive.

I have a few questions about the assumptions needed for the above lemma. Looking at the proof given in the Wikipedia article here, it appears that one does not require each $A_i$ to be mutually independent of all but at most $d$ of the $A_j$'s. Instead, all one requires is that$$\text{Prob}\left(A_i \mid \bigwedge_{j \in S} \overline{B_j} \right) = \text{Prob}(A_i),$$where $S = \{1, \ldots, k\} - \Gamma(A_i)$, where $\Gamma(A_i)$ is the dependency locus of $A_i$. I think that the above condition appears to be considerably weaker than mutual independence; indeed, mutual independence means that $$\text{Prob}\left(A_i \wedge \bigwedge_{j \in S} \overline{B_j}\right) = \text{Prob}(A_i) \cdot \prod_{j \in S} \text{Prob}(\overline{B_j}).$$

Am I right to say that mutual independence is not required?

Also, is there a reason why the constant $e$ in the statement of the Lovász local lemma is optimal? Several sources seem to agree on this. It seems somewhat arbitrary to me, and I feel like a smaller constant can be achieved by being more careful in the proof.

### StackOverflow

#### Python keras how to transform a dense layer into a convolutional layer

I have a problem finding the correct mapping of the weights in order to transform a dense layer into a convolutional layer.

This is an excerpt of a ConvNet that I'm working on:

model.add(Convolution2D(512, 3, 3, activation='relu'))


After the MaxPooling, the input is of shape (512,7,7). I would like to transform the dense layer into a convolutional layer to make it look like this:

model.add(Convolution2D(512, 3, 3, activation='relu'))


However, I don't know how I need to reshape the weights in order to correctly map the flattened weights to the (4096,512,7,7) structure that is needed for the convolutional layer? Right now, the weights of the dense layer are of dimension (25088,4096). I need to somehow map these 25088 elements to a dimension of (512,7,7) while preserving the correct mapping of the weights to the neurons. So far, I have tried multiple ways of reshaping and then transposing but I haven't been able to find the correct mapping.

An example of what I have been trying would be this:

weights[0] = np.transpose(np.reshape(weights[0],(512,7,7,4096)),(3,0,1,2))


but it doesn't map the weights correctly. I verified whether the mapping is correct by comparing the output for both models. If done correctly, I expect the output should be the same.

#### Cost for image similarity

I'm trying to build a spatial transformer network. The goal is to crop a region of interest. I have pairs of (image, ROI) (i.e image is the picture of a road and ROI is the traffic sign). The question is how can I state an efficient cost function that can measure the similarity between the label (ROI) and output image(from the network). I simply stated the cost as;

cost = T.mean(lasagne.objectives.squared_error(output, Y))

Are there any better ways to do that ? While a tiny shift on output and ROI will effect the cost very different.

### CompsciOverflow

#### Timely lower bounded Turing machines

Let M be a deterministic Turing machine wich has the properties:

1) $\forall x,y \in \Sigma^* : t_M(xy) \ge t_M(x) + t_M(y)$

2) $\forall a \in \Sigma: t_M(a) \ge 1$ (Also 2) should be obvious for every DTM).

Then it follows that for all $x \in \Sigma^* : t_M(x) \ge |x|$. The graph $G_M$ induced by the transition function contains a cycle: To see this choose a word $w$ whose length $|w|$ is $> |Q|$ where $Q$ is the set of states of $M$. Then we have $t_M(w) \ge |w| > |Q|$. Since $M$ is at every time step on exactly one state, $M$ must visit in $t_M(w) > |Q|$ time steps one state at least twice, hence the graph $G_M$ must contain a cycle.

My question is this: Can we construct to every DTM $M'$ an equivalent DTM $M$ with the properties above? In my intuition this is possible: Just construct $M$ such that it reads all the input, writes what it has read, move the pointer to the beginning of the word and then gives control to $M'$. But is it possible to give a more formal proof for this? Or is my intuition wrong?

#### Approximability of convex programming (convex optimization)

Convex optimization is defined here: http://cstheory.stackexchange.com/questions/22314/definition-of-convex-optimization-problem-by-stephen-boyd-and-lieven-vandenbergh

But is anything known about the approximation complexity of the problem? Does it have a PTAS (poly time approximation scheme)? Or is there a proof that a PTAS is impossible unless P=NP? Are there any known results on upper/lower bounds on its approximability?

Any information will be much appreciated.

#### Hash multiple integers directly using FNV-1a

An alternative version of FNV-1a hash spread on the internet, which operates directly on integers instead of bytes. The offset basis and prime are the same used in the original version, which operates on bytes.

With this version, is the statistical quality of the produced hash similar to the original algorithm?

Alternative version operating on integers:

#include <cstdint>

#define OFFSET_BASIS 2166136261ul
#define FNV_PRIME 16777619ul

uint32_t hash(uint32_t i, uint32_t j, uint32_t k)
{
return ((((((OFFSET_BASIS ^ i) * FNV_PRIME) ^ j) * FNV_PRIME) ^ k) * FNV_PRIME);
}


Original version operating on bytes:

#include <cstdint>

#define OFFSET_BASIS 2166136261ul
#define FNV_PRIME 16777619ul

uint32_t hash(char* data, size_t bytes)
{
uint32_t h = OFFSET_BASIS;

for (size_t i = 0; i < bytes; ++i)
{
h = (h ^ data[i]) * FNV_PRIME;
}

return h;
}


### StackOverflow

#### Fast Information Gain computation

I need to compute Information Gain scores for >100k features in >10k documents for text classification. Code below works fine but for the full dataset is very slow - takes more than an hour on a laptop. Dataset is 20newsgroup and I am using scikit-learn, chi2 function which is provided in scikit works extremely fast.

Any idea how to compute Information Gain faster for such dataset?

def information_gain(x, y):

def _entropy(values):
counts = np.bincount(values)
probs = counts[np.nonzero(counts)] / float(len(values))
return - np.sum(probs * np.log(probs))

def _information_gain(feature, y):
feature_set_indices = np.nonzero(feature)[1]
feature_not_set_indices = [i for i in feature_range if i not in feature_set_indices]
entropy_x_set = _entropy(y[feature_set_indices])
entropy_x_not_set = _entropy(y[feature_not_set_indices])

return entropy_before - (((len(feature_set_indices) / float(feature_size)) * entropy_x_set)
+ ((len(feature_not_set_indices) / float(feature_size)) * entropy_x_not_set))

feature_size = x.shape[0]
feature_range = range(0, feature_size)
entropy_before = _entropy(y)
information_gain_scores = []

for feature in x.T:
information_gain_scores.append(_information_gain(feature, y))
return information_gain_scores, []


EDIT:

I merged the internal functions and ran cProfiler as below (on a dataset limited to ~15k features and ~1k documents):

cProfile.runctx(
"""for feature in x.T:
feature_set_indices = np.nonzero(feature)[1]
feature_not_set_indices = [i for i in feature_range if i not in feature_set_indices]

values = y[feature_set_indices]
counts = np.bincount(values)
probs = counts[np.nonzero(counts)] / float(len(values))
entropy_x_set = - np.sum(probs * np.log(probs))

values = y[feature_not_set_indices]
counts = np.bincount(values)
probs = counts[np.nonzero(counts)] / float(len(values))
entropy_x_not_set = - np.sum(probs * np.log(probs))

result = entropy_before - (((len(feature_set_indices) / float(feature_size)) * entropy_x_set)
+ ((len(feature_not_set_indices) / float(feature_size)) * entropy_x_not_set))
information_gain_scores.append(result)""",
globals(), locals())


Result top 20 by tottime:

ncalls  tottime percall cumtime percall filename:lineno(function)
1       60.27   60.27   65.48   65.48   <string>:1(<module>)
16171   1.362   0   2.801   0   csr.py:313(_get_row_slice)
16171   0.523   0   0.892   0   coo.py:201(_check)
16173   0.394   0   0.89    0   compressed.py:101(check_format)
210235  0.297   0   0.297   0   {numpy.core.multiarray.array}
16173   0.287   0   0.331   0   compressed.py:631(prune)
16171   0.197   0   1.529   0   compressed.py:534(tocoo)
16173   0.165   0   1.263   0   compressed.py:20(__init__)
16171   0.139   0   1.669   0   base.py:415(nonzero)
16171   0.124   0   1.201   0   coo.py:111(__init__)
32342   0.123   0   0.123   0   {method 'max' of 'numpy.ndarray' objects}
48513   0.117   0   0.218   0   sputils.py:93(isintlike)
32342   0.114   0   0.114   0   {method 'sum' of 'numpy.ndarray' objects}
16171   0.106   0   3.081   0   csr.py:186(__getitem__)
32342   0.105   0   0.105   0   {numpy.lib._compiled_base.bincount}
32344   0.09    0   0.094   0   base.py:59(set_shape)
210227  0.088   0   0.088   0   {isinstance}
48513   0.081   0   1.777   0   fromnumeric.py:1129(nonzero)
32342   0.078   0   0.078   0   {method 'min' of 'numpy.ndarray' objects}
97032   0.066   0   0.153   0   numeric.py:167(asarray)


Looks that most of the time is spent in _get_row_slice. I am not entirely sure about the first row, looks it covers the whole block I provided to cProfile.runctx, though I don't know why there is such a big gap between first line totime=60.27 and second one tottime=1.362. Where was the difference spent in? Is it possible to check it in cProfile?

Basically, looks the problem is with sparse matrix operations (slicing, getting elements) -- the solution probably would be to calculate Information Gain using matrix algebra (like chi2 is implemented in scikit). But I have no idea how to express this calculation in terms of matrices operations... Anyone has an idea??

#### Array of literal Objects without duplicates in ES6 using Set

The code to get an array without repeated items has become elegant since ES6:

[...new Set(array)];


That's it!

However, this is only removing duplicates if the array has elements with a primitive data type (string, boolean, number, ...).

What about a Set of object literals? How to make that work without getting duplicates, using a syntax close to the syntax used above?

var array=["aaa","bbb","aaa","cc","aaa","bbb"];
var out=[...new Set(array)];
console.log(out)

//----Literal Object

array=[{n:"J",last:"B"},{n:"J",last:"B"}];
out=[...new Set(array)];
console.log(out)

The code above produces a set with 2 elements, yet I want it to only have one in this case.

I could use serialize/de-serialize methodology to achieve this:

[...new Set(array.map(
//-- SERIALIZE:
(e) => ${e.n}:${e.last}
))].map(
//-- DE-SERIALIZE:
(e) => ({ n: ${e.split(':')[0]}, last: ${e.split(':')[1]} })
)


However, I am looking for an ES6 built-in.

### CompsciOverflow

#### Data Compression Algorithm for Less repetitive pattern (redundancy) [duplicate]

This question is an exact duplicate of:

Context:

Lossless Data compression (source coding) algorithms heavily rely on repetitive pattern (redundancy)

Questions

Which data compression method/algorithm deals with less repetitive pattern (redundancy) specially ?

### QuantOverflow

#### Systematic credit-risk factor estimation / retail portfolio

I have a question in the field of credit risk models. I work in a small bank, and we are planning to establish the IRB Approach (Credit Metrics). For this reason, I need to estimate systematic risk factors. I have a good idea how to do this for corporate clients (e.g. perfrom regression analysis of individual stock price against market development), but I am seeking some advice regarding retail Business.

Has anyone an idea how to estimate systematic risk factor for a retail credit Portfolio? Any reference to literature would also be very welcomed!

Thx & Regards Georg

### TheoryOverflow

#### Is the infinitely-often version of Ladner's theorem known?

We say two languages $\;\;\; L\hspace{.02 in},\hspace{-0.02 in}L' \: \subseteq \: \{\hspace{-0.02 in}0,\hspace{-0.05 in}1\hspace{-0.03 in}\}^* \;\;\;$ agree infinitely-often with each other
if and only if there are infinitely-many $n$ such that $\;\;\; L \cap \{\hspace{-0.02 in}0,\hspace{-0.05 in}1\hspace{-0.03 in}\}^n \: = \: L' \cap \{\hspace{-0.02 in}0,\hspace{-0.05 in}1\hspace{-0.03 in}\}^n \:\:\:\:$.

For a language $L$ let io-$L$ be the set of languages which agree infinitely-often with $L$.
Let io-P be the set of languages that agree infinitely-often with some language in P.

Let io-NPH be the infinitely-often version of NPH (NP-hard w.r.t. Cook reductions):
$L \in$ io-NPH iff for all $L' \in$ NP, some language in io-$L'$ is polynomial-time Turing reducible to $L$.

Is ​ ​ ​ io-P does not contain NP ​ ​ ​ known to imply that ​ ​ ​ io-P $\cup$ io-NPH ​ does not contain NP ​ ​ ?

### QuantOverflow

#### Fit Simple VAR model in Matlab

I've been trying to fit the following model in Matlab:

$\beta_{t}=a+Mt+A\beta_{t-1}+\epsilon_{t}$

Where a is a constant, M is a vector of trend parameters and A a cross-factor interaction matrix. I've been looking at vgxset but it doesn't have the option to add a trend estimation.

Any ideas? Thanks,

### StackOverflow

#### Tensorflow: curve fitting with logistic regression

I have a dataset where each row is a (x, y) tuple. So, each row is a point of a curve in the X-Y plane. I would like to do logistic regression for it.

Following the examples give here, I have created the model in the following chunk of the code.

# tf Graph Input
X = tf.placeholder("float")
Y = tf.placeholder("float")

# Set model weights
W = tf.Variable(rng.randn(), name="weight")
b = tf.Variable(rng.randn(), name="bias")

# Construct a logistic model
pred = tf.nn.softmax(tf.mul(x, W) + b) # Softmax


I am getting the following error in the last line.

ValueError: Shape () must have rank 2

I have two 1D vectors one for the X values and the other for the Y values. I am not sure why I should have something with the shape of rank 2.

### QuantOverflow

#### Show if Arrow price vector $\pi$ exists, then the law of one price hold

Now, the proof I have read goes like this:

Take assets 1 and 2, entirely identical. By assumption there is a pricing vector, i.e. $\sum_s\pi_sd^1_s=q^1$ and $\sum_s\pi_sd^2_s=q^2$ where $d^i_j$ is the payoff of asset $i$ in state $j$ and $\pi_i$ is the $i$-th element of the pricing vector $\vec{\pi}$. Since $d^1_s=d^2_s$ for all $s$ (identical assets), we must have $q^1=q^2$.

My question about this proof is that the assumption does not say only one $\vec{\pi}$ exists. It seems possible to me that there can be another price vector and we would have got a different price for the same asset. For example, price vector $(1,2,3)$ and $(2,3,4)$.

### StackOverflow

#### does call with current continuation ignore its own continuation?

call_cc :: ((a -> (b -> r) -> r) -> (a -> r) -> r) -> (a -> r) -> r
call_cc f k = f (\t x -> k t) k


As described by this signature and implementation.

However, we can see that the x parameter is here never used. Does it mean that any continuation passed to f is always ignored that the initial continuation k is always replacing it? In that case, does it mean that call-with-cc can only ever call a function with that is one-level deep and not more? (because the next function that would be called in a normal control flow with a continuation x is ignored)

In that case it seems very limiting, what is its practical use?

### StackOverflow

#### Machine Learning Algorithm selection

I am new in machine learning. My problem is to make a machine to select a university for the student according to his location and area of interest. i.e it should select the university in the same city as in the address of the student. I am confused in selection of the algorithm can I use Perceptron algorithm for this task.

### QuantOverflow

#### ARIMA prediction for currencies

I was browsing TradingEconomics.com and I came across their forecast models which immediately captivated my interest. They describe them as "projected using an autoregressive integrated moving average (ARIMA) model calibrated using our analysts expectations. We model the past behaviour of Japanese Yen using vast amounts of historical data and we adjust the coefficients of the econometric model by taking into account our analysts assessments and future expectations."

Here is a picture below: So, my question is.. how can I create the band throughout the whole data set? From what I have seen thus far in youtube tutorials is that ARIMA is only forward looking it doesn't go through whole data set. Also, can this be done in R? Thank you very much for your time!

### TheoryOverflow

#### Why is it impossible to work with polylog length encoding schemes for quantum circuits?

I am going through Quantum Computational Complexity by John Watrous. On page $12$, he said:

The encoding disallows compression: it is not possible to work with encoding schemes that allow for extremely short (e.g., polylogarithmic-length) encodings of circuits; so for simplicity it is assumed that the length of every encoding of a quantum circuit is at least the size of the circuit.

My question:

Why is it impossible to work with polylogarithmic-length encoding schemes for quantum circuits?

### StackOverflow

#### What are some machine learning algorithms [on hold]

I'm kinda confused about machine learning is classification in machine learning is algorithm amd is suprivied and unsupervised is algorithms or type of ML? What are some machine learning algorithms?

### Wes Felter

#### SDxCentral: Mirantis Pegs OpenStack's Future to Kubernetes

SDxCentral: Mirantis Pegs OpenStack's Future to Kubernetes:

Again, what is OpenStack needed for in such a setup? And didn’t OpenStack decide that Go code is not allowed?

### CompsciOverflow

#### Graph isomorphism problem for labeled graphs

In the case of unlabeled graphs, the graph isomorphism problem can be tackled by a number of algorithms which perform very well in practice. That is, although the worst case running time is exponential, one usually has a polynomial running time.

I was hoping that the situation is similar in the case of labeled graphs. However, I have a really hard time to find any reference which proposes an practically efficient'' algorithm.

Remark: Here, we require that the isomorphism preserves the labels. That is, an isomorphism between two finite automata/process algebra terms would imply that the automata/terms are essentially equal up to renaming of the nodes''.

The only reference I found was the one in Wikipedia that states the the isomorphism problem of labeled graphs can be polynomially reduced to that of ordinary graphs. The underlying paper, however, is more about complexity theory than practical algorithms.

I am missing something, or is it really the case that the there are no efficient heuristical'' algorithms to decide whether two labeled graphs are isomorphic?

Any hint or reference would be great.

### QuantOverflow

#### Send TRAIL STOP order when price hits a certain level, with IB TWS

Posting here after searching around and not finding any responses to basically the same question that I saw on EliteTrader, with another variant posted 10 years ago (update: the same question on Money):

Say I bought X at 100 and want to have TWS automatically submit a TRAIL STOP order when the price reaches 110, with a trailing amount of 2 Assume I don't care how long it takes for the price to reach that level. How can I do that?

Example:

1. X is now at 95. The order should wait until X hits 110. No order parameters should change.
2. X hits 110. Trailing order is submitted, with trailing amount 2.
3. X goes to 115. Great. Stop price is now 113.
4. X drops to 112. Market order triggers.

I tried attaching a condition, but got this error:

Conditional submission of orders is supported for Limit, Market, Relative and Snap order types only.

The goal is to lock in some profit and have unlimited upside, at the expense of downside risk. A Trailing Market if Touched sounds right for selling, but apparently isn't:

A sell trailing market if touched order moves with the market price, and continually recalculates the trigger price at a fixed amount above the market price, based on the user-defined "trailing" amount.

This goes against what I want for step 1 above.

Same for Trailing Limit if Touched:

As the market price falls, the trigger price falls by the user-defined trailing amount, but if the price rises, the trigger price remains the same. When the trigger price is touched, a market order is submitted.

I don't want the trigger price to fall. It should stay 110.

### CompsciOverflow

#### Is the halting problem decidable for 3 symbol one dimensional cellular automata?

I've been trying to figure out if the halting problem is decidable for 3-symbol one-dimensional cellular automata.

Definition Let $f(w,i)$ denote the configuration of the system at time step $i$. More formally $f:A^*\times \mathbb{N} \to A^*$, where $A$ is the alphabet.

Definition. A cellular automaton has halted in configuration $f(w,i)$, if $\forall k\in \mathbb{N}$ we have that $f(w,i)=f(w,i+k)$.

The halting problem for a given cellular automaton is as follows:

Input: a finite word $w$
Question: will the automaton halt in some state $s$?

Elementary cellular automata (with 2 symbols) are defined here. I am focused on the same sort of celullar automata, except that I'm interested in the case of CA's with 3 symbols instead of just 2 symbols.

From now on, I will denote my rules in the form of $***\to*$, meaning that 3 neighboring symbols produce another one beneath them.

### The halting problem is decidable for elementary, 2-symbol cellular automata

I will use $0$ to denote a white cell and $1$ to denote a black one.

If we have rules $000 \to 1$, $001 \to 1$, $100 \to 1$ we know the automaton won't halt. Because with the first rule, since our grid is infinite, we will always have 3 white cells that will generate a black cell. With the second and 3rd rules the word will be expanding to the sides and the automaton will never halt.

In the rest of the cases we can let it evolve for $2^n$ steps and see if it halts. If it does halt, then ok, it halts, if it doesn't then it is repeating some combinations and is stuck in a loop, so we can also conclude that it won't halt.

### What I have figured out for the 3 symbol case

It is obvious that it won't halt if we have rules $000 \to 1$ or $000 \to 2$. But the side rules of the form $00x \to y$ and $x00 \to y$ are harder to analyze, because what if we have rules $002 \to 1$ and $001 \to 0$?

Here's what I came up with:

let's consider all combinations of such rules:

1. $001 \to 0$ and $002 \to 0$
2. $001 \to 0$ and $002 \to 1$
3. $001 \to 0$ and $002 \to 2$
4. $001 \to 1$ and $002 \to 0$
5. $001 \to 1$ and $002 \to 1$
6. $001 \to 1$ and $002 \to 2$
7. $001 \to 2$ and $002 \to 0$
8. $001 \to 2$ and $002 \to 1$
9. $001 \to 2$ and $002 \to 2$

I didn't write the cases for the rules of the form $x00 \to y$, because those are symmetrical.

So, in the first case it's obvious that the input word won't be expanding to the sides, because those side symbol rules produce zeros.

In cases 5, 6, 8, 9 it's obvious that the automaton will never halt, because the input word will be expanding.

Cases 2,3,4,7 are more interesting. First, let's note, that case 2 is similar to case 7 and case 3 is similar to case 4. So, let's just consider cases 2 and 3 for conciseness.

I'm gonna consider case 3 first, because it's easier.

We have $001 \to 0$ and $002 \to 2$. It is obvious that if the first or last symbol of our input word is $2$, then we can conclude that the automaton won't halt. But if they are '1', then we have to look at more stuff, in particular, let's look at rules that can turn the last or first symbols into $2$, because if we have those, then after they do produce that $2$, we can conclude that the automaton won't halt. (the word will be expanding to the side(s)).

Here are all combinations that we need to consider:

010 011 012
0   0   0
0   0   1
0   0   2
0   1   0
0   1   1
........... etc


### An explanation of what happens if we have the first triple from the above table

We have a word $w$, written on the grid. The first and last symbols are $1$. Let's say we have rules $010 \to 0$, $011 \to 0$, $012 \to 0$ (the first triple) from above. Then we know that with each next step our input word will be getting smaller by 2 symbols, because these rules erase the first and last symbols, but if at some point we get a $2$, then the rule $002 \to 2$ will make the word grow to one side or the other (or both) and the automaton will never halt. So, all in all, in this case we can let the automaton do $|w|/2$ steps, and if the word becomes empty, then the automaton halts, if not, then it doesn't.

### Generalized case 3

I generalized it and noticed that we can simply let the automaton do $3^n$ steps and if at any one of those steps we have a $2$ as first or last symbol, then the automaton won't halt. If that doesn't happen and the automaton still didn't halt, then it's repeating some configuration, so it's stuck in a loop and won't halt. If it halts, then it halts.

### Where I get stuck

Now let's consider case 2.

We have rules $001 \to 0$ and $002 \to 1$.

And here is where I got stuck and don't know what to do.

I also wrote out a table of rules that start with $1$. I wrote those out, because they seemed to be the first thing I should analyze, because even if we have the input word with first or last (or both) symbol as $2$, at the next step those $2's$ will turn into a $1$. And we will have to deal with rules of the form $01x \to y$.

Here's the table:

010 011 012
0   0   0
0   0   1
0   0   2
0   1   0
0   1   1
0   1   2
0   2   0
0   2   1
0   2   2
1   0   0
1   0   1
1   0   2
1   1   0
1   1   1
1   1   2
1   2   0
1   2   1
1   2   2
2   0   0
2   0   1
2   0   2
2   1   0
2   1   1
2   1   2
2   2   0
2   2   1
2   2   2


It is also obvious, that if among our 27 rules, we have a triple from this table in which no rule derives a $2$, then we have nothing to worry about and can simply let the automaton evolve for $3^n$ steps, because it won't really expand, since the side rules will not produce a $2$.

But looking at the triples that do have a $2$, it's actually very hard to analyze, and whether the word will expand or not also seems to depend on the input word.

Can you guys tell me how to solve this? I can't seem to wrap my head around this.

Or, if this 3 symbol cellular automaton looks like something for which the halting problem has been proven to be undecidable, how can I reduce that something to 3 symbol cellular automata?

### StackOverflow

#### Keras. ValueError: I/O operation on closed file

My code:

model = Sequential()

model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy'])

X_train_shape = X_train.reshape(len(X_train), 1)
Y_train_shape = Y_train.reshape(len(Y_train), 1)
model.fit(X_train, Y_train, nb_epoch=5, batch_size=32)


And I have error, it's some random and sometimes one or two epoch competed:

Epoch 1/5 4352/17500 [======>.......................]

--------------------------------------------------------------------------- ValueError Traceback (most recent call last) in () 2 # of 32 samples 3 #sleep(0.1) ----> 4 model.fit(X_train, Y_train, nb_epoch=5, batch_size=32) 5 #sleep(0.1)

C:\Anaconda3\envs\py27\lib\site-packages\keras\models.pyc in fit(self, x, y, batch_size, nb_epoch, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, **kwargs) 395 shuffle=shuffle, 396 class_weight=class_weight, --> 397 sample_weight=sample_weight) 398 399 def evaluate(self, x, y, batch_size=32, verbose=1,

C:\Anaconda3\envs\py27\lib\site-packages\keras\engine\training.pyc in fit(self, x, y, batch_size, nb_epoch, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight) 1009 verbose=verbose, callbacks=callbacks, 1010
val_f=val_f, val_ins=val_ins, shuffle=shuffle, -> 1011 callback_metrics=callback_metrics) 1012 1013 def evaluate(self, x, y, batch_size=32, verbose=1, sample_weight=None):

C:\Anaconda3\envs\py27\lib\site-packages\keras\engine\training.pyc in _fit_loop(self, f, ins, out_labels, batch_size, nb_epoch, verbose, callbacks, val_f, val_ins, shuffle, callback_metrics) 753 batch_logs[l] = o 754 --> 755 callbacks.on_batch_end(batch_index, batch_logs) 756 757 epoch_logs = {}

C:\Anaconda3\envs\py27\lib\site-packages\keras\callbacks.pyc in on_batch_end(self, batch, logs) 58 t_before_callbacks = time.time() 59 for callback in self.callbacks: ---> 60 callback.on_batch_end(batch, logs) 61 self._delta_ts_batch_end.append(time.time() - t_before_callbacks) 62 delta_t_median = np.median(self._delta_ts_batch_end)

C:\Anaconda3\envs\py27\lib\site-packages\keras\callbacks.pyc in on_batch_end(self, batch, logs) 187 # will be handled by on_epoch_end 188 if self.verbose and self.seen < self.params['nb_sample']: --> 189 self.progbar.update(self.seen, self.log_values) 190 191 def on_epoch_end(self, epoch, logs={}):

C:\Anaconda3\envs\py27\lib\site-packages\keras\utils\generic_utils.pyc in update(self, current, values) 110 info += ((prev_total_width - self.total_width) * " ") 111 --> 112 sys.stdout.write(info) 113 sys.stdout.flush() 114

C:\Anaconda3\envs\py27\lib\site-packages\ipykernel\iostream.pyc in write(self, string) 315 316 is_child = (not self._is_master_process()) --> 317 self._buffer.write(string) 318 if is_child: 319 # newlines imply flush in subprocesses

ValueError: I/O operation on closed file

### QuantOverflow

#### Nasdaq 100 Index Liberty Media Tracking Stocks

Having trouble getting the exact changes to Nasdaq 100 Index for Liberty Media split. What were the Liberty Media related stocks in Nasdaq 100 before April 18th 2016 and then after April 18th 2016 ?

April 18th. Liberty Media splits to have 3 stocks: Liberty SiriusXM (LSXMA) Liberty Atlanta Braves (BATRA) Liberty Media (LMCA) Reference: http://seekingalpha.com/article/3966248-liberty-media-stock-happened

June 20. Dentsply Sirona (XRAY) becomes member of N100 replacing LMCA LMCK BATRA BATRK Reference: http://www.nasdaq.com/press-release/dentsply-sirona-inc-to-join-the-nasdaq100-index-beginning-june-20-2016-20160610-00582

### TheoryOverflow

#### Approximability of convex programming (convex optimization) [on hold]

Apologies for multiple posting (also posted at cs.SE), but I think this question is more relevant here at cstheory:

Convex optimization is defined here. The problem is NP-hard.

But is anything known about the approximation complexity of the problem? Does it have a PTAS (poly time approximation scheme)? Or is there a proof that a PTAS is impossible unless P=NP? Are there any known results on upper/lower bounds on its approximability?

Any information will be much appreciated. This thread is relevant, but not quite there.

### StackOverflow

#### What is denotational semantics?

I am looking for an accurate and understandable definition. The ones I have found differ from each other:

• From a book on functional reactive programming

Denotational semantics is a mathematical expression of the formal meaning of a programming language.

• However, wikipedia refers to it as an approach and not a math expression

Denotational semantics is an approach of formalizing the meanings of programming languages by constructing mathematical objects (called denotations) that describe the meanings of expressions from the languages

#### How does one keep track of the training error of a Neural Network when using Batch Normalization in TensorFlow?

I wanted to keep track of my training error as the Neural Network is trained. During testing, it is customary to remove the batch normalization layer. For example:

# when test
# is_training determines if Batch-norm is off or on
error = sess.run([opt, loss], feed_dict={x: bx, y: by, is_training=False})


however, say that I want to track the training error of a Neural Network as the number of iterations increases. For simplicity assume that the data sets are large enough to train the model and small enough that computing the error on the entire data set is feasible (I know that its possible to use batches to make computations more efficient or moving averages, but that is besides the point of my question). In this case is the correct way to track the training error by turning off the batch normalization as follows:

bx, by = X_train, Y_train
train_error = sess.run([opt, loss], feed_dict={x: bx, y: by, is_training=False})


i.e. should we use the data that we trained but turn off batch normalization?

Notice that if we are doing training is_training=True should invariable be true. i.e. the training step is:

bx, by = get_batch(X_train, Y_train)
b_error = sess.run([opt, loss], feed_dict={x: bx, y: by, is_training=True})


however what confuses me is if we should report b_error or train_error. In other words, when I want to track my training error, should the batch normalization layer be off or on? Obviously it should be on during training and off when I pass in the test set, but when I want to report the train error during training, should it be off?

Notice that its obvious that batch normalization layer should be off when passing the test or cross validation (CV) data

bx, by = X_test, Y_test # or X_cv, Y_cv
test_error = sess.run([opt, loss], feed_dict={x: bx, y: by, is_training=False})


#### Avoid using global variable in Java 8 stream reduce method

I am trying to use Java 8 to rewrite the implementation of Moore’s Voting Algorithm to find the Majority Element in an array.

The Java 7 implementation will be something like this:

public int findCandidate(int[] nums) {

int maj_index = 0, count = 1;
for(int i=1; i<nums.length;i++){
if(count==0){
count++;
maj_index=i;
}else if(nums[maj_index]==nums[i]){
count++;
} else {
count--;
}
}
return nums[maj_index];
}


The method I can think of is using stream reduce to get the final result

public int findCandidate(int[] nums) {
int count = 1;
Arrays
.asList(nums)
.stream()
.reduce(0, (result, cur) -> {
if (count == 0) {
result = cur;
count++;
} else if (result == cur){
count++;
} else {
count --;
}
});
return result;
}


But this method have compile error, besides, it also break the functional purist, I encounter this situation many times, so what is the best way to deal with the global variable inside the lambda expression.

### arXiv Discrete Mathematics

#### Symmetry-free SDP Relaxations for Affine Subspace Clustering. (arXiv:1607.07387v1 [math.OC])

We consider clustering problems where the goal is to determine an optimal partition of a given point set in Euclidean space in terms of a collection of affine subspaces. While there is vast literature on heuristics for this kind of problem, such approaches are known to be susceptible to poor initializations and getting trapped in bad local optima. We alleviate these issues by introducing a semidefinite relaxation based on Lasserre's method of moments. While a similiar approach is known for classical Euclidean clustering problems, a generalization to our more general subspace scenario is not straightforward, due to the high symmetry of the objective function that weakens any convex relaxation. We therefore introduce a new mechanism for symmetry breaking based on covering the feasible region with polytopes. Additionally, we introduce and analyze a deterministic rounding heuristic.

#### Spark Parameter Tuning via Trial-and-Error. (arXiv:1607.07348v1 [cs.DC])

Spark has been established as an attractive platform for big data analysis, since it manages to hide most of the complexities related to parallelism, fault tolerance and cluster setting from developers. However, this comes at the expense of having over 150 configurable parameters, the impact of which cannot be exhaustively examined due to the exponential amount of their combinations. The default values allow developers to quickly deploy their applications but leave the question as to whether performance can be improved open. In this work, we investigate the impact of the most important of the tunable Spark parameters on the application performance and guide developers on how to proceed to changes to the default values. We conduct a series of experiments with known benchmarks on the MareNostrum petascale supercomputer to test the performance sensitivity. More importantly, we offer a trial-and-error methodology for tuning parameters in arbitrary applications based on evidence from a very small number of experimental runs. We test our methodology in three case studies, where we manage to achieve speedups of more than 10 times.

#### Noetherian Quasi-Polish Spaces. (arXiv:1607.07291v1 [math.GN])

In the presence of suitable power spaces, compactness can be characterized as the singleton containing the empty set being open in the hyperspace of closed subsets. Equivalently, this means that universal quantification over a compact space preserves open predicates.

Using the language of represented spaces, one can make sense of notions such as a $\Sigma^0_2$-subset of the space of $\Sigma^0_2$-subsets of a given space. This suggests higher-order analogues to compactness: We can, e.g.~, investigate the spaces where the singleton containing the empty set is a $\Delta^0_2$-subset of the space of $\Delta^0_2$-subsets. Call this notion $\nabla$-compactness. As $\Delta^0_2$ is self-dual, we find that both universal and existential quantifier over $\nabla$-compact spaces preserve $\Delta^0_2$ predicates.

Recall that a space is called Noetherian iff every subset is compact. Within the setting of Quasi-Polish spaces, we can fully characterize the $\nabla$-compact spaces: A Quasi-Polish space is Noetherian iff it is $\nabla$-compact. Note that the restriction to Quasi-Polish spaces is sufficiently general to include plenty of examples.

#### Session Types for Link Failures (Technical Report). (arXiv:1607.07286v1 [cs.LO])

We strive to use session type technology to prove behavioural properties of fault-tolerant distributed algorithms. Session types are designed to abstractly capture the structure of (even multi-party) communication protocols. The goal of session types is the analysis and verification of the protocols' behavioural properties. One important such property is progress, i.e., the absence of (unintended) deadlock. Distributed algorithms often resemble (compositions of) multi-party communication protocols. In contrast to protocols that are typically studied with session types, they are often designed to cope with system failures. An essential behavioural property is (successful) termination, despite failures, but it is often elaborate to prove for distributed algorithms.

We extend multi-party session types with nested sessions by optional blocks that cover a limited class of link (and crash) failures. This allows us to automatically derive termination of distributed algorithms that come within these limits. To illustrate our approach, we prove termination for an implementation of the "rotating coordinator" Consensus algorithm.

#### An Evolutionary Algorithm to Learn SPARQL Queries for Source-Target-Pairs: Finding Patterns for Human Associations in DBpedia. (arXiv:1607.07249v2 [cs.AI] UPDATED)

Efficient usage of the knowledge provided by the Linked Data community is often hindered by the need for domain experts to formulate the right SPARQL queries to answer questions. For new questions they have to decide which datasets are suitable and in which terminology and modelling style to phrase the SPARQL query.

In this work we present an evolutionary algorithm to help with this challenging task. Given a training list of source-target node-pair examples our algorithm can learn patterns (SPARQL queries) from a SPARQL endpoint. The learned patterns can be visualised to form the basis for further investigation, or they can be used to predict target nodes for new source nodes.

Amongst others, we apply our algorithm to a dataset of several hundred human associations (such as "circle - square") to find patterns for them in DBpedia. We show the scalability of the algorithm by running it against a SPARQL endpoint loaded with > 7.9 billion triples. Further, we use the resulting SPARQL queries to mimic human associations with a Mean Average Precision (MAP) of 39.9 % and a Recall@10 of 63.9 %.

#### Approximating Multicut and the Demand Graph. (arXiv:1607.07200v1 [cs.DM])

In the minimum Multicut problem, the input is an edge-weighted supply graph $G=(V,E)$ and a simple demand graph $H=(V,F)$. Either $G$ and $H$ are directed (DMulC) or both are undirected (UMulC). The goal is to remove a minimum weight set of edges in $G$ such that there is no path from $s$ to $t$ in the remaining graph for any $(s,t) \in F$. UMulC admits an $O(\log k)$-approximation where $k$ is the vertex cover size of $H$ while the best known approximation for DMulC is $\min\{k, \tilde{O}(n^{11/23})\}$. These approximations are obtained by proving corresponding results on the multicommodity flow-cut gap. In contrast to these results some special cases of Multicut, such as the well-studied Multiway Cut problem, admit a constant factor approximation in both undirected and directed graphs. Motivated by both concrete instances from applications and abstract considerations, we consider the role that the structure of the demand graph $H$ plays in determining the approximability of Multicut.

In undirected graphs our main result is a $2$-approximation in $n^{O(t)}$ time when the demand graph $H$ excludes an induced matching of size $t$. This gives a constant factor approximation for a specific demand graph that motivated this work.

In contrast to undirected graphs, we prove that in directed graphs such approximation algorithms can not exist. Assuming the Unique Games Conjecture (UGC), for a large class of fixed demand graphs DMulC cannot be approximated to a factor better than worst-case flow-cut gap. As a consequence we prove that for any fixed $k$, assuming UGC, DMulC with $k$ demand pairs is hard to approximate to within a factor better than $k$. On the positive side, we prove an approximation of $k$ when the demand graph excludes certain graphs as an induced subgraph. This generalizes the Multiway Cut result to a much larger class of demand graphs.

#### On the Hourglass Model. (arXiv:1607.07183v2 [cs.NI] UPDATED)

The hourglass model is a widely used as a means of describing the design of the Internet, and can be found in the introduction of many modern textbooks. It arguably also applies to the design of other successful spanning layers, notably the Unix operating system kernel interface, meaning the primitive system calls and the interactions between user processes and the kernel. The impressive success of the Internet has led to a wider interest in using the hourglass model in other layered systems, with the goal of achieving similar results. However, application of the hourglass model has often led to controversy, perhaps in part because the language in which it has been expressed has been informal, and arguments for its validity have not been precise. Making a start on formalizing such an argument is the goal of this paper.

#### The $k$-strong induced arboricity of a graph. (arXiv:1607.07174v1 [math.CO])

The induced arboricity of a graph $G$ is the smallest number of induced forests covering the edges of $G$. This is a well-defined parameter bounded from above by the number of edges of $G$ when each forest in a cover consists of exactly one edge. Not all edges of a graph necessarily belong to induced forests with larger components. For $k\geq 1$, we call an edge $k$-valid if it is contained in an induced tree on $k$ edges. The $k$-strong induced arboricity of $G$, denoted by $f_k(G)$, is the smallest number of induced forests with components of sizes at least $k$ that cover all $k$-valid edges in $G$. This parameter is highly non-monotone. However, we prove that for any proper minor-closed graph class $\mathcal{C}$, and more generally for any class of bounded expansion, and any $k \geq 1$, the maximum value of $f_k(G)$ for $G \in \mathcal{C}$ is bounded from above by a constant depending only on $\mathcal{C}$ and $k$.

We prove that $f_2(G) \leq 3\binom{t+1}{3}$ for any graph $G$ of tree-width~$t$ and that $f_k(G) \leq (2k)^d$ for any graph of tree-depth $d$. In addition, we prove that $f_2(G) \leq 310$ when $G$ is planar, which implies that the maximum adjacent closed vertex-distinguishing chromatic number of planar graphs is constant.

#### Multiparty Quantum Private Comparsion with Individually Dishonest Third Parties for Strangers. (arXiv:1607.07119v1 [quant-ph])

This study explores a new security problem existing in various state-of-the-art quantum private comparison (QPC) protocols, where a malicious third-party (TP) announces fake comparison (or intermediate) results. In this case, the participants could eventually be led to a wrong direction and the QPC will become fraudulent. In order to resolve this problem, a new level of trustworthiness for TP is defined and a new QPC protocol is proposed, where a second TP is introduced to monitor the first one. Once a TP announces a fake comparison (or intermediate) result, participants can detect the fraud immediately. Besides, due to the introduction of the second TP, the proposed protocol allows strangers to compare their secrets privately, whereas the state-of-the-art QPCs require the involved clients to know each other before running the protocol.

#### Revenue Gaps for Discriminatory and Anonymous Sequential Posted Pricing. (arXiv:1607.07105v1 [cs.GT])

We consider the problem of selling a single item to one of $n$ bidders who arrive sequentially with values drawn independently from identical distributions, and ask how much more revenue can be obtained by posting discriminatory prices to individual bidders rather than the same anonymous price to all of them. The ratio between the maximum revenue from discriminatory pricing and that from anonymous pricing is at most $2-1/n$ for arbitrary distributions and at most $1/(1-(1-1/n)^n)\leq e/(e-1)\approx 1.582$ for regular distributions, and these bounds can in fact be obtained by using one of the discriminatory prices as an anonymous one. The bounds are shown via a relaxation of the discriminatory pricing problem rather than virtual values and thus apply to distributions without a density, and they are tight for all values of $n$. For a class of distributions that includes the uniform and the exponential distribution we show the maximization of revenue to be equivalent to the maximization of welfare with an additional bidder, in the sense that both use the same discriminatory prices. The problem of welfare maximization is the well-known Cayley-Moser problem, and this connection can be used to establish that the revenue gap between discriminatory and anonymous pricing is approximately $1.037$ for the uniform distribution and approximately $1.073$ for the exponential distribution.

#### Inverse Optimization of Convex Risk Functions. (arXiv:1607.07099v1 [math.OC])

The theory of convex risk functions has now been well established as the basis for identifying the families of risk functions that should be used in risk averse optimization problems. Despite its theoretical appeal, the implementation of a convex risk function remains difficult, as there is little guidance regarding how a convex risk function should be chosen so that it also well represents one's own risk preferences. In this paper, we address this issue through the lens of inverse optimization. Specifically, given solution data from some (forward) risk-averse optimization problems we develop an inverse optimization framework that generates a risk function that renders the solutions optimal for the forward problems. The framework incorporates the well-known properties of convex risk functions, namely, monotonicity, convexity, translation invariance, and law invariance, as the general information about candidate risk functions, and also the feedbacks from individuals, which include an initial estimate of the risk function and pairwise comparisons among random losses, as the more specific information. Our framework is particularly novel in that unlike classical inverse optimization, no parametric assumption is made about the risk function, i.e. it is non-parametric. We show how the resulting inverse optimization problems can be reformulated as convex programs and are polynomially solvable if the corresponding forward problems are polynomially solvable. We illustrate the imputed risk functions in a portfolio selection problem and demonstrate their practical value using real-life data.

#### On the edge capacitated Steiner tree problem. (arXiv:1607.07082v1 [cs.DM])

Given a graph G = (V,E) with a root r in V, positive capacities {c(e)|e in E}, and non-negative lengths {l(e)|e in E}, the minimum-length (rooted) edge capacitated Steiner tree problem is to find a tree in G of minimum total length, rooted at r, spanning a given subset T of vertices, and such that, for each e in E, there are at most c(e) paths, linking r to vertices in T, that contain e. We study the complexity and approximability of the problem, considering several relevant parameters such as the number of terminals, the edge lengths and the minimum and maximum edge capacities. For all but one combinations of assumptions regarding these parameters, we settle the question, giving a complete characterization that separates tractable cases from hard ones. The only remaining open case is proved to be equivalent to a long-standing open problem. We also prove close relations between our problem and classical Steiner tree as well as vertex-disjoint paths problems.

#### Joint Source-Channel Secrecy Using Analog Coding: Towards Secure Source Broadcast. (arXiv:1607.07040v1 [cs.IT])

This paper investigates joint source-channel secrecy for Gaussian broadcast communication in Shannon cipher system. We use a recently proposed secrecy measure, list secrecy, to measure secrecy, in which an eavesdropper is allowed to produce a list of reconstruction sequences and the secrecy is measured by the minimum distortion over the entire list. For achievability part, we propose a novel joint source-channel secrecy scheme, which cascades the traditional linear joint source-channel coding (analog coding) with a random orthogonal transform. For this scheme, we characterize the set of achievable tuples (secret key rate, list rate, eavesdropper distortion, distortions of all legitimate users). Besides, one of existing schemes, sign-change based scheme, is analyzed, and by comparison the proposed scheme outperforms this sign-change based scheme. For the converse part, we provide a necessary condition on the achievability of any tuple, and comparing this converse with the achievability result of the proposed scheme shows that the proposed scheme is optimal under some certain conditions. Besides, we also extend the proposed scheme and the corresponding achievability result to vector Gaussian communication scenario. No matter for scalar or vector Gaussian communication case, the codebook in the proposed scheme consists of a sequence of random matrices. This is very different from the traditional construction of codebook in information theory that usually consists of a sequence of random samples.

#### Analytical Modeling of IEEE 802.11 Type CSMA/CA Networks with Short Term Unfairness. (arXiv:1607.07021v1 [cs.NI])

We consider single-hop topologies with saturated transmitting nodes, using IEEE~802.11 DCF for medium access. However, unlike the conventional WiFi, we study systems where one or more of the protocol parameters are different from the standard, and/or where the propagation delays among the nodes are not negligible compared to the duration of a backoff slot. We observe that for several classes of protocol parameters, and for large propagation delays, such systems exhibit a certain performance anomaly known as short term unfairness, which may lead to severe performance degradation. The standard fixed point analysis technique (and its simple extensions) do not predict the system behavior well in such cases; a mean field model based asymptotic approach also is not adequate to predict the performance for networks of practical sizes in such cases. We provide a detailed stochastic model that accurately captures the system evolution. Since an exact analysis of this model is computationally intractable, we develop a novel approximate, but accurate, analysis that uses a parsimonious state representation for computational tractability. Apart from providing insights into the system behavior, the analytical method is also able to quantify the extent of short term unfairness in the system, and can therefore be used for tuning the protocol parameters to achieve desired throughput and fairness objectives.

#### Architecture for Community-scale Critical Infrastructure Coordination for Security and Resilience. (arXiv:1607.06992v1 [cs.CR])

Our Critical Infrastructure (CI) systems are, by definition, critical to the safe and proper functioning of society. Nearly all of these systems utilize industrial Process Control Systems (PCS) to provide clean water, reliable electricity, critical manufacturing, and many other services within our communities - yet most of these PCS incorporate very little cyber-security countermeasures. Cyber-attacks on CI are becoming an attractive target. While many vendor solutions are starting to be deployed at CI sites, these solutions are largely based on network monitoring for intrusion detection. As such, they are not process-aware, nor do they account for inter dependencies among other CI sites in their community. What is proposed is an architecture for coordinating all CI within a community, which defines characteristics to enhance its integration, its resilience to failure and attack, and its ultimate acceptance by CI operators.

#### A Space of Phylogenetic Networks. (arXiv:1607.06978v1 [math.CO])

A classic problem in computational biology is constructing a phylogenetic tree given a set of distances between n species. In most cases, a tree structure is too constraining. We consider a circular split network, a generalization of a tree in which multiple parallel edges signify divergence. A geometric space of such networks is introduced, forming a natural extension of the work by Billera, Holmes, and Vogtmann on tree space. We explore properties of this space, and show a natural embedding of the compactification of the real moduli space of curves within it.

#### Satisfiability Checking and Symbolic Computation. (arXiv:1607.06945v1 [cs.SC])

Symbolic Computation and Satisfiability Checking are viewed as individual research areas, but they share common interests in the development, implementation and application of decision procedures for arithmetic theories. Despite these commonalities, the two communities are currently only weakly connected. We introduce a new project SC-square to build a joint community in this area, supported by a newly accepted EU (H2020-FETOPEN-CSA) project of the same name. We aim to strengthen the connection between these communities by creating common platforms, initiating interaction and exchange, identifying common challenges, and developing a common roadmap. This abstract and accompanying poster describes the motivation and aims for the project, and reports on the first activities.

We present a static deadlock analysis approach for C/pthreads. The design of our method has been guided by the requirement to analyse real-world code. Our approach is sound (i.e., misses no deadlocks) for programs that have defined behaviour according to the C standard, and precise enough to prove deadlock-freedom for a large number of programs. The method consists of a pipeline of several analyses that build on a new context- and thread-sensitive abstract interpretation framework. We further present a lightweight dependency analysis to identify statements relevant to deadlock analysis and thus speed up the overall analysis. In our experimental evaluation, we succeeded to prove deadlock-freedom for 262 programs from the Debian GNU/Linux distribution with in total 2.6 MLOC in less than 11 hours.

#### Fixing improper colorings of graphs. (arXiv:1607.06911v1 [cs.DM])

In this paper we consider a variation of a recoloring problem, called the Color-Fixing. Let us have some non-proper $r$-coloring $\varphi$ of a graph $G$. We investigate the problem of finding a proper $r$-coloring of $G$, which is "the most similar" to $\varphi$, i.e. the number $k$ of vertices that have to be recolored is minimum possible. We observe that the problem is NP-complete for any $r \geq 3$, even for bipartite planar graphs. On the other hand, the problem is fixed-parameter tractable, when parameterized by the number of allowed transformations $k$. We provide an $2^n \cdot n^{\mathcal{O}(1)}$ algorithm for the problem (for any fixed $r$) and a linear algorithm for graphs with bounded treewidth. We also show several lower complexity bounds, using standard complexity assumptions. Finally, we investigate the {\em fixing number} of a graph $G$. It is the maximum possible distance (in the number of transformations) between some non-proper coloring of $G$ and a proper one.

#### Rank Correlation Measure: A Representational Transformation for Biometric Template Protection. (arXiv:1607.06902v1 [cs.CV])

Despite a variety of theoretical-sound techniques have been proposed for biometric template protection, there is rarely practical solution that guarantees non-invertibility, cancellability, non-linkability and performance simultaneously. In this paper, a ranking-based representational transformation is proposed for fingerprint templates. The proposed method transforms a real-valued feature vector into index code such that the pairwise-order measure in the resultant codes are closely correlated with rank similarity measure. Such a ranking based technique offers two major merits: 1) Resilient to noises/perturbations in numeric values; and 2) Highly nonlinear embedding based on partial order statistics. The former takes care of the accuracy performance mitigating numeric noises/perturbations while the latter offers strong non-invertible transformation via nonlinear feature embedding from Euclidean to Rank space that leads to toughness in inversion. The experimental results demonstrate reasonable accuracy performance on benchmark FVC2002 and FVC2004 fingerprint databases, thus confirm the proposition of the rank correlation. Moreover, the security and privacy analysis justify the strong capability against the existing major privacy attacks.

#### Dial One for Scam:Analyzing and Detecting Technical Support Scams. (arXiv:1607.06891v1 [cs.CR])

In technical support scams, cybercriminals attempt to convince users that their machines are infected with malware and are in need of their technical support. In this process, the victims are asked to provide scammers with remote access to their machines, who will then "diagnose the problem", before offering their support services which typically cost hundreds of dollars. Despite their conceptual simplicity, technical support scams are responsible for yearly losses of tens of millions of dollars from everyday users of the web. In this paper, we report on the first systematic study of technical support scams and the call centers hidden behind them. We identify malvertising as a major culprit for exposing users to technical support scams and use it to build an automated system capable of discovering, on a weekly basis, hundreds of phone numbers and domains operated by scammers. By allowing our system to run for more than 8 months we collect a large corpus of technical support scams and use it to provide insights on their prevalence, the abused infrastructure, and the current evasion attempts of scammers. Finally, by setting up a controlled, IRB-approved, experiment where we interact with 60 different scammers, we experience first-hand their social engineering tactics, while collecting detailed statistics of the entire process. We explain how our findings can be of use to law-enforcing agencies and propose technical and educational countermeasures for helping users avoid being victimized by technical support scams.

#### Searching for the Internet of Things on the Web: Where It Is and What It Looks Like. (arXiv:1607.06884v1 [cs.IR])

The Internet of Things (IoT), in general, is a compelling paradigm that aims to connect everyday objects to the Internet. Nowadays, IoT is considered as one of the main technologies which contribute towards reshaping our daily lives in the next decade. IoT unlocks many exciting new opportunities in a variety of applications in research and industry domains. However, many have complained about the absence of the real-world IoT data. Unsurprisingly, a common question that arises regularly nowadays is "Does the IoT already exist?". So far, little has been known about the real-world situation on IoT, its attributes, the presentation of data and user interests. To answer this question, in this work, we conduct an in-depth analytical investigation on real IoT data. More specifically, we identify IoT data sources over the Web and develop a crawler engine to collect large-scale real-world IoT data for the first time. We make the results of our work available to the public in order to assist the community in the future research. In particular, we collect the data of nearly two million Internet connected objects and study trends in IoT using a real-world query set from an IoT search engine. Based on the collected data and our analysis, we identify the typical characteristics of IoT data. The most intriguing finding of our study is that IoT data is mainly disseminated using Web Mapping while the emerging IoT solutions such as the Web of Things, are currently not well adopted. On top of our findings, we further discuss future challenges and open research problems in the IoT area.

#### Deciding whether there are infinitely many prime graphs with forbidden induced subgraphs. (arXiv:1607.06864v1 [cs.DM])

A homogeneous set of a graph $G$ is a set $X$ of vertices such that $2\le \lvert X\rvert <\lvert V(G)\rvert$ and no vertex in $V(G)-X$ has both a neighbor and a non-neighbor in $X$. A graph is prime if it has no homogeneous set.

We present an algorithm to decide whether a class of graphs given by a finite set of forbidden induced subgraphs contains infinitely many non-isomorphic prime graphs.

#### Decentralized Bayesian learning in dynamic games. (arXiv:1607.06847v1 [cs.GT])

We study the problem of decentralized Bayesian learning in a dynamical system involving strategic agents with asymmetric information. In a series of seminal papers in the literature, this problem has been studied under a simplifying model where selfish players appear sequentially and act once in the game, based on private noisy observations of the system state and public observation of past players' actions. It is shown that there exist information cascades where users discard their private information and mimic the action of their predecessor. In this paper, we provide a framework for studying Bayesian learning dynamics in a more general setting than the one described above. In particular, our model incorporates cases where players participate for the whole duration of the game, and cases where an endogenous process selects which subset of players will act at each time instance. The proposed methodology hinges on a sequential decomposition for finding perfect Bayesian equilibria (PBE) of a general class of dynamic games with asymmetric information, where user-specific states evolve as conditionally independent Markov process and users make independent noisy observations of their states. Using our methodology, we study a specific dynamic learning model where players make decisions about investing in the team, based on their estimates of everyone's types. We characterize a set of informational cascades for this problem where learning stops for the team as a whole.

#### A Search Algorithm for Simplicial Complexes

Authors: Subhrajit Bhattacharya
Abstract: We present the `Basic S*' algorithm for computing shortest path through a metric simplicial complex. In particular, we consider the Rips complex constructed out of a given metric graph, $G$. Such a complex, and hence shortest paths in it, represent the underlying metric space (whose discrete representation is the graph) more closely than what a graph would do. While discrete graph representations of continuous spaces is convenient for motion planning in configuration spaces of robotic systems, the metric induced in them by the ambient configuration space is different from the metric of the configuration space itself. We remedy this problem using a simplicial complex instead. Our algorithm is local in nature, requiring only an abstract graph, $G=(V,E)$, and a cost/length function, $d:E\rightarrow \mathbb{R}_+$, and no other global information such as an embedding is required. The complexity of our algorithm is comparable to that of Dijkstra's search algorithm, but, as the results presented in this paper demonstrate, the shortest paths obtained using the proposed algorithm represent/approximate the geodesic paths in the original metric space much more closely.

#### Covering segments with axis-parallel unit squares

Authors: Ankush Acharyya, Subhas C. Nandy, Supantha Pandit, Sasanka Roy
Abstract: We study various kinds of line segments covering problems with axis-parallel unit squares in two dimensions. A set $S$ of $n$ line segments is given. The objective is to find the minimum number of axis-parallel unit squares which cover at least one end-point of each segment. We may have different variations of this problem depending on the orientation and length of the input segments. We prove some of these problems to be NP-hard, and give constant factor approximation algorithms for those problems. For some variations, we have polynomial time exact algorithms. Further, we show that our problems have connections with the problems studied by Arkin et al. (2015) on conflict-free covering problem. We also improve approximation factors of some of their problems.

#### Exploiting Symmetry and/or Manhattan Properties for 3D Object Structure Estimation from Single and Multiple Images

Authors: Yuan Gao, Alan L. Yuille
Abstract: Many man-made objects have intrinsic symmetries and Manhattan structure. By assuming an orthographic projection model, this paper addresses the estimation of 3D structures and camera projection using symmetry and/or Manhattan structure cues, for the two cases when the input is a single image or multiple images from the same category, e.g. multiple different cars. Specifically, analysis on single image case implies that Manhattan alone is sufficient to recover the camera projection, then the 3D structure can be reconstructed uniquely exploiting symmetry. But Manhattan structure can be hard to observe from single image due to occlusion. Hence, we extend to the multiple image case which can also exploit symmetry but does not require Manhattan axes. We propose a new rigid structure from motion method, exploiting symmetry, using multiple images from the same category as input. Our results on Pascal3D+ dataset show that our methods can significantly outperform baseline methods.

#### Sliding k-Transmitters: Hardness and Approximation

Authors: Therese Biedl, Saeed Mehrabi, Ziting Yu
Abstract: A sliding k-transmitter in an orthogonal polygon P is a mobile guard that travels back and forth along an orthogonal line segment s inside P. It can see a point p in P if the perpendicular from p onto s intersects the boundary of P at most k times. We show that guarding an orthogonal polygon P with the minimum number of k-transmitters is NP-hard, for any fixed k>0, even if P is simple and monotone. Moreover, we give an O(1)-approximation algorithm for this problem.

#### Unfolding Convex Polyhedra via Radially Monotone Cut Trees

Authors: Joseph O'Rourke
Abstract: A notion of "radially monotone" cut paths is introduced as an effective choice for finding a non-overlapping edge-unfolding of a convex polyhedron. These paths have the property that the two sides of the cut avoid overlap locally as the cut is infinitesimally opened by the curvature at the vertices along the path. It is shown that a class of planar, triangulated convex domains always have a radially monotone spanning forest, a forest that can be found by an essentially greedy algorithm. This algorithm can be mimicked in 3D and applied to polyhedra inscribed in a sphere. Although the algorithm does not provably find a radially monotone cut tree, it in fact does find such a tree with high frequency, and after cutting unfolds without overlap. This performance of a greedy algorithm leads to the conjecture that spherical polyhedra always have a radially monotone cut tree and unfold without overlap.

#### Unique Set Cover on Unit Disks and Unit Squares

Authors: Saeed Mehrabi
Abstract: We study the Unique Set Cover problem on unit disks and unit squares. For a given set $P$ of $n$ points and a set $D$ of $m$ geometric objects both in the plane, the objective of the Unique Set Cover problem is to select a subset $D'\subseteq D$ of objects such that every point in $P$ is covered by at least one object in $D'$ and the number of points covered uniquely is maximized, where a point is covered uniquely if the point is covered by exactly one object in $D'$. In this paper, (i) we show that the Unique Set Cover is NP-hard on both unit disks and unit squares, and (ii) we give a PTAS for this problem on unit squares by applying the mod-one approach of Chan and Hu (Comput. Geom. 48(5), 2015).
#### Incremental $2$-Edge-Connectivity in Directed Graphs
Abstract: In this paper, we initiate the study of the dynamic maintenance of $2$-edge-connectivity relationships in directed graphs. We present an algorithm that can update the $2$-edge-connected blocks of a directed graph with $n$ vertices through a sequence of $m$ edge insertions in a total of $O(mn)$ time. After each insertion, we can answer the following queries in asymptotically optimal time: (i) Test in constant time if two query vertices $v$ and $w$ are $2$-edge-connected. Moreover, if $v$ and $w$ are not $2$-edge-connected, we can produce in constant time a "witness" of this property, by exhibiting an edge that is contained in all paths from $v$ to $w$ or in all paths from $w$ to $v$. (ii) Report in $O(n)$ time all the $2$-edge-connected blocks of $G$. To the best of our knowledge, this is the first dynamic algorithm for $2$-connectivity problems on directed graphs, and it matches the best known bounds for simpler problems, such as incremental transitive closure.
#### Polynomial Time Algorithm for $2$-Stable Clustering Instances
Abstract: Clustering with most objective functions is NP-Hard, even to approximate well in the worst case. Recently, there has been work on exploring different notions of stability which lend structure to the problem. The notion of stability, $\alpha$-perturbation resilience, that we study in this paper was originally introduced by Bilu et al.~\cite{Bilu10}. The works of Awasthi et al~\cite{Awasthi12} and Balcan et al.~\cite{Balcan12} provide a polynomial time algorithm for $3$-stable and $(1+\sqrt{2})$-stable instances respectively. This paper provides a polynomial time algorithm for $2$-stable instances, improving on and answering an open question in ~\cite{Balcan12}.