Planet Primates

May 29, 2016


Difference between let, fun and function in F#

I'm learning F# and I cannot figure out what the difference between let, fun and function is, and my text book doesn't really explain that either. As an example:

 let s sym = function
 | V x -> Map.containsKey x sym
 | A(f, es) -> Map.containsKey f sym && List.forall (s sym) es;;

Couldn't I have written this without the function keyword? Or could I have written that with fun instead of function? And why do I have to write let when I've seen some examples where you write

fun s x = 

What's the difference really?

by eee at May 29, 2016 11:10 AM

compiler backend for a functional language

I have written an interpreter for a functional language, which I niw try to bootstrap using a compiler. The language has a dynamic type system and uses lists, numbers and strings. It also is functional and a function takes its argumenta as a list (like perls @_, or js arguments). Functions are first class and can be nested. Which language should I target with the compiler, btw. I think targetting a static typed imperative language, like C would be hard. The language should support dynamic typing and functional programming (js wouldnt be nice - the language should have a compiler itself, like common lisp)

by ein mensch at May 29, 2016 11:01 AM

Fred Wilson

11 Years of the USV Investment Team

As part of the hiring process for our two year analyst position, we asked everyone who has worked at USV in an investment team role to make a two minute video (the same thing we ask applicants to do). We asked them to answer these two questions – when did you join USV and how has your perspective on tech changed since then? Here’s a compilation of all of the two minute videos. In its entirety it is about 25 mins long.

11 Years of the USV Investment Team from Union Square Ventures on Vimeo.

by Fred Wilson at May 29, 2016 10:53 AM


Is there a broader class of total functions than $PR$?

In total functional programming programs are restricted to total computable functions. A well-known class of total functions are the primitive recursive functions ($PR$). However the Ackermann function and others aren't part of $PR$. Is there a well-defined class of total functions which is broader than $PR$?

Bonus question: I heard lately about Walther and substructural recursion, but I can't find any information whether they are just another definition of $PR$ or different classes of total functions. Could you please clarify? Also a little example on how the definition differ from $PR$ would be great.

by Peter at May 29, 2016 10:41 AM

Is it possible to derive a deterministic CSPRNG given two functions, at least one of which is a CSPRNG?

Let f and g be two functions with integer range 0..m-1. They may keep state and interact with the world (for example setting a seed or reading the current time), calling them multiple times may produce different results. f and g can see each other's state, but cannot modify the other's state.

Assume at least one of f and g is a cryptographically secure pseudorandom number generator, but it is unknown which one. Is it possible to create a function h that uses f and g and behaves as a CSPRNG? h is not allowed to set or read external state directly, the only way it can modify or read state is by calling f and g and observing their results. h should work for any given f and g.

Of course, h is not allowed to use any "true" source of randomness, and ideally the construction should not involve passing randomness tests.

As a related problem, I believe that if f and g were perfectly random, then f + g (mod m) would also be perfectly random. But I think in this deterministic case, it's always possible to create a g such that it "cancels out" f in h. Not sure how to prove this.

by quantumtremor at May 29, 2016 10:35 AM



Minimum Variance Portfolio problem

Minimum Variance Portfolio Suppose there are N stocks in the investmentable universe and we have a fully invested portfolio investing 100% of the capital. The Covariance matrix is denoted as ∑. We are interested in finding the portfolio with minimum variance. An investor choosing this portfolio is only concerned about the risk of the portfolio. Denoting a vector of ones by i=(1,….,1)^' , we have the following optimization problem:
$$Minimize \frac{1}{2}w'∑w$$

$$Subject to : w’ .i = w1 + w2 +……+ w_N =1$$

I would like to solve above problem using Lagrangian multipliers.

by Atul Agarawal at May 29, 2016 10:32 AM


Decrease distance between max and min

Let $a:=(a_1,a_2,\ldots,a_n) \in \mathbb{Z}^n $ and $k \in \mathbb{N}^*$, with

$$f: \begin{cases} \hfill \mathbb{Z}^n \times \mathbb{N}^* \hfill &\rightarrow \mathbb{Z}^n \\ \hfill ((a_1,a_2,\ldots,a_n), k) \hfill & \mapsto (b_1,b_2,\ldots,b_n)\\ \end{cases}$$

$f$ satisfies the following constraints:

  1. $(\forall b_{i=1,2,\ldots,n}\in \mathbb{Z})(\exists! a_{j=1,2,\ldots,n}\in \mathbb{Z}): b_i= a_j \pm k$
  2. $b_{\max} - b_{\min} \le a_{\max} - a_{\min}$

The task is to find a function $f$(or an algorithm) that satisfies the listed constraints. Could anyone give some hints how can I deal with this problem?


Example 1: $$f((1,7),4)=(5,3)$$ First constraint: $$5 = 1 + 4$$ $$3 = 7 - 4$$

Second constraint: $$5 - 3 \le 7 - 1$$

Example 2: $$f((1,2,3,4,5,6,7),5)=(6,7,\ldots,12)$$ First constraint: $$6 = 1 + 5$$ $$7 = 2 + 5$$ $$\vdots$$ $$12 = 7 + 5$$ Second constraint: $$12 - 6 \le 7 - 1$$

Thanks in advance

by Ayoub Falah at May 29, 2016 10:08 AM



Copulas and default probability

Assume a basket of 3 credits, each with some unconditional default probability ${q_i}(t) = \Pr [{\tau _i} \le t]$.

Consider the joint CDF $H$ of the default times is given by $H(t,t,t) = \Pr [{\tau _1} \le t,{\tau _2} \le t,{\tau _3} \le t] = C({q_1}(t),{q_2}(t),{q_3}(t))$, where $C$ is a known copula function (e.g. Archimedan).

My question is: is there some (possibly Copula-based) representation of a function $G$ defined as $G(t,t,t) = \Pr [{\tau _1} > t,{\tau _2} \le t,{\tau _3} \le t]$ ?

I know a survival copula ${\bar C}$ can be constructed from $C$ but this is not entirely what I want as I want a joint probability of the last two names to default and the first name to survive.


by user2743931 at May 29, 2016 09:36 AM

How are Quandl monthly S&P500 earnings estimates derived?

Can someone explain how the monthly earnings estimates are derived for S&P500? Quandl sources, who state:

Yields following March 2015 (including current yield) are estimated based on 12 month earnings through March 2015 — the latest reported by S&P. 

Is there a release schedule for S&P500 (ttm) earnings somewhere?

I would like to be able to manually derive and match the Quandl estimates on the first of each month.

Also, how stable are the Quandl estimates? That is to say, if I use some historical record, but try to replicate each estimate on the first of each month, are they often revised such that the 1st of the month estimates would not reflect the historical values?

by pat at May 29, 2016 09:14 AM

Estimating time-varying tail dependence for Archimedean copulas

Patton (2006) defines the upper tail dependence coefficient for a time-varying bivariate SJC copula as $$\tau^u_t=\Lambda \left(\omega_u + \beta_u \tau^u_{t-1}+\alpha_u \frac{1}{10}\sum^{10}_{i=1}|u_{t-i}-v_{t-i}| \right)$$ where $\tau^u_t$ is the upper tail dependence coefficient at time $t$, $u_t$ and $v_t$ and the univariate transformations, and $\Lambda$ is a transformation function needed to keep $\tau^u_t$ in $[0,1]$.

Although I'm working on an extension for multivariate copulas, I'd like to use a similar equation. My (simple) question though (which I guess holds for way simpler cases like an ARMA(1,1) for any time series): how many observations do I need to estimate $\omega_u$, $\alpha_u$ and $\beta_u$; what is the value of $d$ in $(\tau^u_t)_{t\in\{1, ..., d\}}$ in order to have a reliable estimate?

Patton (2006):

by Kondo at May 29, 2016 09:09 AM

Beta = 1 and 0. Type of portfolios

I read in E. Quian's "Quantitative Equitity Portoflio Management" the following:

A traditional long-only portfolio [with unit beta] would have most of its risk in the market risk. However, a zero beta portfolio, typically a long-short market-neutral portfolio, would have no systematic risk.

Why are Beta=1 portfolios typically long-only and Beta=0 portfolios typically long-short and market neutral?

by Josh at May 29, 2016 09:03 AM


Scaling subproduct tree computation

Consider the following description of a subproduct tree.

We define a tree T for some points x[0] to x[n-1], and define m = log_2(n).

Tree T is represented as a matrix where each row-column entry i, j is the product of pairwise nodes in the row above it. In total, there are m rows; the 1st row is has n column entries, and the last row has 1 column entry.

In short, how does this computation scale for a large n? Once m > 4, the coefficients of the entries become greater than 64-bits. Consider the product of (x - i) for i = 0 to 15. The coefficients become very large.

Unless we are working in a finite field, it seems as though one cannot construct such a subproduct tree for say n = 2^20.



by user51870 at May 29, 2016 08:49 AM

Why is BNF considered an unsatisfactory technique for describing a language?

I was reading the paper Fundamental Concepts in Programming Languages by C. Strachey the other day, wherein I read something that was quite strange to me. Quoting directly (with the strange part highlighted by me):

Faced with the situation as it exists today, where there is a generally known method of describing a certain class of grammars (known as BNF or context-free), the first instinct of these mathematicians seems to be to investigate the limits of BNF—what can you express in BNF even at the cost of very cumbersome and artificial constructions? This may be a question of some mathematical interest (whatever that means), but it has very little relevance to programming languages where it is more important to discover better methods of describing the syntax than BNF (which is already both inconvenient and inadequate for ALGOL) than it is to examine the possible limits of what we already know to be an unsatisfactory technique.

Is there a specific reason the author considers the BNF to be an unsatisfactory technique for describing languages? Could it be because you can only describe syntax and not semantics with a single BNF grammar (though, you can extend it to describe operational semantics via turning it into an attribute grammar)?

by NlightNFotis at May 29, 2016 08:28 AM



Python : How to find Accuracy Result in SVM Text Classifier Algorithm for Multilabel Class

I have used following set of code: And I need to check accuracy of X_train and X_test

The following code works for me in my classification problem over multi-labeled class

import numpy as np
from sklearn.pipeline import Pipeline
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.svm import LinearSVC
from sklearn.feature_extraction.text import TfidfTransformer
from sklearn.multiclass import OneVsRestClassifier

X_train = np.array(["new york is a hell of a town",
                    "new york was originally dutch",
                    "the big apple is great",
                    "new york is also called the big apple",
                    "nyc is nice",
                    "people abbreviate new york city as nyc",
                    "the capital of great britain is london",
                    "london is in the uk",
                    "london is in england",
                    "london is in great britain",
                    "it rains a lot in london",
                    "london hosts the british museum",
                    "new york is great and so is london",
                    "i like london better than new york"])
y_train = [[0],[0],[0],[0]
X_test = np.array(['nice day in nyc',
                   'the capital of great britain is london',
                   'i like london better than new york',
target_names = ['Class 1', 'Class 2','Class 3']

classifier = Pipeline([
    ('vectorizer', CountVectorizer(min_df=1,max_df=2)),
    ('tfidf', TfidfTransformer()),
    ('clf', OneVsRestClassifier(LinearSVC()))]), y_train)
predicted = classifier.predict(X_test)
for item, labels in zip(X_test, predicted):
    print '%s => %s' % (item, ', '.join(target_names[x] for x in labels))


nice day in nyc => Class 1
the capital of great britain is london => Class 2
i like london better than new york => Class 3

I would like to check the accuracy between Training and Test Dataset. Score Function doesn't work for me, it shows an error stating that multilabel value can't accepted

>>> classifier.score(X_train, X_test)

NotImplementedError: score is not supported for multilabel classifiers

Kindly help me get accuracy results for training and test data and choose an algorithm for our classification case.

by user_az at May 29, 2016 07:56 AM


Is EXPTIME "solvable" or "checkable" in exponential time?

According to this video, EXP has problems that are exponentially difficult to check.

But according to this video, EXP are problems that are exponentially difficult to solve.

It would make sense to me, that if EXP contains problems that are exponentially difficult to check, that they would contain problems that are harder than NP-complete ones.

However, if it's true that EXP is problems that are just solvable in exponential time, why wouldn't they be equal? Wouldn't EXP therefore be a subset of NP (and so would R), as problems that are harder (like R) are still solvable in "non-deterministic" time? Because the term "non-deterministic" extends to any finite amount of time, correct?

Thank you in advance for resolving my confusion. If you could explain it as simply as possible, I would appreciate it, as I get easily bogged down by the set theory.

by rb612 at May 29, 2016 07:56 AM


Why is ML on PC so much slower [on hold]

I am doing work in Machine Learning mainly based on an quite old iMAC with OSX 10.11 and a 3GHz CPU based Intel Duo Core with 8GB of RAM. As a metric may training takes about 500s for each epoch to calculate. My partner uses a PC with a rather recent i5 and 12GB of RAM. I was expecting this quadcore to go much faster, but it runs at least 2x slower than my iMAC. For presentations to clients, I finally use a PC again on Windows10 with a 2.5GHz i5 core and 4GB or RAM and here I see performance even more sluggish with epochs training at 10-20x slower than iMAC.

Anyone who could give me a good explanation of this wide performance spread?

I use Keras/Theano/Python as environment for ML.

by Peter HIRT at May 29, 2016 07:24 AM

How to implement decoder in keras?

I am trying to implement an encoder-decoder framework for image segmentation (given below). Can anyone please tell me how to implement the decoder side in Keras? enter image description here

by Avijit Dasgupta at May 29, 2016 07:13 AM

Caffe Python Layer for input, set data type to uint8

I am trying to train a CNN in caffe. I wanted to do a lot of data augmentation, so I'm using a "Python" layer for input, as suggested here.

However, I see from the log that Caffe is using the datatype float32 for all my data. This is really wasteful, because I'm only dealing with 8-bit integers. Is there a way to tell caffe to use dtype='uint8'?

I have tried to typecast the data while setting the top:

top[0].data[...] = someArray.astype(np.uint8, copy=False)

but this doesn't work.

Any suggestions?

by gunner at May 29, 2016 06:54 AM

MongoDB + K means clustering

I'm using MongoDB as my datastore and wish to store a "clustered" configuration of my documents in a separate collection.

So in one collection, I'd have my original set of objects, and in my second, it'd have

kMeansCollection: {
     1: [mongoObjectCopy1], [mongoObjectCopy2]...
     2: [mongoObjectCopy3], [mongoObjectCopy4]... 

I'm following the implementation a K-means for text clustering here,, but I'm having a hard time thinking about how I'd tie the outputs back into MongoDB.

An example (taken from the link):

if __name__ == "__main__":
    tags = collection.find({}, {'tag_data': 1, '_id': 0})
    clusters = cluster_texts(tags, 5) #algo runs here with 5 clusters

The var "tags" is the required input for the algo to run. It must be in the form of an array, but currently tags returns an array of objects (I must therefore extract the text values from the query)

However, after magically clustering my collection 5 ways, how can I reunite them with their respective object entry from mongo?

I am only feeding specific text content from one property of the object.

Thanks a lot!

by Anthony Chung at May 29, 2016 06:53 AM


How many recursive calls does it take to compute binomial probability weights?

I was given an algorithm and I was asked to estimate how often it would be called if I was trying to calculate ${100 \choose 50}0,25^{50}0.75^{50}$, the Binomial distribution of $50$ elements chosen within $100$ elements with $0.25$ their probability to appear.

This is the method:

binomial1(int N, int k, double p) {
    if (N == 0 && k == 0) return 1.0;
    if (N < 0 || k < 0) return 0.0;

    return (1.0 - p) * binomial1(N-1, k, p) 
         + p * binomial1(N-1, k-1, p);  

The question is, how many calls to binomial1 would be required to evaluate binomial1(100, 50, 0.25)?

I tried first the fool strategy to implements a system.out.println within the code with an indice increasing each time the algorithm was called. But I realised that it was something fool when I noticed I reached:

passed 2485266 times

Thus I was wondering that as far that the algorithm is called twice every time it is called, I was wondering if it was simply called $2^{100}$ times? Yet, p*binomial1(N-1, k-1, p) seemed to be called only $50$ times with a parameter $k=50$...

I'm only asking for a hint.

by Marine1 at May 29, 2016 06:29 AM

Is it possible to combine the resources of a gaming laptop and desktop?

I have a laptop with i5, 1GB video RAmM and AMD graphics card, 8GB RAM. My PC has a less powerful processor and a 2 GB Nvidia 3D graphics card, a 7.1 surround sound card, but the RAM burnt out and now i have only 4GB RAM. Is it possible to build a distributed system in which i can use the processing power and RAM of my laptop and sound card and graphics card of my PC to play games? Or is virtualization possible to map the OS such that it uses resources as if locally present?

Ps: I posted this here as although it is to build a gaming system, it involves the concept of using distributed computing.

by duckvader at May 29, 2016 06:21 AM


Handle outliers in a classification task in machine learning

I am using a dictionary learning framework for doing a face recognition classification. However it performs really well on some classes(~95%) but does very bad in other cases(12%)

Now the classes where the performance is bad are very similar so the features belonging to these classes are also similar. I am thinking of adapting it to fix the outliers.

Is there a algorithm I could use to fix the outliers? Any suggestions are welcome..

by user2359877 at May 29, 2016 06:09 AM

Planet Theory

Call for feedback: upcoming “Introduction to Property Testing”

Forwarding an announcement by Oded Goldreich:

I would like to seek the help of this community regarding my forthcoming book Introduction to Property Testing (see Specifically, I would be grateful for feedback regarding any aspect and part of the current text, although detailed comments are likely to be more helpful at this time than suggestions for drastic and/or structural changes. Please do not shy from indicating relevant works that may be mentioned (rather than described in detail), since it is quite possible that I was not aware of them or forgot to mention them (rather than chose not to include a detailed description due to various considerations).

Since I am likely to continue revising the text and posting the revisions on the website, please indicate the text-locations to which you refer in a manner that is most robust to revision (e.g., indicate relative location with respect to sectioning and/or theorem-environment numbers rather than with respect to page numbers).

by Clement Canonne at May 29, 2016 05:33 AM


Check whether an undirected graph contains a simple cycle of length four using the "Squares" method

The following problem is exercise 4.3 of the book "Algorithms" by S. Dasgupta, C. Papadimitriou, and U. Vazirani.

Squares. Design and analyze an algorithm that takes as input an undirected graph $G = (V,E)$ and determines whether $G$ contains a simple cycle (i.e., a cycle which does not intersect itself) of length four in $O(|V|^{3})$ time.

This can be solved by checking for each pair of vertices $u$ and $v$ whether they share at least two common neighbors; see the answer.

However, in the book, the authors gave a single-word hint: Squares.

Following the hint, I square the adjacency matrix $M$ of $G$ twice (i.e., $M \to M^2 \to M^4$) and check whether $M^4[i][i]$ is true for some $i$. But this attempt fails because it produces "false positives" by considering $a-b-a-b-a$ or $a-b-c-b-a$ as cycles of length four.

How to fix the "false positives" in the squaring method above?

by hengxin at May 29, 2016 05:18 AM


Strategy Risk and Portfolio Allocation Model (copy from nuclear phynance)

Here is a very interesting question that I found at Nuclear Phynance (original author: Strange); I though it is so interesting that it is worthwhile to ask it here:

I have $N$ strategies, across a variety of assets and time frames (from intra-day to month holding times). Each strategy is sort-of-kind-of market-neutral and mostly carrying idiosyncratic risk. Since we all know it's not really true, I do evaluate the market-factor risk for each strategy position (delta, vega, correga, whatever applicable) to get an understanding of my gap/systemic risk. On the other hand, strategies do go boom because of idiosyncratic reasons - some stock goes bust like Enron in one of my single name strategies or maybe Obama decides to sell tons of S&P skew to improve the morale. This is non-controllable risk that I simply need to diversify away.

So, this brings multiple questions:

  1. assuming that I got the market-risk modeling under control, how do I assign an idiosyncratic risk value to a strategy? I was thinking of some sort of Poisson process, but can't even think of how to properly combine multiple distributions in that case.
  2. is there a clean way to construct a portfolio allocation model that on one hand will neutralize market risk but on the other hand would keep idiosyncratic risk diverse enough for me to sleep at night?
  3. how do i include the dynamic nature of each strategy in the allocation process? For example, some strategies are expiration-date specific, some fire fairly infrequently etc. Do I allocate hard dollars? Do I look at the new signals and allocate given the current risk in the portfolio?

by lehalle at May 29, 2016 04:47 AM


compute max speed of processing for the following data with 10 PE [on hold]

compute max speed of processing for the following data with 10 PE( Processing Element). where
processing speed of PE- 1 MIPS
Data handled - 64 bits
Time to store result in memory - 0.25 μsec
Bus speed - 40 Mega Bytes/Sec. Which Formula should i Use to solve this one?

by user3902444 at May 29, 2016 04:26 AM


Portfolio with a prearranged pay-off curve

My question has been motivated by the following topic:

Is Behavioral Finance relevant to quants?

I have begun searching relevant optimization option's portfolio models which can describe a prearranged pay-off curve (objective function) under assumptions on how to limit a portfolio's maximal loss, whether could be a portfolio's cost to enter a zero cost or even a negative one.

For instance, I would like to construct an option's portfolio with the non-linear pay-off curve (here's link to an example). I hope that such portfolio can be realized by combining long and short position in put and call contracts based on the same underlying asset but with different strike prices. This is an integer programming problem.

Theoretical option pricing models generally assume that the underlying asset return follows a normal distribution. In contrast this stiff assumption, I have found the paper by Das S. R., Statman M. (2013) Options and structured products in behavioral portfolios. Journal of Economic Dynamics & Control 37 137–153, where authors proposed the method to optimize portfolios without of normal distribution assumption of the portfolio’s return. The objective function the portfolio optimization problem is the expected return which was written down as an integral function. Also the optimization problem includes constraints on short sale as well as the probability of not reaching thresholds. Authors theoretically found that optimal behavioral portfolios are likely to include combination of derivative securities: put, call, and other structured products.

In the study (Bartoňová M. (2012) Hedging of Sales by Zero-cost Collar and its Financial Impact. Journal of Competitiveness. 4 (2) 111-127) was demonstrated the usage of zero-cost option’s strategy in hedging of sales.

I would like to design an option's portfolio without using historical data on the underlying asset return. I want to write objective function of pay-off in the form:

$max F(S_c, S_p, X, M_f)$,

where $S_c$, $S_p$ are lists of strike prices on the call and put on the same underlying asset, $M_f$ is a forecast price of the underlying asset at the end of the strategy term and $X$ is an integer $n$-size vector, entries correspond to the call and put volumes in the portfolio.

My question: on what standard works on this topics I can developing the integer programming problem?

by Nick at May 29, 2016 03:58 AM


The Rust Community

tedu made a good point about the Rust community that got me thinking about things we can collectively do to improve how we represent ourselves and Rust. This post is a collection of thoughts on the issue that hopefully will spur some discussion within the Rust community, but whose arguments I believe are more broadly applicable to tech evangelism.


by brinker at May 29, 2016 03:36 AM


Is the "define" primitive of Scheme an imperative languages feature? Why or why not?

(define hypot
 (lambda (a b)
   (sqrt (+ (* a a) (* b b)))))

This is a Scheme programming language.

  • "define" creates a variable and global binding
  • lambda creates a procedure

I would like to know if "define" would be considered as an imperative language feature! As long as I know imperative feature is static scoping. I think it is an imperative feature since "define" create a global binding and static scoped looks at global binding for any variable definition where as in dynamic it looks at the current most active binding.

Please help me find the correct answer!! And I would like to know why or why not?

by user2884707bond at May 29, 2016 03:12 AM

How to get/set model's weight (delta) in torch?

Sorry for asking as a newbie to torch, but I promise to have search a lot through the documents and Internet.

There are two main demands I need, the first one is to get the weight delta after training for one or more batches, the second one is to set the new weight to model.

That means I want to update the weights by my own methods (with external library), will it be possible to achieve that in torch?

It seems that torch has an abstract module class [1] but its interface doesn't fit all my needs.


by huxuan at May 29, 2016 02:55 AM


Binary rank of binary matrix

Let $M$ be a binary ($0-1$) matrix of size $n \times m$. We define binary rank of $M$ as the smallest positive integer $r$ for which there exists a product decomposition $M = UV$, where $U$ is $n \times r$ and $V$ is $r \times m$, and all entries of $U$ and $V$ come from $\{0, 1\}$.

My question is that is there known algorithmic way to determine the binary rank of $M$; and Singular value decomposition that support the binary rank. Any reference in this regard would highly help.

by Ram at May 29, 2016 02:52 AM



How to create contiguous clusters in Self Organizing Maps using Kohonen package in R

I have developed an SOM model on a dataset and developed cluster image using the following codes

som_model <- som(data_som_scaled, grid = somgrid(15, 15, "hexagonal"), rlen = 500)
pretty_palette <- c("#1f77b4", '#ff7f0e', '#2ca02c', '#d62728', '#9467bd', '#8c564b', '#e377c2')
mypath <- paste("somepath", "clusters", ".png", sep = "")
png(file=mypath, width=2000,height=1200)
som_cluster <- cutree(hclust(dist(som_model$codes)), 6)
plot(som_model, type="mapping", bgcol = pretty_palette[som_cluster], main = "Clusters", cex = 3, font.main= 4)
add.cluster.boundaries(som_model, som_cluster)

enter image description here

The clusters are not contiguous. I want to know in what way can the code be changed so that contiguous clusters can be formed like the one below enter image description here

by Bharath Kumar at May 29, 2016 02:10 AM

Precision & recall

What to do if the precision or recall of denominator is 0 ?

by user197109 at May 29, 2016 02:03 AM



Demand Paging and File System Shared and Segregated Cache - Differences, Advantages?

This is from a computer science class, so I suppose it could be considered a computer science question...

I have a final in a few days, and I'm preparing for it. We were given a review sheet, and one of these questions was related to the caches a demand paging system and a file system may share.

Now, I'm not completely sure about the distinction between the two besides their names (unified cache combines two different caches for the two into one, segregated doesn't is my guess). So, hopefully, someone can clear this up for me. I've had some trouble searching for clarification online, as the answers I'm receiving online are not really for the file system/demand paging cache.

These questions are:

What is a high-level constraint used by a unified cache, but not a segregated cache?

What are the general advantages of a unified cache?

Give a situation where a segregated cache performs better than a unified cache?

I could answer these questions if I had more information on them, but it's been tough going. Could anyone provide some help, please?

by Yet Another Guest at May 29, 2016 01:15 AM

EQtm is not mapping reducible to its complement

This is a problem from Sipser's book (marked with an asterisk).

$EQ_{TM} = \{(\langle M \rangle, \langle N \rangle)$ where $M$ and $N$ are Turing machines and $L(M) = L(N)\}$

We know that neither $EQ_{TM}$ nor $\overline{EQ_{TM}}$ are recognizable so unsure how to go about proving there can't be a mapping reduction from one to the other.

Any hints?

by user1255841 at May 29, 2016 01:13 AM


Correlation between asset A and Portfolio X (which contains A)

After a few hours trying to solve this I give up! I need help.

I need to calculate the BETA of an asset with respect to a portfolio that contains this asset. I have the volatility and correlations for all the portfolio, and the allocation.

I could use a formula to calculate the correlation between asset A and the portfolio. Because I could get Beta with the usual formula.

I need a formula, not a method to calculate this based on historical prices.

So this is what I have:


   A    B      C
A  1    0.85  0.78
B 0.85   1    0.84
C 0.78  0.84   1


  A       B      C
19.74% 25.76% 31.19%


  A       B      C
25.00% 25.00% 50.00%

Hope you can help me!!

by Nicolas Galarza Ricci at May 29, 2016 12:52 AM



Searching the best trading route - algorithm

Imagine a village with people trading goods. Each person has his own offer in this format: Giving amount a of good b for amount c of good d.

I have made a simple table to point out what I am trying to explain:

enter image description here

In this case there are three different goods: Wood, Sand and Gras.

I also live in the village and noticed that the prices of the traders vary greatly. I have 1 wood and want to increase it by simply trading between the five dealers. But: I must not visit a dealer more than once.

There would be different routes, for example I could do Dave - Adam, which would result in +1 wood for me. A better route would be Earl - Berta - Adam, because it would mean +2 wood.

I think at that point you might already know what I am asking for. I am looking for a technique / algorithm to find the route which maximizes my input. The rules are:

  • Input good = Output good
  • Output value > Input value
  • Each node ("dealer") must not be visited more than once
  • You do not have to visit all dealers
  • Each dealer is limited to one trade ("Giving amount a of good b for amount c of good d").
  • $a\in\mathbb{R}_{>0}$ and $c\in\mathbb{R}_{>0}$
  • The benefit is calculated at the end by output - input, i.e., by the increase in the quantity of the goods that you're starting with (if you start with 1 wood, then all that matters is the number of woods you end with).
  • It is important that the input good is the same as the output good ("Input good = Output good"). If I start with wood, I need to end with wood. If I start with sand, I need to end with sand, etc.

I don't even know how such a problem is called. That means that anything like algorithm names to solve it, the general name for this kind of problem, etc. would help me to continue with my research. I have already tried approaching it by converting the table to a graph and then run path-finding algorithms. That did not work, because I wasn't sure how to weight the edges and the Longest-Path-Problem was also in my way, which is NP-Hard for non-DAGs.

The traveling salesman problem (TSP) was already mentioned. I had a look at the Wiki page of the TSP. Quote: "The travelling salesman problem (TSP) asks the following question: Given a list of cities and the distances between each pair of cities, what is the shortest possible route that visits each city exactly once and returns to the origin city?" I think my problem is very different, because:

  1. I look for the most profitable route, not the shortest.
  2. I don't have to visit every dealer ("city").
  3. I must not return to the origin, because as I said each node ("dealer") must not be visited more than once.

The "village" can get quite big. About 200 different goods and 500 dealers, so the algorithm should be able to perform good in this scenario.

The last possibility would be to brute-force, but I really want to avoid that.

by Bobface at May 29, 2016 12:18 AM


Das Schweizer Alpendorf Oberwil-Lieli zahlt lieber ...

Das Schweizer Alpendorf Oberwil-Lieli zahlt lieber sechsstellig Strafe als 10 Flüchtlinge aufzunehmen. Aber mit Rassismus hat das nichts zu tun, sagt der Bürgermeister (von der rechtspopulistischen Nationalistenpartei SVP).

Update: Da scheint der Independent ziemlich ins Klo gegriffen zu haben. NZZ sagt, der entsprechende Antrag auf Freikaufen wurde abgelehnt, dann will der Typ das nicht akzeptieren und auch der Tagesanzeiger bestätigt das.

May 29, 2016 12:00 AM

Wer schonmal mit Delta geflogen ist, wird sich über ...

Wer schonmal mit Delta geflogen ist, wird sich über diese Meldung hier freuen.

Die TSA ist so dermaßen inkompetent, dass Delta eine deutlich effizientere Security-Theater-Installation hingekriegt hat. Delta!!

Delta wird von Reisenden gerne zu "Don't Expect Luggage To Arrive" expandiert :-)

May 29, 2016 12:00 AM

Die Story hier handelt von Korruption im Militärisch-Industriellen ...

Die Story hier handelt von Korruption im Militärisch-Industriellen Komplex. Money Quote:
The target was not a terrorist, nor a spy for a foreign power, nor the kingpin of a drug cartel. But rather a 350-pound defense contractor nicknamed Fat Leonard, who had befriended a generation of Navy leaders with cigars and liquor whenever they made port calls in Asia.
Wie im Mafia-Filmen! Harr harr harr

May 29, 2016 12:00 AM

HN Daily

May 28, 2016


Add custom loss function in Torch

What are the necessary steps to implement a custom loss function in Torch?

It seems like you have to write an implementation for updateOutput and updateGradInput.

Is that all? So then you basically create a new class:

local CustomCriterion, parent =   torch.class('CustomCriterion','nn.Criterion')

and implement the following two functions:

function CustomCriterion:updateOutput(input, target)
function CustomCriterion:updateGradInput(input, target)

Is that correct, or is there more to be done?

Also, for the provided criterions these functions are implemented in C, but I suppose a Lua implementation will also work, albeit possibly a little slower?

by DrMad at May 28, 2016 11:48 PM

Splitting a Torch tensor into a table of tensors with some desired dimension

Suppose I have a convolutional network which outputs a tensor with the size [M x C x H x W] where M is the batch size, C is the number of channels (a multiple of 2) and H = W is an integer number. I want to split the output into a table with two elements each element having the size [M x C / 2 x H x W]. I think it's possible to be done with nn.SplitTable() but I haven't been to do it properly.

Can anyone give me some hints?

by Amir at May 28, 2016 11:32 PM


Maximum cost path in integer matrix

In preparation for my design and algorithms exam, I encountered the following problem.

Given a $2 \times N$ integer matrix $(a[i][j] \in [-1000, 1000])$ and an unsigned integer $k$, find the maximum cost path from the top left corner $(a[1][1])$ and the bottom right corner $a[2][N]$, given the following:

$\bullet$ The path may not go through the same element more than once

$\bullet$ You can move from one cell to the other vertically or horizontally

$\bullet$ The path may not contain more than $k$ consecutive elements from the same line

$\bullet$ The cost of the path is determined by the sum of all the values stored within the cells it passes through

I've been thinking of a simple greedy approach that seems to work for the test cases I've tried, but I'm not sure if it's always optimal. Namely, while traversing the matrix, I select a the maximum cost cell on the vertical or horizontal and then go from there. I've got a counter I only increment if the current cell is on the same line with the previously examined one and reset it otherwise. If at some point the selected element happens to be one that makes the counter go over the given value of $k$, I simply go with the other option that's left.

However, I feel that I'm missing out on something terribly important here, but I just can't see what. Is there some known algorithm or approach that may be used here?

Also, the problem asks for an optimal solution (regarding temporal complexity).

by user43389 at May 28, 2016 11:22 PM

Compression method by replacing

I was wondering if this method can actually compress data:

I have a table of combinations:

00000000 00000001 ... 11111111 And so on...

What if I decide start enumerating these combinations with a special char (tested, I have enough chars) And then replace the exact string?

For example: In a file I have 000000100010101 Then I will read 8 bytes and replace this with 8 bits (a char): A And then do the same thing with the whole file... Is it going to be useful?

by Ben at May 28, 2016 10:53 PM

Why are regular expressions defined with union, concatenation and star operations?

A regular expresssion is defined recursively as

  1. $a$ for some $a \in \Sigma$ is a regular expression,
  2. $\varepsilon$ is a regular expression,
  3. $\emptyset$ is a regular expression,
  4. $(R_1 \cup R_2)$ where $R_1$ and $R_2$ are regular expressions is a regular expression,
  5. $(R_1 \circ R_2)$ where $R_1$ and $R_2$ are regular expressions is a regular expression,
  6. $(R_1)^*$ where $R_1$ is a regular expression is a regular expression.

This definition is taken from page 64 of

Sipser, Michael. Introduction to the Theory of Computation, 3rd edition. Cengage Learning, 2012.

Now, I have the following questions.

  • Why do not the definition contain the intersection, complement or reverse operations?
  • If we change the 4th item to $R_1 \cap R_2$, do we get an equivalent definition, i.e. for each regular language, there is a modified regular expression and vice versa?
  • I know that this definition is complete and well-defined, but why is it preferred to other equivalent, well defined and complete definitions?

by Ali Shakiba at May 28, 2016 10:42 PM

Planet Emacsen

Pragmatic Emacs: Weather in Emacs

I recently came across this very nice emacs config file which has lots of neat bits and pieces. One fun thing that caught my eye was getting the weather forecast using the wttrin.el package. To install the package use

;; weather from
(use-package wttrin
  :ensure t
  :commands (wttrin)
  (setq wttrin-default-cities '("Cheltenham"

This sets the list of available cities. You can then use M-x wttrin to see the weather. This will prompt for the city and then show a buffer with the forecast. You can use g to change city and q to quit.

The only problem is that if the Emacs frame is not wide enough and the lines in the wttrin buffer wrap, it becomes unreadable. My solution is to save the window and frame configuration, then resize the frame to be big enough for the wttrin buffer. The previous configuration is then restored when exiting wttrin. These functions take care of that for us:

;;advise wttrin to save frame arrangement
;;requires frame-cmds package
(defun bjm/wttrin-save-frame ()
  "Save frame and window configuration and then expand frame for wttrin."
  ;;save window arrangement to a register
  (window-configuration-to-register :pre-wttrin)
  ;;save frame setup and resize
  (set-frame-width (selected-frame) 130)
  (set-frame-height (selected-frame) 48)
(advice-add 'wttrin :before #'bjm/wttrin-save-frame)

(defun bjm/wttrin-restore-frame ()
  "Restore frame and window configuration saved prior to launching wttrin."
  (jump-to-register :pre-wttrin)
(advice-add 'wttrin-exit :after #'bjm/wttrin-restore-frame)

by Ben Maughan at May 28, 2016 10:36 PM



The edit distance of BWT of two strings with one difference

Let $BWT$ stands for the Burrows-Wheeler transform on strings. What is the edit distance of $BWT(w)$ and $BWT(u)$ if $w$ and $u$ differ only in character.

by J.Doe at May 28, 2016 10:21 PM

P vs EXP (Can anybody help me where my proof should have failed?) [on hold]

First things first... I am not a mathematician that is why you may find the paper as "having not enough rigor as much as it had to have".

Second, I tried to combine the basic informations on set theory and computational complexity in order to compare P and EXP.

P = $\bigcup_{m} o(n^m)$

What I propose is that, since the definition of P is infinite unification of $o(n^m)$, it can be represented by single time class, $o(n^\omega)$. (I think this is like saying “1 = $0.\bar{9}$” or “two parallel lines touch each other at infinity”.)

And $o(n^\omega)$ has enough time resource to host every execution of any algorithm of EXP. This means P = EXP (and of course this implies P = NP = EXP.)

I had criticism for also that the paper I wrote uses "infinite representations of algorithms". Even though my paper is not based on the algorithms' representations, even if it had been so, $o(n^\omega)$ would still have enough resource to host every algorithm of EXP.

You can find the paper here.

by Mehmet Kaplan at May 28, 2016 10:20 PM

Translation from Lustre to LTL

How to translate the following Lustre expressions into LTL?

a => pre b

a -> pre b

a -> (pre b and q)

by Imed at May 28, 2016 10:17 PM



Generate code for sklearn's GradientBoostingClassifier

I want to generate code (Python for now, but ultimately C) from a trained gradient boosted classifier (from sklearn). As far as I understand it, the model takes an initial predictor, and then adds predictions from sequentially trained regression trees (scaled by the learning factor). The chosen class is then the class with the highest output value.

This is the code I have so far:

def recursep_gbm(left, right, threshold, features, node, depth, value, out_name, scale):
    # Functions for spacing
    tabs = lambda n: (' ' * n * 4)[:-1]
    def print_depth():
        if depth: print tabs(depth),
    def print_depth_b():
        if depth: 
            print tabs(depth), 
            if (depth-1): print tabs(depth-1),

    if (threshold[node] != -2):
        print "if " + features[node] + " <= " + str(threshold[node]) + ":"
        if left[node] != -1:
            recursep_gbm(left, right, threshold, features, left[node], depth+1, value, out_name, scale)
        print "else:"
        if right[node] != -1:
            recursep_gbm(left, right, threshold, features, right[node], depth+1, value, out_name, scale)
        # This is an end node, add results
        print out_name + " += " + str(scale) + " * " + str(value[node][0, 0])

def print_GBM_python(trees, feature_names, X_data, l_rate):
    print "PYTHON CODE"

    # F0
    f0_probs = np.mean(clf.predict_log_proba(X_data), axis=0)
    probs    = ", ".join([str(prob) for prob in f0_probs])
    print "# Initial probabilities (F0)"
    print "scores = np.array([%s])" % probs

    print "# Update scores for each estimator"
    for j, tree_group in enumerate(trees):
        for k, tree in enumerate(tree_group):
            left      = tree.tree_.children_left
            right     = tree.tree_.children_right
            threshold = tree.tree_.threshold
            features  = [feature_names[i] for i in tree.tree_.feature]
            value = tree.tree_.value

            recursep_gbm(left, right, threshold, features, 0, 0, value, "scores[%i]" % k, l_rate)

    print "# Get class with max score"
    print "return np.argmax(scores)"

I modified the tree generating code from this question.

This is an example of what it generates (with 3 classes, 2 estimators, 1 max depth and 0.1 learning rate):

# Initial probabilities (F0)
scores = np.array([-0.964890, -1.238279, -1.170222])

# Update scores for each estimator
if X1 <= 57.5:
    scores[0] += 0.1 * 1.60943587225
    scores[0] += 0.1 * -0.908433703247
if X2 <= 0.000394500006223:
    scores[1] += 0.1 * -0.900203054177
    scores[1] += 0.1 * 0.221484425933
if X2 <= 0.0340005010366:
    scores[2] += 0.1 * -0.848148803219
    scores[2] += 0.1 * 1.98100820717

if X1 <= 57.5:
    scores[0] += 0.1 * 1.38506104792
    scores[0] += 0.1 * -0.855930587354
if X1 <= 43.5:
    scores[1] += 0.1 * -0.810729087535
    scores[1] += 0.1 * 0.237980820334
if X2 <= 0.027434501797:
    scores[2] += 0.1 * -0.815242297324
    scores[2] += 0.1 * 1.69970863021

# Get class with max score
return np.argmax(scores)

I used the log probability as F0, based on this.

For one estimator it gives me the same predictions as the predict method on the trained model. However when I add more estimators the predictions start to deviate. Am I supposed to incorporate the step length (described here)? Also, is my F0 correct? Should I be taking the mean? And should I convert the log-probabilities to something else? Any help is greatly appreciated!

by Pokey McPokerson at May 28, 2016 09:29 PM


Avahi on FreeBSD: Machine is Seen but Does Not See

I'm in the process of setting up Avahi 0.6.31 on a FreeBSD box on a network that uses mDNS for address resolution. I have two other boxen on the same network: one Ubuntu 14.04 with Avahi 0.6.31, and one OSX 10.4 with mDNSResponder. Both these machines can ping and ssh each other, and both machines can ping and ssh the FreeBSD box, but all attempts to ping or ssh from FreeBSD to the other machines are met with Could not resolve hostname.

Both Avahi machines have identical avahi-daemon.conf files, and each machine's nsswitch.conf contains the line

hosts: files mdns4_minimal [NOTFOUND=return] dns mdns4

What have I missed?

by Darwin von Corax at May 28, 2016 09:26 PM


Structural induction hypothesis on derivations

I am relatively new to the field of language semantics, however i am a bit confused about using operational semantics and structural induction on Derivations. If I have two languages $L1$ and $L2$ and I need to prove that Processes in the two languages are equivalent (The Processes of $L1$ and $L2$ may execute conditional (boolean), arithmetical, sequence of commands. Can i prove equivalence in the processes using structural induction on the derivations, using the below induction hypothesis ?

proc : processes of the two languages, state : current execution state

The property on all derivations d : $P(d)\Leftrightarrow \forall p1,p2 \in proc,p1 \in L1,p2 \in L2, \sigma,\sigma_0 \in states. d \models <p1,\sigma_0> \rightarrow \sigma \& <p2,\sigma_0> \rightarrow \sigma \Rightarrow p1 \equiv p2$

where $<p1,\sigma_0> \rightarrow \sigma \& <p2,\sigma_0> \rightarrow \sigma$ are the conclusions of the rule

Is my induction hypothesis correct?, that when the process $p1$ and $p2$ execute in the same state $\sigma_0$ and yield same state $\sigma$ then it implies that the two processes are equivalent?

and this needs to proved for all cases where $p1,p2 \equiv boolen$, $p1,p2 \equiv arithmatic$ and $p1,p2 \equiv sequence$

by anil keshav at May 28, 2016 09:21 PM


Realized "efficient" frontier. Is this reasonable?

I have performed some out-of-sample analysis of mean-variance optimization with monthly rebalancing. Studying the "realized efficient frontier", I am worried that something is wrong. Since the frontier is the "realized outcome", I am aware that the frontier might not be efficient which can be seen in the figure (we have higher volatility for lower return). Having read research on out-of-sample performance with MVO, they usually obtain frontiers that are convex and coincide with the theory (i.e. more risk usually implies more return). I need a second opinion regarding the following figure, is it a reasonable shape? The numbers in the figure are the target volatilities during the portfolio optimization and the difference between the plots is the method I have used to estimated the covariance matrix.

Figure of the frontiers

by B.Swanx at May 28, 2016 09:11 PM



Time complexity of nested for loop function involving mod to filter out the execution [duplicate]

This question already has an answer here:

I have to find the time complexity of the following program:

function(int n)
    for(int i=0;i<n;i++)                   //O(n) times
        for(int j=i;j<i*i;j++)             //O(n^2) times
            {  //O(n) times
                for(int k=0;k<j;k++)       //O(n^2) times

I analysed this function as follows:

i : O(n)  : 1        2        3          4             5
j :       : 1        2..3     3..8       4..15         5..24           (values taken by j)
    O(n^2): 1        2        6          12            20              (Number of times executed)
j%i==0    : 1        2        3,6        4,8,12        5,10,15,20      (Values for which the condition is true)
    O(n)  : 1        1        2          3             4
k         : 1        2        3,6        4,8,12        5,10,15,20      (Number of times printf is executed)
    Total : 1        2        9          24            50              (Total)

However I am unable to bring about any conclusions since I don't find any correlation between $i$ which is essentially $O(n)$ and Total of k (last line). In fact I don't understand if we should be looking at the time complexity in terms of number of times printf is executed since that will neglect $O(n^2)$ execution of j-for loop. The answer given was $O(n^5)$ which I presume is wrong but then whats correct?

by anir at May 28, 2016 08:49 PM


On permanent mod $p$

Permanent of $0/1$ $n\times n$ integer matrix $\bmod p$ is $\#P$-complete if prime $p>n$. We have FPTAS for $0/1$ integer matrix permanent over reals. Moreover we can obtain $n\times n$ $0/1$ matrix permanent in polynomial time if permanent is bounded by $n^c$ for fixed $c>0$.

Is it possible to obtain permanent of $0/1$ $n\times n$ integer matrix $\bmod p$ in polynomial time if permanent $\bmod p$ is bounded by $(\log p)^c$ for fixed $c>0$.

by Student. at May 28, 2016 08:48 PM


Number of winning combination in Nim

I am studying game theory and I am wondering how many winning combinations are possible for Nim game? Suppose, stones = 500 and piles = 5. With these number, there are many initial game positions are possible. like [1, 2, 4, 3, 490], [100, 100, 100, 100, 100], [100, 200, 100, 100], [0, 200, 200, 50, 50] and so on. So basically there are stones^piles combinations possible. Among these, some will yield Xor > 0 and considered as winning possible.

My question is - how many winning initial positions are possible for given stones and piles? Instead of checking all possible combinations(which is impossible for larger inputs), is there any smart way to solve this?

This is not a homework or programming contest problem.

by Kaidul Islam at May 28, 2016 08:40 PM

Blocking in two phase commit protocol

Blocking is a disadvantage in 2 phase commit protocol. But I read a statement that "Blocking in a 2 phase commit protocol can't be completely eliminated EVEN when participants/COHORTs select a new coordinator", Why is that so? Tried searching Internet, couldn't find answer :P

by Salim at May 28, 2016 08:15 PM


Second Moment of Stock Process

I have a stock process which I have decided to model as $$S_T=S_t\exp((r-q-\frac{1}{2}\sigma^2)(T-t)+\sigma(W_T-Wt))-D_T$$ where $D_T$ is a cash dividend at time $T$. This dividend is known. I then calculated its second moment as $$\mathbb{E}(S_T^2)=[S_t\exp((r-q)(T-t))]^2\exp(\sigma^2(T-t))-2D_TS_t\exp((r-q)(T-t))+D_T^2$$ My questions is, how would this expression change if there were say $n$ dividends (all different) between times 0 and T.

by Jim at May 28, 2016 08:11 PM


Configuring Apache 2.4 for CGI on FreeBSD

I am trying to run CGI on FreeBSD 9.2.

  1. I installed Apache 4.2 (pkg install apache42)
  2. Configured it to load CGI module.
  3. Also, I did chmod a+x on files in cgi-bin directory.

And when I connect to the server to a test CGI script, the server printed this error message.

AH01215: (13)Permission denied: exec of '/usr/local/www/apache24/cgi-bin/test-cgi' failed
End of script output before headers: test-cgi

What's wrong and how to fix this problem?

by Eonil at May 28, 2016 08:08 PM


Formal specification for ADT

I need to define a series of Abstract Data Types (ADT) using discrete mathematics for a Formal Specification. For example, to define Empty of a Set ADT I would do the following

A.empty() = {true, if |A| = 0 
             false, if |A| > 0

I am attempting to create one for the Stack ADT for Push, Pop, and Top. I am drawing a blank. I have done some research, but keeping coming up with nothing. Should the Stack ADT. These three are the last bits of information I need to complete my Specification document. How might I convert those three into set notation?

by srtmonkey at May 28, 2016 07:43 PM



Showing that the number of primitive-recursion programs for each function is countably-infinite

Problem Statement

Prove that if a function $f$ is primitive recursive, then there are countably infinite number of primitive recursive definitions of $f$

Yes, this is a homework question.

My Work

I proved that there are infinite number of definitions using the following construction

$$g_k(n) = \underbrace{P_{1}^1(P_{1}^1( \dots( P_{1}^1}_{k \text{ times}}(f(n))\dots), \ \forall \ k \in \mathbb{N}$$

Here $P_1^1$ is the projection basis function which picks the $1^{\text{st}}$ element from a vector of size $1$

Now $k$ is unbounded and we will get an equivalent definition for $f$ for each $k$

Now the issue is that this is not the only way to construct an infinite number of equivalent definitions i.e. I could have also use the multiplication operator and multiplied the output of $f$ by 1 as many times as I liked.

How can I show that there is a $1-1$ correspondence of the set of all primitive recursive definitions with the set of naturals numbers? Unlike bit strings, I am unable to comprehend in what way I should assign some number to each function. In fact I don't even know in how many ways can I create equivalent definitions, so how can I even start to enumerate these functions?

Any hints?

by Banach Tarski at May 28, 2016 07:11 PM

Computing gaussian kernel for image blurring

When I compute a gaussian kernel for image blurring should I normalize the 1D vectors? Because when I apply the raw values sometimes the image gets lighter or darker. The function I'm using is $f(x) = \frac{1}{\sqrt{2\pi}\sigma}e^{\frac{-x^2}{2\sigma^2}}$ or the 2D function. So, should I just apply the results using cross correlation/convolution or should I normalize first?

by Júlio César Batista at May 28, 2016 07:08 PM

Why do we have to trade abstraction for speed?

Why can high-level languages seemingly never reach lower-level languages in terms of speed? Examples of high-level languages would be Python, Haskell, and Java. Low-level languages would be trickier to define, but let's say C. Comparisons can be found all around the Internet$^1$ and they all agree that C is a lot faster, sometimes by a factor of 10 or more.

What is it that causes such big difference in performance and why can't high-level languages catch up?

Initially I believed that this is all the compilers' fault and that things will improve in the future, but some of the most popular higher-level languages have been around for decades and they still lag behind when it comes to speed. Can't they simply compile to a similar to C syntax tree and then follow the same procedures that generate the machine code? Or perhaps it's something to do with the syntax itself?

$^1$ Examples:

by goblin at May 28, 2016 07:03 PM


wishart stochastic volatility models

Stochastic volatility models assume that volatility follow a random process.In the emerging market the volatility tend to be high. why is it that the wishart stochastic volatility model fit well the emerging market

by topi at May 28, 2016 07:03 PM


All output functions of a truth table enter image description here

I can name one more: XNOR. But besides these, what other output functions are there?

by Rafiduzzaman Sonnet at May 28, 2016 06:39 PM


Selection of optimal backtesting parameters

Suppose I backtest some strategy on in-sample data while varying two parameters, say $X$ and $Y$. $X$ can take the values $\{3,6,9,12,15,18\}$ while $Y$ can take $\{10,15,20,25,30\}$. I want to select appropriate values of $X$ and $Y$ for testing the strat on out-of-sample data. The tables of Sharpe and Sortino ratios is as follows:

    > sharpe
  No_of_stocks  X3.month  X6.month  X9.month X12.month X15.month X18.month
1           10 0.2923854 0.2485804 0.3116992 0.2356674 0.2711520 0.2535123
2           15 0.2801226 0.2757317 0.3362495 0.2420944 0.2369459 0.2293062
3           20 0.2887139 0.2953232 0.2952627 0.2979134 0.2553015 0.2249027
4           25 0.2736581 0.3268325 0.2971468 0.2896665 0.2401743 0.2240485
5           30 0.2761537 0.3423867 0.2964909 0.2905532 0.2948999 0.2137761
> sortino
  No_of_stocks  X3.month  X6.month  X9.month X12.month X15.month X18.month
1           10 0.4080662 0.3380257 0.4144185 0.3087768 0.3521293 0.3231242
2           15 0.4013694 0.3842653 0.4503256 0.3174395 0.3080369 0.3006281
3           20 0.4172279 0.4103027 0.3873160 0.3958244 0.3307235 0.2933315
4           25 0.3925792 0.4787884 0.3940304 0.3848995 0.3095552 0.2892468
5           30 0.3987750 0.4990707 0.3906656 0.3826982 0.3863327 0.2721931
> dd
  No_of_stocks  X3.month  X6.month  X9.month X12.month X15.month X18.month
1           10 0.5153225 0.5414108 0.4568199 0.5361848 0.5332630 0.6036963
2           15 0.4821441 0.4207504 0.3996013 0.5099167 0.5355697 0.5306460
3           20 0.4246441 0.3970251 0.4178547 0.3985710 0.4945658 0.5100034
4           25 0.4326678 0.2433439 0.3900689 0.4038422 0.5093099 0.4805518
5           30 0.3173227 0.2467464 0.3621063 0.3928437 0.3686759 0.4893400

As you can see, the 6-month;30-stock combination gives the best results in general. Someone told me that if, with a small variation in any one of the parameters, the value of the metric changes drastically, I should probably not select those combination of parameters (I guess it could be called an estimation error?). If we move from 6-month;30-stock to 9-month;30-stock, the sortino ratio changes pretty drastically from .499 to .39, and the max drawdown changes from .247 to .362.

So by that logic, I probably need to select the 3-month and 20-stock combination, because the metrics don't change a lot if we vary the no. of stocks or no. of months. My question is: is it okay to just blindly select the combination of parameters that give the best results (6-months,30-stocks) or should I also take the variation in metrics into account (3-months,20-stocks)?

Thanks in advance!

by Shirish Kulhari at May 28, 2016 06:34 PM


In FreeBSD's pkg, which actions mark a package as non-automatic

I'm trying to keep my packages up-to-date and unused packages off my system. pkg has a feature called autoremove that will prune unused installed packages that were installed to fulfill dependencies. Internally, it stores this in a flag named automatic. Which operations with pkg, portinstall, portupgrade, and portmaster would make a package non-automatic?

by David Ehrmann at May 28, 2016 06:33 PM


How to use pyspark's SVMModel?

Since I am new with pyspark I would like to see how to fit several classifying models. Therefore I started with a simple desicion tree classifier:

dt = DecisionTreeClassifier(featuresCol=idf.getOutputCol(), labelCol=string_indexer.getOutputCol())

# Instanciate a Pipeline
pipeline = Pipeline(stages=[tokenizer, 

# Transform the data and train the classifier on the training set
pipeline_model =

# Transform the data and perform predictions on the test set
df_test_pred = pipeline_model.transform(df_test)

And it actually works. However, I would like to know how to use SVMs, I try this:

from pyspark.mllib.classification import SVMModel
svm = SVMModel(featuresCol=idf.getOutputCol(), labelCol=string_indexer.getOutputCol())

And I get this exception:

TypeError                                 Traceback (most recent call last)
<ipython-input-35-4467c68d7740> in <module>()
---> 14 svm = SVMModel(featuresCol=idf.getOutputCol(), labelCol=string_indexer.getOutputCol())
     16 # Instanciate a Pipeline

TypeError: __init__() got an unexpected keyword argument 'labelCol'

I went to the documentation. Nevertheless, I did not understood how to use SVMModel. Could someone help me to understand how to use SVMModel?

by john doe at May 28, 2016 06:14 PM


How is the environment designed for testing a reinforcement learning algorithm?

I'm working on a project, and have a candidate algorithm which I'd like to test. Before I go any further, I need to get the hang of how to code the "structure" of the environment in which my system is embedded, like what the relevant signals are, how they interact with each other, how they evolve with time, and so on.

I'm looking for resources on design of the environment, for artificial intelligence problems, as I don't want to reinvent any wheels. Any information would be welcome, especially introductory to intermediate level. Thanks!

by Joebevo at May 28, 2016 06:11 PM

Why %RSD of execution times, while sorting hundreds of arrays, is lower for larger arrays of random integers?

I am experimenting with the sorting of arrays and their execution times. While using bubblesort, insertsort and quicksort, I have observed that relative standard deviation (%RSD) of execution times, for five hundred arrays of the same size containing random integers, decreases if the size of the arrays increases.

For example: if the %RSD of execution times while sorting 500 arrays, of size fifty elements each, is observed as 7%, the same for arrays with five hundred elements each is about 2.5%.

So the question is: why this value of change decreases with the increase in the array size?

P.S. I am using a cycle-accurate simulator (gem5) and taking exact clock cycles.

by tod at May 28, 2016 05:58 PM

How to prove the following xor equation?

In a test paper, a question is given as :

Let, x1^x2^x3^x4 =0 where x1, x2, x3, x4 are Boolean variables,and ^ is the XOR operator.

Which one of the following must always be TRUE?
(A) x1x2x3x4 = 0
(B) x1x3+x2 = 0
(C)  x1'^x3' = x2'^x4'
(D) x1+x2+x3+x4 = 0

+ is OR and concatenation is AND operator.

Correct answer is C, which I guessed because when all booloean variables are 1, then, neither A,B, or D will hold,But what is the correct way to get to option C?

by Sumeet Singh at May 28, 2016 05:51 PM


Why don't implement any of spark.mllib algorithms?

Following the Spark MLlib Guide we can read that Spark has two machine learning libraries:

  • spark.mllib, built on top of RDDs.
  •, built on top of Dataframes.

According to this and this question on StackOverflow, Dataframes are better (and newer) than RDDs and should be used whenever possible.

The problem is that I want to use common machine learning algorithms (e.g: Frequent Pattern Mining,Naive Bayes, etc.) and (for dataframes) don't provide such methods, only spark.mllib(for RDDs) provides this algorithms.

If Dataframes are better than RDDs and the referred guide recommends the use of, why aren't common machine learning methods implemented in that lib?

What's the missing point here?

by Fernando Paladini at May 28, 2016 05:51 PM


Number of ways to construct a Regular graph

For a regular graph (each vertex has same degree k), a part of graph upto level l is given, how to compute the total number of ways in which the graph can be completed given that m edges and n vertices are yet to be added to the preconstructed graph ?

Background : I am working on size of frontier for BFS traversal for different types of graphs. The size of the frontier, depends on the structure of actual graph. For regular graphs, multiple structures can be derived with same number of vertices. I tried deriving all possible structures for small k(degree) and n(number of vertices) but unable to get a general formulation for the same.

by Ravikant at May 28, 2016 05:41 PM


Calculating TP,FP,TN,FN for multiple class label in decision tree

I am a beginner in machine learning and classification, I tried to code decision tree algorithm for the prediction of 2 classes.It worked well, but the problem now am facing is , what if i try to predict more than 2 class labels,?? how to calculate the precision and recall for dataset having 5 classes.. I came to know that , this can be possible by calculating the precision and recall values for all the classes and taking average of that, but can anyone help me how do i need to calculate the TP,TN,FP,FN if i have 5 classes?

by user3448883 at May 28, 2016 05:40 PM

Normalizing [0,1] big amount of columns

I am working on Machine Learning, particularly I have a dataset with 50+ columns and 100,000 rows. I need to get the data normalized with ranging to [0,1] (not with standardization) and I've split the dataset in a 80/20 percentatge for the training/test sets.

My question is: I must normalize first the training set and then normalize the test set with the means and deviations extracted from the training set normalization. How can I do that to each one of the columns? I mean, is there a defined method to get the (mean, deviation) tuples for every column in the training set in order to be able to normalize the test set with those values?

by vandermies at May 28, 2016 05:35 PM

Multinomial Logistic Regression in spark ml vs mllib

Spark Version 2.0.0 has the stated goal to bring feature parity between the ml and the now-deprecated mllib packages.

Presently however the ml package provides ElasticNet support but only with binary regression. To obtain multinomial apparently we have to accept using the deprecated mllib?

The downsides of using mllib:

  • It is deprecated. So we will have the "why are you using the old stuff" questions to field
  • They do not use the ml workflow so it does not integrate cleanly
  • For the above reasons we would eventually have to do a rewrite.

Is there an approach available to achieving the one-vs-all multinomial with the ml package?

by javadba at May 28, 2016 05:19 PM

Many-state nominal variables modelling

I was reading about neural networks and found this: "Many-state nominal variables are more difficult to handle. ST Neural Networks has facilities to convert both two-state and many-state nominal variables for use in the neural network. Unfortunately, a nominal variable with a large number of states would require a prohibitive number of numeric variables for one-of-N encoding, driving up the network size and making training difficult. In such a case it is possible (although unsatisfactory) to model the nominal variable using a single numeric index; a better approach is to look for a different way to represent the information."

This is exactly what is happening when I am building my input layer. One-of-N encoding is making the model very complex to design. However, it is mentioned in the above that you can use a numeric index, which I am not sure what he/she means by it. What is a better approach to represent the information? Can neural networks solve a problem with many-state nominal variables?


by M J at May 28, 2016 04:56 PM

Activity recognition [on hold]

Can someone help me getting the source code of this project

I searched in Github,but all codes doesnt work for me Thanks

by Azza Allouch at May 28, 2016 04:54 PM


ex ante tracking error correlation between funds

I have two portfolio's called Comb & Global. They both have the same investable universe lets says 3000 stocks & are measured against the same benchmark. So it is possible that both funds hold the same stocks. I would like to examine the correlation of the ex-ante between the two funds.

I know I can calculate the ex-ante tracking error as below,

te = sqrt((port_wgt - bm_wgt)' * cov_matrix * (port_wgt - bm_wgt))

I also know the correlation is calculated by

 p = cov(x,y) / stdev(x) * stdev(y)

I was wondering the best way to calculate the ex ante correlation between the two funds? Is there a relationship between the two funds weights that I can make use of?


I should have mentioned that the two portfolios are sub portfolios and are combined into one portfolio. So I wanted to see the correlation of the ex ante tracking error between the two sub portfolio's.

I realised I can do the following,

port_wgts - number_of_companies x 2 matrix
cov_matrix - number_of_companies x number_of_companies matrix

so the below line will return a 2x2 covariance matrix.

port_wgts' * cov_matrix * prt_wgts

So I have the variances of both sub portfolios - taking the square root of this gives me the tracking error for both.

Convert the 2 X 2 covariance matrix to a correlation matrix by the following

  D = Diag(cov_matrix)^(1/2)
  corr_matrix = D^-1 * cov_matrix * D^-1

So I now have the correlation between the two sub portfolios just using the weights.

by mHelpMe at May 28, 2016 04:11 PM


Planet Theory

Progress On Modular Computation

New results on computing with modular gates


Shiteng Chen and Periklis Papakonstaninou have just written an interesting paper on modular computation. Its title, “Depth Reduction for Composites,” means converting a depth-{d}, size-{s} {\mathsf{ACC^0}} circuit into a depth-2 circuit that is not too much larger in terms of {d} as well as {s}.

Today Ken and I wish to talk about their paper on the power of modular computation.

One of the great mysteries in computation, among many others, is: what is the power of modular computation over composite numbers? Recall that a gate {\mathsf{MOD}_{r}(x)} outputs {1} if {(x_{1}+\cdots+x_{n)} \equiv 0 \bmod r} and {0} otherwise. It is a simple computation: Add up the inputs modulo {r} and see if the sum is {0}. If so output {1}, else output {0}. This can be recognized by a finite-state automaton with {r} states. It is not a complex computation by any means.

But there lurk in this simple operation {\mathsf{MOD}_{r}} some dark secrets. When {r} is a prime the theory is fairly well understood. There remain some secrets but by Fermat’s Little Theorem a {\mathsf{MOD}_r} gate has the same effect as a polynomial. In general, when {r} is composite, this is not true. This makes understanding {\mathsf{MOD}} gates over composites much harder: simply because polynomials are easy to handle compared to other functions. As I once heard someone say:

“Polynomials are our friends.”

Chen and Papakonstaninou (CP) increase our understanding of modular gates by proving a general theorem about the power of low depth circuits with modular gates. This theorem is an exponential improvement over previous results when the depth is regarded as a parameter rather than constant. Their work also connects with the famous work of Ryan Williams on the relation between {\mathsf{NEXP}} and {\mathsf{ACC^0}}.

Main Theorem

We will just state their main result and then state one of their key lemmas. Call a circuit of {\mathsf{AND}}, {\mathsf{OR}}, {\mathsf{NOT}}, and {\mathsf{MOD}_{r}} gates (for some {r}) an {\mathsf{ACC}}-circuit.

Theorem 1 There is an efficient algorithm that given an {\mathsf{ACC}} circuit of depth {d}, input length {n}, and size {s \ge n}, outputs a depth-2 circuit of the form {\mathsf{SYM}\circ\mathsf{AND}} of size {2^{(log s)^{O(d)}}}, where {\mathsf{SYM}} denotes some gate whose output depends only on the number of {1}s in its input.

This type of theorem is a kind of normal-form theorem. It says that any circuit of a certain type can be converted into a circuit of a simpler type, and this can be done without too much increase in size. In complexity theory we often find that it is very useful to replace a complicated type of computational circuit with a much cleaner type of circuit even if the new circuit is bigger. The import of such theorems is not that the conversion can happen, but that it can be done in a manner that does not blow up the size too much.

This happens all through mathematics: finding normal forms. What makes computational complexity so hard is that the conversion to a simpler type often can be done easily—but doing so without a huge increase in size is the rub. For example, every map

\displaystyle  f \colon A \rightarrow \mathbb{Z},

can be easily shown to be equal to an integer-valued polynomial with coefficients in {\mathbb{Q}} provided {A} is a finite subset of {\mathbb{Z}^{n}}. For every point {a = (a_1,\dots,a_n) \in A}, set

\displaystyle  g_a(x_1,\dots,x_n) = \prod_{i=1}^n \prod_{b_i \neq a_i} (x_i - b_i),

where the inner product is over the finitely many {b_i} that appear in the {i}-th place of some member of {A}. Then {v_a = g_a(a_1,\dots,a_n)} is an integer and is the only nonzero value of {g_a} on {A}. We get

\displaystyle  f'(x_1,\dots,x_n) = \sum_{a \in A}\frac{f(a)}{v_a}g_a(x_1,\dots,x_n),

which is a polynomial that agrees with {f} on {A}.

Well, this is easy but brutish—and exponential size if {A} is. The trick is to show that when {f} is special in some way then the size of the polynomial is not too large.

Lemma 5

One of the key insights of CP is a lemma, Lemma 5 in their paper, that allows us to replace a product of many {\mathsf{MOD}} gates by a summation. We have changed variables in the statement around a little; see the paper for the full statement and context.

Lemma 5 Let {x_{1},\dots,x_{n}} be variables over the integers and let {r,q} be relatively prime. Then there exist {m \leq r^{n+1}} integral linear combinations {y_{1},\dots,y_{m}} of the variables and integer coefficients {c_{j} \in [q]} so that

\displaystyle  \prod_{1 \le i \le n} \mathsf{MOD}_{r}(x_{i}) = \left(\sum_{1 \le j \le m} c_{j}\mathsf{MOD}_{r}(y_{j})\right) \bmod{q}.

The value of {r} can be composite. The final modulus can be {s = qr} in place of {q} and this helps in circuit constructions. Three points to highlight—besides products being replaced by sums—are:

  1. The bound {m \leq r^{n+1}} on the number of terms in the sum depends only on {r}, not {q} (or {s}); this may have other uses.

  2. The arithmetic of the sum is over an extension of {r} to {s}.

  3. The coefficients are integral. That is, the conclusion is doing more than just linearizing with complex coefficients, which as they remark in their proof would be easier. This also helps in building circuits where bits set to {1} are being counted.

Further all of this can be done in a uniform way, so the lemma can be used in algorithms. This is important for their applications. Note this is a type of normal form theorem like we discussed before. It allows us to replace a product by a summation. The idea is that going from products to sums is often a great savings. Think about polynomials: the degree of a multi-variate polynomial is a often a better indicator of its complexity of a polynomial than its number of terms. It enables them to remove layers of large {\mathsf{AND}} gates that were implementing the products (Lemma 8 in the paper) and so avoids the greatest source of size blowup in earlier constructions.

A final point is that the paper makes a great foray into mixed-modulus arithmetic, coupled with the use of exponential sums. This kind of arithmetic is not so “natural” but is well suited to building circuits. Ken once avoided others’ use of mixed-modulus arithmetic by introducing new variables—see the “additive” section of this post which also involves exponential sums.

Open Problems

The result of CP seems quite strong. I am, however, very intrigued by their Lemma 5. It seems that there should be other applications of this lemma. Perhaps we can discover some soon.

by RJLipton+KWRegan at May 28, 2016 04:05 PM


Training neural network for XOR in Ruby

I am trying to train a feedforward network to work to perform an XOR operations with the Ruby Library AI4R. However, when I evaluate for the XOR after training it. I am not getting the correct output. Has anyone used this library before and gotten it to learn the XOR operation.

I am using two input neurons, three neurons in a hidden layer, and one layer for the output, as I saw a precomputed XOR feed forward neural network like this before.

require "rubygems"
require "ai4r"

# Create the network with:
 #   2 inputs
 #   1 hidden layer with 3 neurons
 #   1 outputs
 net =[2, 3, 1])  

 example = [[0,0],[0,1],[1,0],[1,1]]
 result = [[0],[1],[1],[0]]

 # Train the network
 400.times do |i|
   j = i % result.length
   puts net.train(example[j], result[j])

 # Use it: Evaluate data with the trained network
puts "evaluate 0,0: #{net.eval([0,0])}"  # =>  evaluate 0,0: 0.507531383375123
puts "evaluate 0,1: #{net.eval([0,1])}"  # =>  evaluate 0,1: 0.491957823618629
puts "evaluate 1,0: #{net.eval([1,0])}"  # =>  evaluate 1,0: 0.516413912471401
puts "evaluate 1,1: #{net.eval([1,1])}"  # =>  evaluate 1,1: 0.500197884691668


by Flethuseo at May 28, 2016 04:00 PM

Simple Neural Network can't learn XOR

I'm trying to learn about neural networks and coded a simple, back-propagation, neural network that uses sigmoid activation functions, random weight initialization, and learning/gradient momentum.

When configured with 2 inputs, 2 hidden nodes, and 1 it fails to learn XOR and AND. However, it will correctly learn OR.

I fail to see what I have done wrong and so any help would be greatly appreciated.


EDIT: As stated, I tested with 2 hidden nodes but the code below shows a configuration of 3. I simply forgot to change this back to 2 after running tests using 3 hidden nodes.


module Neural

class Network

    attr_accessor :num_inputs, :num_hidden_nodes, :num_output_nodes, :input_weights, :hidden_weights, :hidden_nodes, 
                    :output_nodes, :inputs, :output_error_gradients, :hidden_error_gradients,
                    :previous_input_weight_deltas, :previous_hidden_weight_deltas

    def initialize(config)

    def initialize_input(config)
        self.num_inputs = config[:inputs]
        self.inputs =
        self.inputs[-1] = -1

    def initialize_nodes(config)
        self.num_hidden_nodes = config[:hidden_nodes]
        self.num_output_nodes = config[:output_nodes]
        # treat threshold as an additional input/hidden node with no incoming inputs and a value of -1
        self.output_nodes =
        self.hidden_nodes =
        self.hidden_nodes[-1] = -1

    def initialize_weights
        # treat threshold as an additional input/hidden node with no incoming inputs and a value of -1
        self.input_weights ={}
        self.hidden_weights ={}
        self.previous_input_weight_deltas ={{0}}
        self.previous_hidden_weight_deltas ={{0}}

    def set_random_weights(weights)
        (0...weights.size).each do |i|
            (0...weights[i].size).each do |j|
                weights[i][j] = (rand(100) - 49).to_f / 100

    def calculate_node_values(inputs)
        inputs.each_index do |i|
            self.inputs[i] = inputs[i]

        set_node_values(self.inputs, input_weights, hidden_nodes)
        set_node_values(hidden_nodes, hidden_weights, output_nodes)

    def set_node_values(values, weights, nodes)
        (0...weights.size).each do |i|
            nodes[i] = Network::sigmoid([i]).map{|v,w| v*w}.inject(:+))

    def predict(inputs)
        output_nodes.size == 1 ? output_nodes[0] : output_nodes

    def train(inputs, desired_results, learning_rate, momentum_rate)
        backpropogate_weights(desired_results, learning_rate, momentum_rate)

    def backpropogate_weights(desired_results, learning_rate, momentum_rate)
        output_error_gradients = calculate_output_error_gradients(desired_results)
        hidden_error_gradients = calculate_hidden_error_gradients(output_error_gradients)
        update_all_weights(inputs, desired_results, hidden_error_gradients, output_error_gradients, learning_rate, momentum_rate)

    def self.sigmoid(x)
        1.0 / (1 + Math::E**-x)

    def self.dsigmoid(x)
        sigmoid(x) * (1 - sigmoid(x))

    def calculate_output_error_gradients(desired_results){|desired, result| (desired - result) * Network::dsigmoid(result)}

    def reversed_hidden_weights
        # array[hidden node][weights to output nodes]
        reversed ={}
        hidden_weights.each_index do |i|
            hidden_weights[i].each_index do |j|
                reversed[j][i] = hidden_weights[i][j];


    def calculate_hidden_error_gradients(output_error_gradients)
        reversed = reversed_hidden_weights do |node, i|
            Network::dsigmoid(hidden_nodes[i]) *[i]).map{|error, weight| error*weight}.inject(:+)

    def update_all_weights(inputs, desired_results, hidden_error_gradients, output_error_gradients, learning_rate, momentum_rate)
        update_weights(hidden_nodes, inputs, input_weights, hidden_error_gradients, learning_rate, previous_input_weight_deltas, momentum_rate)
        update_weights(output_nodes, hidden_nodes, hidden_weights, output_error_gradients, learning_rate, previous_hidden_weight_deltas, momentum_rate)

    def update_weights(nodes, values, weights, gradients, learning_rate, previous_deltas, momentum_rate)
        weights.each_index do |i|
            weights[i].each_index do |j|
                delta = learning_rate * gradients[i] * values[j]
                weights[i][j] += delta + momentum_rate * previous_deltas[i][j]
                previous_deltas[i][j] = delta






load "network.rb"

learning_rate = 0.3
momentum_rate = 0.2

nn = => 2, :hidden_nodes => 3, :output_nodes => 1)
10000.times do |i|
    # XOR - doesn't work
    nn.train([0, 0], [0], learning_rate, momentum_rate)
    nn.train([1, 0], [1], learning_rate, momentum_rate)
    nn.train([0, 1], [1], learning_rate, momentum_rate)
    nn.train([1, 1], [0], learning_rate, momentum_rate)

    # AND - very rarely works
    # nn.train([0, 0], [0], learning_rate, momentum_rate)
    # nn.train([1, 0], [0], learning_rate, momentum_rate)
    # nn.train([0, 1], [0], learning_rate, momentum_rate)
    # nn.train([1, 1], [1], learning_rate, momentum_rate)

    # OR - works
    # nn.train([0, 0], [0], learning_rate, momentum_rate)
    # nn.train([1, 0], [1], learning_rate, momentum_rate)
    # nn.train([0, 1], [1], learning_rate, momentum_rate)
    # nn.train([1, 1], [1], learning_rate, momentum_rate)

puts "--- TESTING ---"
puts "[0, 0]"
puts "result "+nn.predict([0, 0]).to_s
puts "[1, 0]"
puts "result "+nn.predict([1, 0]).to_s
puts "[0, 1]"
puts "result "+nn.predict([0, 1]).to_s
puts "[1, 1]"
puts "result "+nn.predict([1, 1]).to_s

by growlyface at May 28, 2016 04:00 PM

Planet Emacsen

Irreal: Writing A Custom Emacs Package

Al Scott is an Emacs user and JavaScript developer who's interested in using the TDD method. He likes to use Mocha to run JavaScript scripts for testing. Unfortunately, there's no easy way to do that from Emacs so Scott decided to write a custom package, tied to the Emacs compile command, that would implement a Mocha test runner. His post walks us through the steps needed to write such a package.

Scott starts by noting that Emacs comes with a lot of built-in facilities to ease Elisp development and that by adding a couple of packages like flycheck and company-mode you have a pretty complete IDE for Elisp. The purpose of his Mocha-runner is to bring some of the same capabilities to JavaScript development.

The rest of the post shows the step by step development of writing the package. He even mentions the steps needed to add the package to the MELPA package repository. Even if you're not a JavaScript developer or interested in TDD, there's a lot of useful information in the post and I recommend it to any Elisp developer.

by jcs at May 28, 2016 03:57 PM


Planet Emacsen

Alex Schroeder: nginx as a caching proxy

Remember the fiasco at the end of 2014 and the beginning of 2015, when Emacs Wiki was down as I tried to switch to Fast CGI, and it kept staying down as I tried to use mod_perl, when finally Nic Ferrier stepped in and started paying for new hosting. He used nginx as a caching proxy for Apache, which continued to call the wiki as a simply CGI script.

The comments reveal quite a bit of caching issues, though. The problem was that nginx seemed to ignore that two pages with an identical URL but for different language preferences could be different. This in turn meant that a German visitor to a page not in the cache would put the page in the cache but with German menus at the bottom, for example (“Diese Seite bearbeiten” instead of “Edit this page”). And if the next visitor to the same page didn’t understand German, this was annoying.

I thought the answer was that Apache should send the “Vary” header telling the cache that accept-language should be part of the cache key. See this discussion of the Vary header, for example: Understanding the HTTP Vary Header and Caching Proxies. But it just didn’t work.

Recently, I suddenly realized that perhaps nginx doesn’t care about the Vary header Apache is sending. I learned about the proxy_header_key setting in the nginx caching guide. As you can see, the default is inadequate! $scheme$proxy_host$request_uri is not what I’m expecting. I needed to add http_accept_language to that key:

# proxy cache key is $scheme$proxy_host$request_uri by default
proxy_cache_key $scheme$proxy_host$request_uri$http_accept_language;

I get the feeling that things are working, right now! If so, it only took 17½ month to find the solution. I’m writing it down in case anybody else stumbles over this problem (and in case I need to remember in the future).

Perhaps the problem is also related to how I compute Etags for my pages: the wiki doesn’t take accept-language into account when computing Etags. Now that I think about it, perhaps that would also have helped solve the problem?


by Alex Schroeder at May 28, 2016 03:19 PM


Stateful function pipeline

The code explains itself.

val s = Seq(1,1,1)
val res: Seq[Int] =
                     .check(count how many 1s, if > 2 throw Exception)

I am searching the simple solution to this check function .

  • I can use map and closure to count and throw, but I want pure function.
  • I can use filter and size or reduce, but it return a value while I want it to be able to composable within the pipeline.

How do I make a pure and stateful check pipe to the pipeline ?

by WeiChing Lin at May 28, 2016 03:09 PM


Classify the set of all TMs whose languages reduce to A_TM

Let $$L = \{ \langle M \rangle \mid M \text{ is a Turing machine so } A_{TM} \leq_m L(M) \}$$ The question is whether $L$ is in $\mathcal{R}, \mathcal{RE}, co-\mathcal{RE}$ or in $\overline{\mathcal{RE} \cup co-\mathcal{RE}}$. I started by denoting $\mathcal{C} = \{L \in \mathcal{RE} \mid A_{TM} \leq_m L \}$ and try to use Rice theorem to show that $L_{\mathcal{C}}=L\notin\mathcal{R}$.

I know that $H_{TM} \leq_m A_{TM}$ so $\mathcal{C} \neq \emptyset$ but I'm stuck with showing that $\mathcal{C} \neq \mathcal{RE}$. I'm starting to think it's not the best strategy.

Another (successful) attempt to show that $L \notin A_{TM}$:

Define reduction $f:A_{TM}\rightarrow L$ on input $\left\langle M,w\right\rangle$ returns:

  • if $M$ accepts $w$ return $ \left\langle U_{TM}\right\rangle $ ($L \left(U_{TM}\right)=A_{TM}$ so $\left\langle U_{TM}\right\rangle \in L$)
  • if $\left\langle M\right\rangle$ rejects $w$ return 1 (not a TM encoding hence not in L)

$f$ is computable and $\left\langle M,w\right\rangle \in A_{TM}\iff f\left(\left\langle M,w\right\rangle \right)\in L$ hence $A_{TM}\leq_m L \implies L\notin co-\mathcal{RE}$.

Now I want to show that $L \notin \mathcal{RE}$. And I'm stuck..


$A_{TM} = \{ \langle M,w \rangle \mid M \text{ is a TM}, w \in L(M)\}$

$H_{TM} = \{ \langle M,w \rangle \mid M \text{ is a TM and $M$ halts on $w$} \}$

by Uria Mor at May 28, 2016 03:04 PM

Is a union of any TM language and the halting problem language decidable

I need to find if the following language is decidable (in $R$):

$L=\{ \langle M \rangle \mid M \text{ is a TM}, L(M)\cup H_{TM}\in RE\}$

Where $H_{TM}$ is of the halting problem.

My intuition is that $RE$ is closed under union. But that does not help me prove anything.

by Mugen at May 28, 2016 03:04 PM


How (and Why) to Log Your Entire Bash History

For the last three and a half years, every single command I’ve run from the command line on my MacBook Pro has been logged to a set of log files.

Uncompressed, these files take up 16 MB of disk space on my laptop. But the return I’ve gotten on that small investment is immense. Being able to go back and find any command you’ve run in the past is so valuable, and it’s so easy to configure, you should definitely set it up today. I’m going to share how to do this so you can take advantage of it as well.

Bash Configuration File

You’ll need to configure an environment variable so that it’s loaded in every command line session. On my MacBook Pro, I use the .bash_profile file. On other operating systems, the .bashrc file is an option. See this blog post on .bash_profile vs .bashrc for more on the differences.


The Bash Prompt HOWTO describes the PROMPT_COMMAND environment variable as follows:

Bash provides an environment variable called PROMPT_COMMAND. The contents of this variable are executed as a regular Bash command just before Bash displays a prompt.

We’re going to set the PROMPT_COMMAND variable to be something that logs the most recent line of history to a file. To do this, add the following to your chosen Bash configuration file (.bash_profile for me):

export PROMPT_COMMAND='if [ "$(id -u)" -ne 0 ]; then echo "$(date "+%Y-%m-%d.%H:%M:%S") $(pwd) $(history 1)" >> ~/.logs/bash-history-$(date "+%Y-%m-%d").log; fi'

First, this checks to make sure we’re not root.

If that checks out, it appends a line that includes the current timestamp, the current working directory, and the last command executed to a log file that includes the current date in the filename.

Having the commands stored in separate files like this really helps when you’re trying to find a command you ran sometime last month, for example.

> grep -h logcat ~/.logs/bash-history-2016-04*
2016-04-01.10:18:03 /Users/me 66555  adb logcat
2016-04-01.10:19:56 /Users/me 66555  adb logcat
2016-04-01.11:01:36 /Users/me 66555  adb logcat
2016-04-05.09:50:25 /Users/me/git/android-project 66368  adb logcat
2016-04-05.13:42:54 /Users/me/git/android-project 66349  adb -s emulator-5554 logcat
2016-04-06.10:40:08 /Users/me/git/android-project 66390  adb logcat
2016-04-06.10:48:54 /Users/me/git/android-project 66342  adb logcat


It will only take a few seconds to update your PROMPT_COMMAND so that it logs every command to a file.

And the next time you’re trying to remember the command line options you used with find that one time (but can’t find in your current session’s history), you’ll be able to look it up in the log files.

Oh, and if you want to know how many times you’ve done a git push in the last three and a half years, you can look that up, too (5,585 git pushes for me)!

The post How (and Why) to Log Your Entire Bash History appeared first on Atomic Spin.

by Patrick Bacon at May 28, 2016 03:00 PM


The Solar Eye - Cyberpunks [webcomics]

Enough coding!

Time for a little break with this June issue of the Fisheye Placebo - The Solar Eye. :-)

If you haven’t read the first edition or not aware about the context, read the comics introduction & its discussion here on lobsters. While the entire code/manuscript is available on Github I don’t see any reason why you or someone you know would fork it and not go hiking all weekend instead.

Have fun lobsters, hope you like it!

P.S. I’m looking for projects or friends who’d like their books to be converted into responsive superbooks. Reach me on or you could use message on Lobsters too.


by marvindanig at May 28, 2016 02:59 PM


Number of ordinal trees with n nodes, of depth d, with l leaves

What is the number of ordinal trees (aka rose trees) with $n$ nodes, of depth $d$, with $l$ leaves?

I thought that it was a known results but I could not find it, and neither did the various combinatorics whom I asked. It seems that it should be computed using generating functions, but beside the fact that such a generating function would have two variables, there is also the fact that the number of nodes and of leaves are additive between the various children of a tree, whereas the depth of a tree is one plus the max of the depth of each of its children: I don't know how to express those in a generating functions.

The reason I ask is that, when considering (some variant of) the cartesian tree of a multiset in order to support Range Minimum Queries on it, the number of runs in the multisets corresponds exactly to the number of leaves, and the number of distinct elements in the multiset gives a bound on the depth of the ordinal tree: the logarithm of the number of such trees will give a lower bound on the space usage of any compressed index supporting Range Minimum Queries, in the worst case over multisets of size $n$ with $l$ runs and $d$ distinct values. (And, with only a bit more work I believe, an asymptotically tight upper bound.)

Demaine et al define an ordinal tree as "a rooted tree of arbitrary degree in which the children are ordered" It seems that they are also called Rose trees in some communities. They should not be confused with cardinal trees, also called $k$-ary trees.

(Cross posted on Math.Stackexchange as suggested by a colleague)

by Jeremy at May 28, 2016 02:55 PM

Testing Isomorphism of projective planes

Miller showed that isomorphism testing of projective planes can be done in $v^{O(\log \log v)}$. I would like to know whether Babai's techniques that led to the quasipolynomial time algorithm for GI would colapse isomorphism testing for projective planes to polynomial time.

Is Miller's algorithm still the best known upper-bound? What other natural hard problems can be solved in $n^{O(\log \log n)}$ time?

Here, hard means the problem has no known polynomial-time algorithm.

by Mohammad Al-Turkistany at May 28, 2016 02:53 PM


Set of vertex-disjoint cycles maximizing different colored vertices

Let $G=(V,E)$ be a directed graph whose vertices $v \in V$ have colors and its edges $e\in E$ have costs $cost(e)$.

I am looking to find a set of vertex-disjoint cycles that:

  1. First maximizes the number of differently colored covered vertices.
  2. In case of ties, break ties by secondly maximizing the number of chosen edges.
  3. In case of any further ties, third minimizes the total cost of the chosen edges.

Original Graph

This is very similar to this problem, with additional constraint (#1). Without the first constraint (maximize the colors) this problem is easily solved by:

  • Splitting each vertex $v$ to $v^+$ and $v^-$.
  • Each vertex $v^+$ and $v^-$ provides a supply of $+1$ and $-1$, respectively.
  • Replacing an original edge $e': v \to w$ with $e: v^+ \to w^-$. Such each edge has cost $cost(e') = cost(e)$ and unit capacity.
  • Adding a "bind" edge $e'' : v^+ \to v^-$. Its cost is $c(e'') \gg 1$

Here is a representation:

Solution without color constraint

And the problem has been reduced to a minimum-flow problem (MCF). A MCF algorithm will choose $A_1 \to B_1 \to A_2 \to B2 \to A_3 \to B_3 \to A_1$.

My question: is it possible to apply a similar modeling to satisfy the color constraint in polynomial time? If so, is it possible to reduce the problem to a MCF problem?

Here is how far I have managed to take it:

Attempt with color constraint

Here, bind-edges $e''$ go through a collective set of nodes, one per color, e.g., "A no+" and "A no-". These represent the corresponding vertices not chosen. Between "A no+" and "A no-" three edges are added with unit capacities and costs $ c(e_{red}) >> c(e_{orange}) >> 1$

I basically model that the cost of not choosing one or even two yellow items is insignificant to the cost of not choosing all three yellow items. Therefore, the goal is to choose $A_1 \to C \to B_3 \to A_1$.

Nevertheless, this did not quite work. Basically, if a flow goes through $A_1^+ \to A_{no}^+ \to A_{no}^-$, it's not guaranteed that it will return to $A_1^-$.

I have tried other approaches, including toying around with non-unit capacities, but I didn't find a solution. Can it be solved in polynomial time?

by Adama at May 28, 2016 02:32 PM


Is there an API to perform request about stocks and financial derivatives? [closed]

Is there an API to make a request such as: all the companies which thier P/E ratio is greater than 50 ? I know I can crawl google finance and have it locally on my desktop, but I prefer something more dynamic.

by 0x90 at May 28, 2016 02:20 PM


ZeroMQ Publish and Subscribe concurrently

I'm working on a program in C++ that needs to be able to send / receive JSON-payloads from an arbitrary number of other clients.

At first, I tried to implement PubNub service, but I figured I can't get and publish messages at the same time (even using two different contexts on distinct threads). I need to be able to do that. I also found that PubNub has too much latency for my liking.

I came across the ZeroMQ library which has a PUB/SUB model that would suit my needs. But all examples I came across explain how to implement it in a way that one process is a Publisher OR a Subscriber, and not both at the same time.

Ideally, I would like to program a server that would relay all messages coming from anyone, to anyone subscribed to a specific channel specified in the message. Anyone should be able to receive and publish messages to anyone else on the network, provided they are subscribed to the correct channel.


Note : I do not need insurance of receipt because payload N+1 will take precedence over payload N. I want a send and forget mean of communication (UDP-like).

As requested : The PubNub limit of 32 kB per JSON-payload was perfect for me, I do not need more. In fact, my payloads are around 4 kB in average. All instances of clients will run on the same local network, so latency should be less than 5 ms ideally. As for the number of clients, there won't be more than 4 clients subscribed to the same channel/topic at a time.


I cannot predict how many channels/topics will exist ahead of time, but it will be in the order of dozens (most of the time), hundreds (at peak). Not thousands.


Q1: - Can I implement such a behavior using ZeroMQ ?
Q2: - Is there any working sample demonstrating that (preferably in C++) ?
Q3: - If not, any suggestions for a library in C++ ?

pubsub architecture

by Hippolyte Barraud at May 28, 2016 02:13 PM

Creation of Labels for Label Spreading in skicit-learn

I am using Label Propagation in skicit-learn for finding labels for the unknown ones

My Input is 'data_list' containing 400 000 sentences in sanskrit language like :

['tatra yad tad mahABAga SaMkara arDa kAyin', 'gam sa bala SrImant duryoDana ariMdama', 'Sigru pattra Bava SAka rucya vAta kaPa apaha']

I gave this 'data_list' as a parameter to method of fit_transform of Tfidf vectorizer and got a sparse matrix 'X'

Link :

    X = fit_transform(data_list)

I then gave this matrix 'X' to TruncatedSVD to obtain a matrix 'X_reduced' which is dimensionally reduced

    svd = TruncatedSVD()
    X_reduced = svd.fit_transform(X)

Now to use the 'fit' function of Label Spreading I need to give two parameters 1)the above matrix X_reduced and 2) Matrix with labels

Link :

    label_spread = LabelSpreading(),labels) 
    #fit(X, y)
    #where X : array-like, shape = [n_samples, n_features]
    #y : array_like, shape = [n_samples]

For creating Labels, I have a list of words (these are the words that occur in above sanskrit sentences) with name 'unique_words' which has labels of '1' or '0' and '-1' for all other words not in the list

The problem I am facing is X_reduced has no. of rows (n_samples) = no. of sanskrit sentences i.e., no. of elements in the above 'data_list' and I should create a labels matrix of same numbers of rows as X_reduced

So how to create labels matrix to give as a parameter to 'fit' function of Label Spreading in skicit-learn

by user102023 at May 28, 2016 01:47 PM


Hadamard gate on entangled qubit

TL;DR: If you have two entangled qubits in the state $|00\rangle + |11\rangle$, what is the result of applying the Hadamard gate on the second qubit, and why?

I am trying to understand $\text{PSPACE} \subseteq \text{QIP}(3)$ (Watrous, 2003), and have troubles understanding the following.

Hadamard gate: Given some qubit, one can test that it is in a uniform superposition by applying the Hadamard gate on it as it will map it to $|0\rangle$ in this case, $$H(|0\rangle + |1\rangle) = |0\rangle.$$ By measuring it, you have some probability to get a $1$ iff. the qubit was not in a perfectly uniform state.

With entanglement: Now, the paper claims that you can use the Hadamard gate to detect entanglement as well. Given two qubits, $x,y$, that are perfectly entangled, you can have them in the superposition $$xy = |00\rangle + |11\rangle.$$ I understand that if you measure $x$, then apply the Hadamard gate to $y$, it will not map it to $|0\rangle$ as it will be entirely determinated by the measurement made on $x$.

What I don't understand: They do not claim to be able to measure/collapse the qubits that are entangled with the qubits they want to test, so how does it work? If you have two entangled qubits in the state $|00\rangle + |11\rangle$, what is the result of applying the Hadamard gate on the second qubit, and why?

Where is it in the paper:

  • When skecthing the protocol at the end of the introduction section (End of first paragraph, page 2):

    If there is significant correlation between the low-index prover responses and the high-index verifier-messages, the uniformity test [Hadamard gate, measure 0s] will fail with high probability

  • When discussing the completeness of the protocol, in particular step 2. of the verifier (start of page 8):

    Next, the verifier applies the Hadamard transform to every qubit in each register of $\bar{R^u}$. If $\bar{R^u}$ now contains only 0 values, the verifier accepts, otherwise the verifier rejects. In short, the verifier is projecting the state of $\bar{R^u}$ onto the state where $\bar{R^u}$ is uniformly distributed over all possible values. It is easy to check that in the case of the honest prover the registers $\bar{R^u}$ are not entangled with any other registers, as each register of $P^u$ depends only on those of $R^u$, and are in a uniform superposition over all possible values. Thus, the verifier accepts with certainty in this case.

  • Also in the proof of soundess, same page

by Winks at May 28, 2016 01:37 PM


long fra and a short ed future with same fixing dates, is convexivity negative or positive?

If you are long a FRA (forward rate agreement) and short a ED (Eurodollars) future with the same fixing dates, do you have positive convexity or negative convexity? Why?

According to the following thread a Eurodollar future has no convexity and the FRA has positive convexity. Why is this true? Or if it's not true, why is it not true? Long Forward Rate Agreement, short Eurodollar futures

I think that the answer is you will have a negative convexity because the convexity adjustment would make a long future have a higher convexity than the FRA. Can anyone tell me if my logic and answer is right?

by doru at May 28, 2016 01:34 PM


Wollt ihr mal einen so richtig schlechten Verlierer ...

Wollt ihr mal einen so richtig schlechten Verlierer sehen?

Die frisch vor Gericht geschlagenen Oracle-Anwälte versuchen jetzt, mit Anti-GPL-FUD die Stimmung in der Community zu ihren Gunsten zu drehen.

Und WENN jemand als GPL-Freund bekannt ist, dann ist es ja wohl Oracle! Schon Sun hat lieber ihre eigene Bullshit-Lizenz für jeden Scheiß erfunden, und die Liste der Projekte, die Oracle unter GPL gestellt hat, lassen sich an Null Fingern abzählen. Oh und auch ansonsten …

Was für eine Farce.

May 28, 2016 01:00 PM

Gute Nachrichten! Die Bundesregierung vergibt einen ...

Gute Nachrichten! Die Bundesregierung vergibt einen Folgeauftrag über eine halbe Milliarde Euro an Tollcollect!

Ein Glück, dass der Bevölkerung ihr Langzeitgedächtnis per Schund-TV wegtrainiert wurde.

May 28, 2016 01:00 PM

Ein britischer Think Tank hat mal frauenfeindliche ...

Ein britischer Think Tank hat mal frauenfeindliche Tweets ausgewertet. Ergebnis: Die Hälfte von denen kommt von Frauen.

Update: Sachen wie "slut walk" und "slut shaming" haben sie schon rausgerechnet, sagen sie.

May 28, 2016 01:00 PM


Can a momentum strategy be cast as a multilinear regression model?

Disclaimer: the question is similar to

Can momentum strategies be quantitative in nature?

and (to an extent)

What is the expected return I should use for the momentum strategy in MV optimization framework?

However (on the surface of it), I did not quite find a desirable answer.

Can a momentum strategy be expressed as something to the effect of $$r = \beta_0 + \beta_1 \frac{ten\_day\_moving\_average}{hundred\_day\_moving\_average} + \epsilon$$ or any other combination of predictors ($x_{2}$, $x_{3}$, ...); or maybe as some nonlinear model?

by A.L. Verminburger at May 28, 2016 01:00 PM



Finding Max Value in Col of a Frequency Table in R Language

I have created a frequency table (table(thing1, thing2)) and I am trying to find a maximum value for one of the columns. Here is my code:

apmData$bin <- as.factor(apmData$bin)
apmBinTable <-table(apmData$bin, apmData$duration1) 
prop.table(apmBinTable, 1)

This is a fraction of my data (bad to good obvi):

bin id    bad      good                
416082 0.010033445 0.989966555
416084 0.017421603 0.982578397
416085 0.023041475 0.976958525
416086 0.019943020 0.980056980
416087 0.005813953 0.994186047
416093 0.017667845 0.982332155
416095 0.015822785 0.984177215

Here is my problem. I don't see a way to get the max value for the column "bad". Technically "bad" and "good" aren't columns because apmBinTable isn't a data.frame. Any suggestions?

by Pullmetotown at May 28, 2016 12:57 PM


Using Mathematical Language Symbols to Map Programs [on hold]

I'm just getting into CS112. I know there's a way to express programs made by software engineers with mathematical symbols. Context-free grammar, would be one of them and this is on the site. Is there a website or a book I can buy to try and map out programs that I would like to test out during the summer to try and get ahead? I'm fairly intellectual, but I don't know the name for the style of using mathematic symbol to define; array, int [E| {matrix}]

I was following through with some CFG, but I'm still not seeing what the word is for the type, before CFG is needed. I understand that m^n && n^m = mm + nn; since the && applies both log[subx]n^m + log[subx]m^n but this is where I don't follow (mind you I'm Algebra II). nth -> (pointer) m, nth -> n assigns both value and is that how C++ uses copy using the equal sign for strings instead of strcpy? That's the main idea of CFG right?

Back to the question, is there anything related to this type of rationalizing with mathematic symbols through MD5, SHA-256, SHA-3, little endian and big endian via hash algorithm as logs? as a program talking to another program via hash because in the end, you run out of variables and names to make up. Can you just keep appending (1++) and make sense?

by mydiax at May 28, 2016 12:42 PM


Two way binding in RxSwift

I read the two way binding operator in sample code of RxSwift.

func <-> <T>(property: ControlProperty<T>, variable: Variable<T>) -> Disposable {
    let bindToUIDisposable = variable.asObservable()
    let bindToVariable = property
        .subscribe(onNext: { n in
            variable.value = n
        }, onCompleted:  {

    return StableCompositeDisposable.create(bindToUIDisposable, bindToVariable)

When property changed, it will notify variable, and set the variable's value, while the variable's value is set, it will notify the property. I think it will lead to endless loop...

by leizh00701 at May 28, 2016 12:26 PM

DragonFly BSD Digest

In Other BSDs for 2016/05/28

Fun links, this week.


by Justin Sherrill at May 28, 2016 12:23 PM

Planet Emacsen


How do I get systemd to start my ZFS mount script before doing anything else?

Right now I'm using zfsonlinux on Fedora 19, and it works excellently. Mounting works at bootup as planned using the initscript (included with zfs) that systemd calls, but there's a pretty significant flaw: ZFS is mounted way, WAY too late during boot, to the point that transmission and other services get very lost without /zfs mounted and can't recover, hence failing them.

What I'm after is a way to get systemd to fire up the script before doing anything else. I know how to do it using initscripts (just set it to a lower runlevel, say 2), but I haven't the foggiest when it comes to doing it with systemd's target system.

I've tried setting things in the services that require zfs to be up, but it doesn't work, barring Type=idle (but there must be a better way of doing it regardless):

And for reference's sake, here's the initscript provided by Fedora. Due to the nature of how ZFS mounts on linux (?), I can't just plop an entry into fstab and be done with it, it has to be mounted using commands.

by luaduck at May 28, 2016 11:58 AM


Formal program verification in practice

As a software engineer, I write a lot of code for industrial products. Relatively complicated stuff with classes, threads, some design efforts, but also some compromises for performance. I do a lot of testing, and I am tired of testing, so I got interested in formal proof tools, such as Coq, Isabelle... Could I use one of these to formally prove that my code is bug-free and be done with it? - but each time I check out one of these tools, I walk away unconvinced that they are usable for everyday software engineering. Now, that could only be me, and I am looking for pointers/opinions/ideas about that :-)

Specifically, I get the impression that to make one of these tools work for me would require a huge investment to properly define to the prover the objects, methods... of the program under consideration. I then wonder if the prover wouldn't just run out of steam given the size of everything it would have to deal with. Or maybe I would have to get rid of side-effects (those prover tools seem to do really well with declarative languages), and I wonder if that would result in "proven code" that could not be used because it would not be fast or small enough. Also, I don't have the luxury of changing the language I work with, it needs to be Java or C++: I can't tell my boss I'm going to code in OXXXml from now on, because it's the only language in which I can prove the correctness of the code...

Could someone with more experience of formal proof tools comment? Again - I would LOVE to use a formal prover tool, I think they are great, but I have the impression they are in an ivory tower that I can't reach from the lowly ditch of Java/C++... (PS: I also LOVE Haskell, OCaml... don't get the wrong idea: I am a fan of declarative languages and formal proof, I am just trying to see how I could realistically make that useful to software engineering)

Update: Since this is fairly broad, let's try the following more specific questions: 1) are there examples of using provers to prove correctness of industrial Java/C++ programs? 2) Would Coq be suitable for that task? 3) If Coq is suitable, should I write the program in Coq first, then generate C++/Java from Coq? 4) Could this approach handle threading and performance optimizations?

by Frank at May 28, 2016 11:47 AM



Chain repetitions of same generator in python

Consider the following simple generator:

def simple_gen():
    for number in range(5):
        yield number ** 2

I'm willing to user itertools.repeat and itertools.chain to chain n times the generator to itself. For clarity consider the following (non-generator) example of the same:

array = [1,2,3,4]
repetitions = itertools.repeat( array ,2)
list(itertools.chain.from_iterable(repetitions)) -> [1,2,3,4,1,2,3,4]

I want the same but using my own generator (simple_gen) instead of array. Of course a simple substitution does not work because itertools.repeat repeats the same object and therefore the following repetitions of the generator will be exhausted.

Some ideas of how to achieve this using the itertools module?

I do not want to cast the generator to a list or another container.

by Dargor at May 28, 2016 11:22 AM

Classification using Approximate Nearest Neighbors in Scikit-Learn

I have a labeled dataset having a 46D featureset and around 5000 samples that I want to classify using Approximate Nearest Neighbors.

Since I'm familiar with Scikit-Learn, I want to utilize it to achieve this goal.

The scikit documentations lists LSHForest as one of the probable methods for ANN, but it's unclear to me how to apply that for classification purposes.

by Amol Agrawal at May 28, 2016 11:19 AM


Docker and volumes hell

We're increasingly using Docker to build packages, a fresh chroot in which we prepare a number of packages, builds typically for ruby (rvm) , or python (virtualenv) or node stuf where the language ecosystem fails on us ... and fpm the whole tree as a working artifact.

An example of such a build is my work on packaging Dashing.

Now part of that build is running the actual build script in docker with a local volume mounted inside the container This is your typical -v=/home/src/dashing-docker/package-scripts:/scripts param.

Earlier this week however I was stuck on a box where that combo did not want to work as expected. Docker clearly mounted the local volume, as it could execute the script in the directory, but for some reason it didn't want to write in the mounted volume.

docker run -v=/home/src/dashing-docker/package-scripts:/scripts dashing/rvm /scripts/packagervm
Usage of loopback devices is strongly discouraged for production use. Either use `--storage-opt dm.thinpooldev` or use `--storage-opt dm.no_warn_on_loop_devices=true` to suppress this warning.
corefines: Your Ruby doesn't support refinements, so I'll fake them using plain monkey-patching (not scoped!).
/usr/local/share/gems/gems/corefines-1.9.0/lib/corefines/support/fake_refinements.rb:26: warning: Refinements are experimental, and the behavior may change in future versions of Ruby!
/usr/share/ruby/fileutils.rb:1381:in `initialize': Permission denied - rvm-1.27.0-1.x86_64.rpm (Errno::EACCES)

So what was I doing wrong, did the Docker params change, did I invert the order of the params, did I mistype them ? I added debugging to the script, (ls , chmod, etc..) and I couldn't seem to read or modify the directory. So I asked a coworker to be my wobbling duck.

He did more .. he wondered if this wasn't selinux.

And he was right..

Apr 23 21:47:00 audit[9570]: AVC avc: denied { write } for pid=9570 comm="fpm" name="package-scripts" dev="dm-2" ino=368957 scontext=system_u:system_r:svirt_lxc_net_t:s0:c47,c929 tcontext=unconfined_u:object_r:unlabeled_t:s0 tclass=dir permissive=0
Apr 23 21:47:02 python3[9597]: SELinux is preventing fpm from write access on the directory /home/src/dashing-docker/package-scripts.

So while I was looking for errors in docker, it was just my selinux set to enforce acting up and me not noticing it.

The quick way to verify obvisously was to setenforce 0 and trigger the build again that however is not a long term fix so I changed the

semanage fcontext -a -t cgroup_t '/home/src/dashing-docker/package-scripts'
restorecon -v '/home/src/dashing-docker/package-scripts'

That solves the problem

by Kris Buytaert at May 28, 2016 11:16 AM


Models crumbling down due to negative (nominal) interest rates

Given that the negative interest rates on a lot of sovereign bonds with maturity under 10 years are trading in the negative (nominal) interest rate territory (recently also the short term EURIBOR has dropped below zero), which are the most striking applications for the models in financial economics/quant finance field?

By that I mean which of the so called "stylized facts" and standard models of modern finance are becoming highly controversial or just plain useless? As a couple of examples which spring to mind are the following (do not necessarily have to do with sovereign bond yields, but the concept of negative (nominal) interest rates as such):

  • The CIR interest rates model completely breaks down due to the square root term
  • The proof that an American call option written on a non-dividend paying underlying will not be exercised before the maturity is false
  • Markowitz selection obviously encounters difficulties incorporating negative yields

What are the other consequences, on let us say, CAPM, APT, M&M or any other model in finance? Which long held beliefs are hurt the most by negative yields?

by user3612816 at May 28, 2016 11:10 AM

Fred Wilson

Video Of The Week: Three Freedoms For The Future

This is an interview that Andrew Keen did with my partner Albert about three freedoms we need for the future:

Here is a link to the Gitbook version of Albert’s book that is mentioned in the video.

by Fred Wilson at May 28, 2016 10:56 AM



Axiomatisation in the presence of recursion

I read Klaus Havelund's thesis on the Fork Calculus:

He develops the Fork calculus for reasoning about concurrent functional programs, the motivation being Concurrent ML.

In chapter 5, he writes:

First, we add a recursion operator, allowing us to define processes 
with infinite behaviour. Recursion is not difficult to deal with 
(thanks to the use of operational semantics), except that the 
axiomatisation cannot be shown complete in presence of recursion.

There is no literature reference here, so I ask you: where can I read about why recursion makes a programming language's axiomatisation cannot be proved complete, and in precisely what meaning?

by Gergely at May 28, 2016 10:12 AM


Stochastic Calculus Rescale Exercise

I have the following system of SDE's

$ dA_t = \kappa_A(\bar{A}-A_t)dt + \sigma_A \sqrt{B_t}dW^A_t \\ dB_t = \kappa_B(\bar{B} - B_t)dt + \sigma_B \sqrt{B_t}dW^B_t $

If $\sigma_B > \sigma_A$ I would consider the volatility $B_t$ to be more volatile than $A_t$ because

$ d\langle A_\bullet\rangle_t = \sigma_A^2 B_t dt$ and $ d\langle B_\bullet\rangle_t = \sigma_B^2 B_t dt$

Now, if I rescale the process $B$ by $\sigma_A^2$ and define $\sigma_A^2B =\tilde{B}$, I get the an equivalent system of SDE's

$ dA_t = \kappa_A(\bar{A}-A_t)dt + \sqrt{\tilde{B}_t}dW^A_t \\ d\tilde{B}_t = \kappa_B(\sigma_A^2\bar{B} - \tilde{B}_t)dt + \sigma_A\sigma_B \sqrt{\tilde{B}_t}dW^B_t $

But now the claim "If $\sigma_B > \sigma_A$ I would consider the volatility $\tilde{B}_t$ to be more volatile than $A_t$" does not hold anymore. Consider $1>\sigma_B>\sigma_A$ and

$ d\langle A_\bullet\rangle_t = \tilde{B}_t dt$ and $ d\langle \tilde{B}_\bullet\rangle_t = \sigma_A^2\sigma_B^2 \tilde{B}_t dt$.

In this case the volatility $\tilde{B}$ of $A$ is more volatile than $A$ only if $\sigma_A^2\sigma_B^2>1$, which is completely different from the condition above ($\sigma_B > \sigma_A$).

What went wrong? Is there some error in the rescalling?

by Phun at May 28, 2016 10:10 AM



How to resolve weired beaviour of JDK1.8?

while learning lambada expressions I'm unable to use functional interface

(abstract method of functional interface with default method of other interface)

my code is as follow :

1 . interface with abstract mehtod

public interface MathOperation {
    public int operation(int a, int b);


public interface GreetingService {
    public void sayMessage(String message);


3 . interface with default method

public interface MathOperatonWithGrettingService {
    default public int operate(int a, int b, MathOperation mathOperaton) {
        return mathOperaton.operation(a, b);


public class BasicLambadaExampleOne implements MathOperatonWithGrettingService{

    public static void main(String[] args) {
        BasicLambadaExampleOne tester = new BasicLambadaExampleOne();
        MathOperation addition = (int a, int b) -> (a + b);
        MathOperation subtraction = (a, b) -> (a - b);
        MathOperation multiplication = (a, b) -> {
            return a * b;
        GreetingService serviceOne = (message) -> System.out.println("Hello from serivce One" + message);
        GreetingService serviceTwo = (message) -> System.out.println("Hello from service two" + message);

        System.out.println("1. ");
        serviceOne.sayMessage("hi from lambada");
        System.out.println("2. ");
        serviceTwo.sayMessage("and keep learning");
        System.out.println("3. ");
        System.out.println(tester.operate(19, 19, multiplication ));//error 


Error Log 1. Hello from serivce Onehi from lambada 2. Hello from service twoand keep learning 3. Exception in thread "main" java.lang.ClassCastException: cannot be cast to at


My system configuration :

windows xp

Pentium dual core processor

3gb ram

1 Many of you might see the output, but when I run the program, it shows me the above error.

2 Why is it happening and how to resolve the JDK behavior so, it won't bug in future??

by holidayCoder at May 28, 2016 09:45 AM



How to work out the forward outright price from the bid/ask quotes?

I'm facing this problem:

Spot AUD/USD is quoted at 0.7634/39; six-months swaps are 112.1/111.1; at what forward outright rate can a price taker sell USD value spot/6 months?

On the spot side, the market is willing to buy the base currency (AUD) at 0.7634 (best bid), and it is willing to sell the base currency at 0.07639 (best ask). The spread is 5 pips.

On the swap side, the thing is a bit more complicated. From my understanding, in a general swap contract:

  • The buyer goes short on base currency (AUD) and long on counter currency (USD)
  • The seller goes long on base currency (AUD) and short on counter currency (USD)

So, in our case the agent wants to sell USD, therefore she is the seller of the swap contract, matching the swap market on the buy-side. She will therefore sell a swap contract at 111.1 (best bid).

The formula to compute the forward outright rate is

$$ FOR = \text{Spot Price} \frac{1+(i_C \frac{\text{Days} }{360 } )}{1+(i_B \frac{\text{Days} }{360 } )} $$

where $i_C$ is the counter currency rate $i_B$ is the base currency rate

But actually I'm missing many inputs here, or I don't see how I can do it since the quotes I have for the swap don't look like rates. Is there anyone who has an idea on how to proceed? Thank you.

by Puzzle at May 28, 2016 09:09 AM


Lamer News


Bag of Features / Visual Words + Locality Sensitive Hashing


I'm really new to Computer Vision/Image Processing and Machine Learning (luckily, I'm more expert on Information retrieval), so please be kind with this filthy peasant! :D


We have a mobile application where the user takes a photo (the query) and the system returns the most similar picture thas was previously taken by some other user (the dataset element). Time performances are crucial, followed by precision and finally by memory usage.


First of all, it's quite obvious that this is a 1-Nearest Neighbor problem (1-NN). LSH is a popular, fast and relatively precise solution for this problem. In particular, my LSH impelementation is about using Kernalized Locality Sensitive Hashing to achieve a good precision to translate a d-dimension vector to a s-dimension binary vector (where s<<d) and then use Fast Exact Search in Hamming Space with Multi-Index Hashing to quickly find the exact nearest neighbor between all the vectors in the dataset (transposed to hamming space).

In addition, I'm going to use SIFT since I want to use a robust keypoint detector&descriptor for my application.


Well, it seems that I already decided everything, right? Actually NO: in my linked question I face the problem about how to represent the set descriptor vectors of a single image into a vector. Why do I need it? Because a query/dataset element in LSH is vector, not a matrix (while SIFT keypoint descriptor set is a matrix). As someone suggested in the comments, the commonest (and most efficient) solution is using the Bag of Features (BoF) model, which I'm still not confident with yet.

So, I read this article, but I have still some questions (see QUESTIONS below)!


First and most important question: do you think that this is a reasonable approach?

  1. Is k-means used in the BoF algorithm the best choice for such an application? What are alternative clustering algorithms?
  2. The dimension of the codeword vector obtained by the BoF is equal to the number of clusters (so k parameter in the k-means approach)?
  3. If 2. is correct, bigger is k then more precise is the BoF vector obtained?
  4. There is any "dynamic" k-means? Since the query image must added to the dataset after the computation is done (remember: the dataset is formed by the images of all submitted queries) the cluster can change in time.
  5. Given a query image, is the process to obtain the codebook vector the same as the one for a dataset image, e.g. we assign each descriptor to a cluster and the i-th dimension of the resulting vector is equal to the number of descriptors assigned to the i-th cluster?

by justHelloWorld at May 28, 2016 07:58 AM


Edit option is gone after sometime for comments [on hold]

Under answer in stack exchange we can start adding comments. But that comment is editable for only for few minutes. After that it will be read only mode for even the owner of the comment.
I have seen this kind of approach in some other places also. I just want to know what is the idea behind this? is this will save some memory lookup in database for modification? OR is it a simple good practice?

by Kanagavelu Sugumar at May 28, 2016 07:57 AM

Decidability of the TM's computing a none empty subset of total functions

I have this HW problem: Let $F$ be the set of computable total functions, and let $\emptyset\subsetneq S\subseteq F$. Denote $$L_S=\{ \langle M \rangle | M \text{ is a TM that computes a function and } f_M\in S \}$$

where $f_M$ is the function $M$ computes.

Prove that for every such none-trivial $S$, $L_S \notin \mathcal{R}$.

I tried to construct

  • $L_{f}=\left\{ x\#y\in{\Sigma^*}|y=f\left(x\right)\right\} $
  • $\tilde{S}=\left\{ L_{f}|f\in S\right\} $
  • $L_{\tilde{S}}=\left\{ \left\langle M\right\rangle |L\left\langle M\right\rangle \in\tilde{S}\right\} $

and then show with Rice that $L_{\tilde{S}} \notin \mathcal{R}$, when the idea behind it was to eventually show that $L_{\tilde{S}} \leq_m L_S $. But the problem here is that I couldn't show a mapping reduction from $L_{\tilde{S}}$ to $L_S$ without assuming $L_{\tilde{S}} \in \mathcal{RE}$ (which I'm quite sure is not true).
So any other directions will be warmly welcomed!

by Uria Mor at May 28, 2016 07:55 AM


Machine learning Curse of dimensionality

I have word vectors from a word2vec model in 500 dim and 1000dim. I am computing the euclidean distance between some example vectors in 500 and 1000 dim. My problem is that I have read papers about the curse of dimensionality: Euclidean distance does not work in high dimension space. But here the results are quite similar for both dimensions. I computed the euclidean distance between 1000 dim vectors:

distance beween girl and boy 
cosine between girl and boy
l1 distance beween girl and boy
distance between girl and neither 
cosine between girl and neither 
distance between boy and neither 
 cosine between boy and neither
distance between girl and charger 
cosine between girl and charger
distance between either and neither 
cosine between either and neither

In 500 dim it is:

distance between girl and boy 
cosine between girl and boy 0.864196148736 
l1 distance between girl and boy 
distance between girl and neither 
cosine between girl and neither 
distance between boy and neither
cosine between boy and neither 
distance between girl and charger 
cosine between girl and charger
distance between either and neither 
cosine between either and neither 

Can someone explain why this is so? Is it related to sparcity?

by code_lover at May 28, 2016 07:42 AM


Could someone teach me how to construct the portfolios by compute (like using R, Excel or Eviews)

Recently, I am doing my dissertation that covers asset pricing theory. The empirical test of Fama 3 factors model is an important part of this dissertation. Please let me review the fama model.

Fama 3 factors model is $r-R_f=α+β_m(K_m−R_f)+\beta_s⋅SMB+\beta_v⋅HML+e$

where $R_f$ is risk free return, ($K_m−R_f$) is premium return and $K_m$ is market return, SMB is the "Small Minus Big" market capitalization risk factor. HML is is the "High Minus Low" value premium risk factor.

Please let mt detail my question. I have over 500 stocks and their monthly return ,monthly market value and monthly book-to-market ratio (call it b/m). The time interval is from 01.2010 to 12.2015, then because the data is monthly so I have 60 periods. Those data is in one excel document.

its methodology and what I want to do is:

  • For all stocks at each period, compute the mean for market value.
  • Divide them into two groups (say big and small size) by comparing each stock with mean. If stock's market value greater than mean, put them into big size group. If small than mean, put them into small size group. Thus, for each period we have two groups, big and small group in terms of market value.
  • For each period, we compare all the stocks' b/m within each group, then furtherly divide them into 3 groups , named high b/m (top 30%) group, medium b/m (middle hierarchy from 30% to 70%) group and low b/m (bottom 30%) group. Namely, for each period, we finally divide 6 groups. They are S/L, S/M, S/H and B/L, B/M, B/H. For example, S/L group contains all stocks that are small market value and low b/m ratio simultaneously.

These 6 groups are just 6 portfolio. So for each period, we have 6 different portfolios.

So far this is my first stage. Then I have second stage:

  1. After we get 6 different portfolios at each period, we need to compute weighted average return for all stocks at each period.
  2. We need to compute the SMB and HML for each period.


Where S/L,S/M....B/L is the summation return of each group. Like S/L is the summation of each stock's return.

Finally, we got time series data, then can regress this series data, and figure out α and 3 $\beta$.

So, basically those are want I want to do.

Now I compute those manually although I use excel, this is really a huge amount of workload, so tired to do this. I am still in step 1 at first stage. Is anyone interesting in telling me how to achieve dividing stocks into portfolios by compute, and compute those data by compute?

People usually just download the data from French's website. But my goal is to check Fama model in Asian stock market, like Shanghai and Hong Kong. All answers welcome, really really crazy and tired to do this manually.

Anyway, thanks everyone in advanced.

by Zabbka osckey at May 28, 2016 07:14 AM


Connecting a webservice to a database in Rust

Second article in this series of covering development with rust.


by fkr at May 28, 2016 06:57 AM

Creating a basic webservice in Rust

Kick-Off into series of posts on development with rust


by fkr at May 28, 2016 06:56 AM


Optimised .reduce() which "breaks out" when answer found [duplicate]

This question already has an answer here:

I'd like to have an optimised version of .reduce { } which "breaks out" when an answer was found. I realise that the point of this "functional programming jewel" is to be simple and clean, but sometimes I can't sacrifice clean over faster run time.

Unfortunately, Is it possible to cut short evaluation of a higher-level function? doesn't provide an answer.

In Array contains a complete subarray @dasblinkenlight has provided the tight code below. I'd like to use it because it's more elegant than my solution in the same post. However, my issue is that if the subArray was found somewhere in the beginning of the main array then the .reduce { } still continues running. This may become a problem if the self array is long.

extension Array where Element : Equatable {
    func indexOfContiguous(subArray:[Element]) -> Int? {
        // This is to prevent construction of a range from zero to negative
        if subArray.count > self.count {
            return nil
        // The index of the match could not exceed data.count-part.count
        return (0...self.count-subArray.count).reduce(nil) { prev, ind in
            // If there is an earlier match "prev", return it;
            // Otherwise, construct a sub-array from current index,
            // and compare its content to what we are looking for.
            prev ?? ([Element](self[ind..<ind+ subArray.count]) == subArray ? ind : nil)

by Daniel at May 28, 2016 06:46 AM


Time complexity for Breadth-First search

I would like to know why the average number of nodes at level d in BFS in a search tree is $\frac{1+b^d}{2}$ as given in this lecture(p.15)?(Here b is the branching factor of the tree and d is the depth of the tree)

I think that as average tells us to take the sum of the numbers and divide by how many numbers are present here we should consider the sum of nodes at level d(considering the deepest level in search tree) and the number of nodes at level d.

Sum of nodes at level d : $b+\dots+b$(d+1 times since we consider the initial depth as zero)=$b(d+1)$

Number of nodes at level d : $b^d$

So the average number of nodes at level d would be $\frac{b(d+1)}{b^d}$

by justin at May 28, 2016 06:26 AM

Planet Theory

TR16-085 | Depth-reduction for composites | Shiteng Chen, Periklis Papakonstantinou

We obtain a new depth-reduction construction, which implies a super-exponential improvement in the depth lower bound separating $NEXP$ from non-uniform $ACC$. In particular, we show that every circuit with $AND,OR,NOT$, and $MOD_m$ gates, $m\in\mathbb{Z}^+$, of polynomial size and depth $d$ can be reduced to a depth-$2$, $SYM\circ AND$, circuit of size $2^{{(\log n)}^{O(d)}}$. This is an exponential size improvement over the traditional Yao-Beigel-Tarui, which has size blowup $2^{{(\log n)}^{2^{O(d)}}}$. Therefore, depth-reduction for composite $m$ matches the size of the Allender-Hertrampf construction for primes from 1989. One immediate implication of depth reduction is an improvement of the depth from $o(\log\log n)$ to $o(\log n/\log\log n)$, in Williams' program for $ACC$ circuit lower bounds against $NEXP$. This is just short of $O(\log n/\log\log n)$ and thus pushes William's program to the $NC^1$ barrier, since $NC^1$ is contained in $ACC$ of depth $O(\log n/\log\log n)$. A second, but non-immediate, implication regards the strengthening of the $ACC$ lower bound in the Chattopadhyay-Santhanam interactive compression setting.

May 28, 2016 05:27 AM


How to calculate the JdK RS-Ratio

Anyone have a clue how to calculate the JdK RS-Ratio?

Let's say I want to compare the Relative strength for these:

  • EWA iShares MSCI Australia Index Fund

  • EWC iShares MSCI Canada Index Fund

  • EWD iShares MSCI Sweden Index Fund

  • EWG iShares MSCI Germany Index Fund

  • EWH iShares MSCI Hong Kong Index Fund

  • EWI iShares MSCI Italy Index Fund

  • EWJ iShares MSCI Japan Index Fund

  • EWK iShares MSCI Belgium Index Fund

  • EWL iShares MSCI Switzerland Index Fund

  • EWM iShares MSCI Malaysia Index Fund

  • EWN iShares MSCI Netherlands Index Fund

  • EWO iShares MSCI Austria Index Fund

  • EWP iShares MSCI Spain Index Fund

  • EWQ iShares MSCI France Index Fund

  • EWS iShares MSCI Singapore Index Fund

  • EWU iShares MSCI United Kingdom Index Fund

  • EWW iShares MSCI Mexico Index Fund

  • EWT iShares MSCI Taiwan Index Fund

  • EWY iShares MSCI South Korea Index Fund

  • EWZ iShares MSCI Brazil Index Fund

  • EZA iShares MSCI South Africa Index Fund

Each of them should be compared to the SP500 (SPY index). Calculate the relative strength of each of them to SPY and have it normalized (I think it is the only solution)

More info on the concept.

enter image description here

enter image description here

by Donedge at May 28, 2016 04:08 AM

Which is better for quantitative finance, a computer science PhD or an applied mathematics PhD? [on hold]

In the world of quantitative finance which of the following would be more highly regarded or useful when it comes to applying for jobs:

  1. A computer science PhD focused on machine learning used to optimize industrial processes.
  2. An applied mathematics PhD focused on acoustics/electromagnetics

by csss at May 28, 2016 03:37 AM


Side-Effects and Execution Time [on hold]

How do side-effect free functions affect the execution time of programs? Does it have both positive and negative effects?

My assumption is that it has both positive and negative effects. No side-effects mean that we need to allocate more memory to certain functions, such as 'quick sort', making execution time somewhat slower.

On the other hand, the absence of side effects mean that we can verify and prove that a function holds for all inputs, and so we don't need to check the output of a function for any errors.

Any input would be greatly appreciated.

by spitz at May 28, 2016 03:34 AM


Monetary Policy and the Yield Curve PART TWO

The Fed has a number of tools/targets with which they manage monetary policy. I'm looking to refine a concise summary of them and looking for guidance/correction/validation.

Think I understand these first three. Please correct me if I'm wrong:

  1. Open Market Operations: The Federal Open Market Committee (FOMC) will often instruct the Federal Reserve Bank of New York to engage in open market operations (buying and selling of US securities) to influence interest rates. Movement at all maturities on the yield curve can reflect such operations; the Fed has been known to try and alter the shape/slope of the curve.
  2. The Discount Window: offers various types of credit at the discount rate; designed for times of stress; rates are high (penalty rates - see Bagehot's Dictum); use of discount window credit may spark regulator investigation. Discount Window credit is typically overnight (primary/secondary) or less than 9 months in the case of seasonal loans. Changes in discount rate only affects the short end of the yield curve.

  3. The Fed Fund rate: overnight rate at which reserve balances, held by banks at the fed can be lent to each other. This rate is calculated from market transactions. The Fed determines their FF rate target and use open market operations to move the Fed Funds rate toward a particular level. Whilst the Fed Fund rate is an overnight rate, it can be related to longer term movements on the yield curve (1-month treasury bill for example) but there are differences; notably the Fed Funds rate, being a market rate, does vary, while the yield on a 1-month Treasury is effectively fixed at the time of purchase. The relationship between the expected values of a fixed rate and a floating rate is expressed through Overnight Indexed Swap values, and 1-month OIS on the Fed Funds rate is the best direct indication of the expected value of compounded overnight borrowing in the Fed Funds market.

I'm looking for further confirmation/understanding no the next two:

  1. The reverse repo program, which enables it to set a floor under short-term secured borrowing rates. This makes sense: reverse repo = sell security, collect payment from bank, reduce their fed reserve balance, decrease supply of money in the system and put upwards pressure on the federal funds rate for example. Is this logic correct?

  2. The interest rate on excess reserves (IOER); from comments on my prior question, I understand that this rate sets the ceiling for fed funds. IOER = interest paid on balances above the required level; how does that set a ceiling? Sounds more like a floor; for a bank to lend its excess reserves, they would want a higher rate than the IOER?

This is a follow on from part one which was posted here.

by Pete at May 28, 2016 03:07 AM


Windows port of SQLite4's LSM storage engine

Still under heavy flux the LSM storage engine for SQLite 4 is pretty cool. Unfortunately it has no Windows port. So I decided to port it to Windows as well.


by mxp at May 28, 2016 03:02 AM

The EFF is Orwellian as fuck

Apologies for the sensational title, but the trend of “policies of convenience” or even “principles of convenience” is a concern.


by tedu at May 28, 2016 02:53 AM


Bug des Tages: Der ASN.1-Parser im Linux-Kernel ist ...

Bug des Tages: Der ASN.1-Parser im Linux-Kernel ist braun.

Wie überhaupt jemand auf die Idee kommen konnte, einen ASN.1-Parser in den Kernel zu tun, erschließt sich mir nicht. Das ist auf der Scheißidee-Skala kurz vor "wir tun GDI inklusive Font-Parsen in den Kernel".

Falls jetzt jemand denkt, hey, kleines Fehlerchen, passiert bestimmt nicht wieder: Es gab da noch einen zweiten ASN.1-Parser im Kernel.

May 28, 2016 02:02 AM



How can time complexities be elements of others? [duplicate]

This question already has an answer here:

I have a problem that I solved, but I'm unsure if my answers are correct or not. I've never seen the implementation of time complexities as elements of others.

f(n) element of O(log(n))
g(n) element of theta(n)

Prove if the below are true are false, provide a counter example if it's false.

1) g(n) element of omega(f(n))
2) f(n)g(n) element of O(nlog(n))
3) f(n)g(n) element of theta(nlog(n))

Below are my attempts, which was just because I thought it was the logical solution. Any ideas on how I can go about solving these types of questions?

1) False, g(n) has a tight bound at (n). If it had a lower bound of f(n), then g(n) should allow log(n), but that's too high of a complexity.

2) True

3) False, f(n) having an upper bound of log(n) implies it could be even faster. Setting a tight bound contradicts that wherein it cant go faster.

by Andrew Raleigh at May 28, 2016 01:33 AM

Prove the halting problem is undecidable using Rice's theorem

Is it possible to prove that the Halting problem is undecidable using Rice's theorem? Here's what I've tried and failed:

  • We want to reduce Rice's Theorem (decide if a language has the nontrivial property P) to the Halting Problem
  • We want to say "if the Halting Problem can be solved, then we can decide if a language as nontrivial property P"

Here's how I thought the proof might go:

  • Suppose we have a machine H that solves the halting problem. We want to use this machine to decide if a language has property P.
  • We construct a machine that uses H to decide a language has property P.

Example: Take P to be "language contains the string 1".

We want to use H to decide if the language of a TM M contains the string 1, i.e that the TM halts and accepts when given input 1.

Now, if M halts on 1, that doesn't mean it accepts 1, so we don't know if L(M) contains 1. On the other hand if M does not halt on 1, that certainly means that L(M) does not contain 1.

And that's where I'm stuck. Is it actually possible to use Rice's theorem to prove that the halting problem is undecidable?

by an idiot at May 28, 2016 01:29 AM


Mixture of 1D Gaussians fit to data in Matlab / Python

I have a discrete curve y=f(x). I know the locations and amplitudes of peaks. I want to approximate the curve by fitting a gaussian at each peak. How should I go about finding the optimized gaussian parameters ? I would like to know if there is any inbuilt function which will make my task simpler.


I have fixed mean of gaussians and tried to optimize on sigma using lsqcurvefit() in matlab. MSE is less. However, I have an additional hard constraint that the value of approximate curve should be equal to the original function at the peaks. This constraint is not satisfied by my model. I am pasting current working code here. I would like to have a solution which obeys the hard constraint at peaks and approximately fits the curve at other points. The basic idea is that the approximate curve has fewer parameters but still closely resembles the original curve.

fun = @(x,xdata)myFun(x,xdata,pks,locs); %pks,locs are the peak locations and amplitudes already available
x0=w(1:6)*0.25; % my initial guess based on domain knowledge 

[sigma resnorm] = lsqcurvefit(fun,x0,xdata,ydata); %xdata and ydata are the original curve data points
recons = myFun(sigma,xdata,pks,locs);
figure;plot(ydata,'r');hold on;plot(recons);

    function f=myFun(sigma,xdata,a,c)
       % a is constant , c is mean of individual gaussians
       for i = 1:6 %use 6 gaussians to approximate function
          f = f + a(i) * exp(-(xdata-c(i)).^2 ./ (2*sigma(i)^2));      

by droidmad at May 28, 2016 01:22 AM


Special term for 'intersection' of option price

Suppose, I have written two ordered lists:

$S_{call}= (\textbf{8000, 8050, 8100}, 8150, 8200, 8250)$ and
$S_{put} = (7850, 7900, 7950, \textbf{8000, 8050, 8100})$.

Entities are correspond to strike prices of call and put on the same underlying asset XYZ.


Spot price is equal to $8067.6$, then XYZ 8050 call and XYZ 8050 put are "at-the-money" options, XYZ 8000 call and XYZ 8100 put are "in-the-money" options, and the remaining options would be "out-of-the-money".

How to name strike prices which are marked with bold? Is there a special term?

by Nick at May 28, 2016 01:14 AM

Bond Duration with Bond portfolio returns

if I have given CRSP bond portfolio returns with different maturities (1m-12m, etc), how is it possible to compute the Future price and the duration? Beside that I do also have the Nelson-svensson-siegel yield curve in form of zero coupon, par yield, instantaneous forward rate etc.

Thank you very much for any help!



by user20280 at May 28, 2016 12:47 AM


curry example in javascript [duplicate]

This question already has an answer here:

I would like to be able to chain my function calls so that I can manipulate them in a functional style. The example I am looking at is below.

describe('can curry functions', function () {

  function toHtml(txt) {
    return '<html>' + txt + '</html>';

  function toHex(txt) {
    return txt
     .map(function(c) { return c.charCodeAt(0).toString(16); })

  it('can call hex then html', function() {

  it('can call html then hex', function() {


How can I achieve this with Javascript?

by Filype at May 28, 2016 12:41 AM


Is there an easy way to not log a specific sshd event?

The monitoring tool of my hoster is making regular connections with ssh on port 22 to see or the service is up.

This is cluttering my /var/log/messages with this kind of messages:

Dec 13 22:20:17  sshd[29487]: Did not receive identification string from 80.xx.xx.xx

Is there an easy way of muting this message "Did not receive identification string from" for that specific ip?

by Peter Smit at May 28, 2016 12:36 AM


Using R with princomp to create hedge baskets

I am experimenting to try to find better ways to hedge some of our equity portfolios. It's easy enough to use R to get a PCA breakdown of exposure for a portfolio but I can't figure out how to then use that to actually hedge something.

In the example below I'm just using four securities. In real life we would use more, but the small set it to keep this example manageable. Here's the R code:

# Get publicly available data
px.spy=getYahooData("SPY", 20130101, 20160101)$Close
px.iwm=getYahooData("IWM", 20130101, 20160101)$Close
px.tlt=getYahooData("TLT", 20130101, 20160101)$Close
px.gld=getYahooData("GLD", 20130101, 20160101)$Close


px=merge( px.spy  , px.iwm )
px=merge( px  , px.tlt )
px=merge( px , px.gld )

# turn it into something that we can put into princomp %>% mutate( SPYr = (lag(SPY,1) - SPY )/SPY , IWMr = (lag(IWM,1) - IWM )/IWM  ,
 TLTr = (lag(TLT,1) - TLT )/TLT , GLDr= (lag(GLD,1) - GLD )/GLD) %>% filter( complete.cases(.)) %>% select(SPYr,IWMr,TLTr,GLDr )


Now we the loadings in pc$loadings. Now, say we wanted to take a portfolio of $100mm of SPY and hedge it with IWM,TLT,GLD based on the PCA exposures. What's the right way to extract that simply and to create a frame with the weightings?

Thanks, Josh

by JoshK at May 28, 2016 12:34 AM

HN Daily

May 27, 2016


Understanding shorthand closure syntax for map function in Swift

I'm trying to understand some of the short hand syntax used by the map function.

The following is the setup

    let array = [1, 2, 3]

    // these make sense
    let arr1 ={String($0)})
    let arr2 ={String($0)}
    let arr3 ={ number in
        return String(number)
    let arr4 ={ (number) -> String in

Here is where the confusion lays. In swift I can forgo the curly braces for map, but this seems like something that can't be done, for my own functions where I have a trailing closure. Some magical inference that's being made perhaps? Also why is the String initialized in this way?

    // this doesn't make sense. Foregoing the curly braces? I can't do that!!!
    let arr5 =    
    let arr6 =    // Compile Error: Cannot convert value of type 'String' to expected argument type '@noescape (Int) throws -> _'

This is me trying to use similar syntax as map

func crap(block:(Int)-> String) {
    print("Int to string block" + block(1));
// works obviously
crap{ "\($0) some garbage" }
// compile error : Anonymous closure argument not contained in a closure
crap( "\($0) some garbage" )

by Biclops at May 27, 2016 11:57 PM


Complexity and number of bits of square root number

Let an integer a, and b is the number of bits a.

1) If I have a number s = sqrt(a), the number of bits s will be up to b/2?

Considering the code:

while (i <= sqrt(a)) {
    if (a % i == 0) return i;

2) The complexity is $O(sqrt(a))$. But in terms of bits? $O(2^(b/2^))$?

by eightShirt at May 27, 2016 11:19 PM



Tarjan's Bridge Finding Algorithm Psuedocode? [on hold]

I was looking for a verbose and well-constructed psuedocode for finding Bridges in an undirected Graph. If possible please share any psueodocode, thank you!

by Danish Amjad Alvi at May 27, 2016 11:00 PM




Predicting next event

I have a set of data, which has 3 possible events. There are 24 features that effect which of the three events will happen. I have training data with all the 24 features and which events happened.

What I want to do is using this data predict which of the three events will happen next, given all the 24 feature values are known.

Could you suggest some machine learning algorithm that I should use to solve this problem

by ila at May 27, 2016 10:49 PM

Using custom classifier for mutilabel classification with GridSearchCV and OneVsRestClassifier

I am trying to use the OneVsRestClassifier to do multilabel classification on a set of comments. My objective is to tag each comment to a possible list of topics. My custom classifier uses a manually curated list of words and their corresponding tags in a csv to tag each comment. I am trying to combine the results obtained from the Bag of Words technique and my custom classifier using the VotingClassifier. Here is part of my existing code:

import numpy as np

from sklearn.base import BaseEstimator, ClassifierMixin
from sklearn.ensemble import VotingClassifier
from sklearn.feature_extraction.text import CountVectorizer, TfidfTransformer
from sklearn.grid_search import GridSearchCV
from sklearn.linear_model import SGDClassifier
from sklearn.multiclass import OneVsRestClassifier
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import MultiLabelBinarizer

class CustomClassifier(BaseEstimator, ClassifierMixin):
    def __init__(self, word_to_tag):
        self.word_to_tag = word_to_tag

    def fit(self, X, y=None):
        return self

    def predict_proba(self, X):
        prob = np.zeros(shape=(len(self.word_to_tag), 2))

        for index, comment in np.ndenumerate(X):
            prob[index] = [0.5, 0.5]
            for word, label in self.word_to_tag.iteritems():
                if (label == self.class_label) and (comment.find(word) >= 0):
                    prob[index] = [0, 1]

        return prob

    def _get_label(self, ...):
        # Need to have a way of knowing which label being classified
        # by OneVsRestClassifier (self.class_label)

bow_clf = Pipeline([('vect', CountVectorizer(stop_words='english', min_df=1, max_df=0.9)), 
                    ('tfidf', TfidfTransformer(use_idf=False)),
                    ('clf', SGDClassifier(loss='log', penalty='l2', alpha=1e-3, n_iter=5)),
custom_clf = CustomClassifier(word_to_tag_dict)

ovr_clf = OneVsRestClassifier(VotingClassifier(estimators=[('bow', bow_clf), ('custom', custom_clf)],

params = { 'estimator_weights': ([1, 1], [1, 2], [2, 1]) }
gs_clf = GridSearchCV(ovr_clf, params, n_jobs=-1, verbose=1, scoring='precision_samples')

binarizer = MultiLabelBinarizer(), binarizer.fit_transform(y))

My intention is to use this manually curated list of words obtained by several heuristics to improve the results obtained by solely applying bag of words. Currently I am struggling to find a way to know which label is being is classified while predicting, since a copy of CustomClassifier is created for each label using OneVsRestClassifier.

by dshah at May 27, 2016 10:47 PM

calculate p-value for AFT survival model in Spark

I am using Spark ml library to do some survival analysis. Here is the documentation. After training an AFT survival model, I cannot get the p-value directly as in R. What is available for the model are a coefficients vector, scale, intercept, predictions and input data(Y and features). How can I use these variables to calculate p-value? Any advice is welcome. Thanks

by Woaibanzhuan at May 27, 2016 10:36 PM


AI : how can i study it by myself? [on hold]

Peace for All

I'm interested in studying artificial intelligence, i bought a book "artificial intelligence Modern Approach " But I could not begin to delve into it for some reasons So I need your advice for the optimal way to study the topics of artificial intelligence, Note: I've designed several programs, such as 8-queens ,tictactoe But, it is completely free of artificial intelligence techniques, I want to apply these techniques on a programmatic

thank you very much

by user49658 at May 27, 2016 10:34 PM


W^X now mandatory in OpenBSD

Traditional Unix has allowed memory to be mapped W | X. Everyone now knows that’s a bad practice from a security standpoint, but the software ecosystem hasn't made much progress in this area. Theo de Raadt has just committed a change to begin blocking W^X violations in OpenBSD.

Module name:	src
Changes by:	2016/05/27 13:45:04

Modified files:
	lib/libc/sys   : mmap.2 mount.2 mprotect.2 
	sbin/mount     : mntopts.h mount.8 mount.c 
	sbin/mount_ffs : mount_ffs.c 
	sbin/mount_nfs : mount_nfs.c 
	sys/kern       : kern_sysctl.c vfs_syscalls.c 
	sys/sys        : mount.h sysctl.h 
	sys/uvm        : uvm_mmap.c 
	usr.sbin/pstat : pstat.c 

Log message:
W^X violations are no longer permitted by default.  A kernel log message
is generated, and mprotect/mmap return ENOTSUP.  If the sysctl(8) flag
kern.wxabort is set then a SIGABRT occurs instead, for gdb use or coredump

W^X violating programs can be permitted on a ffs/nfs filesystem-basis,
using the "wxallowed" mount option.  One day far in the future
upstream software developers will understand that W^X violations are a
tremendously risky practice and that style of programming will be
banished outright.  Until then, we recommend most users need to use the
wxallowed option on their /usr/local filesystem.  At least your other
filesystems don't permit such programs.

May 27, 2016 10:27 PM


Harvey OS ~ A Fresh Take on Plan 9

Harvey is an effort to get the Plan 9 code working with gcc and clang.


by voidzero at May 27, 2016 10:15 PM


Bug des Tages: systemd killt beim Ausloggen Hintergrundprozesse. ...

Bug des Tages: systemd killt beim Ausloggen Hintergrundprozesse. Das ist eine Option, deren Default jetzt umgestellt wurde, und prompt gehen Detach+Reattach bei screen und tmux nicht mehr richtig, wenn man sich in der Mitte ausgeloggt hat.

Ihr werdet es euch schon gedacht haben: Ist gar kein Bug. Sowas geschieht nicht aus Versehen. Ist Absicht.

May 27, 2016 10:01 PM


Merton model riskless self-financing derivation

Suppose $dA_t = A_t[\mu dt+\sigma dW_t]$ (assets' value) under the physical measure, plus the other assumptions of the Merton model.

Suppose further that debt and equity are tradeable assets that satisfy $A_t = D_t+E_t$ and follow processes $D_t = D(t,A_t)$, $E_t = E(t,A_t)$ for differentiable functions.

By considering a locally risk-free self-financing portfolio of bonds and equity(which by necessity will earn the risk-free rate of return), prove directly that both $D$, $E$ satisfy the Black-Scholes equation: $$\partial_t f+\frac{1}{2}\sigma^2 A^2 \partial_A^2 f+r A \partial_A f-r f = 0$$

by davidpaich at May 27, 2016 09:50 PM



What is the best known FPT result for 3-hitting set?

My research problem involves solving a special instance of the 3-hitting set problem, and I was wondering whether my result is actually significant (i.e. if it is better than the best known result for the 3-hitting set).

I'm specifically interested in the case where the only parameter is the optimal hitting set size.

I know the paper proposes O(2.27^k + n) algorithm, but the paper is pretty old.

by user1441057 at May 27, 2016 09:12 PM


Understanding what Self.Generator.Element is in Swift map definition

Swift noob here.

Many functions have the Self.Generator.Element in the parameters of the closure definition. What the heck is it? I tried to go to the definition and it took me no where.

public func map<T>(@noescape transform: (Self.Generator.Element) throws -> T) rethrows -> [T]

by Biclops at May 27, 2016 09:10 PM


Petrank's proof for the APX-hardness of MAX k-VERTEX COVER on subcubic graphs

My question is about the following maximization problem, which is the "fixed cardinality" version of MIN VERTEX COVER. I am interested in the restriction to subcubic graphs (i.e. of maximum degree 3).

INPUT: a graph $G$, an integer $k$
OUTPUT: a set $C\subseteq V(G)$ of vertices of size $k$
MEASURE: the number of edges of $G$ that have at least one endpoint in $C$

MAX k VERTEX COVER is proved to be APX-hard on subcubic graphs, by a reduction from MIN VERTEX COVER (itself well-known to be APX-hard on subcubic graphs), in this 1994 paper by Petrank (Theorem 5.4). However, the proof is very succinct and in a formalism that I do not know, hence I could not grasp the details of it. I would like to have a formal proof in a more standard language. (For example, a proof that this reduction is an L-reduction would be perfect!)

Question: Can someone explain the proof of Petrank's reduction (for example in terms of standard L-reductions)? Alternatively, give (or point to) a different reduction with such a proof.

Below I give my (rather informal, perhaps incorrect) try at understanding Petrank's reduction.

My understading of Petrank's reduction. Assume that there is a PTAS for MAX $k$-VERTEX COVER. Given $\epsilon'>0$, we will transform a $(1+\epsilon)$-approximation algorithm $A$ for MAX $k$-VERTEX COVER into a $(1+\epsilon')$-approximation algorithm $A'$ for MIN VERTEX COVER, contradicting the APX-hardness for the latter. The reduction is trivial in the sense that the graph remains unchanged.

Let $G$ be a subcubic graph. Run $A$ on $G$ to obtain a set $A(G)$ of $k$ vertices that is a $(1+\epsilon)$-approximation for MAX $k$-VERTEX COVER on $G$ (the value of $k$ will be specified later). Let $C:=A(G)$. For each edge uncovered by $A(G)$, select one of its endpoints and add it to $C$. In the end, $C$ is a vertex cover of $G$. Moreover, on a fraction of roughly $\lambda=k/|V(G)|$ vertices, $C$ is a $(1+\epsilon)$-approximation (?), while on the remaining $(1-\lambda)$-fraction, $C$ is a $3$-approximation (because $G$ is subcubic). So, overall $C$ is an approximation with ratio $(\lambda(1+\epsilon)+(1-\lambda)3)=\lambda(\epsilon-2)+3$. If we fix $\lambda=\frac{\epsilon'-2}{\epsilon-2}$, $C$ is indeed a $(1+\epsilon')$-approximation. So, MAX $k$-VERTEX COVER has no PTAS.

Note: in Lanberg's 1998 master thesis, a stronger result than Petrank's is proved (the hardness remains valid for a wide range of values of $k$), but only for graphs of maximum degree 30 (see Lemma 3.1). I do not know if other results of APX-hardness for the problem exist in the literature.

by Florent Foucaud at May 27, 2016 09:09 PM


How can you find change in working capital and capital expenditures without a balance sheet?

I'm working with the following information trying to work through a valuation exercise and I'm absolutely stuck. How can I find ∆WC and CAPX with this information?

enter image description here

by Parker at May 27, 2016 09:07 PM


exercise on Huffman encode

I have these 4 symbols with their probabilities:

 x   P(x)
 1   0.3
 2   0.3
 3   0.2
 4   0.2

I built the Huffman tree in this way:

enter image description here

and I obtainded:

 x   P(x)   C(x)
 1   0.3     0
 2   0.3     10
 3   0.2     110
 4   0.2     111

it's correct? Because according to the solution the results should be:

 x   P(x)   C(x)
 1   0.3     00
 2   0.3     01
 3   0.2     10
 4   0.2     11

Why? Yet I followed the steps shown here.

by four at May 27, 2016 08:46 PM

a theorem about the enumeration of R and a subset of always halting TMs

The question:

Let $L_1,L_2,...$ be an enumeration of $\mathcal{R}$ and define $A_i = \{\langle M\rangle \ | \ L(M) = L_i\}$. Let $L$ be a language in $\mathcal{RE}$ such that $L \subset \{\langle M\rangle \ | \ \text{M is a TM that always halts}\}$. Prove that there exists an $i$ for which $L∩A_i = ∅$.

Hint $\downarrow$

My approach

I thought of the Hint but I am not sure that the language I get is decidable and I don't know how to prove that all its "always halting" TMs are not in $L$.

by Lior at May 27, 2016 08:32 PM

Overcoming Bias

Rating Ems vs AIs

Everyone without exception believes his own native customs, and the religion he was brought up in, to be the best (Herodotus 440BC).

I’ve given about sixty talks so far on the subject of my book The Age of Em. A common response is to compare my scenario to one where instead of ems, it is non-emulation-based software that can first replace humans on most all jobs. While some want to argue about which tech may come first, most prefer to evaluate which tech they want to come first.

Most who compare to ems to non-em-AI seem to prefer the latter. Some say they are concerned because they see ems as having a lower quality of life than we do today (more on that below). But honestly I mostly hear about humans losing status. Even though both meat humans and ems can both be seen as our descendants, people identify more with meat as “us” and see ems as “them.” So they lament meat no longer being the top dog in-charge center-of-attention.

The two scenarios have many similarities. In both scenarios, meat humans must all retire, and robots take over managing the complex details of this new world, which humans are too slow, distant, and stupid to manage. The world economy can grow very fast, letting meat get collectively very rich, and which meat soon starves depends mostly on how well meat insures and shares among themselves. But it is hard to offer much assurance of long run stability, as the world can plausibly change so fast.

Ems, however, seem more threatening to status than other kinds of sophisticated capable machinery. You can more vividly imagine ems more clearly winning the traditional contests whereby humans compete for status, and then afterward acting superior, such as by laughing at meat humans. In contrast, other machines can be so alien that we may not be tempted to make status comparisons with them.

If, in contrast, your complaint about the em world is that ems have a lower quality of life, then you have to either care about something more like an average quality of life, or you have to argue that the em quality of life is below some sort of “zero”, i.e., the minimum required for a life to be worth living (or having existed). And this seems to me a hard case to make.

Oh I can see you thinking that em lives aren’t as good as yours; pretty much all cultures find ways to see their culture as superior. But unless you argue that em lives are much worse than the typical human life in history, then either you must say the typical human life was not worth living, or you must accept em lives as worth living. And if you claim that the main human lives that have been worth living are those in your culture, I’ll shake my head at your incredible cultural arrogance.

(Yes, some like Nick Bostrom in Superintelligence, focus on which scenario reduces existential risk. But even he at one point says “On balance, it looks like the risk of an AI transition would be reduced if whole brain emulation comes before AI,” and in the end he can’t seem to rank these choices.)

by Robin Hanson at May 27, 2016 08:20 PM

DragonFly BSD Digest

garbage[27]: All these chat programs suck, I’ll create another one

The garbage[27] podcast is out, and it’s covering OpenBSD, iOS, and Android topics, or at least that’s what I guess from the summary, cause I’m still at work.

by Justin Sherrill at May 27, 2016 08:15 PM



Error in recursive function / trying to find all subdirectories of a given directory

I'm trying to write a function that returns all subdirectories of a given directory (recursive).

What I have so far is:

import System.Directory
import Control.Monad
import Data.List

getSubDirs :: FilePath -> IO [FilePath]
getSubDirs dir =
        getDirectoryContents dir
        >>= filterM (\x -> return $ x /= "." && x /= "..")
        >>= mapM (makeAbsolute . (\x -> dir ++ "/" ++ x))
        >>= filterM doesDirectoryExist
        >>= return . reverse

getDirTree :: [FilePath] -> IO [FilePath]
getDirTree [] = return []
getDirTree l@(x:xs) = do
        a <- getSubDirs x >>= getDirTree
        b <- getDirTree xs
        return (nub $ l ++ a ++ b)

Which seems to work. But there is something wrong with the recursion - I would like to get rid of the nub in getDirTree.

by Nausika at May 27, 2016 07:58 PM


What is the intuition on why the longest path problem does not have optimal substructure?

I was learning about longest paths and came across the fact that longest paths in general graphs is not solvable by dynamic programming because the problem lacked optimal substructure (which I think the statement needs to be corrected to longest simple paths on general graphs is not solvable by dynamic programming).

If we assume they have to be simple (for some reason this is required which I don't appreciate yet) and longest, then its easy to create a counter example. Consider the square graph with 4 nodes A→B→C→D→A.

enter image description here

A longest path from A to D is clearly A→B→C-D. But the longest path from B to C is not part of the longest path to from A to D because the longest path from B to C is B→A→D→C which is clearly not the same as the path B→C (which in this case is in fact a shortest path!).

This seems to work only because we required the paths to be simple. Is it necessary to assume that the paths must be simple for us to prove that optimal substructure is not present?

I think that the counter example should be good evidence/proof (which I don't deny), I just don't find the counter example very enlightening at all. I see why it proves the point that it doesn't allow optimal substructure but it fails to provide me real understanding or intuition why it should be obvious that there should be no longest path optimal substructure. Like for example, why doesn't a cut and paste argument work? If the subpath isn't longest, then just stick in a longer path! It sounds very tempting, I mean, if we put in place something longer then of course it should get longer...though, this is for some reason wrong. Does the example actually in fact prove that DP can never solve longest (simple?) paths efficiently? I don't require for a general proof that its not in P (since thats might be asking for a P vs NP solution). I am just after a proof that its not solvable by DP (or at least very strong evidence that DP can never solve this longest paths problem or that the problem does not have optimal substructure).

I have definitively seen in wikipedia that the problem should be NP-Hard, which means that there is probably no fast algorithm. Not sure if that is the only type of evidence/intuition that exists to provide that optimal substructure should be obviously lacking (or if its not lacking, that it can't be used to make the problem faster). Is that the only way to show that it should not be solvable by a fast dynamic program?

Is the reason we require simple just because if we don't place that requirement the problem becomes trivial/uninteresting? In other words, if there is any cycle then one has solved the longest path problem to all nodes reachable from that cycle (by finding any path to that cycle). For the nodes that don't have any cycle reachable, we have a acyclic graph, which is solvable by DP (if the weights are positive?). In addition if there are cycles, we have automatically made things have optimal substructure (I think) because any longest path just is made up of longest path which covers two cases, 1 the paths through the cycle or 2 the paths through the DAG, which both contain optimal substructure. So the problem has become trivial without the requirement of simple paths? Or am I missing something or are there better explanations to why simple paths are required? Doesn't the algorithm in this paragraph show that general longest paths are solvable by DP?

I am also not 100% sure what requirements are needed to guarantee that DP can't be used. Is it necessary to negative edge weights, positive, un-weighted, directed, undirected...what are the requirements?

by Charlie Parker at May 27, 2016 07:41 PM

Complexity of choosing a suitable disjunctive domain for program analysis

Do you know the complexity class of the following problem?


  • A sequential nondeterministic program over integer variables, equipped with initial values of these variables. The exact programming language is not that important; let us take Disjktra's GCL (Guarded Command Language).
  • A finite set S of integer linear inequalities over the program variables, where the program counter is also represented as an integer variable.


Is there a nontautological inductive invariant as a disjunction over a subset of S?

I can prove that the above problem lies in PSPACE, but I have no idea about the lower bound. I need an answer of the form "X-complete" for a suitable X. My hunch is X = PSPACE, but I cannot prove it.

by Mark McGregor at May 27, 2016 07:38 PM


The Great Lobster Boil

Early lobsters communication technology

Hey folks!

  • Do you like discussing things with other crustaceans?
  • Do you hate leaving the house, or do you just live inconveniently far from other people?
  • Do you have a functioning microphone/camera/keyboard?

If you answered “yes”, “sure”, or “what are you selling?” to any of the above, then you’re the lobsters we’re looking for!

I’d like to start a thread to give site users the opportunity to coordinate on meeting each other online and having a friendly chat!


  1. If interested, make a post in this thread. Include preferred contact method, areas of expertise, hobbies, and things you don’t want to talk about.
  2. Users who want to pair up reply to your post.
  3. Using PMs, orchestrate virtual meetup with partner.
  4. PM me (angersock) when you’ve had a meetup so I can see how many people did it. If we get enough interest, we can maybe do a reflection thread later (mods willing).


  • Don’t be a jerk to other lobsters. Don’t be skeevy. Be polite.
  • Try keeping conversations between 0.5 and 1.0 hour, because that’s easier to schedule.
  • Consider using voice-only for communications if you aren’t comfortable, at least for this first run.
  • Consider using throwaway accounts (with your username in there somewhere) for the first run.
  • Talk with lobsters who know more than you about something.
  • Have fun!

EDIT: Shoutout to @ngoldbaum for the new name!

by angersock at May 27, 2016 07:25 PM

DragonFly BSD Digest

ral(4) may suddenly work

If you have a ral(4) wireless card that didn’t function as expected, it may suddenly work for you now on DragonFly 4.5 due to the large wifi update.  The ral(4) driver covers a lot of hardware, so check the man page for all the commercial names.

by Justin Sherrill at May 27, 2016 07:23 PM


Taking intersection in large search

As I understand, you can build the the word -> pages index in Google or large SQL database since indexed search has complexity O(1) -- lookup gives you a billion-page result at once

сomputer -> About 2.14 bln results 
science -> About 1.93 bln results

computer science -> About 377 mln results
"computer science" -> About 147 mln results
science computer -> About 0.97 bln results
"science computer" -> About 452k results

So, you have got 2 billion pages immediately. To find out the pages which contain both words, you seem to need to make an intersection. How do they make it immediately? I do not believe that they try all 10^18 matches

for p in pages(word1):
 for q in pages(word2):
  if p == q yield p

by Little Alien at May 27, 2016 07:20 PM

Computational complexity of logistic map

My question is pretty simple and to the point. Is there a known way to efficiently compute logistic maps to within a specified precision? In other words, the input is a value $x$ and integers $d,n$; the desired output is the result of $n$ iterations of the logistic map applied to $x$, to $d$ bits of precision.

I know of a way to do this using exact real arithmetic but the representations of real numbers that I know and the algorithms I know all take exponential time with respect to the requested number of bits of precision. Using fixed-point arithmetic doesn't work because each multiplication doubles the number of bits of precision needed, so the number of bits needed is exponential in the number of iterations. Is there a known efficient way to compute logistic maps to with specified precision?

by Jake at May 27, 2016 07:04 PM




What does permanent in poly time over $\Bbb F_p$ give?

A paper on arXiv [1] states in abstract that permanent of an $n\times n$ matrix over $\Bbb F_p$ in polynomial time gives $NP=RP$.

My query is why is 'permanent of an $n\times n$ matrix over $\Bbb F_p$ in polynomial time gives $NP=RP$' true and why is stronger conclusion $P=NP$ not conceivable in that case?

  1. A Polynomial-time Algorithm for Computing the Permanent in GF($3^q$) by V. Tarin (2007)

by Student. at May 27, 2016 06:52 PM

Planet Theory

The Man Who Knew Infinity

I generally avoid movies about mathematicians, or mathematics.

I didn't watch Beautiful Mind, or even the Imitation game. Often, popular depiction of mathematics and mathematicians runs as far away from the actual mathematics as possible, and concocts all kinds of strange stories to make a human-relatable tale.

Not that there's anything wrong with that, but it defeats the point of talking about mathematics in the first place, by signalling it as something less interesting.

So I was very worried about going to see The Man Who Knew Infinity, about Srinivas Ramanujan and his collaboration with G. H. Hardy. In addition to all of the above, watching movies about India in the time of the British Raj still sets my teeth on edge.

To cut a long story short, I was happy to be proven completely wrong. TMWKI is a powerful and moving depiction of someone who actually deserves the title of genius. The movie focuses mostly on the time that Ramanujan spent at Cambridge during World War I working with Hardy. There are long conversations about the nature of intuition and proof that any mathematician will find exquisitely familar, and even an attempt to explain the partition function. The math on the boards is not hokey at all (I noticed that Manjul Bhargava was an advisor on the show).

You get a sense of a real tussle between minds: even though the actual discussions of math were only hinted at, the way Hardy and Ramaujan (and Littlewood) interact is very realistic. The larger context of the war, the insular environment of Cambridge, and the overt racism of the British during that period are all signalled without being overbearing, and the focus remains on the almost mystical nature of Ramanujan's insights and the struggle of a lifelong atheist who finally discovers something to believe in.

It took my breath away. Literally. Well done.

by Suresh Venkatasubramanian ( at May 27, 2016 06:35 PM


Virtual Address,Page size,Maximum number of pages [on hold]

In a paging system a virtual address consists of 24 bits in which 16 bits are displacement and 8 bits for page number. Calculate (a) Page size (b) Maximum number of pages (c) Maximum virtual address space

by user39196 at May 27, 2016 06:27 PM



Who pays for sovereign ratings?

Does the "issuer-pay" model hold also for sovereign credit ratings? Do States pay for having their bond being rated?

by Elena De Falco at May 27, 2016 06:21 PM


How to compute the sum of this series involving golden ratio, efficiently?


  1. Let $\tau$ be a function on natural numbers defined as $\tau(n)=\lceil n*\phi^2\rceil$ where $n$ is some natural number and $\phi=\frac{1+\sqrt{5}}{2}$ is the golden ratio. This series can also be looked up here : A004957, with first few terms being $3,6,8,11,14...$ .
  2. Let $t$ be the largest natural number such that $\tau(t) \le 10^{16}$.
  3. Let $S_i$ for $1 \le i \le t$ be defined as $S_i=-(10^{16}-\tau(i)+1)*i+ 2*\sum_{j=\tau(i)}^{j=10^{16}} j $.


How can I compute the sum $(S_1+S_2+...S_t)$ % $7^{10}$ efficiently ? I tried thinking by matrix exponentiation, but can't come up with a method due to the form of the $\tau$ function.

PS: This question is my way of trying to solve the problem : stone game II.

by sasha at May 27, 2016 06:16 PM

Complexity of choosing a suitable conjunctive domain for program analysis

Do you know the complexity class of the following problem?


  • A sequential nondeterministic program over integer variables, equipped with initial values of these variables. The exact programming language is not that important; let us take Disjktra's GCL (Guarded Command Language).
  • A finite set S of integer linear inequalities over the program variables, where the program counter is also represented as an integer variable.


Is there a nontautological inductive invariant as a conjunction over a subset of S?

I need an answer of the form "X-complete" for a suitable X. My hunch is X = PSPACE, but I cannot prove it. The only thing I can prove is that the above problem lies in PSPACE, but I'm not sure about the lower bound.

PS. Notice that here, in a conjunctive domain, only formulas are allowed which are conjunctions of predicates. In a similar question about disjunctive domains, only formulas are allowed which are disjunctions of predicates. It is unclear whether the two problems are formally related or even reducible (e.g., by logspace-reductions) to one another.

by Mark McGregor at May 27, 2016 05:58 PM

Complexity of choosing a suitable Boolean algebra domain for program analysis

Do you know the complexity class of the following problem?


  • A sequential non-deterministic program over integer variables, equipped with initial values of these variables. The exact programming language is not that important; let us take Disjktra's GCL (Guarded Command Language).
  • A finite set S of integer linear inequalities over the program variables, where the program counter is also represented as an integer variable.


Is there a non-tautological inductive invariant described by a quantifier-free Boolean formula over a subset of S?

I need an answer of the form "X-complete" for some X. My guess is X = PSPACE, but I cannot prove it.

PS. Notice that arbitrary (quantifier-free) Boolean formulas are allowed to involve negation, while a similar question on "conjunctive domains" does not allow negations.

by Mark McGregor at May 27, 2016 05:49 PM

Priority Queue using an AVL tree, run time question

This is a question I want to answer in pseudocode: This is regarding a sort of priority queue using an AVL tree. I initialize a global variable (named GLOB) with 0. I receive from the user an input integer of VALUE and i'm required to insert it into an AVL tree. So i created an AVL tree, where each node contains a key that equals the original value of the node plus the global variable. (key = Value + Glob) I input n number of objects into the AVL tree (I assume each one has a unique key)

The question is: I want to increase GLOB by a certain amount, and therefor increase the KEY of each node by that amount. Will this happen in a run time of O(1) or O(n) since i need to "update" all of the nodes, even though they simply contain pointers to GLOB?

by LifeOfPai at May 27, 2016 05:48 PM


Compose functions into another function

Is there a function for Haskell in the standard library which takes three functions and returns a function which applies the return values of the first two functions to the third function, something like this:

compact :: (a -> b) -> (a -> c) -> (b -> c -> d) -> a -> d
compact a b c = \x -> c (a x) (b x)

Or this:

import Control.Arrow
compact' :: (a -> b) -> (a -> c) -> (b -> c -> d) -> a -> d
compact' a b c = uncurry c . (a &&& b)

So that:

compact (take 1) (drop 2) (++) [1,2,3,4] == [1,3,4]
compact (+10) (*2) (<) 11 == True
compact (+10) (*2) (<) 9 == False

by foxterfox at May 27, 2016 05:38 PM


Where can I get equivalent of 3 months libor or swap historical data?

Please note: I have already checked your standard "Historical data sources" link, but it does not have the data I need:

I am looking for 5 years of libor/swap data for major currencies. Daily, or even better hourly.

Is this available anywhere?

An example of what I would like is: Bloomberg ADSW2 CMPL Curncy.

Is there a free equivalent?

by ManInMoon at May 27, 2016 05:07 PM


Should names of built-in funtions be part of BNF grammar?

We're creating our own programming language for a subject. And we're confused whether to add the names of the functions we want to have in our language (like our own version of print) in the BNF grammar.

Reserved words should be recognized by the lexer, right? How about function names?

by Aquila0916 at May 27, 2016 04:55 PM



Formal proof using well-founded induction for relationship between processes in languages [on hold]

I am having to prove that two processes in two different languages are deterministic and have a equivalence relationship between them. I plan to prove this by using well-founded induction on the structure of derivations . In this regards is my below Theorem correct that i need to prove?.

enter image description here

i.e There exists a relation R between P1 and P2 if both executed in same state provide a resulting state and i can prove the resulting state are equal, then it implies that the processes are related ? .

Or have i grossly missed out something?

by anil keshav at May 27, 2016 04:46 PM



Finding a minimum tree which is isomorphic to a subtree of $T_1$ but not to a subtree of $T_2$

Consider the problem that receives two trees $T_1$, $T_2$, and asks to find a minimum size tree $T$ such that there exists a subtree of $T_1$ which is isomorphic to $T$, but there is no such isomorphic subtree in $T_2$.

What is known about the complexity of this problem?

by R B at May 27, 2016 04:43 PM


Time series of European sovereign credit ratings by the Big Three?

I would need time series, from 2000 to 2015 (if possible) of sovereign credit ratings by Moody's, S&P and Fitch. Could you suggest me a source or provide me such a dataset? Thank you very much!

by Elena De Falco at May 27, 2016 04:35 PM

How to simulate correlated Geometric brownian motion for n assets?

So I'm trying to simulate currency movements for several currencies with a given correlation matrix. I have the initial price, drift and volatility for each of the separate currencies, and I want to simulate their prices against USD with correlations following the matrix. I'm doing this in Excel. I read somewhere that multiplying a vector of independent GBMs with the Cholesky decomposition of the correlation matrix gives the required result, but doesn't work.

Any help?

by Yashvardhan at May 27, 2016 04:26 PM

How to take care of newly auctioned yield/price in fixed income data

This is a financial data cleaning question. I have raw price and yield data for US cash treasury across the curve. In the time-series there are jumps on the day after the treasury auction results come out. Prior to using the data, is it good practice to manually remove the jumps? Thanks.

by A1122 at May 27, 2016 04:01 PM


Wolltet ihr schon immer mal wissen, wie die fiese russische ...

Wolltet ihr schon immer mal wissen, wie die fiese russische Propagandamaschine funktioniert?

Heute ist eure Gelegenheit!


instead of trying to silence the western propaganda – they actually actively promote it!
Und wenn man nicht in der Propagandablase lebt, aus der die Sprüche kommen, dann wirken die natürlich alle grotesk und lächerlich. Russen sind mit schlechter Propaganda aufgewachsen. Die erkennen schlechte Propaganda, wenn sie sie sehen. Also zeigt man ihnen die aus dem Ausland über Russland!

Und so gibt es in Russland auch immer wieder Talkshow-Spots für ukrainische Nationalisten und "Journalisten" aus den USA, wenn sie sich auf russisch verständigen können. Und da muss man dann gar nichts weiter machen, einfach die sich um Kopf und Kragen reden lassen :-) (via)

Update: Bei der Propagandaschau gibt es eine deutsche Übersetzung.

May 27, 2016 04:01 PM


How do you find distribution and parameters for real data? (Python 3)

I have a dataset from sklearn and I plotted the distribution of vectorized data.

Using Python 3, How can I get the distribution-type and parameters of the distribution this most closely resembles?

All I know it's positive and skewed. . . Is there a way in Python to give it a few distributions and then get the best fit for it? Or, to actually suggest a fit based on the data that's given? That would be realllllly useful.

from sklearn.datasets import load_diabetes
import matplotlib.pyplot as plt
import seaborn as sns; sns.set()
import pandas as pd

#Get Data
data = load_diabetes()
X, y_ =,

#Organize Data
SR_y = pd.Series(y_, name="y_ (Target Vector Distribution)")

#Plot Data
fig, ax = plt.subplots()
sns.distplot(SR_y, bins=25, color="g", ax=ax)

enter image description here

Note: Ultimately I'm trying to figure out good prior parameter defaults for Bayesian inference. I know you're not supposed to use the data at all in your priors but I have no idea what default parameters to use so I have to start somewhere. Anyone with good stats background who could give some insight into this your help would be GREATLY APPRECIATED

by O.rka at May 27, 2016 03:58 PM


Does the Halting Problem prove that true Artificial Intelligence is impossible?

The Halting Problem demonstrates that there are things that a machine can never be programmed to do.

Is this proof that true Artificial Intelligence - that is, the ability for a machine to think and reason like humans - is not possible?

by CodyBugstein at May 27, 2016 03:46 PM

Complexity of choosing a suitable positive domain for program analysis

Do you know the complexity class of the following problem?


  • A sequential nondeterministic program over integer variables, equipped with initial values of these variables. The exact programming language is not that important; let us take Disjktra's GCL (Guarded Command Language).
  • A finite set S of integer linear inequalities over the program variables, where the program counter is also represented as an integer variable.


Is there a nontautological inductive invariant described by a quantifier-free positive Boolean combination over a subset of S?

I need an answer of the form "X-complete" for a suitable X. My hunch is X = PSPACE, but I cannot prove it. The only direction I can prove is PSPACE-hardness by a reduction from proving (non)reachability of a location in a sequential nonrecursive program on Boolean variables.

by Mark McGregor at May 27, 2016 03:36 PM

Lossless Condensing, Modification, and Decondensing

Given a string $\alpha$ that is derived from context-free grammar $G$, what is an algorithm $f$ such that there exists a string $\beta$ (derived from an unrestricted grammar) where

  • $f(\alpha)=\beta$,
  • $f^{-1}(\beta)=\alpha$,
  • $\forall \alpha_0,\alpha_1,\alpha_2\in G. d(\alpha_0,\alpha_1)<d(\alpha_0,\alpha_2)\leftrightarrow d(f(\alpha_0),f(\alpha_1))<d(f(\alpha_0),f(\alpha_2))$ where $d(\alpha_0,\alpha_1)$ is the Hamming distance between $\alpha_0$ and $\alpha_1$ (i.e., small perturbations of $\beta$ lead to small perturbations of $\alpha$, and vice versa), and
  • for any perturbation $\beta'$ of $\beta$ (say, by substituting a letter) we have $f^{-1}(\beta')=\alpha'$ such that
    • $\alpha'$ is a string derived from $G$ (i.e., "valid" w.r.t. $G$) and
    • $\alpha'$ is a perturbation of $\alpha$.

On a high level, I would like translate arbitrary modifications to some "dense" representation of a string $\alpha$ into "valid" modifications of $\alpha$ ("valid" means that $\alpha'\in G$).

Since arbitrary modifications of $\beta$ yield valid modifications of $\alpha$, we can say that $\beta$ is a "condensed" representation of $\alpha$, $f$ is the "condensing" operator, and $f^{-1}$ is the "de-condensing" operator.

It doesn't necessarily need to be complete such that all valid modifications of $\alpha$ can be generated but it should be sound such that only "valid modifications of $\alpha$ can be generated.

My first idea was to conceptually index each string that can be generated by $G$. For instance, if $\alpha$ has index $\beta=12$, we could generate a perturbation $\beta'=15$ and find $\alpha'$ as the string derived from $G$ with index 15. However, $\alpha'$ may be very different from $\alpha$ e.g., in terms of Hamming distance and I would need to establish an order over all strings and be able to enumerate the first $n$ strings.

Any ideas or pointers?

by Marcel at May 27, 2016 03:30 PM


k-NN algorithm accuracy for iris dataset

I'm currently studding k-NN algorithm and I've implemented it in the simplest way (I used euclidean distance, no normalization etc.). I choose iris dataset:

I ran my program 100 times and calculated the average accuracy. And I did this several times. The average accuracy lays between 95.0% and 97.0%, but in most cases it is about 96.0%. However for single launches the accuracy may be even 100.0%.

So my question: is such results good for this algorithm and this dataset? I thought that the average accuracy for iris dataset should be extremely high, more than 98.0%. Is such a big (about 2.0%) difference between the average accuracy values normal?

Here is my code in Python (now k = 3):

from random import random
from collections import defaultdict

class kNN:
    def __init__(self, filename, k, split, n_of_tests = 1, print_info = False):
        self.filename = filename
        self.k = k
        self.split = split
        self.n_of_tests = n_of_tests
        self.print_info = print_info

    def get_datasets(self):
        with open(self.filename, 'r') as f:
            dataset = []
            for i in f:
                temp = i.rstrip('\n').split(',')
                dataset.append(list(map(float, temp[:-1])) + [temp[-1]])
        self.train_set = []
        self.test_set = []
        for i in dataset:
            if random() < self.split:

    def get_neighbors(self, instance):
        distances = [[euclidean_distance(instance, i), i] for i in self.train_set]
        neighbors = [i[-1] for i in distances][:self.k]
        return neighbors

    def get_instance_class(neighbors):
        neighbors_classes = defaultdict(int)
        for i in neighbors:
            neighbors_classes[i[-1]] += 1
        return max(neighbors_classes, key = neighbors_classes.get)

    def run(self):
        if (self.print_info):
            print('train set:', len(self.train_set))
            print('test set:', len(self.test_set))
        count = 0
        for i in self.test_set:
            predicted_class = self.get_instance_class(self.get_neighbors(i))
            if predicted_class == i[-1]:
                count += 1
            if self.print_info:
                print('predicted: ' + repr(predicted_class) + \
                  ', actual: ' + repr(i[-1]))
        if self.print_info:
            print('accuracy:', count / len(self.test_set) * 100)
        return count / len(self.test_set) * 100

    def test(self):
        results = []
        for i in range(self.n_of_tests):
        return sum(results) / len(results)

def euclidean_distance(a, b):
    return sum([(i[0] - i[1]) ** 2 for i in zip(a[:-1], b[:-1])]) ** 0.5

classifier = kNN('', 3, 0.66, 100)

by Count Zero at May 27, 2016 03:25 PM


Proving correctness of an exponentiation routine

I have the following exponentiation routine, which takes $O(\log n)$ steps

function power(a:real, n:positive integer)
  while i > 0 do
    if(i % 2 == 1) then   //i odd
    if(i > 0) then

I'm trying to prove that it always gives $a^n$. I think it might be provable via induction on $n$. in the examples i went through i found that it always ends with $i=1$ before the final division down to $0$.

by miniparser at May 27, 2016 03:20 PM


Calculate n number and n operator in C

I want to write an console application to Calculate n number and n operator in C language using CodeBlock.

For Example I need a program that can Calculate this:




How can I do this? Please help me. Thanks a lot

by Mahdi.s at May 27, 2016 03:14 PM

Classify a text with weka after classifier has been trained

I am a beginner with weka.

I have managed to import dataset from the disk (one folder by category, all text related to this category inside the folder), apply StringToWordVector with tokenizer, train a Naive Multniomial categorizer ... The code is below (it is c# but Java is ok of course)

However, I can hardly find information on how to use the categorizer on a project. Say I have a text with unknown category, input by a user, how can I just apply the categorizer to this text and infer the category it belongs to ? (code "// what to do here below"). Any help would be greatly appreciated ;-)

Thanks in advance


string filepath = @"C:\Users\Julien\Desktop\Meal\";
    ClassificationDatasetHelper classHelper = new ClassificationDatasetHelper();
    weka.core.converters.TextDirectoryLoader tdl = new

    weka.core.Instances insts = tdl.getDataSet();

    weka.filters.unsupervised.attribute.StringToWordVector swv = new weka.filters.unsupervised.attribute.StringToWordVector();
    weka.core.tokenizers.NGramTokenizer tokenizer = new weka.core.tokenizers.NGramTokenizer();

    insts = weka.filters.Filter.useFilter(insts, swv);


    weka.classifiers.Classifier cl = new weka.classifiers.bayes.NaiveBayesMultinomial();
    int trainSize = insts.numInstances() * percentSplit / 100;
    int testSize = insts.numInstances() - trainSize;
    weka.core.Instances train = new weka.core.Instances(insts, 0, trainSize);

    string s = "Try to classify this text";
    weka.core.Instance instanceToClassify = new weka.core.Instance();

    // what to do here
    // ???

    double predictedClass = cl.classifyInstance(instanceToClassify);


by JulienS at May 27, 2016 03:13 PM


Is there an approximation algorithm for MAX k DOUBLE SET COVER?

Given a set system $(X,\mathcal S)$, let us say that a subset $\mathcal C\subseteq \mathcal S$ doubly covers a vertex $x$ in $X$ if $x$ is contained in at least two sets of $\mathcal C$. Let us define the following maximization problem:

INPUT: a set system $(X,\mathcal S)$, an integer $k$
OUTPUT: a set $\mathcal C\subseteq \mathcal S$, $|\mathcal C| \le k$
MEASURE: the number of elements of $X$ that are doubly covered by $\mathcal C$

MAX k DOUBLE SET COVER is a generalization of the problem MAX k SET COVER, where we only require simple coverage instead of double coverage (see the celebrated 1998 paper by Feige, "A threshold of $\ln n$ for approximating Set Cover"), where a $(1-1/e)$-approximation is given).

My question: Is there any known nontrivial positive approximation bound for MAX k DOUBLE SET COVER?
Note: I am especially interested in the restriction where the set system has maximum intersection one, that is, any two sets in $\mathcal S$ share at most one element (SET COVER for these set systems remains hard to approximate, see this paper).

As pointed out in the answer to this other question, the problem DENSEST k-SUBGRAPH is a special instance restriction of MAX k DOUBLE SET COVER with set systems with intersection one: an input graph $G$ corresponds to the hypergraph $(E(G),\mathcal S_G)$ where $\mathcal S_G$ has a set $S_v$ for each vertex $v$ of $V(G)$ containing the edges incident to $v$ in $G$.
DENSEST k-SUBGRAPH has an $O(n^{1/4})$-approximation algorithm (this paper); perhaps this algorithm can been (has been) extended to MAX k DOUBLE SET COVER (for set systems with intersection one)?
(The complexity of DENSEST k-SUBGRAPH being well-studied and a tough open problem, for MAX k DOUBLE SET COVER I am not expecting anything better than an approximation ratio of the form $O(n^{\delta})$ for some $\delta<1$.)

Any related observation or reference is welcome. Thanks!

by Florent Foucaud at May 27, 2016 03:09 PM


The meaning of $A\in co-NP$?

I trying to understand the meaning of $A\in co-NP$. That's mean that $\bar{A}\in NP$? (I followed the definition).

$A\subseteq \Sigma^{*}$.

Thank you!

by Yoar at May 27, 2016 03:07 PM



Why is the set of perfect squares in P?

I am reading an article by Cook [1]. In it he writes:

The set of perfect squares is in P, since Newton's method can be used to efficiently approximate square roots.

I can see how to use Newton's method to decide whether a natural number is a perfect square or not, but it seems tricky to analyse.

Is there an easier way to see that the set of perfect squares is in P?

  1. S. Cook, The P versus NP problem, Online

by Wei-Cheng Liu at May 27, 2016 02:54 PM



How to use feature selection and dimensionality reduction in Unsupervised learning?

I've been working on classifying emails from two authors. I've been successful in executing the same using supervised learning along with TFIDF vectorization of text, PCA and SelectPercentile feature selection. I used scikit-learn package to achieve the same.

Now I wanted to try the same using Unsupervised Learning KMeans algorithm to cluster the emails into two groups. I have created dataset wherein I have each data point as a single line in the python list. Since I am a newbie to unsupervised so I wanted to ask if I can apply the same dimensionality reduction tools as used in supervised (TFIDF, PCA and SelectPercentile). If not then what are their counterparts? I am using scikit-learn for coding it up.

I looked around on stackoverflow but couldn't get a satisfactory answer. I am really stuck at this point.

Please help!

by user2168281 at May 27, 2016 02:39 PM



What is the effect of mean-reversion on an upper barrier knock-out call option?

Consider a mean-reverting normal model for an underlying

$dX^{(1)}_t=-\kappa X^{(1)}_tdt+\sigma^{(1)} dW^{(1)}_t$,

for fixed time-independent constants, $\kappa$ (mean-reversion) and $\sigma^{(1)}$ (volatility) and Brownian motion, $W^{(1)}_t$. Suppose that using this model, I calculate options prices for all $t$, then calibrate the time-dependent local vol, $\sigma_t^{(2)}$, of a second normal model (without mean-reversion)

$dX^{(2)}_t=\sigma_t^{(2)} dW^{(2)}_t$,

so that the two models give the same prices for vanilla options at all times.

Will a continuous upper barrier knock-out call option be cheaper in the first or second model?

For simplicity, take $X_0=Y_0=0$, and assume that the upper barrier, $B$, is larger than the strike, $K$.

by Jase Uknow at May 27, 2016 02:15 PM

Lamer News



Useful equivalence relations on $X^{\ast}$ (like the Nerode and syntactic equivalence relations)

I want to have an overview of all the meaningful equivalence relations defined on $X^{\ast}$, in particular when the languages in question are regular. They typically arise in connection with the minimization of automata. Further I want to know all the relations among them. Below I write what I know, I encourage everyone to add useful facts to this list, give other equivalence relations and how they relate to the given ones!?

Common equivalence relations on $X^{\ast}$ are \begin{align*} u \equiv_L v & :\Leftrightarrow \forall w \in X^{\ast} : uw \in L \leftrightarrow vw \in L \\ u \equiv_R v & :\Leftrightarrow \forall w \in X^{\ast} : wu \in L \leftrightarrow wv \in L \\ u \equiv_S v & :\Leftrightarrow \forall w_1, w_2 \in X^{\ast} : w_1uw_2 \in L \leftrightarrow w_1vw_2 \in L \end{align*} If we have some deterministic and complete automaton $\mathcal A = (X,Q,\delta,q_0,F)$ with $L = L(\mathcal A)$, then every $u \in X^{\ast}$ corresponds to a function $\varphi_u : Q \to Q$. Then we can define $$ u \equiv_A v :\Leftrightarrow \delta(q_0, u) = \delta(q_0,v) \Leftrightarrow \varphi_u(q_0) = \varphi_v(q_0). $$ Also for states $q, q' \in Q$ we could define $$ q \equiv q' :\Leftrightarrow \forall u \in X^{\ast} : \delta(q, u) = \delta(q',u) \Leftrightarrow \varphi_u(q) = \varphi_v(q') $$ and by incorporating the final states (and thereby fitting it more to the conrete language $L$) we could define $$ q \equiv^L q' :\Leftrightarrow \forall u \in X^{\ast} : \delta(q, u) \in F \leftrightarrow \delta(q', u) \in F. $$ And if we have some recognizing monoid $M$ with homomorphism $\varphi : X^{\ast} \to M$ we could define $$ u \equiv_h v :\Leftrightarrow h(u) = h(v). $$ We have (the automaton is arbitrary) \begin{align*} u \equiv_L v & \Leftrightarrow \delta(q_0, u) \equiv^L \delta(q_0,v) \\ u \equiv_R v & \Leftrightarrow \forall q \in Q : \delta(q, u) \in F \leftrightarrow \delta(q, v) \in F \\ u \equiv_S v & \Leftrightarrow \forall q \in Q : \delta(q, u) \equiv^L \delta(q,v) \\ u \equiv_h v & \Rightarrow u \equiv_S v \\ u \equiv_S v & \Rightarrow u \equiv_L v \land u \equiv_R v \\ u \equiv_A v & \Rightarrow u \equiv_L v \\ \varphi_u = \varphi_v & \Rightarrow u \equiv_S v \\ u \equiv_S v & \Leftrightarrow \forall q \in Q : \varphi_u(q) \in F \leftrightarrow \varphi_v(q) \in F \end{align*} And we have that the syntactic monoid is precisely the transformation monoid of the minimal automaton (which could be constructed with $\equiv_L$). Also the above notions could be described in terms of language quotients $u^{-1}L := \{ w : uw \in L \}, Lu^{-1} := \{ w : wu \in L \}$ (and the minimal automaton could also be constructed with the quotients $u^{-1}L$) or the sets $L_q(\mathcal A) := (X,Q,\delta,q,F)$ (i.e. the language accepted if $\mathcal A$ is started at $q$).

by Stefan at May 27, 2016 01:48 PM


Updating an MST $T$ when the weight of an edge not in $T$ is decreased

Given an undirected, connected, weighted graph $G = (V,E,w)$ where $w$ is the weight function $w: E \to \mathbb{R}$ and a minimum spanning tree (MST) $T$ of $G$.
Now we decrease the weight by $k$ of an edge $e$ which does not belong to $T$.

How to efficiently update $T$ to make it an MST (denoted $T'$) of $G'=(V,E,w')$, where $w'$ is the same as $w$ except that $w'(e) = w(e) - k$?

The algorithm for updating $T$ to $T'$ is easy: Adding $e$ to $T$ creates a cycle $C$ in $T$. Let $e'$ be a maximum-weighted edge in the cycle $C$. If $w(e') > w(e)$, then $T' = T \cup \{e\} - \{e'\}$ is the MST as desired. Otherwise, $T' = T$.

I have difficulty in proving its correctness by contradiction. Suppose $T''$ is a spanning tree of $G'$ and $w'(T'') < w'(T')$.

  • $e \notin T''$: we have $w(T'') = w'(T'') < w'(T') < w(T)$. Contradicts with the fact that $T$ is an MST of $G$.
  • $e \in T''$: I am stuck here.

Two Notes:

  • The accepted answer here to the same question is too general for me to follow.

  • I prefer to proofs which do not rely on any concrete MST algorithms, such as Kruskal's and Prim's algorithms. However, you don't need to prove it by contradiction or separate the two cases $e \notin T''$ and $e \in T''$ as I did.

by hengxin at May 27, 2016 01:47 PM


CountVectorizer reading and writing vocabulary

I am currently working on a fairly trivial sentiment classification program. Everything works well in the training phase. However, I am having trouble using CountVectorizer to test new strings of text that contain unseen words.

For this reason I am trying to write a lookup vocabulary for vectorization in the testing phase. However, I don't know how to create and retrieve the vocabulary object to pass as a parameter.

My two methods currently appear as follows:

def trainingVectorTransformation (messages):
    #--> ReviewText to vectors    
    vect = CountVectorizer(analyzer=split_into_lemmas).fit(messages['reviewText'])

    messages_bow = vect.transform(messages['reviewText'])

    feature_list = vect.get_feature_names()
    with open("vocab.txt", "w") as text_file:

    tfidf_transformer = TfidfTransformer().fit(messages_bow)

    messages_tfidf = tfidf_transformer.transform(messages_bow)
    return messages_tfidf


def testingVectorTransformation (messages):
    #--> ReviewText to vectors
    txt = open("vocab.txt")
    vocabulary =

    vect = CountVectorizer(analyzer=split_into_lemmas, vocabulary = vocabulary).fit(messages['reviewText'])

    messages_bow = vect.transform(messages['reviewText'])

    tfidf_transformer = TfidfTransformer().fit(messages_bow)

    messages_tfidf = tfidf_transformer.transform(messages_bow)
    return messages_tfidf

If anyone has any advice on how to properly create and use the vocabulary I would very much appreciate it.

by 1540303 at May 27, 2016 01:39 PM

Planet Emacsen

Jorgen Schäfer: Circe 2.3 released

We just released version 2.3 of Circe, a Client for IRC in Emacs.

The package is available from github, MELPA stable and MELPA unstable. The latter will track further development changes, so use at your own risk.


  • Circe (Lui) now has a track bar. Use (enable-lui-track-bar) to get a bar where you stopped reading when you did hide a buffer.
  • Buffers are now by default limited to 100k, as large buffers cause unreasonable slowdown in Emacs.
  • Autopaste now defaults to and also knows about
  • A number of faces have been updated to be nicer to the eye.
  • Improve compatibility with the Slack IRC gateway.
  • Lots of bug fixes.

Thanks to defanor, Janis Dzerins and Vasilij Schneidermann for their contributions.

by Jorgen Schäfer ( at May 27, 2016 01:24 PM


What is the point of the "respect" requirement in cut property of minimum spanning tree?

The cut property stated in terms of Theorem 23.1 in Section 23.1 of CLRS (2nd edition) is as follows.

Theorem 23.1 Let $G = (V, E)$ be a connected, undirected graph with a real-valued weight function $w$ defined on $E$. Let $A$ be a subset of $E$ that is included in some minimum spanning tree for $G$, let $(S, V-S)$ be any cut of $G$ that respect $A$ (emphasis added), and let $(u,v)$ be a light edge crossing $(S,V-S)$. Then, edge $(u,v)$ is safe for $A$.

Why does this theorem require that the cut $(S,V-S)$ respect $A$? How is this requirement used in the correctness proof? I do not see what would fail if the requirement was removed.

Some Definitions:

  • Cut: A cut $(S, V-S)$ of an undirected graph $G=(V,E)$ is a partition of $V$.
  • Cross: An edge $(u,v) \in E$ crosses the cut $(S,V-S)$ if one of its endpoints is in $S$ and the other is in $V-S$.
  • Respect: A cut respects a set $A$ of edges if no edge in $A$ crosses the cut.
  • Light edge: An edge is a light edge crossing a cut if its weight is the minimum of any edge crossing the cut.

by hengxin at May 27, 2016 01:06 PM



Can't find 8GB data in FreeBSD [on hold]

I can't find 8GB data in FreeBSD. I use df -h and see this:

/dev/da1s1a      19G     16G    1.4G    92%    /
devfs            1.0k    1.0k      0B   100%    /dev
zzdata/jail      341G     22G    318G     7%    /jail
ssd/radius       183G     10G    173G     5%    /jail/mysql1/var/db/mysql/radius
ssd              173G     31k    173G     0%    /ssd
zzdata           355G     37G    318G    10%    /usr/local
zzdata/ports     319G    906M    318G     0%    /usr/ports
zzdata/ports/distfiles    319G    457M    318G     0%    /usr/ports/distfiles
zzdata/src       318G    386M    318G     0%    /usr/src
/bin             19G     16G    1.4G    92%    /jail/mysql1/bin
/sbin            19G     16G    1.4G    92%    /jail/mysql1/sbin
/lib             19G     16G    1.4G    92%    /jail/mysql1/lib
/libexec         19G     16G    1.4G    92%    /jail/mysql1/libexec
/usr/bin         19G     16G    1.4G    92%    /jail/mysql1/usr/bin
/usr/sbin        19G     16G    1.4G    92%    /jail/mysql1/usr/sbin
/usr/include     19G     16G    1.4G    92%    /jail/mysql1/usr/include
/usr/lib         19G     16G    1.4G    92%    /jail/mysql1/usr/lib
/usr/libdata     19G     16G    1.4G    92%    /jail/mysql1/usr/libdata
/usr/libexec     19G     16G    1.4G    92%    /jail/mysql1/usr/libexec
/usr/share       19G     16G    1.4G    92%    /jail/mysql1/usr/share
/usr/ports       319G    906M    318G     0%    /jail/mysql1/usr/ports
/usr/ports/distfiles   319G  457M  318G   0%    /jail/mysql1/usr/ports/distfiles
devfs            1.0k    1.0k      0B   100%    /jail/mysql1/dev
fdescfs          1.0k    1.0k      0B   100%    /dev/fd

I try to check via du -Aclnx all files mounted to UFS filesystem and nonetheless I find 9525Mb files. Any suggestion?

My fstab:

/dev/da1s1a / ufs rw 1 1
/dev/da1s1b none swap sw 0 0 /mnt/nas nfs rw,tcp,noauto,noatime,async 0 0

by David at May 27, 2016 12:46 PM


Why are regular expressions defined with union, concatenation and star operations? [migrated]

A regular expresssion is defined recursively as

  1. $a$ for some $a \in \Sigma$ is a regular expression,
  2. $\varepsilon$ is a regular expression,
  3. $\emptyset$ is a regular expression,
  4. $(R_1 \cup R_2)$ where $R_1$ and $R_2$ are regular expressions is a regular expression,
  5. $(R_1 \circ R_2)$ where $R_1$ and $R_2$ are regular expressions is a regular expression,
  6. $(R_1)^*$ where $R_1$ is a regular expression is a regular expression.

This definition is taken from page 64 of

Sipser, Michael. Introduction to the Theory of Computation, 3rd edition. Cengage Learning, 2012.

Now, I have the following questions.

  • Why do not the definition contain the intersection, complement or reverse operations?
  • If we change the 4th item to $R_1 \cap R_2$, do we get an equivalent definition, i.e. for each regular language, there is a modified regular expression and vice versa?
  • I know that this definition is complete and well-defined, but why is it preferred to other equivalent, well defined and complete definitions?

by Ali Shakiba at May 27, 2016 12:36 PM



Are all functions with constant space complexity in $REG$?

The Wikipedia article about regular languages mentions that $DSPACE(O(1))$ is equal to $REG$. Can I conclude from this that every function in $R$ with constant space complexity is in $REG$?

by Peter at May 27, 2016 12:22 PM



Wenn er sich schon nicht für die Atombombe entschuldigt, ...

Wenn er sich schon nicht für die Atombombe entschuldigt, was hat Obama dann in Hiroshima gemacht? Eine Welt ohne Atomwaffen gefordert.

Hey, Obama, tolle Idee! Mach doch mal!

May 27, 2016 12:01 PM


Take Your Emacs to the Next Level by Writing Custom Packages

I wrote recently about using Emacs as a JavaScript development environment. One of my chief complaints was the inability to easily run JavaScript tests from within Emacs. I practice TDD frequently, and having to context-switch out of the editor I’m using to run tests is a big annoyance for me.

I knew it was possible to do what I wanted from within Emacs, as evidenced by other test runner modes like RSpec-mode. Armed with that knowledge, I decided to go through the process of learning enough Emacs Lisp to make a Mocha test runner. In the process, I learned a lot about developing Emacs packages and ended up with a really useful tool, so I thought I would share some of the things I learned.

There is a lot of content here, and we are going to cover three main topics: using Emacs as a Lisp IDE, writing a simple package, and publishing that package for others to use.

Emacs as an Emacs Lisp IDE

Unsuprisingly, Emacs itself is an excellent development environment for Emacs Lisp code. It can be easily cofigured to include IDE features for Lisp development, such as autocompletion, popup documentation, integrated debugging, and a REPL.

A few recommendations

Most of these features are built in, although I highly recommend installing the third-party packages company-mode (autocompletion) and Flycheck (real-time syntax checking) if you’re going to do Emacs Lisp development.

I also recommend turning on the built-in eldoc-mode, which will pop up documentation and signatures for various functions and symbols as you write code.

Lastly, I recommend familiarizing yourself with the built-in debugging and evaluation functions for Emacs Lisp. For evaluating code to test it, you can use the built-in Lisp-interaction-mode, which the *scratch* buffer usually has enabled by default. With the mode, you can paste Emacs Lisp code and then press C-x C-e to evaluate the code and see the results.

Emacs also comes with Edebug, a built-in stepping debugger for Emacs Lisp code. There are several ways to use it, but I most commonly use the interactive function edebug-defun. When run inside the body of a function, it sets a breakpoint at the start of that function that will be hit the next time you run it.

Making a Custom Compilation Mode

Mocha is a CLI tool, and Emacs has a number of built-in utilities for running external CLI programs.

Compilation buffer

The most relevant one for something like a test runner is a compilation buffer. In Emacs, this runs an external CLI process and displays the output in a buffer. This is useful for programs where you care about the output, like a compiler or test runner. It also includes some built-in niceties like the ability to highlight errors and jump to them.

In fact, you don’t even need to write any code to run an external command in a compilation buffer. You can just use the M-x compile command like so:

Running a compilation command

This is a solid approach for a static compilation command like the default make -k. However, it doesn’t scale well to something like a test runner, which needs to do the following:

  1. Run a local script, requiring a consistent working directory or an absolute path (M-x compile will use the directory of the current file as the working directory).
  2. Pass dynamic configuration options like the file to test the runner.

Custom compilation mode

The solution in Emacs is to programmatically create a custom compilation mode that can take these options and run using an interactive function. This is easy to do. In fact, the compilation mode for Mocha.el is only a couple of lines:

(require 'compile)
(defvar node-error-regexp-alist
  `((,node-error-regexp 1 2 3)))
(defun mocha-compilation-filter ()
  "Filter function for compilation output."
  (ansi-color-apply-on-region compilation-filter-start (point-max)))
(define-compilation-mode mocha-compilation-mode "Mocha"
  "Mocha compilation mode."
    (set (make-local-variable 'compilation-error-regexp-alist) node-error-regexp-alist)
    (add-hook 'compilation-filter-hook 'mocha-compilation-filter nil t)

While some of the syntax is a little cryptic (thanks, Lisp!), what it does is very simple. We use the built-in define-compilation-mode macro to define a compilation mode named mocha-compilation-mode, and we do two things with it:

  1. Pass it a regular expression that maps Node.js error output to filenames, line numbers, and column numbers.
  2. Add a processing hook which interprets ANSI escape codes and formats them properly.

The first enables us to quickly jump to the point of failure in a test. The second makes everything look nicer.

Running Test Commands

Now that we have a custom compilation mode that will nicely display our command output, we need to generate a test command and run it with the custom mode. Doing this will involve several simple steps.

Find project root

Many types of command line utilities need to be run from the project root. Fortunately, project roots are generally easily identified by the presence of a particular file or directory (like a source control directory). Since this is such a common need, Emacs has a built-in function, locate-dominating-file, to recursively search up a directory tree for a particular file name. The Emacs documentation on this function explains how to use it better than I could:

(locate-dominating-file FILE NAME)
Look up the directory hierarchy from FILE for a directory containing NAME. Stop at the first parent directory containing a file NAME, and return the directory. Return nil if not found. Instead of a string, NAME can also be a predicate taking one argument (a directory) and returning a non-nil value if that directory is the one for which we’re looking.

Customize configuration

Unlike an actual compilation, which would involve rerunning a single static command, something like a test runner needs to be dynamically configurable. Fortunately, Emacs has Customize, an awesome built-in and extensible configuration interface for packages (and the core editor). Customize exposes several macros which can be used to define custom configuration parameters for a package and display them in an editable GUI.

For example, here are the configurations we expose for our Mocha runner:

(defgroup mocha nil
  "Tools for running mocha tests."
  :group 'tools)
(defcustom mocha-which-node "node"
  "The path to the node executable to run."
  :type 'string
  :group 'mocha)
(defcustom mocha-command "mocha"
  "The path to the mocha command to run."
  :type 'string
  :group 'mocha)
(defcustom mocha-environment-variables nil
  "Environment variables to run mocha with."
  :type 'string
  :group 'mocha)
(defcustom mocha-options "--recursive --reporter dot"
  "Command line options to pass to mocha."
  :type 'string
  :group 'mocha)
(defcustom mocha-debug-port "5858"
  "The port number to debug mocha tests at."
  :type 'string
  :group 'mocha)

And those show up in the customize GUI like so:

GUI interface for configuring our package

Since many of these options make sense to configure on a per-project rather than global basis, Emacs also supports a special file called .dir-locals.el, which can override these settings on a per-directory basis. A typical .dir-locals.el file might look like this:

((nil . (
            (mocha-which-node . "/Users/ajs/.nvm/versions/node/v4.2.2/bin/node")
            (mocha-command . "node_modules/.bin/mocha")
            (mocha-environment-variables . "NODE_ENV=test")
            (mocha-options . "--recursive --reporter dot -t 5000")
            (mocha-project-test-directory . "test")

The syntax is a little cryptic, but if your Emacs working directory is in the same directory as this file or below it, it will respect these options in favor of any global configuration.

Once we have these configuration options defined, it is easy to write a function that will concatenate all the strings together to create our test runner command!

(defun mocha-generate-command (debug &optional mocha-file test)
  "The test command to run.
If DEBUG is true, then make this a debug command.
If MOCHA-FILE is specified run just that file otherwise run
IF TEST is specified run mocha with a grep for just that test."
  (let ((path (or mocha-file mocha-project-test-directory))
        (target (if test (concat "--grep "" test "" ") ""))
        (node-command (concat mocha-which-node (if debug (concat " --debug=" mocha-debug-port) "")))
        (options (concat mocha-options (if debug " -t 21600000"))))
    (concat mocha-environment-variables " "
            node-command " "
            mocha-command " "
            options " "

Generating and Running Compile Command

Now that we can configure our test command and find the root of our project, we are ready to run it with the custom compilation mode we made earlier. I’m going to show you the most important code for doing that below, and then break it down and explain the different parts.

(defun mocha-run (&optional mocha-file test)
  "Run mocha in a compilation buffer.
If MOCHA-FILE is specified run just that file otherwise run
IF TEST is specified run mocha with a grep for just that test."
  (save-some-buffers (not compilation-ask-about-save)
                     (when (boundp 'compilation-save-buffers-predicate)
(when (get-buffer "*mocha tests*")
    (kill-buffer "*mocha tests*"))
  (let ((test-command-to-run (mocha-generate-command nil mocha-file test)) (root-dir (mocha-find-project-root)))
    (with-current-buffer (get-buffer-create "*mocha tests*")
      (setq default-directory root-dir)
      (compilation-start test-command-to-run 'mocha-compilation-mode (lambda (m) (buffer-name))))))

Whew! That is some pretty dense code, so let’s break it down bit by bit.

Check for unsaved buffers

The first thing this function does is check if there are any unsaved buffers open, and then prompt the user to save them. Sounds pretty complex, but since this is such a common operation, Emacs makes it possible with just a couple of lines.

  (save-some-buffers (not compilation-ask-about-save)
                     (when (boundp 'compilation-save-buffers-predicate)

Clean up test buffer

Next, we search for the named buffer we use to run tests to see if it is still around from a previous test run. If it is, we kill it so we can get a fresh start.

  (when (get-buffer "*mocha tests*")
    (kill-buffer "*mocha tests*"))

Bind values

After that, the real work begins. We start by binding two values: the actual test command we are going to run and the path to the project root directory. Both values are calculated using the techniques and code we defined above.

  (let ((test-command-to-run (mocha-generate-command nil mocha-file test)) (root-dir (mocha-find-project-root)))

Run test command

Finally, now that we have those two values, we actually run our test command. This is a three-step process of:

  1. Creating and switching to the buffer our tests will run in.
  2. Changing the working directory to our project root.
  3. Running our test command in the buffer with our custom compilation mode.

All of this is done with the last three lines of code:

    (with-current-buffer (get-buffer-create "*mocha tests*")
      (setq default-directory root-dir)
      (compilation-start test-command-to-run 'mocha-compilation-mode (lambda (m) (buffer-name))))))

Expose interface to users

Now that we have the code to run our test commands, we need to expose it to users. For explicit actions like running commands, Emacs uses interactive functions, which can be called interactively by a user via either the M-x interface or a hotkey.

To make a function interactive, you just include the (interactive) special form at the top of the function body like so:

(defun mocha-test-file ()
  "Test the current file."
  (mocha-run (buffer-file-name)))

If you are not exporting the function as part of a mode, it is also customary to add the ;;;###autoload magic comment before the function, which helps other Emacs files referencing your package find the function so it can be used (for example, to bind them to a hotkey).

Once a function is defined as interactive, it will appear in the M-x interface and can be activated by a user.

Interactive functions for our mode

And there you have it. With only a couple of functions and big dose of Emacs magic, we have created a highly configurable test runner that is integrated into our development environment.

Distributing on MELPA

Having done all the work to create a custom package, don’t you just want to share it with the world? Fortunately for you, Emacs has a built-in package manager that makes this pretty easy. The package manager is backed by several different repositories, so making your package publicly available is just a matter of getting it into one of these repositories.

The three main package repositories are ELPA, Marmalade, and MELPA. ELPA is the offical GNU repository that comes with Emacs, while Marmalade and MELPA are third-party repositories. There are a number of differences between each of the repositories, the most significant being how they deal with licensing.

ELPA and Marmalade both require that all packages are GPL- or GPL-compliant licensed. Additionally, ELPA requires you to complete an FSF copyright assignment form. MELPA, on the other hand, has no licensing requirements, although it does have a code review process that all newly added packages must go through to ensure the code is of suitable quality.

Which package repositories you choose to put your code on is up to you, but I personally use MELPA and will talk about the process of getting a package into that repository.

There are two basic steps to getting a project on to MELPA.

Format the package file

First, you need to follow standard Emacs Lisp conventions for formatting a package file, which includes adding a description header and several other sections to the file. The Flycheck package for Emacs is invaluable here, because it will mark all of the required sections that are missing as errors and guide you through adding them. Doing this correctly is important because the Emacs package manager actually parses these sections as metadata to use.

Add your recipe

Once your code is properly formatted, all you need to do is fork the MELPA project on GitHub and add a recipe for your project. MELPA has docs for configuring more complex projects, but for a simple one-file package, the recipe is really easy.

The recipe for the Mocha runner looks like this:

 :repo "scottaj/mocha.el"
 :fetcher github)

That’s it, just a path to the GitHub repository. Once the recipe is added, you can open a pull request against MELPA. Someone will review your package and may suggest code changes. Once those are done, your pull request will be merged and MELPA will start publishing your package in its regular builds. The best part is, since MELPA pulls your code straight from your source repository, you don’t have to do anything to push updates to MELPA. It will just automatically pull down the latest version of your code.

Well, that is my short guide to creating and publishing an Emacs package. You can find the Mocha.el package I used as an example here and my Emacs config here. Drop me a comment if you have any questions!

The post Take Your Emacs to the Next Level by Writing Custom Packages appeared first on Atomic Spin.

by Al Scott at May 27, 2016 12:00 PM




prove that 2.n3 + 3.n + 10  O (n4) [duplicate]

This question already has an answer here:

plz provide the answer of below question prove that 2.n3 + 3.n + 10  O (n4)enter image description here

by Muhammad Rameez at May 27, 2016 11:18 AM

Shortest distance from a set of points

Consider an unweighted, undirected, simple graph $G=(V,E)$. We have some subset $S \subseteq V$, and we want to determine the shortest distance from any vertex $v\in V$ to some vertex $s\in S$.

To be clear, each $v\in V$ has a distance $d_1, d_2,\dots , d_{|S|}$ to each $s\in S$, and we want to find the smallest value of all $d_1, d_2,\dots , d_{|S|}$, and do this for every $v\in V$. So every $s\in S$, for example, would have a distance of $0$.

The algorithm should run in $O(|V|+|E|)$ time. Note that this is independent of $|S|$. I was thinking of running simultaneous BFS's on each $s\in S$ and only assign a vertex a distance if it doesnt already have a distance estimate (i.e. no other node in $s$ reached to $v$ earlier). I'm not really sure how to make the runtime work out though, since we would have to cut off the BFS at some point to prevent a factor of $|S|$ in the runtime.

Also, this is not a homework problem, it's a question from a midterm from a class a few years ago that I am using to study for my final coming up, and the solutions are not available. Thanks for the help!

by Raj at May 27, 2016 11:15 AM


Correct Alphabet (Google) market cap calculation?

Given the definition: Market capitalization (market cap) is the total market value of the shares outstanding of a publicly traded company; it is equal to the share price times the number of shares outstanding.

I find it quite puzzling to find different numbers everywhere for Google's (now renamed Alphabet) market capitalisation. I have summarised my findings in this Google sheet ( hope you can all access it.

It ranges from the totally wrong for Market Cap (see CNBC), to inconsistent (see Nasdaq, WSJ and Yahoo finance) to differences in number of shares (Google finance and don't seem to agree on the number of outstanding shares). My aim is first to understand what is the right number for outstanding shares and market cap and second what is the right "price" for the Class B shares that are unlisted.

Data in the sheet is as of Feb 4th, 2016 11AM Sydney time (so based on closing prices, way after market closes).

by Bondifrench at May 27, 2016 11:06 AM


Bluecoat hat jetzt übrigens ein ganz offizielles Intermediate ...

Bluecoat hat jetzt übrigens ein ganz offizielles Intermediate Cert von Symantec/Verisign.

Bluecoat ist die Firma, die Abhör-Proxyserver für die Gestapos der Unterdrückungsregimes dieser Welt baut.

Leider haben die Browser im Allgemeinen kein UI, um Intermediate Certs zu invalidieren. Man müsste also das Verisign-Root-Cert rausschmeißen.

Oder halt sowas wie Certificate Patrol ausrollen. Aber das macht ja seit Letsencrypt nur Ärger.

Update: Bei diesem konkreten Zertifikat ist die Pathlen auf 0 gesetzt. Das entschärft es weitgehend (keine Sub-CAs erlaubt). Die müssten den privaten Schlüssel zu dem Zertifikat auf ihren Appliances ausliefern, wo er geklaut werden könnte. Wobei hey, vielleicht tun sie das ja?

May 27, 2016 11:00 AM

Lamer News

Fred Wilson

Feature Friday: Learn To Code on an iPhone

Hopscotch is a visual programming environment, like Scratch or Blockly, that runs on an iPhone.

If your kids like to grab your phone and watch videos or play games on it, put Hopscotch on your phone and encourage them to make games instead of just playing them.

Here’s a piece from Wired that explains how it works (with some screen shots) and why it is so cool.

You might ask, why should my kids learn to code? And there are many great answers to that but I always like to answer that question by reminding people that instructing machines what to do is becoming an important life skill. And it will only get more important in the coming years. So getting your kids comfortable doing that at a young age is a great thing and Hopscotch is a great way to do that.

by Fred Wilson at May 27, 2016 10:55 AM


Is this arithmetic problem $\#P$-complete?

Fix $q^{(\log q)^{c'}}$ subsets of size $O(\sqrt{q})$ and use $O((\log q)^{c'})$ random bits to select $T\subseteq\Bbb F_q$ with $|T|=O(\sqrt{q})$.

Define character $\chi(a)=1$ at every quadratic residue $a\neq0$ and $\chi(a)=-1$ at every quadratic non-residue and $\chi(0)=0$.

Is the decision problem 'Given $b\in\Bbb F_q$ and $c\in\Bbb N$ is $|\{a\in T:\chi(a+b)=1\}|>c$?' outside $\mathsf{PH}$ and is counting $|\{a\in T:\chi(a+b)=1\}|$ $\#P$-complete?

Is the decision problem 'Given $b\in\Bbb F_q$ and $c\in\Bbb N$ is $|\{a\in T:\chi(a+b)=1\}|\bmod k\neq0$?' $\mathsf{Mod}_k\mathsf{P}$-complete?

@EmilJeřábek There are $\binom{q}{o(\sqrt{q})}\approx q^{q/2}\approx 2^{q\log q}$ subsets. We can pick ra which

by Turbo at May 27, 2016 10:45 AM


What approaches are there for keeping local and remote order books in sync?

Scenario: developing a custom application which defines a structured workflow for manual order submission to Bloomberg EMSX; it should minimize own persisted state and rely on the remote order book as much as possible; it should be resilient against failures and downtime and prevent duplicate order submission; it should only care about what happened within the current one day.

Are there any standard patterns of handling this? Ideally not Bloomberg-specific.

My current thinking is to have three lists:

  • proposed orders - not submitted, not executed;
  • submitted orders - submitted for execution, but no confirmation received yet;
  • actioned orders - submission confirmation received, fills are being received.

Then there will be some process that will "try hard" to keep the three lists in sync:

  • user submits order: remove from proposed orders list and add to submitted orders list;
  • confirmation/fills received: remove from submitted orders list and add to actioned orders list.

More practically it will reactively act on three respective sets of event streams and manage the respective order lists.

Orders will have a unique ID. Execution platform-specific duplicate prevention will be used as well e.g. EMSX_REQUEST_SEQ, but it doesn't mean there shouldn't be earlier checks done.

What do you think?

by Den at May 27, 2016 10:36 AM


How to provide a score value to an image based on pattern information in it?

I saw a few image processing and analysis related questions on this forum and thought I could try this forum for my question. I have a say 30 two-dimensional arrays (to make things simple, although I have a very big data set) which form 30 individual images. Many of these images have similar base structure, but differ in intensities for different pixels. Due to this intensity variation amongst pixels, some images have a prominent pattern (say a larger area with localised intense pixels or high intensity pixels classifying an edge). Some images, also just contain single high intensity pixels randomly distributed without any prominent feature (so basically noise). I am now trying to build an algorithm, which can give a specific score to an image based on different factors like area fraction of high intensity pixels, mean standard deviation, so that I can find out the image with the most prominent pattern (in order words rank them). But these factors depend on a common factor i.e. a user defined threshold, which becomes different for every image. Any inputs on how I can achieve this ranking or a image score in an automated manner (without the use of a threshold)? I initially used Matlab to perform all the processing and area fraction calculations, but now I am using R do the same thing.

Can some amount of machine learning/ random forest stuff help me here? I am not sure. Some inputs would be very valuable.

P.S. If this not the right forum to post, any suggestions on where I can get good advise?

by novicegeek at May 27, 2016 10:31 AM


Preparation for interview: influx of power of the moon

I am preparing myself for an interview for a quantitative analyst position and one of the sample questions asked in previous examinations was:

"Suppose the moon were to disintegrate, and fall to earth over 5000 years. How does this influx of power compare to that of the Sun? Much more, about the same, or much less?"

My idea was to consider the moon as a star and compute its hypothetical flux of energy, but the "fall to earth over 5000 years" puzzles me a bit.

How would you approach the problem?


by Giuseppe at May 27, 2016 10:16 AM

Stress Testing for VaR

I am trying to perform stress testing for VaR and have taken into consideration two methods:- 1. Sensitivity analysis 2. Historical scenario analysis.

According to the Derivatives Policy group we need to take into consideration 5 factors which are:- o Parallel yield curve in ±100 basis points. o Yield curve shifts of ±25 basis points. o Stock index changes of ±10%. o Currency changes of ±6%. o Volatility changes of ±20%.

  1. I am trying to perform the stress testing through sensitivity analysis in excel for which I am not able to figure out how to mould the prices for equities,bonds and derivatives by taking into account above factors through the excel function data table. For instance, if I take into account the 3rd factor mentioned above as STOCK INDEX CHANGES OF +- 10% and one of my stock in my portfolio is listed in Dow Jones, so how can I adjust the prices for a particular time period (say 6 months).?

2.Secondly if I take historical scenario analysis in which I am taking the scenario for instance 1997 Asian crisis, how do I adjust the prices in this scenario also. In this case, for instance, my portfolio contains all the asset class which are issued in the last 10 years and therefore I dont have any data (prices etc.) for them related to the 1997 asian crisis. SO how do I adjust the prices in this case also?.

P.S :-I am using variance covariance method for calculating VaR. Eagerly waiting for valuable suggestions on this.

by Manish Pratap Singh at May 27, 2016 10:06 AM

Lamer News


What is the formal justification for the correctness of the second formulation of rod cutting DP solution

CLRS on section 15.1 3rd edition has a good discussion of the rod cutting problem. I will add a description at the end of the question for reference.

Define $r_j$ to be the optimal way to cut a rod of integer length $j$. Since we do not know where is the optimal place to cut the rod we consider every location $ i \in \{0,\cdots,j\}$ (essentially doing exhaustive search) to do the cut. Then we are left with two rods of length $i$ and $j-i$ to cut optimally. If we consider every location we could do a cut and then cut the remaining rods optimally, then we have essentially solved $r_j$ optimally. i.e. we get the following recursion:

$$ r_j = \max \{p_n, r_1 + r_{k-1}, r_2 +r_{j-2}, ..., r_{j-1} + r_1 \}$$

However there seems to be an alternative (or maybe simpler strategy to solve such a problem):

$$r_n = \max_{1 \leq i \leq n} \{ p_i + r_{n-i} \}$$

CLRS has the following discussion about it:

In a related, but slightly simpler, way to arrange a recursive structure for the rod- cutting problem, we view a decomposition as consisting of a first piece of length $i$ cut off the left-hand end, and then a right-hand remainder of length $n -i$. Only the remainder, and not the first piece, may be further divided. We may view every decomposition of a length-n rod in this way: as a first piece followed by some decomposition of the remainder. When doing so, we can couch the solution with no cuts at all as saying that the first piece has size $i = n$ and revenue $p_j$ and that the remainder has size $0$ with corresponding revenue $r_0 = 0$. In this formulation, an optimal solution embodies the solution to only one related subproblem-the remainder-rather than two.

Personally I've always found it harder to convince myself that this in fact optimal (and that is therefore equivalent to the first one). Therefore I was wondering if anyone had a good argument or maybe a good formal proof to arguing why this second formulation is in fact optimal.

Usually I've tried to justify it to myself by having the following thought experiment: well, if I want to have an optimal revenue I need to make sure that I some how explore all good solutions exhaustively and not miss any. If I want the solution to the whole thing $r_j$ then it could be that the full rod is required, then the cut is at position $i=n$. That is if the optimal solution contains no cuts. But if we have a cut we will have some part of the rod that must remain un cut. If that is the case, then just consider all possible full parts of the rod from the beginning of the rod to the end that remain uncut. Basically, it seems that the solution is as follow, every solution to $r_j$ must contain a piece of rod from the beginning that is uncut to some cutting location $i$. Since this is part of the optimal solution, we just consider all beginnings that are "full" rods up to $i$ and then make sure to solve the remaining optimally. This justification feels close to being the argument of correctness but for some reason it just seems to be missing something (wish I knew what it was).

Does someone know how to write a better clear proof of correctness of this? If its really formal or rigorous that is ok with me, I just sort of want some proof for the second recursion where there is no doubt that its correct.

Thanks for the help!

Recall the rod cutting problem. You manufacture steel rods of some fixed length and sell them in segments of integer length and because of the laws of supply and demand, the price per length is not constant. Given a set of prices (i.e. an array $p_i$ indicating the price for an integer cut of length $i$) for each possible length segment, how should you cut a given rod of length $n$ into integer length segments to maximize revenue? You may assume that no cost or loss incurred with each cut.

by Charlie Parker at May 27, 2016 09:55 AM


Pointfree Join Array to String, by key in object, in Ramda

Can this be done pointfree?

var joinByKey = R.curry(function(key, model){
    return R.assoc(key, R.join(',' ,R.prop(key, model)), model);

var input = { a: ['1', '2', '3'] };
var result = joinByKey("a", input); // {"a": "1,2,3"}

by tomaj at May 27, 2016 09:52 AM



String Distance Matrix in Python

How to calculate Levenshtein Distance matrix of strings in Python

              str1    str2    str3    str4    ...     strn
      str1    0.8     0.4     0.6     0.1     ...     0.2
      str2    0.4     0.7     0.5     0.1     ...     0.1
      str3    0.6     0.5     0.6     0.1     ...     0.1
      str4    0.1     0.1     0.1     0.5     ...     0.6
      .       .       .       .       .       ...     .
      .       .       .       .       .       ...     .
      .       .       .       .       .       ...     .
      strn    0.2     0.1     0.1     0.6     ...     0.7

Using Ditance function we can calculate distance betwwen 2 words. But here I have 1 list containing n number of strings. I wanted to calculate distance matrix after that I want to do clustering of words.

by Ajay Jadhav at May 27, 2016 09:30 AM



NFA to accept strings where the third AND the third-last element are b

Alphabet is {a,b}, the third element from the start must be "b" (e.g: aab..., abb...) and the third-last element must also be "b" (e.g: ...baa, ...bbb). I can do them separately but at the same time I don't know where to start. Help please.

by Marx Guimarães at May 27, 2016 09:15 AM

ML functions from polymorphic lists to polymorphic lists

I'm learning programming in ML (OCaml), and earlier I asked about ML functions of type 'a -> 'b. Now I've been experimenting a bit with functions of type 'a list -> 'b list. There are some obvious simple examples:

let rec loop l = loop l
let return_empty l = []
let rec loop_if_not_empty = function [] -> []
                                   | l -> loop_if_not_empty l

What I can't figure out is how to make a function that does something other than return the empty list or loop (without using any library function). Can this be done? Is there a way to return non-empty lists?

Edit: Yes, if I have a function of type 'a -> 'b, then I can make another one, or a function of type 'a list -> 'b list, but what I'm wondering here is how to make the first one.

by Gilles at May 27, 2016 09:14 AM

What is the difference between finite automata and Büchi automata?

as the title suggests, I was struggling to define the differences between finite and Büchi automata and how they are represented.

From an assignment I'm working on, this automaton can be depicted as both infinite automaton and Büchi automaton.

Can you give me an example for each of those automatons to understand the differences?

by csessh at May 27, 2016 09:13 AM


Lamer News



did a wthingdsfdfgffffffffffffffffffffffffffffffffffffffffffff

by BMVagabond at May 27, 2016 08:47 AM


C# extension using function Func as parameter

Please don't be confuse with the code, the code is wrong. Focused on the bold question below.

I've been preparing to study functional programming and just so at least be ready for its prerequisites, I've been studying extensions, functions and lambda expressions.

This code below is NOT working I just thought this is how it should be coded:


class Program
    static void Main(string[] args)
        int s = 10;
        int t = s.CreatedMethod(2, 3); // <-- the one that calls the extension

    static int RegularMethod(int v1, int v2)
        return v1 * v2; // <-- I also wanted to multiply int 's' like this s*v1*v2


public static class Extension
    public static int CreatedMethod(this int number, Func<int, int, int> fn) 
       // I'm expecting that the code here will read the 
       // RegularMethod() above
       // But I don't know how to pass the parameter of the function being passed
       return @fn(param1, param2)// <-- don't know how to code here ??

As you can see, the CreateMethod extended my integer 's'. My plan is to pass the two parameters in CreateMethod() above and multiply those two parameters into 's'

In the example above, the answer should be 60.

Can you help me do it using extension?

by user1960948 at May 27, 2016 08:45 AM

Machine Learning Multiclass Classification for thousands of Classes

I have a few million Entities with 1 to 10 attributes describing each of them and about one hundred thousand Classes to sort them into.

Are there any Machine Learning algorithms (ideally available on SQL Server, Azure or as .NET library) or a stand-alone tools for massive Multiclass Classification capable of suggesting the top few best matching Classes for each of the Entities?

I have found this research along the lines: Learning compact class codes for fast inference in large multi class classification, but could not find any implementations.

At the moment I have sort of a K-nearest neighbours based on Full-Text Search with a couple of other dimensions weighted at 1/3 each to improve the results. I am looking for the ways to improve both performance and accuracy.

by Y.B. at May 27, 2016 08:42 AM


Numerical Boardgame -- Help with with a pruning method (alpha-beta : minimax)

I'm developing a simple AI to play a boardgame which consists of N x N squares, each with a predefined value.

44 33 75 12 99 23
53 89 14 01 32 98
34 98 67 45 02 38
89 87 54 21 34 87
65 84 99 72 95 17

Each player can either (a) instant claim an unclaimed cell anywhere on the board, or (b) rush an unclaimed cell from a cell already owned. In the case of (b), any enemy cells that are touching the newly claimed cell are converted to your side.


  • Player 1: (1,1) -- {89}
  • Player 2: (3,1) -- {87}
  • Player 1: rush->(2,1) -- {98}

In the scenario above, after the rush move, Player 1 now owns all three cells (since (3,1) was an adjacent cell to (2,1)).

I have the MiniMax algorithm working great for this. It's actually quite fun to watch the AI compete against each other. My problem is determining an effective way to prune moves from the tree. I attempted a trivial pruning method of:

 - - If the cell we're claiming is empty
 - - - If the neighbors of the cell we're claiming is empty
 - - - - If the neighbor's-neighbor is an enemy-owned cell
 - - - - - Prune this move

The idea above being, we don't want to claim a cell that could immediately result in the opponent rushing towards us, claiming for themselves the cell we just claimed.

This didn't go well.

I'm a bit at a loss for a decent heuristic for pruning moves at this point. Move ordering is a bit difficult because I can't just choose the highest remaining cell (can I?), since there are plenty of opportunities for the opponent to right-out rush that cell as soon as it's claimed. That is to say, if our opponent owns cell (4,4), it wouldn't be ideal for us to choose (4,2) since it could wind up being immediately rushed. But maybe it wouldn't hurt to start at the largest values and just test them out? I don't know.

Any help here is greatly appreciated.

by ctote at May 27, 2016 08:34 AM


predict() R function caret package errors: "newdata" rows different, "type" not accepted

  • I am running a logistic regression analysis using the caret package.

  • Data is input as a 18x6 matrix

  • everything is fine so far except the predict() function.

  • R is telling me the type parameter is supposed to be raw or prob but raw just spits out an exact copy of the last column (the values of the binomial variable). prob gives me the following error:

"Error in dimnames(out)[[2]] <- modelFit$obsLevels : length of 'dimnames' [2] not equal to array extent In addition: Warning message: 'newdata' had 7 rows but variables found have 18 rows"

install.packages('e1071', dependencies=TRUE)
#install.packages('caret', dependencies = TRUE)


  nrow = 18,
  ncol = 6,
  byrow = FALSE)  #"bycol" does NOT exist
################### data set as vectors


Train <- createDataPartition(f, p=0.6, list=FALSE)  #60% of data set is used as training.
training <- A[ Train, ]
testing <- A[ -Train, ]

#this is the logistic formula:
#estimates from logistic regression characterize the relationship between the predictor and response variable on a log-odds scale
mod_fit <- train(f ~ a + b + c + d +e,  data=training, method="glm", family="binomial")

#this isthe exponential function to calculate the odds ratios for each preditor:

predict(mod_fit, newdata=training)
predict(mod_fit, newdata=testing, type="prob")

by user6318255 at May 27, 2016 08:31 AM

Planet Theory

Polymath 10 post 6: The Erdos-Rado sunflower conjecture, and the Turan (4,3) problem: homological approaches.

In earlier posts I proposed a homological approach to the Erdos-Rado sunflower conjecture.  I will describe again this approach in the second part of this post. Of course, discussion of other avenues for the study of the conjecture are welcome. The purpose of this post is to discuss another well known problem for which a related  homological approach,  might be relevant.  The problem is the Turan (4,3) problem from 1940.

We will not regard attacks on the sunflower conjecture based on the recent solution of the cap set problem and the Erdos Szemeredi sunflower conjecture (which is a weaker version of the Erdos-Rado conjecture that we considered in post 5) as part of the present polymath10 project, but, of course, I will be happy to hear also reports on progress or comments on these directions. Some updates on this story: Eric Naslund and Will Sawin gave a direct proof based on the polynomial method for the Erdos-Szemeredi sunflower conjecture, and an even  stronger result is given by Naslund here. (Eric also has stronger quantitative bounds for Erdos-Szemeredi based on bounds for cap sets.)   More cap set updates are added to this post, and can be found in the comment threads here and here.

Turan’s (4,3) problem

Turan’s 1940 problem is very easy to state.

What is the minimum of edges in a 3-uniform hypergraph on n vertices with the property that every four vertices span at least one triangle? 

We talked about the problem (and a little about the homological approach to it) in this post and this post.

Four things that the Erdos-Rado sunflower problem and the Turan (4,3) problem have in common.

4. 1000 dollars or so Erdos prize

If I remember correctly, the Turan’s problem is the only question not asked by Erdos himself for which he offered a monetary award.

3.  The homological approach  might be relevant.

This is what this post is going to be about. But while this connection is still tentative and speculative, the next connections between the two problems are solid.

2. Sasha Razborov

Sasha used the Erdos-Rado sunflower theorem in his famous theorem on superpolynomial lower bounds for monotone circuits, and his flag-algebra theory have led to remarkable progress and the best known upper bound for the Turan (4,3) problem. (See here and here .)

1. Sasha Kostochka

Sasha holds the best known upper bound  for the sunflower conjecture and he found a large class of examples for the equality cases for the Turan (4,3) conjecture.

Quasi-isomorphism of hypergraphs

Compound matrices, weak isomorphism and dominance.

Let X=(x_{ij})_{1 \le i \le n, 1 \le j \le n} be a generic matrix. The k-th compound matrix C_k(X) is the matrix of k by k minors. Namely, C_k(X)=x_{ST}, S,T \in {{[n]} \choose {k}} where x_{ST}=det(x_{ij})_{i \in S,j\in T}.

Given two k-unform hypergraphs F,G we say that F and G are weakly isomorphic if the m\times m minor C_{F,G}(X) of C_k(X) whose rows and columns correspond to sets in F and G respectively is non-singular. (It is fairly easy to see that if F and G are isomorphic then they are weakly isomorphic. This relies on the condition that X is generic.) We will say that F dominates G if  |F|\ge |G| and C_{F,G}(X) is full rank. Thus, F and G are weakly isomorphic if each dominates the other.

A class of matroids defined on hypergraphs

Let G(k,r,n) be the k-uniform hypergraph which consists of all k-subsets of [n] that intersect [r]. For a k-uniform hypergraph F on the vertex set [n] let rank_r(F) = rank C_{F,G(k,r,n)}. For the complete k-uniform hypergraph K on n vertices (n \ge r) rank_r(K)={{n}\choose {k}} - {{n-r} \choose {k}}.

 Homological approach to Turan’s (4,3) problem.

We refer to a 3-uniform hypergraph  with the property that every four vertices span at least one triangle as a Turan (4,3)-hypergraph.

Conjecture: If F is a Turan (4,3)-hypergraph with n vertices then

rank_r(F) \ge r {{n-r-1}\choose {2}}.

This conjecture is a refinement of Turan’s conjecture. (The conjectured 1940 lower bound by Turan can be written as  \max (r{{n-r-1} \choose {2}}: r\ge 1).) It is known for r=1 (see this post) and an analogous statement is known for Turan (3,2)-graphs.

We would like to find even stronger algebraic statements which amounts not only to upper bounds on certain homology-like groups but to stronger statements about vanishing of certain homology-like  groups.

I am also curious about

Question: Are all extremal examples with n vertices for the Turan (4,3) problem quasi-isomorphic? If this is the case we can also conjecture that every Turan (4,3) hypergraph dominates  Turan’s example on the same number of vertices.

I hope to be able to present some experimental data on these problems.

Algebraic shifting

A family F of k-subsets of [n] is shifted if whenever S \in F and R is obtained by replacing an element by a smaller element then R \in F.  It can be shown that two shifted families of the same size are weakly isomorphic only if they are equal! We can use our compound matrix to describe an operation (in fact, several operations) called shifting which associated to a family F a shifted family \Delta_T(F).  Suppose that |F|=m. \Delta (F) is the lexicographically smallest family of sets which is weakly isomorphic to F. In more details: we first consider a total ordering T on k-subsets of [n]. Then we greedily  build a hypergraph which is dominated by F.  When we reach m sets we obtain a  hypergraph \Delta_T(F) weakly isomorphic to F.

Now, if the total ordering is the lexicographic order on {{[n]} \choose {k}} then we denote \Delta(F)=\Delta_T(F)) and call \Delta(F) the “algebraic shifting” of F. In this case, it can be shown that the resulting family is shifted.

Also of special importance is the case that T=RL is the reverse lexicographic order.

For sets of integers of size k, the lexicographic order refers to the lexicographic order when we ordered the elements of every set from increasingly, and the reverse lexicographic order is the lexicographic order when we order the elements decreasingly.

From 12<13<14<15<…<21<22<…<31<32<… we get

\{1,2\} <_L \{1,3\} <_L \{1,4\} \dots <_L \{ 2,3\} <_L \{2,4\} <_L \dots

and from  21<31<32<41<42<43<51<52<…  we get

\{1,2\} <_{RL} \{1,3\}<_{RL}\{2,3\}<_{RL}\{1,4\}\dots

We mention again some connection with acyclicity and homology: F  is acyclic if all sets in \Delta (F) contains ‘1’. F is [r,1]-acyclic iff all sets in in \Delta (F) intersect [r]. F is [m,m]-acyclic iff all sets in \Delta(F) contains [r]. For general b,c, [b,c]-acyclicity is not expressed by those two versions of algebraic shifting, however, F is [s,k]-cyclic if \Delta_{RL}(F) \subset {{[s]}\choose{k}}.

The (current) homological approach to the Sunflower problem: a reminder.

Our ultimate conjecture is:

Main Conjecture:  If F has no sunflower of size r then it is [rk,k]-acyclic. (I.e., Z_{k-1}[(rk,k](K)=0.)

An equivalent formulation in terms of reverse lexicographic shifting is:

(**) \Delta_{RL}(F) \subset {{[rk]}\choose{k}}.

Our current proposal  is to use the following theorem and (a greatly modify) Conjecture B. (For general families.)

Theorem: Let F is a family of k-sets without a sunflower of size r. Then

(*) For every family of i-sets F' which is the link of a set of size k-i (including the case F'=F) , Every set in \Delta _{RL}(F') intersect [(r-1)i].

Conjecture B (from post 4, greatly modified from the one discussed in posts 2,3): For a family F of k-sets satisfying (*) we have

(**) \Delta_{RL}(F) \subset {{[rk]}\choose{k}}.

Also here, I hope to be able to present some experimental data soon.

by Gil Kalai at May 27, 2016 08:11 AM

Lamer News

Planet Theory

TR16-084 | Low-Sensitivity Functions from Unambiguous Certificates | Shalev Ben-David

We provide new query complexity separations against sensitivity for total Boolean functions: a power 3 separation between deterministic (and even randomized or quantum) query complexity and sensitivity, and a power 2.1 separation between certificate complexity and sensitivity. We get these separations by using a new connection between sensitivity and a seemingly unrelated measure called one-sided unambiguous certificate complexity ($UC_{\min}$). Finally, we show that $UC_{\min}$ is lower-bounded by fractional block sensitivity, which means we cannot use these techniques to get a super-quadratic separation between bs(f) and s(f).

May 27, 2016 07:30 AM


Any tutorial on building a chatbot with R? [on hold]

Are there any tutorials on building a chatbot with R?

I am looking at building a customer service chatbot for banking industry which has a customer facing web app. It has to be able to provide human like responses to questions raised.

Please let me know if more details are required.

by Ganesh Bhat at May 27, 2016 07:18 AM

Lamer News


Do we need Feller condition if volatility process jumps?

It is fairly known that in affine processes, as Heston model \begin{equation} \begin{aligned} dS_t &= \mu S_t dt + \sqrt{v_t} S_t dW^{S}_{t} \\ dv_t &= k(\theta - v_t) dt + \xi \sqrt{v_t} dW^{v}_{t} \end{aligned} \end{equation} the SV $v_t$ is a strictly positive process if the drift is stronger enough, i.e. if drift parameters ($k$, the speed of mean-reverting, and $\theta$, mean-reverting level) and the Vol-of-Vol $\xi$ satisfy: \begin{equation} k \theta > \frac{1}{2} \xi^2 \end{equation} which is known as Feller condition. I know this condition can be generalized to multi-factor affine processes. For example, if the volatility of the returns $\log S_t$ is made of several independent factors $v_{1,t},v_{2,t},...,v_{n,t}$, then the Feller condition applies to each factor separately (check here at page 705, for example). Moreover Duffie and Kan (1996) provide a multidimensional extension of the Feller condition.

But I still don't understand if we still need the (or a sort of) Feller condition in case of jump-diffusion. You may consider for example the simple case of a volatility factor with exponentially distributed jumps: \begin{equation} dv_t = k(\theta - v_t) dt + \xi \sqrt{v_t} dW^{v}_{t} + dJ^{v}_{t} \end{equation} where $J^{v}_{t}$ is a compound Poisson process, independent of the Wiener $W^{v}_{t}$. The Poisson arrival intensity is a constant $\lambda$ with mean $\gamma$. I observe that in this case, the long term mean reverting level is jump-adjusted: \begin{equation} \theta \Longrightarrow \theta ^{*}=\theta + \frac{\lambda}{k} \gamma \end{equation} so I suspect if a sort of Feller condition applies it must depends on jumps.

Nevertheless, from a purely intuitive perspective, even if the barrier at $v_t = 0$ is absorbent, jump would pull back from 0 again.

Thanks for your time and attention.

by Gabriele Pompa at May 27, 2016 05:58 AM


How do i draw a flowchart for this problem? [on hold]

Draw a complete Flowchart for the following case. If age is greater than 59 then continue to check whether the status is ‘W’. If status is ‘W’ then display “Working senior” otherwise display “Retired senior”. If age is not greater than 59 then check whether age is greater than 20. If age is greater than 20 then display “Adult” otherwise proceed to check whether age is greater than 12. If age is greater than 12 then display “Teen” otherwise display “Child”.

by Alicia Tan at May 27, 2016 05:51 AM


Roll-a-Sketch drawings from Maker Faire!


Here are a few of the many Roll-a-Sketch drawings I did at Maker Faire this year!

Thank you to everyone who came by and said hello; I saw a lot of returning faces. Maker Faire is a super fun environment to be immersed in for a few days — you’re surrounded by robots, and art cars, and 3D printers, and drones, and everyone’s just interesting. I had many lovely and surprising conversations with creative folks.

I have never been to any of the other, subsidiary Maker Faires — only the flagship show in San Mateo, which has a wholly enclosed, Vatican-City-style craft fair section that is technically what I am there as part of. But in the coming year I hope to make fewer papergoods and more tangible “weird things” so who knows, there may be room for me at another city’s Maker Faire yet.















ALSO: We’re getting close to the end of May, which means it’s time to reveal the Wondermark Cast Card that’ll be going out to Patreon subscribers!

This month’s card features Boutrous, the camel from the classic Wondermark #558, “In which a Camel is merely ordinary”.


(This particular comic was also narrated in audio form by voice actor SungWon Cho, here. It’s…super good.)

The monthly Patreon Cast Card subscription, so far, has been going out to a HIGHLY EXCLUSIVE GROUP. Later in the year, there will be another new “achievement” Cast Card that you’ll be able to get in a different way… And of course, the Roll-a-Sketch “Patron of the Arts” card is available at all my convention appearances this year.

Thank you very much to all you Patreonauts, at any level! We’re very close to a $500 goal that will prompt the start of a monthly annotation series, in which I go back through the archives and puzzle out just what I was thinking for comics deep in the archive.

Last month we crossed the THIRTEEN YEAR MARK with Wondermark. A decade ago, I was a very different person, in a very different place in my life. I don’t write comics the same way now as I did then, mostly for the better (I hope), but who knows. I think it’ll be interesting to go back and interrogate my younger self. That’ll become a monthly thing once we crest $500 on Patreon.

I am very grateful for your support via any method: Patreon, the store, saying hello at a show, or just a word to a friend. I’m thankful for every kindness — it’s always been, and continues to be, a privilege to do this for you, and by reading this you are by definition a Cool Person with Good Taste.



by David Malki at May 27, 2016 05:23 AM


Simulations of CPU [on hold]

I'm trying to create a simulation of the CPU just to teach myself how it works.

So there's an input string of zeros and ones. There's an output string of zeros and ones. Does the output also include a piece of information about where it's to be sent?

I'm imagining output of the form: (Send to ALU, 0011---11), (Send to GPU, 110--1), etc... Obviously the "Send to ..." would also be in binary.

EDIT: I'm sorry for the ambiguity. I am looking at the CPU as a black box, a mapping between input and output strings of bits. I was imagining that part of the bits in the output directed where the command should be sent, like to a network location or to the GPU or to the ALU. From Mohammad's answer I guess it's done mostly through dedicated registry locations.

TLDR: How does the CPU direct its output to the appropriate location? Is this information encoded within the bits of the output itself?

by Anthony Ebert at May 27, 2016 04:23 AM



Ramda JS: How to perform a map where I call R.replace for a given property on each object?

Given the following data:

const my_data = [
    name: "John",
    age: 22
    name: "Johnny",
    age: 15
    name: "Dave",
    age: 27

I want to transform the data such that the substring "John" is replaced with "Ben" in each of the name properties so it looks like this:

    name: "Ben",
    age: 22
    name: "Benny",
    age: 15
    name: "Dave",
    age: 27

I want to do so in the proper functional way (I think is points-free but I am still learning), so I can reuse this in a pipeline, say first reducing by age and then doing the replace, or doing the replace first then doing a sort. How would I do this using the Ramda functions?

var fix_names = ???
var fixed_data = fix_names, my_data );

by Ryan Jarvis at May 27, 2016 03:39 AM

Confusion about flatMapLatest in RxSwift

I learn the sample code in RxSwift. In the file GithubSignupViewModel1.swift, the definition of validatedUsername is:

validatedUsername = input.username //the username is a textfiled.rx_text
    .flatMapLatest { username -> Observable<ValidationResult> in
        return validationService.validateUsername(username)
            .catchErrorJustReturn(.Failed(message: "Error contacting server"))

the validateUsername method is finally called the following method:

func usernameAvailable(username: String) -> Observable<Bool> {
    // this is ofc just mock, but good enough
    let URL = NSURL(string: "\(username.URLEscaped)")!
    let request = NSURLRequest(URL: URL)
    return self.URLSession.rx_response(request)
        .map { (maybeData, response) in
            return response.statusCode == 404

Here is my confusion:

whenever I input a character quickly in the username textfield, message -------->1:, -------->2: showed, and a little later message -------->3: showed, but only showed one -------->3: message.

When I input characters slower, message -------->1:, -------->2:, -------->3: showed successively.

But when I change the flatMapLatest to flatMap, how many characters I input, I will get the same number of -------->3: message.

So how did the flatMapLatest work here?

How the flatMapLatest filter the early response from NSURLResponse ?

I read some about the flatMapLatest, but none of them will explain my confusion.

What I saw is something like:

let a = Variable(XX)

When changed a.value to another Variable, the Variable(XX) will not influence the subscriber of a.

But the input.username isn't changed, it is always a testfield.rx_text! So how the flatMapLatest work?

by leizh00701 at May 27, 2016 03:28 AM


Help with Rock Paper Scissors game [on hold]

I would like some help with my code. I still quite new to CS so please forgive me if I make a mistake. The main thing about this Rock Paper Scissors game is that every 4 turns, each player must use rock , paper, and scissors. The computer is supposed to be able to go and "guess" what to put next based the the opponenents previous moves due to the restriction given. Any help is appreciated. Thank you.

 * RockPaperScissors
 * @author -

import java.util.*;

public class RPS16

{ // class variable declarations // initialize arrays static String inputs[] = {"p", "r", "s", "x"}; //allowable user inputs static String rps[] = {"r", "p", "s"}; //allowable computer choices static String cMoves[] = {"", "", "", ""}; //computer move history static String uMoves[] = {"", "", "", ""}; //user move history

    // initialize data variables
static String computerMove = "d"; //dummy d to start
static String userMove = "d";  //dummy d to start
static String again = "y";  //initialize loop to yes - play again
static String winner = ""; // who won this round
static String scomputerMove="r";

static String tcomputerMove="r";

static int rounds = 0;   //number of rounds played

//variables for scoring
static int cScore=0, cWin=0,cLoss=0,cTie=0, cPenalty=0;  //c = computer
static int uScore=0, uWin=0,uLoss=0,uTie=0, uPenalty=0;  //u = user(player)

public static void main(String a[])

Scanner kb = new Scanner(;
System.out.println("enter 's' to start Rock-Paper-Scissors");
String dummy =;       //start?  press any key to start

//do while loop will start here

do{ rounds += 1; //update round count

// calculate RPS values and find legal next moves for the player

//calculate best move for computer

shiftArrays(cMoves,computerMove); bestMove(cMoves, uMoves); updateNewMoves(cMoves,computerMove); if(legalMove(cMoves)==false) { computerMove=scomputerMove; if(rounds>4) { cMoves[3]=computerMove; } else { cMoves[rounds-1]=computerMove; }



System.out.print("My selection is " + computerMove + "       ");
//update computer moves
//input user move

{   System.out.print("Your selection(r,p,s, or x to exit) ");
    userMove =;
    userMove = userMove.toLowerCase();
    {System.out.println("invalid entry");
}while (!legalInput(userMove));

// check to exit
{//System.out.print("should exit here");
//update move arrays with new entries


//check for legal move from both

legalMove(cMoves); legalMove(uMoves); if(legalMove(cMoves) == false)

if(legalMove(uMoves) == false) { System.out.print("Invalid user move"); uPenalty++; } // continue to score after checking for no illegal moves

//check who won computer, player, tie determineWinner(); if(determineWinner().equals("c")) { cWin++; uLoss++; } if(determineWinner().equals("cu")) { cTie++; uTie++; } if(determineWinner().equals("u")) { cLoss++; uWin++; } // if computer wins update computer scores

// if user (player) wins update player scores // if a tie, update both players

// calculate scores calcScore(); // display Scores displayScore(); // shift arrays for(int x = 0; x

System.out.print(cMoves[x]+" ,"); } System.out.println(); for(int x = 0; x

System.out.print(uMoves[x]+" ,"); } System.out.println(); }while (userMove != "x");


}  // end main

public static String bestMove(String cMoves[],String uMoves[])  //selects best move based on each
    String output = "";                  //opponents allowable moves

if(rounds>3) { int urCount = 0; int usCount = 0; int upCount = 0; int crCount = 0; int csCount = 0; int cpCount = 0; for(int x = 0; x

for(int x = 0; x<uMoves.length; x++)

if(urCount==2||upCount==2||usCount==2) {

                computerMove = "s";

scomputerMove = "p"; tcomputerMove = "r";

                computerMove = "r";

scomputerMove = "s"; tcomputerMove = "p";


                computerMove = "r";

scomputerMove = "s"; tcomputerMove = "p"; output="r"; } if(urCount==0) { computerMove = "p"; scomputerMove = "r"; tcomputerMove = "s"; output="p"; } } else {

                computerMove = "p";

scomputerMove = "r"; tcomputerMove = "s"; output="p"; } if(upCount==0) { computerMove = "s"; scomputerMove = "p"; tcomputerMove = "r"; output="s"; } } else { } }



else { int p = (int)(Math.random()*10)+1; if(p>3) { if(p<=6) { output = "s"; computerMove="s"; scomputerMove="r"; tcomputerMove="p"; } else if(p>6) { output = "p"; computerMove="p"; scomputerMove="r"; tcomputerMove="s"; } } else if(p<=3) output = "r"; computerMove="r"; scomputerMove="p"; tcomputerMove="s"; } // return next move return computerMove; }

public static boolean legalMove(String moves[])  //checks for legal move
    boolean legality = true;

if(rounds <4) { int rCount=0; int pCount=0; int sCount=0; for(int x = 0; x2||sCount>2||pCount>2) { legality = false; } else { legality = true; }


else {

    int rCount=0;
    int pCount=0;
    int sCount=0;
    for(int x = 0; x<moves.length; x++)
                    legality = true;

            legality = false;
            legality = false;
            legality = false;
            legality = false;
                    legality = true;
            return legality;

//true if legal, false if illegal move (must have one each r,p,s)

public static boolean legalInput(String input)  //checks for legal move
    int index = Arrays.binarySearch(inputs, input);
    //System.out.println("input index = " + index);
    return index >= 0;  //true if legal(in inputs list)

public static String determineWinner()
    // return t for tie, c for computer, u for user(player)
    {return "t";}

    if ((computerMove.equals("r")) && (userMove.equals("s")))
        return "c";

    if ((computerMove.equals("p")) && (userMove.equals("r")))
        return "c";
    if ((computerMove.equals("s")) && (userMove.equals("p")))
        return "c";
        return "cu";
            return "u";

public static void shiftArrays(String moves[],String s)
    if(rounds < 4)

        for(int r = 0; r < moves.length; r++)
            moves[r]= moves[r+1];

    // in rounds 1,2,3 no shifting is needed

public static void updateNewMoves(String moves[], String s)
    if(rounds < 4)
        moves[rounds - 1] = s;
        moves[3] = s;

public static void calcScore()
//  cScore
cScore = (cWin*3) + (cTie*1)+(cPenalty*(-10));
//  uScore
uScore = (uWin*3) + (uTie*1)+(uPenalty*(-10));

public static void displayScore()
if (cScore == uScore)
    {System.out.println("     The score is tied!");
if (cScore > uScore)
    {System.out.println("     I am ahead by " + (cScore - uScore));
if (uScore > cScore)
    {System.out.println("     You are ahead by " + (uScore - cScore));

public static void printFinalScore()
System.out.println("Computer scores " + cWin + " wins   " + cLoss + " losses  "
     + cTie + " Ties " + cPenalty + " penalties  equals " + cScore);
System.out.println("User scores " + uWin + " wins   " + uLoss + " losses  "
     + uTie + " Ties " + uPenalty + " penalties  equals " + uScore);
System.out.println("computer score  " + cScore +
    "   user score  " + uScore);



by upvote at May 27, 2016 03:08 AM



Is the halting problem decidable for 3 symbol one dimensional cellular automata?

I've been trying to figure out if the halting problem is decidable for 3-symbol one-dimensional cellular automata.

Definition Let $f(w,i)$ denote the configuration of the system at time step $i$. More formally $f:A^*\times \mathbb{N} \to A^*$, where $A$ is the alphabet.

Definition. A cellular automaton has halted in configuration $f(w,i)$, if $\forall k\in \mathbb{N}$ we have that $f(w,i)=f(w,i+k)$.

The halting problem for a given cellular automaton is as follows:

Input: a finite word $w$
Question: will the automaton halt in some state $s$?

Elementary cellular automata (with 2 symbols) are defined here. I am focused on the same sort of celullar automata, except that I'm interested in the case of CA's with 3 symbols instead of just 2 symbols.

From now on, I will denote my rules in the form of $***\to*$, meaning that 3 neighboring symbols produce another one beneath them.

The halting problem is decidable for elementary, 2-symbol cellular automata

I will use $0$ to denote a white cell and $1$ to denote a black one.

If we have rules $000 \to 1$, $001 \to 1$, $100 \to 1$ we know the automaton won't halt. Because with the first rule, since our grid is infinite, we will always have 3 white cells that will generate a black cell. With the second and 3rd rules the word will be expanding to the sides and the automaton will never halt.

In the rest of the cases we can let it evolve for $2^n$ steps and see if it halts. If it does halt, then ok, it halts, if it doesn't then it is repeating some combinations and is stuck in a loop, so we can also conclude that it won't halt.

What I have figured out for the 3 symbol case

It is obvious that it won't halt if we have rules $000 \to 1$ or $000 \to 2$. But the side rules of the form $00x \to y$ and $x00 \to y$ are harder to analyze, because what if we have rules $002 \to 1$ and $001 \to 0$?

Here's what I came up with:

let's consider all combinations of such rules:

  1. $001 \to 0$ and $002 \to 0$
  2. $001 \to 0$ and $002 \to 1$
  3. $001 \to 0$ and $002 \to 2$
  4. $001 \to 1$ and $002 \to 0$
  5. $001 \to 1$ and $002 \to 1$
  6. $001 \to 1$ and $002 \to 2$
  7. $001 \to 2$ and $002 \to 0$
  8. $001 \to 2$ and $002 \to 1$
  9. $001 \to 2$ and $002 \to 2$

I didn't write the cases for the rules of the form $x00 \to y$, because those are symmetrical.

So, in the first case it's obvious that the input word won't be expanding to the sides, because those side symbol rules produce zeros.

In cases 5, 6, 8, 9 it's obvious that the automaton will never halt, because the input word will be expanding.

Cases 2,3,4,7 are more interesting. First, let's note, that case 2 is similar to case 7 and case 3 is similar to case 4. So, let's just consider cases 2 and 3 for conciseness.

I'm gonna consider case 3 first, because it's easier.

We have $001 \to 0$ and $002 \to 2$. It is obvious that if the first or last symbol of our input word is $2$, then we can conclude that the automaton won't halt. But if they are '1', then we have to look at more stuff, in particular, let's look at rules that can turn the last or first symbols into $2$, because if we have those, then after they do produce that $2$, we can conclude that the automaton won't halt. (the word will be expanding to the side(s)).

Here are all combinations that we need to consider:

010 011 012
 0   0   0
 0   0   1
 0   0   2
 0   1   0
 0   1   1
........... etc

An explanation of what happens if we have the first triple from the above table

We have a word $w$, written on the grid. The first and last symbols are $1$. Let's say we have rules $010 \to 0$, $011 \to 0$, $012 \to 0$ (the first triple) from above. Then we know that with each next step our input word will be getting smaller by 2 symbols, because these rules erase the first and last symbols, but if at some point we get a $2$, then the rule $002 \to 2$ will make the word grow to one side or the other (or both) and the automaton will never halt. So, all in all, in this case we can let the automaton do $|w|/2$ steps, and if the word becomes empty, then the automaton halts, if not, then it doesn't.

Generalized case 3

I generalized it and noticed that we can simply let the automaton do $3^n$ steps and if at any one of those steps we have a $2$ as first or last symbol, then the automaton won't halt. If that doesn't happen and the automaton still didn't halt, then it's repeating some configuration, so it's stuck in a loop and won't halt. If it halts, then it halts.

Where I get stuck

Now let's consider case 2.

We have rules $001 \to 0$ and $002 \to 1$.

And here is where I got stuck and don't know what to do.

I also wrote out a table of rules that start with $1$. I wrote those out, because they seemed to be the first thing I should analyze, because even if we have the input word with first or last (or both) symbol as $2$, at the next step those $2's$ will turn into a $1$. And we will have to deal with rules of the form $01x \to y$.

Here's the table:

010 011 012
 0   0   0
 0   0   1
 0   0   2
 0   1   0
 0   1   1
 0   1   2
 0   2   0
 0   2   1
 0   2   2
 1   0   0
 1   0   1
 1   0   2
 1   1   0
 1   1   1
 1   1   2
 1   2   0
 1   2   1
 1   2   2
 2   0   0
 2   0   1
 2   0   2
 2   1   0
 2   1   1
 2   1   2
 2   2   0
 2   2   1
 2   2   2

It is also obvious, that if among our 27 rules, we have a triple from this table in which no rule derives a $2$, then we have nothing to worry about and can simply let the automaton evolve for $3^n$ steps, because it won't really expand, since the side rules will not produce a $2$.

But looking at the triples that do have a $2$, it's actually very hard to analyze, and whether the word will expand or not also seems to depend on the input word.

Can you guys tell me how to solve this? I can't seem to wrap my head around this.

Or, if this 3 symbol cellular automaton looks like something for which the halting problem has been proven to be undecidable, how can I reduce that something to 3 symbol cellular automata?

by paulpaul1076 at May 27, 2016 02:40 AM



Maximum subset pairwise not divisible by $K$

I have a set of numbers, and want to calculate the maximum subset such that the sum of any two of it's elements is not divisible by an integer $K$. I tried to solve this problem, but I have found the quadratic solution, which is not efficient response.
$K < 100, N < 10000$, where $N$ is the number of elements and $K$ is given constant. Is there better than quadratic solution?

by manduinca at May 27, 2016 02:05 AM


What is the heat-map method of calculating VaR?

I'm familiar with the historical full revaluation, VcV, and Delta-gamma methods, but a client keeps talking about a heat-map method and I'm not sure what he's talking about.

Any ideas?

by Brian Wilson at May 27, 2016 02:03 AM


How can I cleanly map to a method that returns java.util.Optional? [duplicate]

This question already has an answer here:

This code works:

private Optional<Long> getMaybeLong(String string) {

private void foo(String[] strings) {
    List<Long> longs =

However I find it a bit clunky to call Optional#isPresent followed by Optional#get. Is there a cleaner way to go from each Optional<Long> to a List<Long>?

by Andrew Swan at May 27, 2016 01:53 AM

arXiv Networking and Internet Architecture

Performance Enhancement of the Golden Code by Utilizing ORIOL Antenna. (arXiv:1605.08427v1 [cs.IT])

In this paper, a novel method is exposed to improve the performance of the Golden code, by using octagonal reconfigurable isolated orthogonal element (ORIOL) antennas, instead of a conventional microstrip patch antenna. The aforementioned antenna, should be employed in both the transmitter and the receiver sides, to approach the mentioned improvement. As a matter fact, in this paper, we recommend space-time-polarization diversity instead of space-time singly; therefore it is obvious that by employing the aforementioned technique, the system obtains more strength against destructive fading. The simulations for different rates have confirmed that, utilizing ORIOL antenna outperforms patch microstrip one, which is roughly about $2$ to $3$ dB, according to the rates.

by <a href="">Vahid Amiri</a>, <a href="">Mohammadali Sadat Hosseini</a>, <a href="">Ali Lotfi-Rezaabad</a>, <a href="">Siamak Talebi</a> at May 27, 2016 01:30 AM

Advancing the State-of-the-Art in Hardware Trojans Design. (arXiv:1605.08413v1 [cs.CR])

Electronic Design Automation (EDA) industry heavily reuses third party IP cores. These IP cores are vulnerable to insertion of Hardware Trojans (HTs) at design time by third party IP core providers or by malicious insiders in the design team. State of the art research has shown that existing HT detection techniques, which claim to detect all publicly available HT benchmarks, can still be defeated by carefully designing new sophisticated HTs. The reason being that these techniques consider the HT landscape to be limited only to the publicly known HT benchmarks, or other similar (simple) HTs. However the adversary is not limited to these HTs and may devise new HT design principles to bypass these countermeasures.

In this paper, we discover certain crucial properties of HTs which lead to the definition of an exponentially large class of Deterministic Hardware Trojans $H_D$ that an adversary can (but is not limited to) design. The discovered properties serve as HT design principles, based on which we design a new HT called 'XOR-LFSR' and present it as a 'proof-of-concept' example from the class $H_D$. These design principles help us understand the tremendous ways an adversary has to design a HT, and show that the existing publicly known HT benchmarks are just the tip of the iceberg on this huge landscape. This work, therefore, stresses that instead of guaranteeing a certain (low) false negative rate for a small constant set of publicly known HTs, a rigorous HT detection tool should take into account these newly discovered HT design principles and hence guarantee the detection of an exponentially large class (exponential in number of wires in IP core) of HTs with negligible false negative rate.

by <a href="">Syed Kamran Haider</a>, <a href="">Chenglu Jin</a>, <a href="">Marten van Dijk</a> at May 27, 2016 01:30 AM

Fast plurality consensus in regular expanders. (arXiv:1605.08403v1 [cs.DM])

Pull voting is a classic method to reach consensus among $n$ vertices with differing opinions in a distributed network: each vertex at each step takes on the opinion of a random neighbour. This method, however, suffers from two drawbacks. Even if there are only two opposing opinions, the time taken for a single opinion to emerge can be slow and the final opinion is not necessarily the initially held majority.

We refer to a protocol where 2 neighbours are contacted at each step as a 2-sample voting protocol. In the two-sample protocol a vertex updates its opinion only if both sampled opinions are the same. Not much was known about the performance of two-sample voting on general expanders in the case of three or more opinions. In this paper we show that the following performance can be achieved on a $d$-regular expander using two-sample voting. We suppose there are $k \ge 3$ opinions, and that the initial size of the largest and second largest opinions is $A_1, A_2$ respectively.

We prove that, if $A_1 - A_2 \ge C n \max\{\sqrt{(\log n)/A_1}, \lambda\}$, where $\lambda$ is the absolute second eigenvalue of matrix $P=Adj(G)/d$ and $C$ is a suitable constant, then the largest opinion wins in $O((n \log n)/A_1)$ steps with high probability.

For almost all $d$-regular graphs, we have $\lambda=c/\sqrt{d}$ for some constant $c>0$. This means that as $d$ increases we can separate an opinion whose majority is $o(n)$, whereas $\Theta(n)$ majority is required for $d$ constant.

This work generalizes the results of Becchetti et. al (SPAA 2014) for the complete graph $K_n$.

by <a href="">Colin Cooper</a>, <a href="">Tomasz Radzik</a>, <a href="">Nicol&#xe1;s Rivera</a>, <a href="">Takeharu Shiraga</a> at May 27, 2016 01:30 AM

On Modeling Coverage and Rate of Random Cellular Networks under Generic Channel Fading. (arXiv:1605.08381v1 [cs.IT])

In this paper we provide an analytic framework for computing the expected downlink coverage probability, and the associated data rate of cellular networks, where base stations are distributed in a random manner. The provided expressions are in computable integral forms that accommodate generic channel fading conditions. We develop these expressions by modelling the cellular interference using stochastic geometry analysis, then we employ them for comparing the coverage resulting from various channel fading conditions namely Rayleigh and Rician fading, in addition to the fading-less channel. Furthermore, we expand the work to accommodate the effects of random frequency reuse on the cellular coverage and rate. Monte-Carlo simulations are conducted to validate the theoretical analysis, where the results show a very close match.

by <a href="">Akram Al-Hourani</a>, <a href="">Sithamparanathan Kandeepan</a> at May 27, 2016 01:30 AM

Probabilistic Inference Modulo Theories. (arXiv:1605.08367v1 [cs.AI])

We present SGDPLL(T), an algorithm that solves (among many other problems) probabilistic inference modulo theories, that is, inference problems over probabilistic models defined via a logic theory provided as a parameter (currently, propositional, equalities on discrete sorts, and inequalities, more specifically difference arithmetic, on bounded integers).

While many solutions to probabilistic inference over logic representations have been proposed,

SGDPLL(T) is simultaneously (1) lifted, (2) exact and (3) modulo theories, that is, parameterized by a background logic theory.

This offers a foundation for extending it to rich logic languages such as data structures and relational data.

By lifted, we mean algorithms with constant complexity in the domain size (the number of values that variables can take). We also detail a solver for summations with difference arithmetic and show experimental results from a scenario in which SGDPLL(T) is much faster than a state-of-the-art probabilistic solver.

by <a href="">Rodrigo de Salvo Braz</a>, <a href="">Ciaran O&#x27;Reilly</a>, <a href="">Vibhav Gogate</a>, <a href="">Rina Dechter</a> at May 27, 2016 01:30 AM

Hitting minors, subdivisions, and immersions in tournaments. (arXiv:1605.08366v1 [cs.DM])

The Erd\H{o}s-P\'osa property usually relates parameters of covering and packing of combinatorial structures, and has been mostly studied in the setting of undirected graphs. In this note, we use results of Chudnovsky, Fradkin, Kim, and Seymour to show that for every directed graph $H$, the class of all minor-expansions (resp. subdivisions) of $H$ has the vertex-Erd\H{o}s-P\'osa property in the class of tournaments. We also prove that if $H$ is a strongly connected directed graph, the class of all immersion-expansions of $H$ has the edge-Erd\H{o}s-P\'osa property in the class of tournaments. Our results are orthogonal to the recent results of Amiri et al. [arXiv:1603.02504, March 2016] in the sense that we restrict the class of "host graphs", whereas they restrict the class of "guest graphs".

by <a href="">Jean-Florent Raymond</a> at May 27, 2016 01:30 AM

MobileAppScrutinator: A Simple yet Efficient Dynamic Analysis Approach for Detecting Privacy Leaks across Mobile OSs. (arXiv:1605.08357v1 [cs.CR])

Smartphones, the devices we carry everywhere with us, are being heavily tracked and have undoubtedly become a major threat to our privacy. As "tracking the trackers" has become a necessity, various static and dynamic analysis tools have been developed in the past. However, today, we still lack suitable tools to detect, measure and compare the ongoing tracking across mobile OSs. To this end, we propose MobileAppScrutinator, based on a simple yet efficient dynamic analysis approach, that works on both Android and iOS (the two most popular OSs today). To demonstrate the current trend in tracking, we select 140 most representative Apps available on both Android and iOS AppStores and test them with MobileAppScrutinator. In fact, choosing the same set of apps on both Android and iOS also enables us to compare the ongoing tracking on these two OSs. Finally, we also discuss the effectiveness of privacy safeguards available on Android and iOS. We show that neither Android nor iOS privacy safeguards in their present state are completely satisfying.

by <a href="">Jagdish Prasad Achara</a>, <a href="">Vincent Roca</a>, <a href="">Claude Castelluccia</a>, <a href="">Aurelien Francillon</a> at May 27, 2016 01:30 AM

User Performance in Small Cells Networks with Inter-Cell Mobility. (arXiv:1605.08353v1 [cs.NI])

We analyze the impact of intra-cell mobility on user performance in dense networks such as that enabled by LTE-A and 5G. To this end, we consider a homogeneous network of small cells and first show how to reduce the evaluation of user performance to the case of a single representative cell. We then propose simple analytical models that capture mobility through the distribution of the residual sojourn time of mobile users in the cell. An approximate model, based on Quasi-Stationary (QS) assumptions, is developed in order to speed up computation in the Markovian framework. We use these models to derive the average throughput of both mobile and static users, along with the probability of handover for mobile users. Numerical evaluation and simulation results are provided to assess the accuracy of the proposed models. We show, in particular, that both classes of users benefit from a throughput gain induced by the "opportunistic" displacement of mobile users among cells.

by <a href="">Philippe Olivier</a>, <a href="">Alain Simonian</a> at May 27, 2016 01:30 AM

Theano-MPI: a Theano-based Distributed Training Framework. (arXiv:1605.08325v1 [cs.LG])

We develop a scalable and extendable training framework that can utilize GPUs across nodes in a cluster and accelerate the training of deep learning models based on data parallelism. Both synchronous and asynchronous training are implemented in our framework, where parameter exchange among GPUs is based on CUDA-aware MPI. In this report, we analyze the convergence and capability of the framework to reduce training time when scaling the synchronous training of AlexNet and GoogLeNet from 2 GPUs to 8 GPUs. In addition, we explore novel ways to reduce the communication overhead caused by exchanging parameters. Finally, we release the framework as open-source for further research on distributed deep learning

by <a href="">He Ma</a>, <a href="">Fei Mao</a>, <a href="">Graham W. Taylor</a> at May 27, 2016 01:30 AM

Near Optimal Channel Assignment for Interference Mitigation in Wireless Mesh Networks. (arXiv:1605.08321v1 [cs.NI])

In multi-radio multi-channel (MRMC) WMNs, interference alleviation is affected through several network design techniques e.g., channel assignment (CA), link scheduling, routing etc., intelligent CA schemes being the most effective tool for interference mitigation. CA in WMNs is an NP-Hard problem, and makes optimality a desired yet elusive goal in real-time deployments which are characterized by fast transmission and switching times and minimal end-to-end latency. The trade-off between optimal performance and minimal response times is often achieved through CA schemes that employ heuristics to propose efficient solutions. WMN configuration and physical layout are also crucial factors which decide network performance, and it has been demonstrated in numerous research works that rectangular/square grid WMNs outperform random or unplanned WMN deployments in terms of network capacity, latency, and network resilience. In this work, we propose a smart heuristic approach to devise a near-optimal CA algorithm for grid WMNs (NOCAG). We demonstrate the efficacy of NOCAG by evaluating its performance against the minimal-interference CA generated through a rudimentary brute-force technique (BFCA), for the same WMN configuration. We assess its ability to mitigate interference both, theoretically (through interference estimation metrics) and experimentally (by running rigorous simulations in NS-3). We demonstrate that the performance of NOCAG is almost as good as the BFCA, at a minimal computational overhead of O(n) compared to the exponential of BFCA.

by <a href="">Ranadheer Musham</a>, <a href="">Srikant Manas Kala</a>, <a href="">Pavithra Muthyap</a>, <a href="">Pavan Kumar Reddy Mule</a>, <a href="">Bheemarjuna Reddy Tamma</a> at May 27, 2016 01:30 AM

Overview of ISM bands and Software-defined Radio Experimentation. (arXiv:1605.08309v1 [cs.NI])

Wireless systems using low-power wireless communication protocol are rapidly gain popularity in the license-free industrial scientific, and medical (ISM) frequency bands. One such emerging trend in ISM frequency bands is home automation. Historically, all the home devices were once unconnected, today are now being connected either by a wired or wireless connection. The low-power wireless communication protocols enable integration of all the digital home devices into a single system and enhance end user product experience. The rapid prototyping of these end user product using programmable radio system such as software-defined radio (SDR) has been the game-changer in the field of home automation. In this article, we present an overview of popular ISM bands in different regions of the world. Furthermore, we analyze the signals from the wireless electric switch using a software-defined radio in 433 MHz ISM band.

by <a href="">Abhaykumar Kumbhar</a> at May 27, 2016 01:30 AM

Privacy Odometers and Filters: Pay-as-you-Go Composition. (arXiv:1605.08294v1 [cs.CR])

In this paper we initiate the study of adaptive composition in differential privacy when the length of the composition, and the privacy parameters themselves can be chosen adaptively, as a function of the outcome of previously run analyses. This case is much more delicate than the setting covered by existing composition theorems, in which the algorithms themselves can be chosen adaptively, but the privacy parameters must be fixed up front. Indeed, it isn't even clear how to define differential privacy in the adaptive parameter setting. We proceed by defining two objects which cover the two main use cases of composition theorems. A privacy filter is a stopping time rule that allows an analyst to halt a computation before his pre-specified privacy budget is exceeded. A privacy odometer allows the analyst to track realized privacy loss as he goes, without needing to pre-specify a privacy budget. We show that unlike the case in which privacy parameters are fixed, in the adaptive parameter setting, these two use cases are distinct. We show that there exist privacy filters with bounds comparable (up to constants) with existing privacy composition theorems. We also give a privacy odometer that nearly matches non-adaptive private composition theorems, but is sometimes worse by a small asymptotic factor. Moreover, we show that this is inherent, and that any valid privacy odometer in the adaptive parameter setting must lose this factor, which shows a formal separation between the filter and odometer use-cases.

by <a href="">Ryan Rogers</a>, <a href="">Aaron Roth</a>, <a href="">Jonathan Ullman</a>, <a href="">Salil Vadhan</a> at May 27, 2016 01:30 AM

A counterexample to Thiagarajan's conjecture. (arXiv:1605.08288v1 [cs.FL])

We provide a counterexample to a conjecture by Thiagarajan (1996 and 2002) that regular event structures correspond exactly to finite 1-safe Petri nets. The same counterexample is used to disprove a closely related conjecture by Badouel, Darondeau, and Raoult (1999) that domains of regular event structures with bounded $\natural$-cliques are recognizable by finite trace automata. A necessary condition for both conjectures to be true is that domains of respective regular event structures admit a regular nice labeling. Our counterexample of a regular event domain with bounded $\natural$-cliques, and not admitting a regular nice labeling is based on (i) the bijection between event domains, median graphs, and CAT(0) cube complexes and is derived from (ii) an example by Wise (1996 and 2007) of a nonpositively curved square complex ${\bf X}$ with six squares, whose edges are colored in five colors, and whose universal cover $\widetilde{\bf X}$ is a CAT(0) square complex containing a particular plane with an aperiodic tiling.

by <a href="">J&#xe9;r&#xe9;mie Chalopin</a>, <a href="">Victor Chepoi</a> at May 27, 2016 01:30 AM

K-method of cognitive mapping analysis. (arXiv:1605.08243v1 [cs.SI])

Introduced a new calculation method (K-method) for cognitive maps. K - method consists of two consecutive steps. In the first stage, allocated subgraph composed of all paths from one selected node (concept) to another node (concept) from the cognitive map (directed weighted graph) . In the second stage, after the transition to an undirected graph (symmetrization adjacency matrix) the influence of one node to another calculated with Kirchhoff method. In the proposed method, there is no problem inherent in the impulse method. In addition to "pair" influence of one node to another, the average characteristics are introduced, allowing to calculate the impact of the selected node to all other nodes and the influence of all on the one selected. For impulse method similar to the average characteristics in the case where the pulse method "works" are introduced and compared with the K-method.

by <a href="">A.A. Snarskii</a>, <a href="">D.I. Zorinets</a>, <a href="">D.V. Lande</a>, <a href="">A.V. Levchenko</a> at May 27, 2016 01:30 AM

EPEM: A General and Validated Energy Complexity Model for Multithreaded Algorithms. (arXiv:1605.08222v1 [cs.DC])

Like time complexity models that have significantly contributed to the analysis and development of fast algorithms, energy complexity models for parallel algorithms are desired as crucial means to develop energy efficient algorithms for ubiquitous multicore platforms. Ideal energy complexity models should be validated on real multicore platforms and applicable to a wide range of parallel algorithms. However, existing energy complexity models for parallel algorithms are either theoretical without model validation or algorithm-specific without ability to analyze energy complexity for a wide-range of parallel algorithms.

This paper presents a new general validated energy complexity model for parallel (multithreaded) algorithms. The new model abstracts away possible multicore platforms by their static and dynamic energy of computational operations and data access, and derives the energy complexity of a given algorithm from its work, span and I/O complexity. The new energy complexity model is validated by different sparse matrix vector multiplication (SpMV) algorithms and dense matrix multiplication (matmul) algorithms running on high performance computing (HPC) platforms (e.g., Intel Xeon and Xeon Phi). The new model is able to characterize and compare the energy consumption of SpMV and matmul kernels according to three aspects: different algorithms, different input matrix types and different platforms. The prediction of the new model regarding which algorithm consumes more energy with different inputs on different platforms, is confirmed by the experimental results. In order to improve the usability and accuracy of the new model for a wide range of platforms, the platform parameters of EPEM model are provided for eleven platforms including HPC, accelerator and embedded platforms.

by <a href="">Vi Ngoc-Nha Tran</a>, <a href="">Phuong Hoai Ha</a> at May 27, 2016 01:30 AM

The Symbolic Interior Point Method. (arXiv:1605.08187v1 [cs.AI])

A recent trend in probabilistic inference emphasizes the codification of models in a formal syntax, with suitable high-level features such as individuals, relations, and connectives, enabling descriptive clarity, succinctness and circumventing the need for the modeler to engineer a custom solver. Unfortunately, bringing these linguistic and pragmatic benefits to numerical optimization has proven surprisingly challenging. In this paper, we turn to these challenges: we introduce a rich modeling language, for which an interior-point method computes approximate solutions in a generic way. While logical features easily complicates the underlying model, often yielding intricate dependencies, we exploit and cache local structure using algebraic decision diagrams (ADDs). Indeed, standard matrix-vector algebra is efficiently realizable in ADDs, but we argue and show that well-known optimization methods are not ideal for ADDs. Our engine, therefore, invokes a sophisticated matrix-free approach. We demonstrate the flexibility of the resulting symbolic-numeric optimizer on decision making and compressed sensing tasks with millions of non-zero entries.

by <a href="">Martin Mladenov</a>, <a href="">Vaishak Belle</a>, <a href="">Kristian Kersting</a> at May 27, 2016 01:30 AM

Analyzing the Gadgets Towards a Metric to Measure Gadget Quality. (arXiv:1605.08159v1 [cs.SE])

Current low-level exploits often rely on code-reuse, whereby short sections of code (gadgets) are chained together into a coherent exploit that can be executed without the need to inject any code. Several protection mechanisms attempt to eliminate this attack vector by applying code transformations to reduce the number of available gadgets. Nevertheless, it has emerged that the residual gadgets can still be sufficient to conduct a successful attack. Crucially, the lack of a common metric for "gadget quality" hinders the effective comparison of current mitigations. This work proposes four metrics that assign scores to a set of gadgets, measuring quality, usefulness, and practicality. We apply these metrics to binaries produced when compiling programs for architectures implementing Intel's recent MPX CPU extensions. Our results demonstrate a 17% increase in useful gadgets in MPX binaries, and a decrease in side-effects and preconditions, making them better suited for ROP attacks.

by <a href="">Andreas Follner</a>, <a href="">Alexandre Bartel</a>, <a href="">Eric Bodden</a> at May 27, 2016 01:30 AM

The Many Worlds of Modal {\lambda}-calculi: I. Curry-Howard for Necessity, Possibility and Time. (arXiv:1605.08106v1 [cs.LO])

This is a survey of {\lambda}-calculi that, through the Curry-Howard isomorphism, correspond to constructive modal logics. We cover the prehistory of the subject and then concentrate on the developments that took place in the 1990s and early 2000s. We discuss logical aspects, modal {\lambda}-calculi, and their categorical semantics. The logics underlying the calculi surveyed are constructive versions of K, K4, S4, and LTL.

by <a href="">G. A. Kavvos</a> at May 27, 2016 01:30 AM

Proceedings First Workshop on Pre- and Post-Deployment Verification Techniques. (arXiv:1605.08096v1 [cs.LO])

The PrePost (Pre- and Post-Deployment Verification Techniques) workshop aimed at bringing together researchers working in the field of computer-aided validation and verification to discuss the connections and interplay between pre- and post-deployment verification techniques. Examples of the topics covered by the workshop are the relationships between classic model checking and testing on the one hand and runtime verification and statistical model checking on the other, and between type systems that may be checked either statically or dynamically through techniques such as runtime monitoring.

by <a href="">Luca Aceto</a> (ICE-TCS, School of Computer Science, Reykjavik University), <a href="">Adrian Francalanza</a> (Computer Science, Faculty of Information &amp; Communications Technology, University of Malta), <a href="">Anna Ingolfsdottir</a> (ICE-TCS, School of Computer Science, Reykjavik University) at May 27, 2016 01:30 AM

A generalization of unit distances, angular properties and convexity. (arXiv:1605.08066v1 [cs.DM])

In this paper, we prove that unit distance graphs on convex point sets with n vertices have O(n) edges improving the previous known bound of O(n log n).

by <a href="">Abhijeet Khopkar</a> at May 27, 2016 01:30 AM

Cryptographic applications of capacity theory: On the optimality of Coppersmith's method for univariate polynomials. (arXiv:1605.08065v1 [cs.CR])

We draw a new connection between Coppersmith's method for finding small solutions to polynomial congruences modulo integers and the capacity theory of adelic subsets of algebraic curves. Coppersmith's method uses lattice basis reduction to construct an auxiliary polynomial that vanishes at the desired solutions. Capacity theory provides a toolkit for proving when polynomials with certain boundedness properties do or do not exist. Using capacity theory, we prove that Coppersmith's bound for univariate polynomials is optimal in the sense that there are \emph{no} auxiliary polynomials of the type he used that would allow finding roots of size $N^{1/d+\epsilon}$ for monic degree-$d$ polynomials modulo $N$. Our results rule out the existence of polynomials of any degree and do not rely on lattice algorithms, thus eliminating the possibility of even superpolynomial-time improvements to Coppersmith's bound. We extend this result to constructions of auxiliary polynomials using binomial polynomials, and rule out the existence of any auxiliary polynomial of this form that would find solutions of size $N^{1/d+\epsilon}$ unless $N$ has a very small prime factor.

by <a href="">Ted Chinburg</a>, <a href="">Brett Hemenway</a>, <a href="">Nadia Heninger</a>, <a href="">Zachary Scherr</a> at May 27, 2016 01:30 AM

$\ell_p$-Box ADMM: A Versatile Framework for Integer Programming. (arXiv:1604.07666v2 [cs.CV] UPDATED)

This paper revisits the integer programming (IP) problem, which plays a fundamental role in many computer vision and machine learning applications. The literature abounds with many seminal works that address this problem, some focusing on continuous approaches (e.g. linear program relaxation) while others on discrete ones (e.g., min-cut). However, a limited number of them are designed to handle the general IP form and even these methods cannot adequately satisfy the simultaneous requirements of accuracy, feasibility, and scalability. To this end, we propose a novel and versatile framework called $\ell_p$-box ADMM, which is based on two parts. (1) The discrete constraint is equivalently replaced by the intersection of a box and a $(n-1)$-dimensional sphere (defined through the $\ell_p$ norm). (2) We infuse this equivalence into the ADMM (Alternating Direction Method of Multipliers) framework to handle these continuous constraints separately and to harness its attractive properties. More importantly, the ADMM update steps can lead to manageable sub-problems in the continuous domain. To demonstrate its efficacy, we consider an instance of the framework, namely $\ell_2$-box ADMM applied to binary quadratic programming (BQP). Here, the ADMM steps are simple, computationally efficient, and theoretically guaranteed to converge to a KKT point. We demonstrate the applicability of $\ell_2$-box ADMM on three important applications: MRF energy minimization, graph matching, and clustering. Results clearly show that it significantly outperforms existing generic IP solvers both in runtime and objective. It also achieves very competitive performance vs. state-of-the-art methods specific to these applications.

by <a href="">Baoyuan Wu</a>, <a href="">Bernard Ghanem</a> at May 27, 2016 01:30 AM


Compiling a Functional Language

This paper summarizes my experience in implementing a compiler for a functional language. The language is ML1 [Milner 84] and the compiler was first implemented in 1980 as a personal project when I was a postgraduate student at the University of Edinburgh2. In this paper, “the” ML compiler refers to my VAX implementation.

At the time, I was familiar with programming language semantics but knew very little about compiler technology; interpreters had been my main programming concern. Major influences in the design of this compiler have been [Steele 77] [Steele 78] and the implementation folklore for statically and dynamically scoped dialects of Lisp [Allen 78]. As a result, the internal structure of the compiler is fairly unorthodox, if compared for example with [Aho 78].

Anyway, a compiler for a language like ML has to be different. ML is interactive, statically scoped, strongly typed, polymorphic, and has first class higher-order functions, type inference and dynamic allocation. These features preclude many well-known implementation styles, particularly the ones used for Lisp (because of static scoping), the Algol family (because of functional values) and C (because of nested scoping and strong typing). The interaction of these features is what gives ML its “character”, and makes compilation challenging.


by lojikil at May 27, 2016 01:19 AM

Planet Theory

Towards large-scale deliberative decision-making: small groups and the importance of triads

Authors: Ashish Goel, David T. Lee
Download: PDF
Abstract: Though deliberation is a critical component of democratic decision-making, existing deliberative processes do not scale to large groups of people. Motivated by this, we propose a model in which large-scale decision-making takes place through a sequence of small group interactions. Our model considers a group of participants, each having an opinion which together form a graph. We show that for median graphs, a class of graphs including grids and trees, it is possible to use a small number of three-person interactions to tightly approximate the wisdom of the crowd, defined here to be the generalized median of participant opinions, even when agents are strategic. Interestingly, we also show that this sharply contrasts with small groups of size two, for which we prove an impossibility result. Specifically, we show that it is impossible to use sequences of two-person interactions satisfying natural axioms to find a tight approximation of the generalized median, even when agents are non-strategic. Our results demonstrate the potential of small group interactions for reaching global decision-making properties.

May 27, 2016 01:13 AM


Common Lisp or Scheme for server-side?

I wonder if some functional languages are used for web development and which are most useful and supported with that goal?

by Alexander Ivanov at May 27, 2016 01:08 AM


Are there specific rules for programming languages applicable to Quantum Computers?

What would be the requirements for programming languages that are used for quantum computers? Do standard procedural languages with their constructs be sufficient, and succinctly capture the computing capabilities of a Quantum Computer? To illustrate, languages such as Assembler would most likely be closest to the hardware, and the programmer programs to the Assemblers' specifications. CSP is an example of specification for concurrent processes. Are there any formalisms such as CSP for quantum computers, or even languages such as Assembly for quantum computers?

by user3483902 at May 27, 2016 01:03 AM

Planet Theory

A resource-frugal probabilistic dictionary and applications in (meta)genomics

Authors: Camille Marchet, Antoine Limasset, Lucie Bittner, Pierre Peterlongo
Download: PDF
Abstract: Genomic and metagenomic fields, generating huge sets of short genomic sequences, brought their own share of high performance problems. To extract relevant pieces of information from the huge data sets generated by current sequencing techniques, one must rely on extremely scalable methods and solutions. Indexing billions of objects is a task considered too expensive while being a fundamental need in this field. In this paper we propose a straightforward indexing structure that scales to billions of element and we propose two direct applications in genomics and metagenomics. We show that our proposal solves problem instances for which no other known solution scales-up. We believe that many tools and applications could benefit from either the fundamental data structure we provide or from the applications developed from this structure.

May 27, 2016 01:01 AM

Analysis of Resparsification

Authors: Jakub Pachocki
Download: PDF
Abstract: We show that schemes for sparsifying matrices based on iteratively resampling rows yield guarantees matching classic 'offline' sparsifiers (see e.g. Spielman and Srivastava [STOC 2008]).

In particular, this gives a formal analysis of a scheme very similar to the one proposed by Kelner and Levin [TCS 2013].

May 27, 2016 01:01 AM


PDA to CFG equivalence (No. of states and stack symbols)

All examples of problems on converting PDA to CFG I have encountered always have a number of states EQUAL to the stack symbols. Is that always the case?

I get confuse since in making productions for the start symbol, you use combinations of qXp (where q is the first state, X is in stack, p could be any of the states in line with the stack symbol). For example, States: {q0,q1} Stack symbols: {Z,X} Then productions for start has 2 possibilities: S -> [q0Zq0] or S -> [qoXq1]

What will happen if the number of states is not equal to that of the stack symbols? For example, States: {q0,q1,q2} Stack symbols: {Z,X} How can I pair the 3rd state with a stack symbol? since S -> [q0Zq0] or S -> [qoXq1] already taken?

Thank you.

Just preparing for class defense :)

by ramzel at May 27, 2016 12:25 AM


Step By Step method to calculating VaR using MonteCarlo Simulations

In trying to find VaR for 5 financial assets with prices over a long period of time(2000 days worth of data) how would I do the following:

  1. Carry out monte-carlo simulation in order to find a VaR value, assuming all 5 assets are standard normally distributed.
  2. Carry out monte-carlo simulation for all 5 assets to find a VaR value, assuming they follow a student-t distribution with 10 degrees of freedom?

I am trying to do this at both the 95% and 90% confidence levels, and simulate the data with 10,000 replications. Any help would be greatly appreciated. I have already created a Cholesky Decomposition Matrix but not sure how to use the data in it to get the VaR figure.

by Dr. Dre at May 27, 2016 12:03 AM

HN Daily

Planet Theory

Dominance Products and Faster Algorithms for High-Dimensional Closest Pair under $L_\infty$

Authors: Omer Gold, Micha Sharir
Download: PDF
Abstract: We give improved algorithmic time bounds for two fundamental problems, and establish a new complexity connection between them. The first is computing dominance product: given a set of $n$ points $p_1,\ldots, p_n$ in $\mathbb{R}^d$, compute a matrix $D$, such that $D[i,j] = \bigl| \{k \mid p_i[k] \leq p_j[k]\} \bigr|$; this is the number of coordinates at which $p_j$ dominates $p_i$. Dominance product computation has often been applied in algorithm design over the last decade.

The second problem is the $L_\infty$ Closest Pair in high dimensions: given a set $S$ of $n$ points in $\mathbb{R}^d$, find a pair of distinct points in $S$ at minimum distance under the $L_\infty$ metric. When $d$ is constant, there are efficient algorithms that solve this problem, and fast approximate solutions are known for general $d$. However, obtaining an exact solution in very high dimensions seems to be much less understood. We significantly simplify and improve previous results, showing that the problem can be solved by a deterministic strongly-polynomial algorithm that runs in $O(DP(n,d)\log n)$ time, where $DP(n,d)$ is the time bound for computing the dominance product for $n$ points in $\mathbb{R}^d$. For integer coordinates from some interval $[-M, M]$, and for $d=n^r$ for some $r > 0 $, we obtain an algorithm that runs in $\tilde{O}\left(\min\{Mn^{\omega(1,r,1)},\, DP(n,d)\}\right)$ time, where $\omega(1,r,1)$ is the exponent of multiplying an $n \times n^r$ matrix by an $n^r \times n$ matrix.

May 27, 2016 12:00 AM

May 26, 2016


How to show that an MINLP with L0 regularization is NP-hard?

I am currently working on a project that involves a mixed-integer non-linear optimization problem, and wondering if I can state that this problem NP-hard in a research paper. I'm not looking for a formal proof as much as a reasonable argument and/or a citation.

The MINLP has the following general form:

$\begin{align} \min_{x} & \quad f(x) + C ~\|x\|_0 \\ \text{s.t.} & \qquad x \in \mathbb{Z}^d \cap [10,10]^d \end{align}$


  • $\|x\|_0 = \sum_{i=1}^d 1[x_i \neq 0]$ is the $l_0$-norm

  • $C > 0$ is fixed constant that represents the $l_0$ regularization parameter

  • $f: \mathbb{R}^d \rightarrow \mathbb{R}^+$ is a strictly convex non-linear function. In practice, $f$ is a loss function and is part of the input. As an example, consider $f(x) = \log(1+\exp(-x^T z_k))$ where $z_1,\ldots,z_K \in \mathbb{R}^d$ are fixed parameters derived from a dataset.

In practice, I represent and implement this MINLP using the following formulation:

$\begin{align} \min_{x} & \quad \sum_{k=1}^K v_k + C ~\sum_{i=0}^d y_i & \\ \text{s.t.} & \qquad f_k = \log(1+\exp(-x^T z_k)) & \text{for } k = 1,\ldots,K\\ & \qquad |x_i| \leq 10 y_i & \text{for } i = 1,\ldots,d\\ & \qquad x_i \in \{-10,\ldots,10\} & \text{for } i = 1,\ldots,d\\ & \qquad y_i \in \{0,1\} & \text{for } i = 1,\ldots,d\\ \end{align}$

This formulation requires $K + 2d$ total variables ($f_k, x_i, y_i$) of which $K$ are real and $2d$ are discrete. In addition, it uses $K + 2d$ constraints (this accounts for the fact that we have to express the $|x_i| \leq 10 y_i$ constraints using two constraints, but does not account for the lower and upper bound constraints on $x_i$ and $y_i$).

Some thoughts: I am pretty sure that the problem is NP-hard, since it has the elements of other problems that are NP-hard such as $l_0$ regularization and integer constraints.

The problem can be reformulated as a "convex" MINLP (i.e., an optimization problem with a convex objective function with integer constraints). To see this, just replace the $\|x\|_0$ term in the objective with $\sum_{i=0}^d y_i$ and add the constraints $y_i \in \{0,1\}$ and $|x_i| \leq 10 y_i$ for $i = 1,\ldots,d$ to obtain:

$\begin{align} \min_{x} & \quad f(x) + C ~\sum_{i=0}^d y_i \\ \text{s.t.} & \qquad x \in \mathbb{Z}^d \cap [10,10]^d \\ & \qquad |x_i| \leq 10 y_i \text{ for } i = 1,\ldots,d \\ & \qquad y_i \in \{0,1\} \text{ for } i = 1,\ldots,d \end{align}$

This paper argues that most convex MINLP instances are NP-hard in the introduction, but also that they might be but says that it might also be instance (since they constitute a wide variety of optimization problems).

by Berk U. at May 26, 2016 11:51 PM

Wald's equation for dependent decrements

I'm taking a course on Coursera about approximation algorithms for fun and one of the modules uses probability to analyze the cost of a linear program with randomized rounding solution to the set cover problem.

One of the techniques employed is Wald's equation for dependent decrements described here:

For Alice's coin-flipping example, can somebody please describe to me how δ is increasing if it's chosen to be δ(nt) =1 / 6 * ceil(nt)? As the number of coins left decreases doesn't δ(nt) also decrease?

Clearly there's something missing in my understanding of how this works.

by Steadman Dillroper at May 26, 2016 11:44 PM

Ambiguity vs. context-sensitivity

It is said that attributes supply some semantic information to the grammar. Meantime, the same attributes let you to resolve ambiguities. Text books agree that it is worth haveing a CF grammar which admits ambiguities because the ambiguities will go away anyway once you evaluate your AST with attributes anyway. For this reason it is useless to invent an intractable CS grammar for your language.

This strongly associates ambiguities with context sensitivity in my mind. However, there is an expert in the field who states brazen faced that context sensitivity is absolutely irrelevant to ambiguity but supplies no support to ground his statement. Can you untangle the conflicting opinions by some light?

I cannot refrain from asking the same question in the following form. My mind is highly influenced by this description the of grammar type comparison in "Parsing Techniques: A Practical Guide"

Type n grammars are more powerful than Type n+1 grammars, for n = 0,1,2”, and one often reads statements like “A regular (Type 3) grammar is not powerful enough to match parentheses”. It means that more powerful grammars can define more complicated boundaries between correct and incorrect sentences. Some boundaries are so fine that they cannot be described by any grammar (that is, by any generative process). This idea has been depicted metaphorically in Figure 2.33, in which a rose is approximated by increasingly finer outlines. In this metaphor, the rose corresponds to the language (imagine the sentences of the language as molecules in the rose); the grammar serves to delineate its silhouette. A regular grammar only allows us straight horizontal and vertical line segments to describe the flower; the result is a coarse and mechanical-looking picture. A CF grammar would approximate the outline by straight lines at any angle and by circle segments. The result is stilted but recognizable. A CS grammar would present us with a smooth curve tightly enveloping the flower, but the curve is too smooth: it cannot follow all the sharp turns, and it deviates slightly at complicated points; still, a very realistic picture results. An unrestricted phrase structure grammar can represent the outline perfectly. The rose itself cannot be caught in a finite description; its essence remains forever out of our reach.

enter image description here

A more prosaic and practical example can be found in the successive sets of Java programs that can be generated by the various grammar types:

  1. The set of all lexically correct Java programs can be generated by a regular grammar. A Java program is lexically correct if there are no newlines inside strings, comments are terminated before end-of-file, all numerical constants have the right form, etc.
  2. The set of all syntactically correct Java programs can be generated by a context free grammar. These programs conform to the (CF) grammar in the manual.
  3. The set of all semantically correct Java programs can be generated by a CS grammar. These are the programs that pass through a Java compiler without error.
  4. The set of all Java programs that would terminate in finite time when run with a given input can be generated by an unrestricted phrase structure grammar. Such a grammar would incorporate detailed descriptions of the Java lib and the Java run-time system.
  5. The set of all Java programs that solve a given problem cannot be generated by a grammar (although the description of the set is finite).

I see that CF can generate the syntactically valid programs and you need CS grammar to accommodate the semantics of the language. This more complete grammar can resolve if x(y) is a call of a function or type conversion. It will also iron out the associativity ambiguity in a+b+c telling that difference between (a+b)+c and a+(b+c) is immaterial. For this reason I think that CF -> CS also resolves ambiguities. Why telling that context sensitivity and ambiguity are orthogonal after that?

by Valentin Tihomirov at May 26, 2016 11:40 PM


Why are my TensorFlow network weights and costs NaN when I use RELU activations?

I can't get TensorFlow RELU activations (neither tf.nn.relu nor tf.nn.relu6) working without NaN values for activations and weights killing my training runs.

I believe I'm following all the right general advice. For example I initialize my weights with

weights = tf.Variable(tf.truncated_normal(w_dims, stddev=0.1))
biases = tf.Variable(tf.constant(0.1 if neuron_fn in [tf.nn.relu, tf.nn.relu6] else 0.0, shape=b_dims))

and use a slow training rate, e.g.,

tf.train.MomentumOptimizer(0.02, momentum=0.5).minimize(cross_entropy_loss)

But any network of appreciable depth results in NaN for cost and and at least some weights (at least in the summary histograms for them). In fact, the cost is often NaN right from the start (before training).

I seem to have these issues even when I use L2 (about 0.001) regularization, and dropout (about 50%).

Is there some parameter or setting that I should adjust to avoid these issues? I'm at a loss as to where to even begin looking, so any suggestions would be appreciated!

by raxacoricofallapatorius at May 26, 2016 11:32 PM


Linearity of Expectation

I'm taking a course on Coursera about approximation algorithms for fun and one of the modules uses probability to analyze the cost of a linear program with randomized rounding solution to the set cover problem.

One of the problem set questions is a basic probability question and is as follows:

You ask ten of your friends to give you one dollar (coin) each so that you can buy your favourite pizza.

When you reach the pizza place you realize that it's closed. So you return the coins to your friends by giving each of them a random coin.

1) How likely is that each of them gets the coin they put back?

2) How many of them will get in expectation the coin they put back? (Try to prove this elegantly using techniques discussed in the course)Define the random variables you use.

3) Now suppose that you take one coin to buy a coffee somewhere else. You redistribute the others dollar coins giving each of your friends one coin chosen uniformly at random. How many of your friends will get in expectation the coin they put back?

I've solved the first two parts and I'm thinking for the third part that I should define a random variable Xi that represents whether or not persion i get the coin back that she put in and E[Xi] is the probability that she gets it back.

The probability that she gets it back is 9 / 10 * 1 / 10 = 9 / 100. There's a 9 / 10 chance that her coin was not the coin that I bought coffee with and a 1 / 10 chance that she picks her coin or is left out as there are 9 coins for 10 remaining people.

Taking X as the number of folks that get their coins back I have

E[X] = sum(E[Xi]) over all persons E[X] = 90 / 100 = .9 people are expected to get their coin back.

Is my reasoning sound here? If not, can somebody show me the error of my ways?

Thanks so much,


by Steadman Dillroper at May 26, 2016 11:30 PM


Functional programming - is immutability expensive? [closed]

The question is in two parts. The first is conceptual. The next looks at the same question more concretely in Scala.

  1. Does using only immutable data structures in a programming language make implementing certain algorithms/logic inherently more computationally expensive in practice? This draws to the fact that immutability is a core tenet of purely functional languages. Are there other factors that impact this?
  2. Let's take a more concrete example. Quicksort is generally taught and implemented using mutable operations on an in-memory data structure. How does one implement such a thing in a PURE functional way with a comparable computational and storage overhead to the mutable version. Specifically in Scala. I have included some crude benchmarks below.

More Details:

I come from an imperative programming background (C++, Java). I have been exploring functional programming, specifically Scala.

Some of the primary principles of pure functional programming:

  1. Functions are first class citizens.
  2. Functions do not have side effects and hence objects/data structures are immutable.

Even though modern JVMs are extremely efficient with object creation and garbage collection is very inexpensive for short lived objects, it's probably still better to minimize object creation right? At least in a single-threaded application where concurrency and locking is not an issue. Since Scala is a hybrid paradigm, one can choose to write imperative code with mutable objects if necessary. But, as someone who has spent a lot of years trying to reuse objects and minimize allocation. I would like a good understanding of the school of thought that would not even allow that.

As a specific case, I was a little surprised by this code snippet in this tutorial 6 . It has a Java version of Quicksort followed by a neat looking Scala implementation of the same.

Here is my attempt to benchmark the implementations. I haven't done detailed profiling. But, my guess is that the Scala version is slower because the number of objects allocated is linear (one per recursion call). Is there any way chance that tail call optimizations can come into play? If I am right, Scala supports tail call optimizations for self-recursive calls. So, it should only be helping it. I am using Scala 2.8.

Java version

public class QuickSortJ {

    public static void sort(int[] xs) {
      sort(xs, 0, xs.length -1 );

    static void sort(int[] xs, int l, int r) {
      int pivot = xs[(l+r)/2];
      int a = l; int b = r;
      while (a <= b){
        while (xs[a] < pivot) { a = a + 1; }
        while (xs[b] > pivot) { b = b - 1; }
        if (a <= b) {
          swap(xs, a, b);
          a = a + 1;
          b = b - 1;
      if (l < b) sort(xs, l, b);
      if (b < r) sort(xs, a, r);

    static void swap(int[] arr, int i, int j) {
      int t = arr[i]; arr[i] = arr[j]; arr[j] = t;

Scala version

object QuickSortS {

  def sort(xs: Array[Int]): Array[Int] =
    if (xs.length <= 1) xs
    else {
      val pivot = xs(xs.length / 2)
        sort(xs filter (pivot >)),
        xs filter (pivot ==),
        sort(xs filter (pivot <)))

Scala Code to compare implementations

import java.util.Date
import scala.testing.Benchmark

class BenchSort(sortfn: (Array[Int]) => Unit, name:String) extends Benchmark {

  val ints = new Array[Int](100000);

  override def prefix = name
  override def setUp = {
    val ran = new java.util.Random(5);
    for (i <- 0 to ints.length - 1)
      ints(i) = ran.nextInt();
  override def run = sortfn(ints)

val benchImmut = new BenchSort( QuickSortS.sort , "Immutable/Functional/Scala" )
val benchMut   = new BenchSort( QuickSortJ.sort , "Mutable/Imperative/Java   " )

benchImmut.main( Array("5"))
benchMut.main( Array("5"))


Time in milliseconds for five consecutive runs

Immutable/Functional/Scala    467    178    184    187    183
Mutable/Imperative/Java        51     14     12     12     12

by smartnut007 at May 26, 2016 11:18 PM

Wes Felter

"Unix Philosophy? What did that get us? A bunch of failing vendors in the 1990s, and the..."

“Unix Philosophy? What did that get us? A bunch of failing vendors in the 1990s, and the inevitability of Windows NT.”


Don Marti sets Linux elitists straight

I disagree with “Designing working software based on philsophy is like writing real network software based on the OSI 7-layer burrito model.” though; systemd and the rest of Lennartix is based on a philsophy—just a better one.

May 26, 2016 11:16 PM


Fixed parameter tractability

Lets say I have an algorithm with complexity $O(n^k)$ where $n$ is the size of the input and $k$ is a parameter. Clearly this is superpolynomial; but in fixed parameter tractability we restrict $k$ to some value. It becomes fixed parameter tractable if the resulting complexity is $O(f(k)*n^c)$. My gut tells me that this is the case for this example, but I can't connect the dots yet. Can I just subsitute the $c$ in the equation by $k$ and then I'm done?

by Héctor van den Boorn at May 26, 2016 11:15 PM



Is there a way to meaningfully generate daily returns from monthly?

I have a set of 7 investments in a portfolio and I need to optimize the weightings based on some exposures to various markets/styles/economic factors. I was hoping to do some sort of simple exposure analysis or 'factor analysis' (not actual Factor Analysis, but more just a bunch of regressions), using daily returns of various risk factors (for example, SPX, TBills, MSCI, FamaFrench Factors, etc).

I only have daily returns for 5 of the 7 investments in the portfolio. I have monthly returns for the remaining two. Is there an easy way to do some sort of generation of daily returns from monthly returns, possibly modelling the monthly against the factors' monthly returns, and then generating daily returns based on the model? (I know this is circular, but I am spitballing.) The problem is that I need some way to tie or anchor the modeled daily returns back to the actual monthly returns.

Any ideas? And does this make sense?

by lukehawk at May 26, 2016 11:13 PM


What happens to quantum algorithms such as BB84 if P=NP

Under the hypothesis that P=NP, many cryptographic protocols are no longer secure (i.e. attacks are feasible).

The BB84 algorithm is based on the idea that by observing a quantum state, one has to (in quantum theory) disturb that state. This makes them efficient against eavesdropping.

Are quantum algorithms based on BB84 also subject to the hypothesis that P=NP? Or do they remain unaffected by the hypothesis?

by user13675 at May 26, 2016 11:12 PM

Is it possible to entangle more than 2 qubits? If so, is it useful?

With 2 qubits, there are 4 ways to entangle the qubits - the four maximally entangled two-qubit Bell states, of which one is $1/\sqrt{2}(|00\rangle + |11\rangle)$.

Is it possible to entangle more than two qubits together? If so, is there any practical use to doing so?

by Alan Wolfe at May 26, 2016 11:04 PM

Generally, does the time it takes to compute (n mod m) depend on the size of n?

Example: Suppose I have a number a with 100 decimal digits, b with 200 decimal digits, and m with 10 decimal digits.

Would the speed of the computation a mod m vary greatly from b mod m because of the size of a vs b?

by user51720 at May 26, 2016 10:59 PM


Using reduce, map or other function to avoid for loops in python

I have a program working for calculating the distance and then apply the k-means algorithm. I tested on a small list and it's working fine and fast, however, my original list is very big (>5000), so it's taking forever and I ended it up terminating the running. Can I use outer() or any other parallel function and apply it to the distance function to make this faster?? On the small set that I have:

strings = ['cosine cos', 'cosine', 'cosine???????', 'l1', 'l2', 'manhattan']

And its distance 3D array returns like this:

[[[ 0.          0.25        0.47826087  1.          1.          0.89473684]
  [ 0.25        0.          0.36842105  1.          1.          0.86666667]
  [ 0.47826087  0.36842105  0.          1.          1.          0.90909091]
  [ 1.          1.          1.          0.          0.5         1.        ]
  [ 1.          1.          1.          0.5         0.          1.        ]
  [ 0.89473684  0.86666667  0.90909091  1.          1.          0.        ]]]

Each line of the array above represents the distance for one item in the strings list. My way of doing it using the for loops is:

strings = ['cosine cos', 'cosine', 'cosine???????', 'l1', 'l2', 'manhattan']

data1 = []

for j in range(len(np.array(list(strings)))):

     for i in range(len(strings)):
       data1.append(1-Levenshtein.ratio(np.array(list(strings))[j], np.array(list(strings))[i]))

#n =(map(Levenshtein.ratio, strings))
#n =(reduce(Levenshtein.ratio, strings))

arr_3d = data2.reshape((1,k,k))

Where arr_3d is the array above. How can I use any of outer() or map() to replace the for loops above, because when the list strings is big, it's taking hours and never got the results even. I appreciate the help. Levenshtein.ratio is a built in funciton in python.

by Lelo at May 26, 2016 10:58 PM

If functional languages are really concise, why don't they have a better rank in the language shootout game?

I compared the languages at the language shootout game by their code size only. Here is a summary of what I got (shortest first, grouped by similar score).

  1. Python, Ruby, JavaScript, Perl, Lua, PHP, Mozart/OZ
  2. OCaml, Erlang, Racket, Go, Scala, F#, Smalltalk
  3. Pascal, Clean, Haskell, Common Lisp, C#, Java, C
  4. C++, Ada, ATS

I wonder why. The winners seem to be plain old dynamic languages. Erlang, Racket (née PLT Scheme), and F# are doing OK. Haskell and Common Lisp don't look more concise than claimed-to-be-verbose Java.


I've found an insightful post on this topic with charts. I also found a similar comparison of languages for a larger program (a simple ray tracer). Altogether, I wouldn't say I got "the" answer, but I got some food for thought.

by Adam Schmideg at May 26, 2016 10:55 PM

Are there any good Clojure benchmarks?

Edit: The Clojure benchmarks are up on the Benchmarks Game.

I have made this question community wiki and invite others to keep it updated.

Is anyone aware of benchmarks of Clojure's performance?

I have done some of my own (although nothing too formal) and it didn't fair too well in comparison to other functional languages (tried Haskell and OCaml). But how does it look compared to Java or another language on the JVM (e.g. Scala)? And how does it compare to other Lisps?

There was some discussion on the Computer Language Benchmarks Game forum about adding Clojure on there, but nothing has been done yet.

Edit: I'm going to continue to add to this as I find more:

@igouy pointed out that the benchmark scripts for clojure are being created by jafingerhut on github.

Two very relevant threads from the Clojure discussion group:

And separately, these blog posts:

And lastly, a related question on stackoverflow:

Most of these discussions lead me to think that Clojure's performance is very favorable in comparison to other languages running on the JVM, although there is no doubt that it can be very difficult to come to a broad conclusion when comparing languages because their performance can vary dramatically dependent on the task.


Lau Jensen just posted a great discussion about benchmarking with JVM languages on his blog: "Getting benchmarking right".

by Shane at May 26, 2016 10:40 PM

How do I alter functions on different files?

I'm curious if it's possible for me to change the functions from different files in Python.

What file does is that it receives two datas from each file that will go against each other in rock paper scissors and deciding which player have won. Of course, file contains other functions, too, but I just inserted one function since other ones don't really matter for my question.

I'm trying to edit the function that's in file so that team3's (there are team1, team2, team3 and they play rock paper scissors against one another. (i.e. team1 and team 2 go against each other then team 1 and 3 after. vice versa)) result will always be rock and the opponent's result will be scissors so that team3 can win no matter what.

However, I am struggling and don't know what to do.. :( I barely started learning Python and it's pretty challenging task for me to do. I would love it if you can help me out a little bit ;)

# This is a function from file
import team3

def round(player1, player2, history1='', history2=''):

    # Get player 1's move. 
    move1 = player1(my_history=history1, their_history=history2)
    # Get player 2's move. 
    move2 = player2(my_history=history2, their_history=history1)

    if valid(move1) and valid(move2):      
        if move1 == move2:
            score1, score2 = 0, 0
        elif move1+move2 in ['rs', 'sp', 'pr']:
            score1, score2 = 1, -1
            score1, score2 = -1, 1
    else: #one of the moves was invalid
        if valid(move1): # only move2 was invalid
            move2 = 'x'
            score1, score2 = 1, -1
        elif valid(move2): # only move1 was invalid
            move1 = 'x'
            score1, score2 = -1, 1
        else: # Both moves invalid
            move1, move2 = 'x', 'x'
            score1, score2 = -1, -1

    return move1, move2, score1, score2

...And I'm trying to edit this function from another file named team3...

# Team 3 File
# -*- coding: utf-8 -*-
import psrock

def round(player1, player2, history1='', history2=''):
    move1 = player1(my_history=history1, their_history=history2)
    if player1 == team3:
        move1 = 'r'
        move2 = 's'
    elif player2 == team3:
        move1 = 's'
        move2 = 'r'

The Files:

  1. Download the File
  2. Extract the zip file
  3. Open all of the files in the same tab
  4. Play the file

by David Lim at May 26, 2016 10:21 PM


Suggest a data-structure that supports the following operations with time complexity of $ O(log(n)) $

Iv'e been struggling a lot with this one. I am looking for a data-structure (could be a modification of an existing type of data-structure, or a combination of more than one data-structure), which supports the following methods, all with time complexity (at worst-case) of $O(log(n))$:

  1. Insert(v) - insert v to the structure
  2. search(v) - search for v
  3. delete(v) - delete node with value v
  4. updateKeys(x,v) - for each node with a value greater then x, add v to it's value, where v is a positive number.

For example, if I had the following set of elements: ${1,2,5,8,10,15,20}$ , using updateKeys the following way: $ updateKeys(10,5) $ would change the set to : $ {1,2,5,8,15,20,25} $.

Well, I had a lot in my mind. Basically, my assignment is on the following data-structures:

  1. Hash tables - which sounded good at first, but I couldn't think of a way to find all elements larger than or equal to x.

  2. AVL tree - it does support the basic methods with this complexity, but in order to find which elements are larger, I need to search through all tree which can cost more than $ log(n) $. Further more, I can mess the tree up if I for instance make a left son be larger than it's parent.

  3. A combination of the two above - looked promising at first, but again, I couldn't find a way to find those elements.
  4. Heap - for some reason it sounds like a good direction since I have some sort of order, but update the values with log(n) time? seems impossible.

  5. Skip list - here I thought about using skip list since the most bottom layer is sorted in increasing order, so I thought about finding the minimum value larger than x, and then update all of it's followers from the right. Sounds good, but it is definitely not within $ O(log(n)) $, since for the following set ${1,2,3,4...10}$, and for $ x=1 $, I need to iterate 10 numbers, which is $ O(n) $

We also studied B-trees, ADT structures...

Then, almost hopeless, I started reading about data structures that weren't even taught in our course, such as range trees, BIT trees and so on.For some reason, some of them seemed suitable for this purpose, but since it's not in our syllabus, I don't think we are even allowed to use them.

Thanks in advance.

by superuser123 at May 26, 2016 10:01 PM


How to apply different cost functions to different output channels of a convolutional network?

I have a convolutional neural network whose output is a 4-channel 2D image. I want to apply sigmoid activation function to the first two channels and then use BCECriterion to computer the loss of the produced images with the ground truth ones. I want to apply squared loss function to the last two channels and finally computer the gradients and do backprop. I would also like to multiply the cost of the squared loss for each of the two last channels by a desired scalar.

So the cost has the following form:

cost = crossEntropyCh[{1, 2}] + l1 * squaredLossCh_3 + l2 * squaredLossCh_4

The way I'm thinking about doing this is as follow:

criterion1 = nn.BCECriterion()
criterion2 = nn.MSECriterion()

error = criterion1:forward(model.output[{{}, {1, 2}}], groundTruth1) + l1 * criterion2:forward(model.output[{{}, {3}}], groundTruth2) + l2 * criterion2:forward(model.output[{{}, {4}}], groundTruth3)

However, I don't think this is the correct way of doing it since I will have to do 3 separate backprop steps, one for each of the cost terms. So I wonder, can anyone give me a better solution to do this in Torch?

by Amir at May 26, 2016 10:01 PM


Im endlosen Kampf von Oracle gegen Google hat die Jury ...

Im endlosen Kampf von Oracle gegen Google hat die Jury jetzt entschieden, dass Googles Benutzung der Java-APIs "Fair Use" ist.

Damit ist Oracle gescheitert mit ihrem Versuch, bei Google mit beiden Händen kräftig in den Geldspeicher zu greifen.

Man kann gar nicht überbewerten, wie wichtig diese Entscheidung war. Da hingen Milliarden von Dollars in dem konkreten Verfahren dran, und die ganze Industrie stand auf dem Teppich, den Oracle da wegziehen wollte. Man stelle sich mal vor, was dann plötzlich für alle möglichen Leute Nachforderungen stellen könnten. Du benutzt einen Browser, der Javascript unterstützt? Das API ist von mir! Rück Kohle rüber! Autoren irgendwelcher Bibliotheken hätten sich gegenseitig verklagen können. Das wäre eine Abmahnwelle von biblischen Dimensionen geworden.

Und warum bei Programmiersprachen stoppen? Das betrifft ja auch Protokolle im Internet und anderswo. Plugin-Schnittstellen. Alleine SGI mit OpenGL hätte einmal die ganze Industrie zur Kasse bitten können.

May 26, 2016 10:00 PM


Passing optional callback into Swift function

I'm learning Swift lang, but I cannot pass optional callback argument into function:

func dismiss(completion: () -> Void) {
    if (completion) {
        return self.dismissViewControllerAnimated(true, completion: completion)

This shows me an error - Type () -> Void does not conform to protocol 'LogicValue'

Any suggestions?

by Kosmetika at May 26, 2016 10:00 PM


Can I use PyAlgoTrade for Forex?

Is it possible to use PyAlgoTrade for Forex related algorithms? If it is, please point me in the right direction on how to get PyAlgoTrade to work on Forex data.

by Ian Whittington at May 26, 2016 09:56 PM

DragonFly BSD Digest

BSDNow 143: One small step for DRM, one giant leap for BSD

BSDNow 143 has the usual roundup of news, plus a conversation with Matthew Macy about graphics improvements in FreeBSD.

We need DragonFly people interviewed, since DragonFly graphics improvements have been leading the pack, so to speak.  I’m linking to the Jupiter Broadcasting site again since I don’t see this episode up on the BSDNow site yet.

by Justin Sherrill at May 26, 2016 09:36 PM



What is a monad?

Having briefly looked at Haskell recently, what would be a brief, succinct, practical explanation as to what a monad essentially is?

I have found most explanations I've come across to be fairly inaccessible and lacking in practical detail.

by ljs at May 26, 2016 09:13 PM



Why is this output wrong ? - Sequential Consistency

The way I understand sequential consistency model the output marked as wrong should be valid, what am I missing ? enter image description here

by Computerphile at May 26, 2016 09:10 PM



Deduce expected exposure profile from option/structure delta?

I am thinking about whether there exists a relationship between the delta of an option (or any structured derivative) and it's expected positive/negative exposure?

PFE of a Forward

An intuitive question would be the following: A Foward has a Delta of 1 and given the above exposure profile and the Delta of an Option with the same underlying, can I deduce that the exposure profile of the Option equals Delta * Forward_Exposure?

However, after running some simulations I see that this is not the case, part of the reason being (I think) that for exposure generation one simulates values for all relevant risk parameters and not just the one which corresponds to the Delta/sensitivity.

If there are any questions on Definitions of terms I used, I am happy to clarify. Image taken from Jon Gregory's book on CVA.

by Tim at May 26, 2016 08:54 PM


Longest path in a cyclic, directed and weighted graph

I am looking for the longest simple path in a directed, cyclic and weighted graph with positive and negative weights.

In my research so far I have found out that you need to generate -G from graph G and then run a shortest path algorithm on it to find the longest path in G. Of course this won't work if -G contains negative cycles. But that's excatly what I need to do. Consider following graph (which I created quickly in paint..):

enter image description here

The red letters are just there for identification. The red numbers are the value of the vertex and the blue numbers are the edge weights. The edge weights are generated with the following formula:

$$ Weight = \frac{destination}{source} $$

I am looking for the longest path from A to F - but the longest simple path. That means that a vertex must not be visited more than once. By simply looking at it I can see that it is A - B - E - F. But if I tried to generate -G and then run the Bellman-Ford algorithm on it it would fail because there would be negative cycles like B - E - D - B.

I am aware that it is not possible if you look for paths in which a vertex can be visited more than once. But if you look for simple paths only, is it still not possible?

by Bobface at May 26, 2016 08:43 PM

Is a PDA's stack size linear bounded in input size?

I was thinking as follows: At each step, a PDA can put arbitrary many symbols onto the stack. But this number is constant for every individual PDA, so it can't be more than, say, $k$ symbols per step. Using only regular transitions, the stack can rise to maximally (more or less) $kn$ symbols in a run on an input sized $n$.

But what about $\epsilon$-transitions? Is there a simple argument why their maximum number should as well be independent of the input size?

So, in short: Is a PDA's stack size linear in the input size?

by lukas.coenig at May 26, 2016 08:35 PM



Constraint on Universal set of hash functions

I was reading hashing from CLRS. In it author says:

Let $\mathscr{H}$ be a finite collection of hash functions that map a given universe $U$ of keys into the range ${0,1,...,m-1}$. Such a collection is said to be universal if for each pair of distinct keys $k,l\in U$, the number of hash functions $h\in\mathscr{H}$ for which $h(k)= h(l)$ is at most $|\mathscr{H}|/m$.

So basically in universal set of hash functions, the number of hash functions a finite collection of hash functions is said to be universal if number of hash functions $h$ for which $h(k)=h(l)$ is at most $\frac{\text{number of hash functions}}{\text{size of hash table}}=\frac{|\mathscr{H}|}{m}$

However next the author says:

In other words, with a hash function randomly chosen from $\mathscr{H}$, the chance of a collision between distinct keys $k$ and $l$ is no more than the chance $1/m$ of a collision if $h(k)$ and $h(l)$ were randomly and independently chosen from the set $\{0,1,...,m-1\}$.

I didnt get the last part "$h(k)$ and $h(l)$ were randomly and independently chosen from the set $\{0,1,...,m-1\}$". How $|\mathscr{H}|=$[$h(k)$ and $h(l)$ were randomly and independently chosen from the set $\{0,1,...,m-1\}$]

by anir at May 26, 2016 08:25 PM

Time complexities [duplicate]

This question already has an answer here:

I have a problem that I solved, but I'm unsure if my answers are correct or not. Below is the question, my solution is at the end. Any help would be appreciated :)

f(n) element of O(log(n))
g(n) element of theta(n)

Prove if the below are true are false, provide a counter example for false.

1) g(n) element of omega(f(n))
2) f(n)g(n) element of O(nlog(n))
3) f(n)g(n) element of theta(nlog(n))

1) False, g(n) has a tight bound at (n). If it had a lower bound of f(n), then g(n) should allow log(n), but that's too high of a complexity.

2) True

3) False, f(n) having an upper bound of log(n) implies it could be even faster. Setting a tight bound contradicts that wherein it cant go faster.

by Andrew Raleigh at May 26, 2016 08:25 PM



Amazon ElastiCache Update – Export Redis Snapshots to Amazon S3

Amazon ElastiCache supports the popular Memcached and Redis in-memory caching engines. While Memcached is generally used to cache results from a slower, disk-based database, Redis is used as a fast, persistent key-value store. It uses replicas and failover to support high availability, and natively supports the use of structured values.

Today I am going to focus on a helpful new feature that will be of interest to Redis users. You already have the ability to create snapshots of a running Cache Cluster. These snapshots serve as a persistent backup, and can be used to create a new Cache Cluster that is already loaded with data and ready to go. As a reminder, here’s how you create a snapshot of a Cache Cluster:

You can now export your Redis snapshots to an S3 bucket. The bucket must be in the same Region as the snapshot and you need to grant ElastiCache the proper permissions (List, Upload/Delete, and View Permissions) on it. We envision several uses for this feature:

Disaster Recovery – You can copy the snapshot to another environment for safekeeping.

Analysis – You can dissect and analyze the snapshot in order to understand usage patterns.

Seeding – You can use the snapshot to seed a fresh Redis Cache Cluster in another Region.

Exporting a Snapshot
To export a snapshot, simply locate it, select it, and click on Copy Snapshot:

Verify the permissions on the bucket (read Exporting Your Snapshot to learn more):

Then enter a name and select the desired bucket:

ElastiCache will export the snapshot and it will appear in the bucket:

The file is a standard Redis RDB file, and can be used as such.

You can also exercise this same functionality from your own code or via the command line. Your code can call CopySnapshot while specifying the target S3 bucket. Your scripts can use the  copy-snapshot command.

This feature is available now and you can start using it today! There’s no charge for the export; you’ll pay the usual S3 storage charges.



by Jeff Barr at May 26, 2016 08:20 PM

Planet Emacsen

Irreal: Emacs as a (Python) IDE

A couple of years ago, Drew Werner gave a nice talk to the New York Emacs Meetup group on An Intelligent Python IDE with Emacs, Projectile, and Jedi. I came across a reference to it today and watched the video of it. It's well worth the 47 and half minutes even if you're not a Python user.

Werner's idea is that you can take his basic method of setting up a Python IDE and use it to set up an IDE for any language. Indeed, the first two components are pretty much language independent. He first sets up Projectile to handle your workflow at the project level. Then he installs auto-complete to provide context sensitive completion. The problem is that auto-complete doesn't understand Python so its completion is rudimentary.

He solves that problem by adding EPC and Jedi, which provide Python language parsing and enable auto-complete to provide much better completions. Auto-complete has engines for other languages so with a little work you can build an IDE for many environments. See my post on Átila Neves' talk on setting up Emacs as a C++ IDE as an example.

If you're a Python user you should definitely watch the video. If you're not, you may still find the ideas that Werner talks about useful for enhancing your own workflow.

by jcs at May 26, 2016 07:58 PM


Error using ghyp-distribution function

I want to fit multivariate GH distribution on my data, and then generate simulations for that distribution. Using the instructions given in ghyp package, I wrote following lines of code in R.

> data(smi.stocks)
> fitted <- fit.ghypmv(data = smi.stocks, silent=TRUE)
> rghyp(n = 100, fitted)
Error in ghypCheckPars(param) : 
  (list) object cannot be coerced to type 'double'

However, it gives me the error as shown above.

Any thoughts on why is it happening?

by HasnainMamdani at May 26, 2016 07:54 PM

Conditional probability of geometric brownian motion

I created paths using GBM to implement The stochastic mesh method. But the method requires the conditional distribution, given some S(t) the probability of S(t+1).

I've searched and can't find this formula, Does someone know it? Is there any other way to calculate the conditional probability?


Let me explain a little better, I'm working on the stochastic mesh method The method require I create N paths over M steps of time. I choose GBM to do this, so in a single path simulation I have

$$ S = S_{0}, S_{1}, ... ,S_{M} $$

Next I need to calculate the continuation value for wich I need Weights

$$ w_{ij} = p_{ij} / \sum_k p_{kj} $$ $$ p_{ij} = P (S_{t + \bigtriangleup t} \in A \mid S{t} = x) $$ The last is the formula I'm looking for the GBM

by nmunozsi at May 26, 2016 07:49 PM


Custom kernels for SVM, when to apply them?

I am new to machine learning field and right now trying to get a grasp of how the most common learning algorithms work and understand when to apply each one of them. At the moment I am learning on how Support Vector Machines work and have a question on custom kernel functions.
There is plenty of information on the web on more standard (linear, RBF, polynomial) kernels for SVMs. I, however, would like to understand when it is reasonable to go for a custom kernel function. My questions are:

1) What are other possible kernels for SVMs?
2) In which situation one would apply custom kernels?
3) Can custom kernel substantially improve prediction quality of SVM?

by Vitali Halapjan at May 26, 2016 07:44 PM



In an example of "call options"

The following is an excerpt from Introduction to the Mathematics of Finance by Roman:

As a more concrete example, suppose that IBM is selling for $\$100$ per share at this moment. A $3$ month call option on IBM with strike price $\$102$ is a contract between the buyer and the seller of the option that says that the buyer may (but is not required to) purchase $\color{blue}{\bf 100}$ shares of IBM from the seller for $\$102$ per share at any time during the next $3$ months.

In this example is the number $\color{blue}{100}$ in blue arbitrarily picked by the author? Is it relevant to the concept of "call option"?

by Jack at May 26, 2016 07:35 PM

Interest rates - Swaptions implied volatility - Volatility anchoring with Black and with normal volatilities

In a LMM+ with displacement factor a volatility anchoring technique is used, i.e. a long term volatility assumptions is applied, derived from historic time series. Should I adjust this historic volatility (e.g. 10%) when using Black or the normal implied volatilities?

by BlueAngel at May 26, 2016 07:23 PM


Implementing batch normalization with tensorflow

I am trying to implement a batch normalization layer in tensorflow. I am having am having no problem running the train step of this using tf.moments to get the mean and variance. For test time, I'd like to set up an exponential moving average to track the mean and variance. I am trying to do it like this:

def batch_normalized_linear_layer(state_below, scope_name, n_inputs, n_outputs, stddev, wd, eps=.0001):
  with tf.variable_scope(scope_name) as scope:
    weight = _variable_with_weight_decay(
      "weights", shape=[n_inputs, n_outputs],
      stddev=stddev, wd=wd
    act = tf.matmul(state_below, weight)
    # get moments
    act_mean, act_variance = tf.nn.moments(act, [0])
    # get mean and variance variables
    mean = _variable_on_cpu('bn_mean', [n_outputs], tf.constant_initializer(0.0))
    variance = _variable_on_cpu('bn_variance', [n_outputs], tf.constant_initializer(1.0))
    # assign the moments
    assign_mean = mean.assign(act_mean)
    assign_variance = variance.assign(act_variance)

    act_bn = tf.mul((act - mean), tf.rsqrt(variance + eps),"_bn")

    beta = _variable_on_cpu("beta", [n_outputs], tf.constant_initializer(0.0))
    gamma = _variable_on_cpu("gamma", [n_outputs], tf.constant_initializer(1.0))
    bn = tf.add(tf.mul(act_bn, gamma), beta)
    output = tf.nn.relu(bn,
  return output, mean, variance

Where _variable_on_cpu is defined as:

def _variable_on_cpu(name, shape, initializer):
  """Helper to create a Variable stored on CPU memory.

    name: name of the variable
    shape: list of ints
    initializer: initializer for Variable

    Variable Tensor
  with tf.device('/cpu:0'):
    var = tf.get_variable(name, shape, initializer=initializer)
  return var

I believe that I am setting

assign_mean = mean.assign(act_mean)
assign_variance = variance.assign(act_variance)

Incorrectly, but I am not sure how. When I use tensorboard to track these mean and variance variables, they are just flat that their initialized values.

by wgrathwohl at May 26, 2016 07:22 PM



Modelling a dependency of multiple transitions on data in one place

We are modeling our process using a colored Petri net. One of the limitations we have is that when multiple transitions depend on one place, only one of those transitions will fire because then the token and data is consumed.

How can we model our process, or what type of Petri net property can we use, to facilitate the fact that multiple transitions can depend on one place and all have to be able to consume the same token?

by Erwin Rooijakkers at May 26, 2016 06:57 PM

Identify webmail traffic from HTTPS [on hold]

I am seeing encrypted (HTTPS) coomunication(payload is http) using wireshark.HTTPS communication(http used in payload) can correspond to various activities like web browsing,VoIP,Web Downloads,Web-mail Traffic(, ,etc.

Is there a technique to identify out/differentiate the webmail traffic from the rest of HTTPS traffic based on some flow,periodic communication properties.

by user10012 at May 26, 2016 06:57 PM



$N$-th term of a quadratic recurrence [on hold]

I have a sequence defined as follows:

$a_1 = A$

$a_n = a_{n-1}^2 + B$

$A, B$ are positive integers. I want to design an algorithm, which would calculate $N$-th term of this recurrence modulo $M$ for very big values of $N$, like $10^9$. If it was linear recursion, I would use matrix multiplication, but this time it won't work.

by J. Abraham at May 26, 2016 06:44 PM



Inconsistent results with Q-learning

I'm running a q-learning agent in tic tac toe against a minimax agent and measuring the tie rate (since you can never win against minimax). When I run 10,000 training games, I've found that there seem to be two drastically different patterns: enter image description here 1) Q-learning agent has a bad start and ends up with an overall tie rate of between 0.06 and 0.09. enter image description here 2) Q-learning agent improves consistently and has an overall tie rate of around 0.7.

Why is this happening? Is the q-learning algorithm getting stuck in some local optima? Is this normal behavior for q-learning?

by muri at May 26, 2016 06:33 PM


linear-regression with torch7 demo

I am following this demo-

feval = function(x_new)
   -- set x to x_new, if differnt
   -- (in this simple example, x_new will typically always point to x,
   -- so the copy is really useless)
   if x ~= x_new then

   -- select a new training sample
   _nidx_ = (_nidx_ or 0) + 1
   if _nidx_ > (#data)[1] then _nidx_ = 1 end

   local sample = data[_nidx_]
   local target = sample[{ {1} }]      -- this funny looking syntax allows
   local inputs = sample[{ {2,3} }]    -- slicing of arrays.
   local loss_x = criterion:forward(model:forward(inputs), target)
   model:backward(inputs, criterion:backward(model.output, target))
   return loss_x, dl_dx

I have a few doubts in this function

  1. Where is the argument x_new (or its copy x) used in the code?
  2. What does _nidx_ = (_nidx_ or 0) + 1 mean?
  3. what is the value of nidx when the function is first called?
  4. Where is dl_dx updated? Ideally it should have been just after local loss_x is updated, but it isnt written explicitly


My point#4 is very clear now. For those who are interested- (source- deep learning, oxford, practical 3 lab sheet)

enter image description here

by user144608 at May 26, 2016 06:24 PM


SABR Implied Volatility and Option Prices

I am trying to understand the SABR model. Specifically, I am having difficulty to understand how to calibrate the model parameters, that is,

  • initial variance,
  • volatility of variance,
  • exponent for the forward rate, and
  • correlation between the Brownian motions.

Small example on the above would be useful. Thanks in advance

by user1131338 at May 26, 2016 06:07 PM

Cointegration for forex using ARMA model to forecast the spread

I am working on an automatized quantitative strategy that use cointegration in Forex. I am backtesting this strategy in Python.

Please see below the python file:

See below the data that I used to backtest:

My Algorithm is:

  • Read File
  • Transform the data with the log return
  • Treat Outliers
  • Realize the linear regression
  • Test for cointegration with ADF test
  • If the spread is stationary then apply the best ARMA model
  • Forecast and using beta calculation to control the risk

The equity curve of this strategy is: Equity Curve of Cointegration Strategy

Could you please help to understand how can I improve this algo?


by David Hoareau at May 26, 2016 06:00 PM

Calibration Merton Jump-Diffusion

Consider the following SDE $dV_t = rV_tdt +\sigma V_t dW_t + dJ_t$

where $J_t$ is a Compound poisson process with log-Normal jump size $Y_i$.

How am I supposed to calibrate this model to CDS spreads? The problem of course is there doesn't exist an analytical formula for the survival probability function...

[EDIT] Well, what I'd need is in fact the distribution of the first hitting time, that is

$\tau = \inf\{t>0 : V_t = x\}$

where x is some barrier $\in R$

$Pr\left\{V_0 e^{(r-(1/2) \sigma^2)t + \sigma W_t + \sum_{i=0}^{N(t)} Y_i} = x \right\} =\\Pr \left\{(r-(1/2)\sigma^2)t + \sigma W_t + \sum_{i=0}^{N(t)}Y_i =\ln(x/V_0) \right\} = \\ Pr\left\{\sigma W_t + \sum_{i=0}^{N(t)}Y_i =\ln(x/V_0) - (r-(1/2)\sigma^2)t \right\}$

The problem is here...I don't know which distribution comes out in the left hand side

by Vittorio Apicella at May 26, 2016 05:54 PM

Is This A Viable Alternative Options Pricing Method?

i'm currently a high school student who hasn't gone past Algebra II, and thus I have minimal Calculus knowledge. I know the basics of Integration and Derivation (drop the coefficient, raise to the coefficient), Infinite series, and some basics of random walks (ex. t^0.5 = sigma).

I am unable to solve the Black Scholes equation due to it's use of SDEs. I am however familiar with probability theory and have attempted to approximate the CDF of a Gaussian distribution using a Monte Carlo Method. I plugged in the standard deviation and strike price as (strike/sigma) then multiplied by the payout as a percent and added the risk free rate (10 year treasury) times days till expiration divided by 365 (365 days in a year duh). This gave me a value almost equal to the price on the open market. I assume that the inaccuracy is due to loss of precision when retrieving values such as the risk free rate or the asset price.

What I want to know is whether or not this method of pricing options is viable, or was the semi-successful calculation due to chance? Basically I am seeking peer review since I don't know anyone working in finance who can verify whether I messed up or not.

My proposed method:

  1. a function that generates a random number using the Gaussian distribution and if the number is greater than the strike price it adds 0.0001 to an increment variable, otherwise it does not add anything. Repeat 10k times and this approximates the probability of hitting the strike where each random number is one basis point significance.
  2. Multiply this probability by the payout in order to find the price where the expected return of the option is zero.
  3. Add the interest accrualed by the risk free rate over the period of time that the option is open interest because for some reason this is included in things like Sharpe Ratio and the Black Scholes Model.
  4. That value is the option price calculated by this method that seems to fit close with the price of options trading on exchange.

by FreepromTech at May 26, 2016 05:07 PM


Processing email in R [categorization supervised learning]

I'm working on a project in R where i have roughly 1200 emails from a company, most of them are labeled {1 or 2} which are the types of requests. ( roughly 1000 mails from category 1 and 200 in category 2 )

My goal is to automatically categorize a new mail.

But after a lot of pre-processing (parsing, remove stopwords etc.), and tried some typical algorithms (SVM, decision tree ...) on a document term matrix, my confusion matrix was horrible with a lot of False positive and false negative and not so much in false negative ( only 1 with SVM )...

So I'm wondering how could I improve my results ? Do i need to use oversampling, bigram ? I guess the problem is that the topics of the two categories are really close...


by MarcelRitos at May 26, 2016 05:07 PM

Regularized logistic regression code in matlab

I'm trying my hand at regularized LR, simple with this formulas in matlab:

The cost function:

J(theta) = 1/m*sum((-y_i)*log(h(x_i)-(1-y_i)*log(1-h(x_i))))+(lambda/2*m)*sum(theta_j)

The gradient:

∂J(theta)/∂theta_0 = [(1/m)*(sum((h(x_i)-y_i)*x_j)] if j=0

∂j(theta)/∂theta_n = [(1/m)*(sum((h(x_i)-y_i)*x_j)]+(lambda/m)*(theta_j) if j>1

This is not matlab code is just the formula.

So far I've done this:

function [J, grad] = costFunctionReg(theta, X, y, lambda)

J = 0;
grad = zeros(size(theta));

temp_theta = [];

%cost function

%get the regularization term

for jj = 2:length(theta)

    temp_theta(jj) = theta(jj)^2;

theta_reg = lambda/(2*m)*sum(temp_theta);

temp_sum =[];

%for the sum in the cost function

for ii =1:m

   temp_sum(ii) = -y(ii)*log(sigmoid(theta'*X(ii,:)'))-(1-y(ii))*log(1-sigmoid(theta'*X(ii,:)'));


tempo = sum(temp_sum);

J = (1/m)*tempo+theta_reg;

%theta 0

reg_theta0 = 0;

for jj=1:m
 reg_theta0(jj) = (sigmoid(theta'*X(m,:)') -y(jj))*X(jj,1)

reg_theta0 = (1/m)*sum(reg_theta0)

grad_temp(1) = reg_theta0

%for the rest of thetas

reg_theta  = [];
thetas_sum = 0;

for ii=2:size(theta)
    for kk =1:m
        reg_theta(kk) = (sigmoid(theta'*X(m,:)') - y(kk))*X(kk,ii)
    thetas_sum(ii) = (1/m)*sum(reg_theta)+(lambda/m)*theta(ii)
    reg_theta = []

for i=1:size(theta)

    if i == 1
        grad(i) = grad_temp(i)
        grad(i) = thetas_sum(i)

And the cost function is giving correct results, but I have no idea why the gradient (one step) is not, the cost gives J = 0.6931 which is correct and the gradient grad = 0.3603 -0.1476 0.0320, which is not, the cost starts from 2 because the parameter theta(1) does not have to be regularized, any help? I guess there is something wrong with the code, but after 4 days I can't see it.Thanks

by Pedro.Alonso at May 26, 2016 05:04 PM



Is DTIME(n) = DTIME(2n) true? (unlike Rosenberg's results)

I'm reading Homer and Selman's "Computability and Complexity" book. In some Corollary 5.3 it says:

For all ε‎ > 0, DTIME(O(n)) = DTIME( (1+ε‎‎) n).

Now I'm confused with this corollary and Rosenberg's result (p87 in the same book):

DTIME(n) ≠ DTIME(2n).

Why can we not use the corollary to show that DTIME(n) = DTIME(2n)?

by math at May 26, 2016 04:16 PM


A simple problem whose decidability is not known

I am preparing for a talk aimed at undergraduate math majors, and as part of it, I am considering discussing the concept of decidability. I want to give an example of a problem which we do not currently know to be decidable or undecidable. There are many such problems, but none seem to stand out as nice examples so far.

What is a simple-to-describe problem whose decidability is open?

by Lev Reyzin at May 26, 2016 04:12 PM

Separation of a preprocessed polyhedron and a plane

I have serious trouble understanding one step in the paper of Dobkin and Kirkpatrick about the separation of polyhedra. I am trying to understand this version:

It argues that after we know the best separation of $P_{i}$ and $S$, realized by $r_i$ and $s_i$, we can find the best separation of $P_{i-1}$ and $S$ in $O(1)$ steps. This is done in the following way. We take the plane parallel to $S$ through $r_i$ and cut $P_{i-1}$ into two parts with it. On one side, the closest point to $S$ is $r_i$ and on the other we have an ``elementary'' polyhedron that we can check in $O(1)$ time. My problem is - how do we find this elementary polyhedron? Note that the degree of $r_i$ in $P_{i-1}$ might be unbounded.

In the pdf to prove Thm 5.1 from page 9, they use Thm 3.1 from page 4, which makes the whole thing harder to follow.

by domotorp at May 26, 2016 04:05 PM


Object Detection with OpenCV and Machine Learning (OpenCV)

Currently, I'm programming a module for a robot. Basically, the required task is to grab the objects to a specific location in a box.

Details about the 5 objects and the box are in the attached image below (sorry for not enough reputation):

Box and objects

Question: How can I obtain the position and rotation of each individual object (including the box) , using Machine Learning?

Some info:

  • The color of the five objects and the box are fixed.
  • Machine Learning is the requirement for this project.
  • For the moment, there will be no object of uninterest in the background.
  • The environment may be prone to lightning.

I'm really new to this field, and any idea is welcome.

NOTE: I have checked the answer here. The result is for circle only, and I want to find methods for other objects also.

by User0402 at May 26, 2016 04:02 PM

Planet Emacsen

Raimon Grau: keysnail plugin to navigate relations

I'm using keysnail as my emacsy browser. It's heavier than conkeror, but I'd say it's also more compatible with common plugins (adblockers, cookiemanagers, RES,...)

A feature not present in keysnail (until now) was the ability to navigate through a hierarchy of a web without reaching to the actual link (if any).

So I wrote this super simple keysnail-navigate-relations plugin that provides 3 commands (go-next, go-prev, go-up) and 3 keybindings (]], [[, ^) so you can navigate much more easily through structured webs.

Possible uses for it are:

As always, feedback is more than welcome.

by Raimon Grau ( at May 26, 2016 03:58 PM


What is the need of re-sampling the image in HOG features?

I read Dalal and Triggs paper for HOG description and a blog by Chris McCormick regarding the same. The blog says that the image needs to be re-sampled at different scales to recognize different person.

My question is: Already we have a window which we place on the image having a size of 64*128 and which slides over the image. Then why re-sampling instead of sliding the whole window over the image which can detect the persons instead. ?

Please rectify if I am wrong, thanks in advance !!

by Ashutosh Mishra at May 26, 2016 03:47 PM

Facial Expression Recognition Data Preparation for CNN

I am quite new to the area of facial expression recognition and currently I'm doing a research on this via Deep Learning specifically CNN. I have some questions with regard to preparing and/or preprocessing my data.

I have segmented videos of frontal facial expressions (e.g. 2-3 seconds video of a person expressing a happy emotion based on his/her annotations).

Note: expressions displayed by my participants are quite of low intensity (not exaggerated expressions/micro-expressions)

General Question: Now, how should I prepare my data for training with CNN (I am a bit leaning on using a deep learning library, TensorFlow)?

Question 1: I have read some deep learning-based facial expression recognition (FER) papers that suggest to take the peak of that expression (most probably a single image) and use that image as part of your training data. How would I know the peak of an expression? What's my basis? If I'm going to take a single image, wouldn't some important frames of the subtlety of expression displayed by my participants be lost?

Question 2: Or would it be also correct to execute the segmented video in OpenCV in order to detect (e.g. Viola-Jones), crop and save the faces per frame, and use those images as part of my training data with their appropriate labels? I'm guessing some frames of faces to be redundant. However, since we knew that the participants in our data shows low intensity of expressions (micro-expressions), some movements of the face could also be important.

I would really appreciate anyone who can answer, thanks a lot!

by Renz at May 26, 2016 03:38 PM



How to modify Alexnet Architecture for GreyScale Images?

I need to extract features from fc7 from alexnet architecture. But I only have greyscale images. How can I modify alexnet for the same ?

by Abhinav Bhatt at May 26, 2016 03:29 PM


A2: Analog Malicious Hardware

LtU’s article on it has some additional text commenting on it.


by zedoary at May 26, 2016 03:12 PM


Neural network output [on hold]

I have my neural network trained with decimal values of output, but my app needs it to return prediction on 4 bit values. I'm always getting highest prediction on {0,1,0,0}, what should i do?

by Endyanod at May 26, 2016 03:04 PM


How to perform risk budgeting for non-linear portfolios?

I am using this question to compute optimal weights following a risk budgeting approach. The problem is I am using non-linear portfolios (options,equity,fixed income,fx).

What I am looking for is that each asset class contributes the same amount of risk to the portfolio, and I am sure I can't use the regular approach if I have derivatives in my portfolio.

by FernandoG at May 26, 2016 02:54 PM



Modeling the Stock Market

Hi I was wondering what is the model that best describes the price movement of the stock market?

  1. A Brownian motion Process with drift?
  2. An Ornstein Uhlenbeck_process? (where the long term mean is described by an ax+b function for the drift component)
  3. A model that uses stochastic volatility?
  4. Jump Diffusion Models?
  5. Or something else?


by hammond at May 26, 2016 02:37 PM



Is there a meaningful difference between $O(1)$ and $O(\log n)$?

A computer can only process numbers smaller than say $2^{64}$ in a single operation, so even an $O(1)$ algorithm only takes constant time if $n<2^{64}$. If I somehow had an array of $2^{1000}$ elements to process, even an $O(1)$ operation such as an index lookup will start to take longer as the index has to be calculated in multiple operations. I think it will take at least $O(\log n)$.

Similarly even if an algorithm has $O(\log n)$ complexity, $\log n$ cannot possibly grow larger than about a hundred, so could be ignored as no larger than a small constant.

So, is it really meaningful to treat $O(1)$ and $O(\log n)$ as different?

The same applies of any difference of $\log n$, like between $O(n)$, $O(n\log n)$ and $O(n/\log n)$.

by Tor Klingberg at May 26, 2016 02:30 PM


Ich fragte die Tage, wieso sich Obama eigentlich bei ...

Ich fragte die Tage, wieso sich Obama eigentlich bei seinem historischen Hiroshima-Besuch nicht für die Atombombe entschuldigen will.

Heute hat die Süddeutsche eine einleuchtende Erklärung dafür: Die Japaner haben darum gebeten, dass er es nicht tut.

Nanu? Nun, wie sich rausstellt, wünscht sich die japanische Bevölkerung mit knapper Mehrheit (56%) eine Entschuldigung, aber die Politik hat im Vorfeld schon abgewunken, sie wollen keine. Und zwar deshalb, weil die Japaner die Aufarbeitung ihrer eigenen Kriegsverbrechen bisher wunderschön unter den Teppich der Trümmer von Hiroshima kehren konnten. Auf der (länger widerlegten) Lüge, dass die Bombe notwendig war oder gar unterm Strich Leben gerettet habe, haben Japan und die USA eine Militärallianz aufgebaut, und haben die Japaner sich in ihrem Selbstverständnis vom Täter zum Opfer umgedichtet nach dem Krieg. Das wäre jetzt alles gefährdet.

Wie widersprüchlich die Nische ist, in der es sich Japan bequem gemacht hat, haben sie in diesem Absatz schön zusammengefasst:

In internationalen Gremien tritt Japan - als das einzige Opfer von Atombomben - für die nukleare Abrüstung ein. Aber seine Sicherheitspolitik stützt sich wesentlich auf den atomaren Schutzschild der USA. Die Regierung in Tokio nutzt die Hibakusha aus, um Japans Opferrolle zu belegen, vernachlässigt sie aber sonst. Viele Betroffene mussten sich Hilfe vor Gericht erstreiten, manchen wurde sie verweigert.

May 26, 2016 02:00 PM


How To Change Attributes in List of CustomObject in Scala?

I am having following Case class

case class BIC( id: Option[Int], name: String, description: Option[String],
                                    bId: Option[Int], amount: BigDecimal,
                                    createdBy: Option[Int], createdAt: Option[DateTime] )

my REST-API url is like following

POST    /api/v3/user/2/bid/23/create

Now userId = 2 and bId = 23 are coming From URL path params I am also having List of BIC incoming from POST Request JSON which does not contain userId and bId

I am directly injecting incoming JSON to List[BIC], but all the objects in this list doesn't have createdBy (userId) and bId set.

if I want to set these 2 parameters in each BIC objects of List[BIC] what is the approach I should follow?

            val lsttemp:List[BIC] = bicList
            val lst:List[BIC]
              x.createdBy = userId

I am trying above code but its not working.

by praxnet at May 26, 2016 01:46 PM



Vectorized Algorithm for finding the Shortest Path in a Graph

I know that you can calculate the shortest path in a vectorized fashion using Floyd-Warshall, e.g. like proposed by Han and Kang, however I want the matrix, they call "via", the actual route taken through the graph.

Is it possible to retrieve this "via" matrix from a similar vectorized algorithm?

Actually I do not need all-pairs shortest path, but have two nodes, that I just wish to calculate the distance between, however I want to do it vectorized. I cannot seem to find algorithms, that allow me to get the actual path, and not just the distance, when doing it vectorized.

Is it at all possible?

by Mads Ohm Larsen at May 26, 2016 01:42 PM

Why do we set conditions f(n) ≥ n resp. f(n) ≥ log(n) the Time resp. Space Hierarchy?

In the Space (Time) Hierarchy Theorem and also fully space (time) constructibility of two function we have the condition: being greater than $log(n)$ (being greater than $n$).

Why do we have these conditions?

by math at May 26, 2016 01:37 PM


Programming language semantics

When it comes to programming language semantics, is it the case that the following are equivalent :

  1. "declarative", "denotational" and "model-theoretical"
  2. "operational", "procedural" and "proof-theoretical" ?

Also, is there such a thing as "evaluation of a semantics" and does this apply to both classes above ? For example, if an algorithm is a special type of evaluation for an operational semantics what would be the equivalent in the declarative world, i.e can one say for instance that a partiular categorical semantics is an evaluation ? What is the difference between the semantics and evaluation ?

by kittykatron at May 26, 2016 01:28 PM


How can I go about applying azure machine learning algorithms to sales order report?

I have 2 month sales record. i need to predict next weak sales history using azure machine learning to archive this concept. i need sample code or example images.

by Ravi G at May 26, 2016 01:15 PM



Are all pseudo-random number generators ultimately periodic?

Are all pseudo-random number generators ultimately periodic? Or are they periodic at all in the end?

By periodic I mean that, like rational numbers, they in the end generate a periodic subsequence...

And pseudo-random means algorithmic/mathematical generation of random numbers...

by user13675 at May 26, 2016 01:04 PM

Lamer News



How do I train a neural network from many sets of data?

I have many sets of related data that I want to use to train a neural network. The data is from racing pigeons that fly a set distance. The inputs could be weight, age, size, wing span, sex, distance, time etc. sampled lets say every minute. I am trying to predict the typical time a pigeon, given the inputs, would take in completing a race. The input data would be normalised to improve the efficiency of the network. A typical line of data may look like for weight, age, size, wing span, sex, distance, time would be -1.34, -1.23, -0.49, 0.47, 0.97, -1, -1.44, -0.75 There would be many of these lines as the race progresses and at the end there would be a time result for each pigeon. Ie 3 hours 5 minutes. I would then have a complete set of data for this race but obviously each race will be different so I need a way of training it for many races. My question is, because each race is a discreet event what methods should I use to train the network. What I am confused about is this isn't a continuous stream of data that I can just let it get to work on. Should I train it with one set\race until it has learnt that and then move onto the next one etc or is there a methodology to use for this type of problem.


by SpeedOfSpin at May 26, 2016 12:34 PM


Planet Theory

Theory Jobs 2016

In the fall we point to theory jobs, in the spring we see who got them. How is the CS enrollment explosion affecting the theory job market? We've seen some big name moves but that's only part of the picture.

Like last year and years past I created a fully editable Google Spreadsheet to crowd source who is going where. Ground rules:
  • I set up separate sheets for faculty, industry and postdoc/visitors.
  • People should be connected to theoretical computer science, broadly defined.
  • Only add jobs that you are absolutely sure have been offered and accepted. This is not the place for speculation and rumors.
  • You are welcome to add yourself, or people your department has hired.
This document will continue to grow as more jobs settle. So check it often.


by Lance Fortnow ( at May 26, 2016 12:23 PM

Planet Theory

Sad news: Zoltán Ésik passed away yesterday

Our good colleague and friend Zoltán Ésik passed away suddenly yesterday afternoon in the hotel room where he was staying with his wife during a visit to our group at Reykjavik University. He had delivered a survey talk at Reykjavik University on the Equational Logic of Fixed Point Operations on Tuesday and we were making plans for the coming days.

Zoltán was a scientist of the highest calibre; he was one of the prime movers in the monumental development of Iteration Theories, amongst many other achievements. He had served the EATCS as a member of its Council for many years, had recently been named as one of the 2016 EATCS Fellows and he had been a member of the Presburger Award Committee for the last two years.

There is much more to be said about his scientific achievements and warm personality, but this will have to wait for better times.

R.I.P. Zoltán. 

by Luca Aceto ( at May 26, 2016 12:17 PM


Proof of correctness for the greedy solution of uva10249 The Grand Dinner [on hold]

This problem is at here.

Given $M$ teams and $N$ tables, can we find a way to divide the teams over the tables such that no two members of the same team end up at the same table? The number of team members of each team is given as $m_1,\dots,m_M$, and likewise the capacity of the tables is also specified $n_1,\dots,n_N$.

A greedy solution is: assign one member of the team with currently most people to the table with currently biggest capacity.(After the assignment, both the number of people and capacity of table will be updated.) How to prove the correctness of this greedy solution?

If this solution is not correct, what is the correct one? How can you prove its correctness?

by YangZhao at May 26, 2016 12:04 PM


how does theano.scan's updates work?

theano.scan return two variables: values variable and updates variable. For example,

a = theano.shared(1)

values, updates = theano.scan(fn=lambda a:a+1, outputs_info=a,  n_steps=10)

However, I notice that in most of the examples I work with, the updates variable is empty. It seems only when we write the function in theano.scan is a certain way, we get the updates. For example,

a = theano.shared(1)

values, updates = theano.scan(lambda: {a: a+1}, n_steps=10)

Can someone explain to me why in the first example the updates is empty, but in the second example, the updates variable is not empty? and more generally, how does the updates variable in theano.scan work? Thanks.

by user5016984 at May 26, 2016 12:02 PM


Rust Sysroots for Easier Embedded Projects

Rust now has usable support for Rust sysroots. This makes it must easier and useful to use cargo to build your embedded projects. You should be able to even use stable rust once 1.9 is out, but you will have to use a nightly for now.

Build libcore for your target

Use the instructions in my last blog post here to build a libcore for the target platform.

Make a sysroot directory structure:

The directory naming and structure in your sysroot is important. Put the libraries you built in the previous step in:


Replace $target_name with your desired target, in my case this is thumbv7em-none-eabi so I have:


(you can name my-sysroot whatever you want)

Set RUSTFLAGS to specify the desired target and sysroot

You have two options for specifying the sysroot. You can either use the RUSTFLAGS environment variable:

RUSTFLAGS=”–sysroot=my-sysroot” cargo build –target thumbv7em-none-eabi

or you can specify these in the .cargo/config file:

target = "thumbv7em-none-eabi"
rustflags = ["--sysroot", "my-sysroot"]


and then just do:

cargo build

To build your cargo library against your desired target and sysroot.

I usually have a few more flags. Something like:

target = "thumbv7em-none-eabi"
rustflags = ["-C", "opt-level=2", "-Z", "no-landing-pads", "--emit", "obj", "--sysroot", "my-sysroot"]

and then build with a:

cargo build --verbose

The --emit obj causes cargo to emit an object file as an artifact of the build which you can link into an existing C embedded project. The --verbose shows the arguments passed to rustc under the hood which is useful to verify that cargo is calling rustc as you expect.

The post Rust Sysroots for Easier Embedded Projects appeared first on Atomic Spin.

by Job Vranish at May 26, 2016 12:00 PM


Inverse Function Theorem for Recursive Functions

I need inverse function theorem for recursive functions. Has it been described? Any pointers will be greatly appreciated. Thanks!

by Kasthuri at May 26, 2016 11:57 AM


"Risk" Factor vs Double Sorts

With regards to a cross-sectional asset pricing (stocks) study, I am testing if one variable can explain another. One common approach to do this, is to use the double-sorting portfolio technique (sort on variable a into portfolios, then dependent on a - sort b into portfolio). This approach seems to be adequate for such a problem, if you have a large sample so that you get reasonable amount of dispersion in your sort variables.

Another approach I have seen in the literature is to create factors of your preferred explanatory variables by using the Fama French factor methodology (HML,SML etc), then sort you portfolios on the first variable (the variable you would like to explain) and run time series regression of a Fama French three factor model (or CAPM) augmented with your new factor.

I am wonder if there are any arguments for performing the second over the first approach? Could these approaches complement each other,

by Nicolai at May 26, 2016 11:54 AM


What is the difference between Multilabel and Multitask classification?

Please provide inputs on what is the fundamental difference between the Multilabel and Multitask classification. Is it possible to model a Multilabel problem as Multitask and vice-versa?

by Monika Gupta at May 26, 2016 11:47 AM

Fred Wilson


I am really into embed codes. I think the ability to take content and software from one app and inject it into another app is one of the coolest things about the Internet.

Biodigital is a 3D visual model of the human body. Think of it as Google Maps for the body.

I’ve had issues with my MCL in my left knee for the past five or six years. I slightly tore it and although it has healed, I still get pain in it from time to time.

So I went into Biodigital today to take a look at what the MCL actually looks like. Here it is:

That’s pretty cool, right?

If you want to get lost in the human body for the next hour, go check out Biodigital.

by Fred Wilson at May 26, 2016 11:22 AM



Merkt ihr, wie die Linkspartei völlig in der Belanglosigkeit ...

Merkt ihr, wie die Linkspartei völlig in der Belanglosigkeit versinkt? Früher waren die noch die Protestpartei, das ist heute die AfD.

Und in den Bundesländern, in denen die Linken an der Macht waren, haben sie es nicht geschafft, sich hinreichend von den anderen Parteien zu unterscheiden. Argumente für ihre Wiederwahl haben sie jedenfalls keine hinterlassen.

Angesichts des epischen Verlierertums bei den linken Parteien in Europa frage ich mich langsam, wie der Kommunismus jemals ganze Länder übernehmen konnte.

May 26, 2016 11:00 AM


How to build a cross currency swap pricer?

We're looking to build a pricer to convert a funding spread in a given currency over a specific funding basis e.g. 20 bps EUR 3m€ and convert it to a funding spread to a different currency with a different funding basis say USD 6m$L.

We're in the process of sourcing market swap data including discount factors for EONIA, FedFund and LIBOR for different tenors.

Looking for someone to help us with this, could even turn into a paid project, basically I'm totally lost! Thanks!

by Rob Taylor at May 26, 2016 10:47 AM


Does google prediction api work for image recogniton

I read the official documentation for the api, but I wanted to make sure that it's possible for it to perform object recognition in images. More specifically, my idea is to provide a lot of images of parking lots with the number of parking spots currently available. I wanna get a model to predict how many spots are available given an image of the parking lot.

Does anybody have previous experience with using the API for a similar goal?

by niko at May 26, 2016 10:26 AM

Incrementally trainable one class classifier

I'm working on a classification problem where I have data about only One Class, so I wanna classify between that "Target"class against all other possibilities which is the "Outlier" Class in incremental learning. So, I have found some libraries, but none of them support updating classifier.

Do you know any library that supports one-class classifier with updating pre-existed classifier especially in java or matlab?

by Arman at May 26, 2016 10:19 AM


Volatility Targeting, Leverage and Backtesting

I am trying to figure out how the best way to include volatility targeting / leverage in my backtests. Multiplying returns by intended leverage naturally increases volatility, so my pushing this up and down I can get the intended annual volatility target. Let's say 25%. Using a simple momentum strategy,

import pandas as pd
import numpy as np 

def crossover(df,col,lev):
    signals = pd.DataFrame(index=df.index) 
    signals['signal'] = 0 
    short_ma = pd.rolling_mean(df[col], 40, min_periods=1) 
    long_ma = pd.rolling_mean(df[col], 100, min_periods=1) 
    signals['signal'] = np.where(short_ma > long_ma, 1, 0) 
    df['signal'] = signals['signal'].shift(1) 
    df['ret'] = df[col].pct_change() * df['signal']
    ret = df.ret.dropna() * lev
    return ret

# see the link below for the raw file
ff = 'CRUDE_W_price.csv'
px = pd.read_csv(ff,parse_dates=True,index_col=0)
ret = crossover(px,'PRICE',lev=1.1) 
print ret.std()*np.sqrt(252)
print 'APR', ((**(252./len(ret)))-1
print 'Sharpe', np.sqrt(252.)*np.mean(ret)/np.std(ret)    

I get

Annual volatility 0.242445046081
APR 0.0734578424568
Sharpe 0.404703408221

The leverage I applied was 1.1. But what if there was an implicit leverage, such as 10:1 leverage for futures (through the margin) or 2:1 for stocks and ETFs? Should that have been included somehow in the calculations above? Or should I leave the calculation above the way it is, but use 1.1 / 10 = 0.11 as a way to allocate funds within my portfolio?


by user616 at May 26, 2016 10:13 AM


What is the need of re-sampling the image for HOG features?

I read Dalal and Triggs paper for HOG description and a blog Chris McCormick HOG regarding the same. The blog says that the image needs to be re-sampled at different scales to recognize different person.

My question is: Already we have a window which we place on the image having a size of 64*128 and which slides over the image. Then why re-sampling instead of sliding the whole window over the image which can detect the persons instead. ?

Please rectify if I am wrong, thanks in advance !!

by Ashutosh Mishra at May 26, 2016 10:13 AM

tips on how to define a state space for reinforcement learning

I have read some reinforcement learning model examples of various things and I was surprised by how varied unintuitive some of the state spaces are.

For example: in the standard grid world, it is easily understandable that the position is the state of an agent.

In the mountain car problem, the current position is also defined as the state. I have seen versions where the distance to the goal is the definition of the state of the agent.

In a task where an arm must learn to reach a stochastic target, I thought the state space definition would be the distance to the target. But the definition of the state space was the relative position of the target with respect to the hand: if it is up, down, left, or right.

Are there rules, tips or tricks to construct the state space?

by cgo at May 26, 2016 10:12 AM


Advice using the theano implementation of Conv3D

I am trying to run a 3D convolutional neural network using theano, however I am not completely sure of the usage of the function theano.tensor.nnet.Conv3d.

I am used to using lasagne, however due to not having access to a GPU at this time I am unable to use the lasagne.layers.dnn.Conv3DDNNLayerfunction.

Is anyone able to advise me in terms of inputs and outputs what I need to do to be able to use the theano function? I have data in the form N x 9 x 9 x 9 with 1 channel, and have initialised the theano tensors. I have also created an input layer with lasagne like so:

input = lasagne.layers.InputLayer((None, 1, 9, 9, 9), input_var=input_var)

Any advice is very welcome! Thanks.

by i-am-spartacus at May 26, 2016 10:11 AM



Algorithm to solve a tangram [on hold]

I'm given a set of pieces and a shape, and I've to come up with one(any) solution. I'm totally stuck on where to begin. I'd like to get some suggestions to get me started. Particularly, I'd like to know what algorithm could I use to fit the pieces into the shape.


<svg version="1.1" xmlns="">
    <path d="M 90 90 L 90 110 L 110 110 L 110 70 z" fill="yellow"/>
    <path d="M 130 80 L 170 120 L 170 80 z" fill="red"/>
    <path d="M 290 150 L 270 170 L 210 110 L 250 110  z" fill="olive"/>
    <path d="M 190 120 L 190 160 L 210 180 L 230 160 z" fill="magenta"/>
    <path d="M 320 80 L 300 100 L 280 100 L 280 80 z" fill="blue"/>
    <path d="M 320 150 L 360 150 L 320 190 z" fill="green"/>
    <path d="M 230 30 L 230 70 L 210 90 L 190 70 L 190 30 z" fill="purple"/>


<svg version="1.1" xmlns="">
    <path d="M 50 10 L 90 10 L 90 50 L 130 50 L 130 90 L 90 90 L 90 130 L 50 130 L 50 90 L 10 90 L 10 50 L 50 50 z" fill="brown"/>

sample solution:

<svg version="1.1" xmlns="">
    <path d="M 50 10 L 70 10 L 70 30 L 50 50 z" fill="yellow"/>
    <path d="M 70 10 L 90 10 L 90 50 L 70 30 z" fill="blue"/>
    <path d="M 10 50 L 50 50 L 10 90 z" fill="green"/>
    <path d="M 90 50 L 130 50 L 130 90 z" fill="red"/>
    <path d="M 10 90 L 70 30 L 90 50 L 50 90 z" fill="olive"/>
    <path d="M 90 50 L 130 90 L 90 90 L 70 70 z" fill="magenta"/>
    <path d="M 70 70 L 90 90 L 90 130 L 50 130 L 50 90 z" fill="purple"/>

by user51737 at May 26, 2016 10:07 AM


Accessing object key using reduce WITHOUT passing it as param

Without altering any of this code, is it possible to write a wrapper function that calls reduce on randomObject and accesses the keys through it?

This is assuming that:

  • reduce acts just like the version below
  • reduce doesn't include an index parameter like the real deal-- only each does

    var randomObject = {a: 6, b: 5}; 
    var each = function(obj, iterator){
        for(key in obj){
            iterator(obj[key], key);
    var reduce = function(accumulator, collection, callback){
        each(randomObject, function(item){
            accumulator = callback(accumulator, item);
        return accumulator;

by cheesenthusiast at May 26, 2016 10:01 AM


Gute Nachrichten: Im Zuge der Modernisierung kriegen ...

Gute Nachrichten: Im Zuge der Modernisierung kriegen die Atomstreitkräfte der USA bald Rechner ohne Disketten.
Die Atomstreitkräfte der USA benutzen teilweise noch vollkommen veraltete Computer und Floppy Disks. Das kritisierte nun der US-Rechnungshof (GAO) in einem Bericht. So laufe im Verteidigungsministerium ein Kommando- und Kontrollsystem auf einem IBM-Computer der Serie 1, die aus den Siebzigerjahren stammt. Eine Sprecherin des Pentagons sagte, das System bleibe in Gebrauch, weil es noch immer funktioniere.
Wenn die von Disketten und 70ern reden, meinen sie glaube ich die ganz großen 8"-Dinger, die sich wie eine wegfließende Schallplatte anfühlen.

May 26, 2016 10:00 AM

Heute werden alle 19 französichen Atomkraftwerke bestreikt. ...

Heute werden alle 19 französichen Atomkraftwerke bestreikt. Aber natürlich schalten sie sie nicht einfach aus, sondern sie lassen sie weiterlaufen. Prinzip Hoffnung, wie immer beim Atomstrom.

Was kann da schon schiefgehen. Keinerlei Bedrohung für die Bevölkerung!

May 26, 2016 10:00 AM

Lamer News



Describing the language of a CFG

Let G be a CFG such that:

S -> aSb | bY | Ya
Y -> bY | aY | lambda

Give a simple description of L(G) in English.

From what I can tell from this CFG, there are many string that can be generated from it, but not so sure what this review means about giving a simple description?

by Darkflame at May 26, 2016 09:58 AM


A webapp built generally in functional style? [on hold]

I can't find good examples of single-page webapps built in JavaScript with the primary functional-style approach. I have read some books on the topic, and authors say that in functional style you approach components of the app as functions, not objects, as it would be with OOP.

Assume, I have an application with some components in it - Dialog, Button, ScrollArea, as I would call them if I used OOP style.

In a regular way, I would do some things like this:

  1. Instantiating a new dialog:

    var profileDialog = new Dialog({type: 'profile'});
  2. Instantiating a new button that listens to click and opens the dialog:

    var profileButton = new Button({size: 'large', label: 'Edit profile'});

Now, I wonder, how can I implement this basic part of an app in a functional style? I suppose, I have to have profileDialog as an object with methods, or can I look at it as at a function? How would then methods of it look and where would they be located? As separate functions?

Are there practical examples of how functional style can be used in a JavaScript frontend application, besides things like processing objects and arrays with data?

Is functional style suitable for for things like UI, events management and other functionality that is basic for webapps?

by Sergei Basharov at May 26, 2016 09:56 AM

Detect irregular horizontal and vertical lines in scatterplot

I plot a scatter plot, in which I can see see certain patterns that I want to detect. They look like either vertical or horizontal lines (but are quite irregular as well).

allpoints[['day', 'flags']].set_index('day').dr_h.plot(figsize=(10,8), style='.', color='g', alpha=.2)

enter image description here

by tesgoe at May 26, 2016 09:47 AM


Free high resolution financial data

As thebonnotgang(1) stopped updating their database, I was wondering if there are some other free sources of high-frequency data available.

I found a proper tick data api (ca. 25 day history) hosted by onvista which is however undocumented. As some one might be interested I will add a short howto:


[TradingDay]: Can be calculated by fixing a date e.g. 26-01-2016 as 1453791600 and subtracting the number of days multiplied by 86400. You can find the current trading day here[2]. Just klick on 'Anzeigen' as soon as the page loaded.

[ISIN]: self-explanatory I guess

[AssetType]: Up to know I was only downloading stocks so I used 'Aktie'

[ExchangeID]: I'm not able to offer a full list of all exchanges however '1937897' refers to Xetra and '9385907' refers to Tradegate. You can find more by playing around with the stock exchange dropdown menu [here][3] and looking at &notation=[***] within the url.

Example Allianz 06.01.2016:

(I am afraid this link will be broken in a few days due to the time horizon. So it should serve just as an example)

As onvista is restructuring their homepage I assume they will take this service down.

So did someone discover something similar?

A similar question[4] has been asked several years ago and provides more trivial/googled results.





I didn't have 'sufficient reputation to post the links in correct mode.

by user47489 at May 26, 2016 09:22 AM


Number theoretic problem that is $\#P$-complete

Is there a number theoretic problem that is $\#P$ complete?

by Student. at May 26, 2016 08:59 AM


Open source equity/bond index data

I have been using the tseries package of R (get.hist.quote) to get historical quotes for various indices from yahoo finance. I am interested in DAX, VDAX, EB.REXX and DJ UBS Commodity Index. When I tried to expand the time window for my analyses I saw that all time series except DAX and VDAX are discontinued.

My questions:

1) Do you know why EB.REXX (the symbol was REX.DE) dissapeared on yahoo finance (I now use EB.REXX 10 years, REX0.DE, but it is also discontinued) and why I can not find DJ UBS Cdty Index (symbol: ^DJUBS) anymore?

I use code like


get.hist.quote(instrument="REX0.DE",start="2006-01-01",quote=c("AdjClose"),compression="d") get.hist.quote(instrument="^DJUBS",start="2006-01-01",quote=c("AdjClose"),compression="d")

but both times series end in the 2nd half of 2012.

2) Do you know any R-compatible open data source where I can get

  1. a price or performance index for German or core-EURO government bonds (like eb.rexx)
  2. a price or performance index for broad commodities (like DJ UBS Cdty Index)?

EDIT: I started to try getSymbols of the quantmode package.

  1. In google finance I found INDEXDB:RXPG for EB.REXX and INDEXDJX:DJUBS for DJ UBS - are these the correct indices? Where do I find any description of the data?
  2. The example taken from the manual - getSymbols("MSFT",src="google") - works, but what I would need for the index data - getSymbols("INDEXDB:RXPG",src="google") - does not ...

by Richard at May 26, 2016 08:53 AM

Non-contractual accounts behavioural study

I need to carry a non-contractual accounts behavoiural study for a bank. The objective is to estimate core/non core ratios and then bucket and ftp them. Any recipe where to start? I have 3yrs of historical data, daily closing balances. From what I googled I understand that I need some kind of seasonal vs growth trend segregation. But only guidelines, nothing in particular. Visually represented my data has (e.g. current accounts) very heavy seasonal bias with highs in shoulder seasons and lows in the festive seasons ;)). How to isolate it? How do I then calculate the true core/volatile ratio?

by Peaches at May 26, 2016 07:53 AM


ordered uniform distribution

We are given $n$ objects with individual weights $w_1 , w_2 , \ldots , w_n$ and $m$ buckets in which these objects are to be inserted but in order. Here order means if object $i$ goes in bucket $m_i$ and if object $j$ goes in bucket $m_j$ then if $i < j$ then $m_i \le m_j$. Objective of the problem is the minimize the weight (sum of weight of objects in the bucket) of the bucket which has the maximum weight (amongst all buckets) in such a distribution.

by user6370652 at May 26, 2016 07:47 AM



How to infer category of a single text with weka Classifier? [on hold]

I am a beginner with weka. I have managed to import dataset from the disk (one folder by category, all text related to this category inside the folder), apply StringToWordVector with tokenizer, train a Naive Multniomial categorizer ... The code is below (it is c# but Java is ok of course) However, I can hardly find information on how to use the categorizer on a project. Say I have a text with unknown category, input by a user, how can I just apply the categorizer to this text and infer the category it belongs to ? (code "// what to do here below"). Any help would be greatly appreciated ;-) Thanks in advance


string filepath = @"C:\Users\Julien\Desktop\Meal\";
ClassificationDatasetHelper classHelper = new ClassificationDatasetHelper();
weka.core.converters.TextDirectoryLoader tdl = new

            weka.core.Instances insts = tdl.getDataSet();

            weka.filters.unsupervised.attribute.StringToWordVector swv = new weka.filters.unsupervised.attribute.StringToWordVector();
            weka.core.tokenizers.NGramTokenizer tokenizer = new weka.core.tokenizers.NGramTokenizer();

            insts = weka.filters.Filter.useFilter(insts, swv);


            int percentSplit = 90;

            weka.classifiers.Classifier cl = new weka.classifiers.bayes.NaiveBayesMultinomial();
            Console.WriteLine("Performing " + percentSplit + "% split evaluation.");

            int trainSize = insts.numInstances() * percentSplit / 100;
            int testSize = insts.numInstances() - trainSize;
            weka.core.Instances train = new weka.core.Instances(insts, 0, trainSize);


            string s = "Try to classify this text";
            weka.core.Instance instanceToClassify = new weka.core.Instance();

            // what to do here
            // ???

            double predictedClass = cl.classifyInstance(instanceToClassify);

by JulienS at May 26, 2016 07:34 AM


reinforcement learning in gridworld with subgoals

Andrew Ng, Daishi Harada, Stuart Russell published a conference paper entitled "Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping"

There is a specific example there that I am extremely curious/interested about. It is in Figure 2(a) of the paper. It is about a 5x5 gridworld with start and goal states in opposite corners. The catch is, the agent must learn to go from start to end BY VISITING specific subgoals 1,2,3,4 IN ORDER.

Has anyone seen/understood the code for this? I want to know how the reward function/shaping is given in this kind of problem.

I am interested to know how the flow of this modification to the grid world is written.

by cgo at May 26, 2016 07:08 AM

Find internal surfaces in B-rep geometry

I have a solid with internal holes. My solid is in Boundary-representation (B-rep). More exactly my solid is a list of polygons that are list of points. I want to distinguish internal surfaces from external ones. My list of polygons has a random order.

I want to be able to distinguish external surfaces from internal ones. It surely exists in literature but I can't find it.

I think BSP-trees can help me solve such a problem, but I can't find out how.


Here is an example of my situation:

I have an idea on how to do it. My algorithm seems robust, but a bit stupid, I'm sure in literature there is something smarter :

First I need to get a point inside my object

  • I calculate the bounding box containing my polygons
  • I can then easily have a point outside if its x,y,z are superior or inferior to any other x,y,z of the bounding box

Find one external face

  • From this point I do take a ray (the choice of the may be random, but a ray may fail to intersect the object. A ray between my exterior point and the center of the boundingbox should always be good)
  • With ray tracing, I can find out the first polygon it touches. It may touch a vertex first. Then I'd just take one of the polygon that contains this vertex.

Split my list of polygons into smaller lists

Let's call P my list of polygons, and L a list of list of polygons, then li a list of polygons included in L.
li should include only polygons that touches each other.
Let's define what is touching each other : We define a polygon as a list of edges.
A general definition would be that two polygons (polygon1,polygon2) touch each others if ∃(edge1,edge2)∈ polygon1 x polygon2

For my simplified case : My polygons are list of points, and I know that two polygons that touches each others implies that they have a common vertex. I will use this simplified definition in practice, but it doesn't alter the methodology.

  1. I take the first polygon in P
  2. I remove it from P and add it to li
  3. I see if it touches any other polygon in P
  4. I get those polygons, remove them from P and add them to li
  5. I see if those polygons touch other polygons in P
  6. With those newly touched polygons, I repeat the step 4. But if it didn't touch anything at step 5, I go to next step.
  7. I put li in L
  8. if P still contains at least one item, I go to step 1

Only one of the sublists contain external li the external polygon I have found above. The others are internal.
When to check which sublists contain the external polygon probably depend in the data structure, but it's a matter of optimization, not robustness.


Maybe I misunderstood the idea behind the Boundray-represenation and wrongly implemented it. But for the moment I just have a list of polygons (in a really random order) that face outward the solid. Facing outward means either the normal vector face toward the external space, or the internal holes. It's easy for me to do it manually, but algorithmically I didn't find a simple solution to distinguish holes from outerspace. The polygons that face the holes and those who face the outer space are both in the same list.


I have a data structure that looks like that :
Class Building Element:

  • Object ID (Input)
  • Implicit geometry (Input)
  • Explicit geometry :
    I calculate explicit geometry from the implicit geometry, if I have for example a XY rectangle and a vector, I can calculate the 6 faces of the wall.
    It gets more complicated when I have curves and weirder stuff. Sometimes I have a problem with complicated forms, where even 4 points that I calculated and are supposed to be planar are not (numerical computation error I think) so I split them into triangles. I try to not use triangles in most cases, as it makes my next computations heavier (more polygons to take into account)
    My room is a list of building elements that bound its space.
    I can't give any real example because of lab's and partners policy unfortunately.

by Marouane at May 26, 2016 06:51 AM


Using SelectKBest on Text Data

I've got a corpus, and in order to improve the accuracy of some classifiers I am using, I would like to use SelectKBest for feature selection.

Is there someway I could fit SelectKBest into my pipeline?

SVC_clf = Pipeline([('vect', CountVectorizer(ngram_range = (1,1), max_features = 10000, max_df = 0.75)),
                     ('tfidf', TfidfTransformer(use_idf = True)),

If not, how can I use SelectKBest to find the best features?

by Demetri P at May 26, 2016 06:39 AM


Runtime complexity for selection sort using recursion [duplicate]

This question already has an answer here:

I have implemented selection sort using recursion for which I am getting following function for var inputArr = [99,88,77,66,55,44,33,22,11,10];

$(-1 + (-1)^n + 8 n + 2 n^2)/8$

How would I calculate Asymptotic notation from above function?

by Amit Gupta at May 26, 2016 06:16 AM


How to perform batch-trading using Interactive Broker API?

My definition of batch-trading: Given $N$ BUY orders, $M$ SELL orders and $O$ ($O < N$) as the max number of open positions to be held. Batch-trading should monitor the orders and when $O$ BUY orders are filled it would cancel the $N-O$ BUY Orders.

To perform this, below is the pseudo-code:

for every PositionMessage that comes in: if number_of_filled_openpositions >= Max: // cancel is performed only ONCE, I set a "flag" to make sure // cancel is called only once during this entire process cancel_unfilled_BUY_orders_one_by_one() // OpenPositions have a TimeUpdated field, // Sell order will be generated starting from the last filled orders // Sell order is generated only after the position is fully filled Sell_excess_filled_orders()

The problem I am facing now is, depending on the day, I might get correct number of cancellations and correct number of excess_positions_sold. But on other days, I sometimes run into max number of messages exceeds limit.. or remote host forcibly disconnected your connection or incorrect number of cancellation or excess_filled_sold.

I have tried introducing a time-delay within cancel_unfilled_BUY_orders_one_by_one() so that each cancel order is sent after a time-delay.

I am finding it hard to debug because I need the market-to-be-open to test exhaustively, which is 00:00 (midnight) for me and on weekends the markets are not open (this is a hobby project).

What else should I consider? what am I missing? Has anyone else tried doing something similar?

Happy to elaborate more if required.

by user2979010 at May 26, 2016 06:15 AM



How is currying in groovy closures different from currying in functional programs?

Read about them and looked at examples. Can't understand the minute distinction. Can someone please explain?

by Prachi at May 26, 2016 06:06 AM



How is the fundamental theorem of asset pricing used?

I know that a multi-period market model is complete and arbitrage free if there's a unique equivalent martingale measure. The thing is, I have absolutely no clue how to apply this theorem to a simple binomial tree. I just don't get what the two things even have to do with one another.

For example, consider:


Yes, I know that $u = 1.1$. I know that $d = 0.9$. But what does this have anything to do with the complicated theorem which talks about conditional expectations and equivalent martingale measures? I guess $q = (R - d)/(u-d)$ and $1 - q$ is this equivalent martingale measure but why? And why is it unique?

by Min at May 26, 2016 05:38 AM


Why Convex Sets for I-projections?

I have been reviewing the literature on information theoretic methods in statistics, and in particular, the method of I-projections. Given a discrete, finite alphabet $\mathcal{X}$, let $\prod$ denote a set of probability mass functions (pmfs) on $\mathcal{X}$. Suppose $Q$ is a pmf on $\mathcal{X}$ which does not belong to $\prod$.

The I-projection of $Q$ on $\prod$ is defined as the element $P^{*}\in \prod$ such that

\begin{eqnarray}P^{*}=\arg \min \limits_{P\in \prod} D(P||Q),\end{eqnarray} where $D(P||Q)$ denotes the Kullback-Leibler divergence (KL-divergence) between pmfs $P$ and $Q$, defined as \begin{eqnarray}D(P||Q)=\sum\limits_{x\in \mathcal{X}} P(x)\log\left(\frac{P(x)}{Q(x)}\right).\end{eqnarray}

A sufficient condition for the existence of a $P^{*}$ as above requires $\prod$ to be closed. In addition, in the literature, $\prod$ is considered to be a convex set. For example, the family

\begin{eqnarray} \mathcal{P}=\lbrace P: \sum\limits_{x\in \mathcal{X}} P(x)f_{i}(x)=\alpha_{i}, ~1\leq i\leq k\rbrace, \end{eqnarray} where $f_{i}:\mathcal{X}\rightarrow \mathbb{R}$ are measurable functions and $\alpha_{i}\in \mathbb{R}$, called a linear family of pmfs, is typically considered in place of $\prod$ in the literature.

Question: Can someone please explain the significance of using convex sets? Do things fail if we do not consider convex sets?

by Karthik at May 26, 2016 05:35 AM



which programming languages uses for my career? [on hold]

Myself Deepen Patel. Current, i am working on python. I am looking career in AI, Machine Learning, Linux Development, High Performance Computing. so, which programming language i have to choose for this? I have lots of confusion to choose any one. so, suggest me on this.

by Deepen Patel at May 26, 2016 04:52 AM


Both shared-memory and distributed-memory

As you see in the following picture its a shared-memory architecture which has been modeled in a form of complete graph. Why we can consider the following architecture (which is a complete graph) both as shared-memory and distributed-memory architecture?

enter image description here

I know it seems to be an distributed-memory architecture but can we say that if we consider local memory of one of the processor as shared-memory for other ones(I mean the shared-memory is located in one of the processor) then we can consider it as shared-memory architecture not distributed?

by Abbas Foroughi at May 26, 2016 04:39 AM



Is there any problem which is in NP but not a NPC and NPH? [on hold]

I just started learning Complexity theory. And I am searching from the last four five days, only one thing. Is there any problem which is in NP but not a NPC and NPH. Look in this diagram,

Consider P in not Equal to NP

Ther is space outside of P, which is not a part of NPC. I am wondering, is there any problem exist or not there?

by user39176 at May 26, 2016 03:42 AM


How to map XGBoost predictions to the corresponding data rows?

XGBoost generates a list of predictions for the test dataset. My question is how can I map the generated predictions to the actual test file rows ? Is it strictly safe to assume that the nth prediction corresponds to the nth data row ? XGBoost leverages multi-threading for its operations. So, in such a setting can it be trusted that the prediction results strictly map to the test data rows ? Ideally would have really loved if there was a way to annotate the predictions with some row identifier from the test data file ?

I am using this example and working with DMatrix data format of XGBoost.

by invinc4u at May 26, 2016 03:41 AM

Planet Theory

Pre-announcement for the SODA call for papers

SODA PC chair Phil Klein asked me to publicize this quick announcement because of delays in getting this information online in a more official way: for next January's SODA (ACM-SIAM Symposium on Discrete Algorithms) in Barcelona, the submission deadlines will be July 6 (short abstract and paper registration) and July 13 (full submission).

For now, you can visit for the basic information (deadlines, submission site, and program committee).

May 26, 2016 03:11 AM