# Planet Primates

## October 07, 2016

### Planet Theory

#### Linear algebraic structure of word meanings

Word embeddings capture the meaning of a word using a low-dimensional vector and are ubiquitous in natural language processing (NLP). (See my earlier post 1 and post2.) It has always been unclear how to interpret the embedding when the word in question is polysemous, that is, has multiple senses. For example, tie can mean an article of clothing, a drawn sports match, and a physical action.

Polysemy is an important issue in NLP and much work relies upon WordNet, a hand-constructed repository of word senses and their interrelationships. Unfortunately, good WordNets do not exist for most languages, and even the one in English is believed to be rather incomplete. Thus some effort has been spent on methods to find different senses of words.

In this post I will talk about my joint work with Li, Liang, Ma, Risteski which shows that actually word senses are easily accessible in many current word embeddings. This goes against conventional wisdom in NLP, which is that of course, word embeddings do not suffice to capture polysemy since they use a single vector to represent the word, regardless of whether the word has one sense, or a dozen. Our work shows that major senses of the word lie in linear superposition within the embedding, and are extractable using sparse coding.

This post uses embeddings constructed using our method and the wikipedia corpus, but similar techniques also apply (with some loss in precision) to other embeddings described in post 1 such as word2vec, Glove, or even the decades-old PMI embedding.

## A surprising experiment

Take the viewpoint –simplistic yet instructive– that a polysemous word like tie is a single lexical token that represents unrelated words tie1, tie2, … Here is a surprising experiment that suggests that the embedding for tie should be approximately a weighted sum of the (hypothethical) embeddings of tie1, tie2, …

Take two random words $w_1, w_2$. Combine them into an artificial polysemous word $w_{new}$ by replacing every occurrence of $w_1$ or $w_2$ in the corpus by $w_{new}.$ Next, compute an embedding for $w_{new}$ using the same embedding method while deleting embeddings for $w_1, w_2$ but preserving the embeddings for all other words. Compare the embedding $v_{w_{new}}$ to linear combinations of $v_{w_1}$ and $v_{w_2}$.

Repeating this experiment with a wide range of values for the ratio $r$ between the frequencies of $w_1$ and $w_2$, we find that $v_{w_{new}}$ lies close to the subspace spanned by $v_{w_1}$ and $v_{w_2}$: the cosine of its angle with the subspace is on average $0.97$ with standard deviation $0.02$. Thus $v_{w_{new}} \approx \alpha v_{w_1} + \beta v_{w_2}$. We find that $\alpha \approx 1$ whereas $\beta \approx 1- c\lg r$ for some constant $c\approx 0.5$. (Note this formula is meaningful when the frequency ratio $r$ is not too large, i.e. when $r < 10^{1/c} \approx 100$.) Thanks to this logarithm, the infrequent sense is not swamped out in the embedding, even if it is 50 times less frequent than the dominant sense. This is an important reason behind the success of our method for extracting word senses.

This experiment –to which we were led by our theoretical investigations– is very surprising because the embedding is the solution to a complicated, nonconvex optimization, yet it behaves in such a striking linear way. You can read our paper for an intuitive explanation using our theoretical model from post2.

## Extracting word senses from embeddings

The above experiment suggests that

but this alone is insufficient to mathematically pin down the senses, since $v_{tie}$ can be expressed in infinitely many ways as such a combination. To pin down the senses we will interrelate the senses of different words —for example, relate the “article of clothing” sense tie1 with shoe, jacket etc.

The word senses tie1, tie2,.. correspond to “different things being talked about” —in other words, different word distributions occuring around tie. Now remember that our earlier paper described in post2 gives an interpretation of “what’s being talked about”: it is called discourse and it is represented by a unit vector in the embedding space. In particular, the theoretical model of post2 imagines a text corpus as being generated by a random walk on discourse vectors. When the walk is at a discourse $c_t$ at time $t$, it outputs a few words using a loglinear distribution:

One imagines there exists a “clothing” discourse that has high probability of outputting the tie1 sense, and also of outputting related words such as shoe, jacket, etc. Similarly there may be a “games/matches” discourse that has high probability of outputting tie2 as well as team, score etc.

By equation (2) the probability of being output by a discourse is determined by the inner product, so one expects that the vector for “clothing” discourse has high inner product with all of shoe, jacket, tie1 etc., and thus can stand as surrogate for $v_{tie1}$ in expression (1)! This motivates the following global optimization:

Given word vectors in $\Re^d$, totaling about $60,000$ in this case, a sparsity parameter $k$, and an upper bound $m$, find a set of unit vectors $A_1, A_2, \ldots, A_m$ such that where at most $k$ of the coefficients $\alpha_{w,1},\dots,\alpha_{w,m}$ are nonzero (so-called hard sparsity constraint), and $\eta_w$ is a noise vector.

Here $A_1, \ldots A_m$ represent important discourses in the corpus, which we refer to as atoms of discourse.

Optimization (3) is a surrogate for the desired expansion of $v_{tie}$ in (1) because one can hope that the atoms of discourse will contain atoms corresponding to clothing, sports matches etc. that will have high inner product (close to $1$) with tie1, tie2 respectively. Furthermore, restricting $m$ to be much smaller than the number of words ensures that each atom needs to be used for multiple words, e.g., reuse the “clothing” atom for shoes, jacket etc. as well as for tie.

Both $A_j$’s and $\alpha_{w,j}$’s are unknowns in this optimization. This is nothing but sparse coding, useful in neuroscience, image processing, computer vision, etc. It is nonconvex and computationally NP-hard in the worst case, but can be solved quite efficiently in practice using something called the k-SVD algorithm described in Elad’s survey, lecture 4. We solved this problem with sparsity $k=5$ and using $m$ about $2000$. (Experimental details are in the paper. Also, some theoretical analysis of such an algorithm is possible; see this earlier post.)

# Experimental Results

Each discourse atom defines via (2) a distribution on words, which due to the exponential appearing in (2) strongly favors words whose embeddings have a larger inner product with it. In practice, this distribution is quite concentrated on as few as 50-100 words, and the “meaning” of a discourse atom can be roughly determined by looking at a few nearby words. This is how we visualize atoms in the figures below. The first figure gives a few representative atoms of discourse.

And here are the discourse atoms used to represent two polysemous words, tie and spring

You can see that the discourse atoms do correspond to senses of these words.

Finally, we also have a technique that, given a target word, generates representative sentences according to its various senses as detected by the algorithm. Below are the sentences returned for ring. (N.B. The mathematical meaning was missing in WordNet but was picked up by our method.)

## A new testbed for testing comprehension of word senses

Many tests have been proposed to test an algorithm’s grasp of word senses. They often involve hard-to-understand metrics such as distance in WordNet, or sometimes tied to performance on specific applications like web search.

We propose a new simple test –inspired by word-intrusion tests for topic coherence due to Chang et al 2009– which has the advantages of being easy to understand, and can also be administered to humans.

We created a testbed using 200 polysemous words and their 704 senses according to WordNet. Each “sense” is represented by a set of 8 related words; these were collected from WordNet and online dictionaries by college students who were told to identify most relevant other words occurring in the online definitions of this word sense as well as in the accompanying illustrative sentences. These 8 words are considered as ground truth representation of the word sense: e.g., for the “tool/weapon” sense of axe they were: handle, harvest, cutting, split, tool, wood, battle, chop.

Police line-up test for word senses: the algorithm is given a random one of these 200 polysemous words and a set of $m$ senses which contain the true sense for the word as well as some distractors, which are randomly picked senses from other words in the testbed. The test taker has to identify the word’s true senses amont these $m$ senses.

As usual, accuracy is measured using precision (what fraction of the algorithm/human’s guesses were correct) and recall (how many correct senses were among the guesses).

For $m=20$ and $k=4$, our algorithm succeeds with precision $63\%$ and recall $70\%$, and performance remains reasonable for $m=50$. We also administered the test to a group of grad students. Native English speakers had precision/recall scores in the $75$ to $90$ percent range. Non-native speakers had scores roughly similar to our algorithm.

Our algorithm works something like this: If $w$ is the target word, then take all discourse atoms computed for that word, and compute a certain similarity score between each atom and each of the $m$ senses, where the words in the senses are represented by their word vectors. (Details are in the paper.)

##Takeaways

Word embeddings have been useful in a host of other settings, and now it appears that they also can easily yield different senses of a polysemous word. We have some subsequent applications of these ideas to other previously studied settings, including topic models, creating WordNets for other languages, and understanding the semantic content of fMRI brain measurements. I’ll describe some of them in future posts.

## August 24, 2016

### Fefe

#### Nordkorea hat jetzt einen Raketenabschuss aus einem ...

Nordkorea hat jetzt einen Raketenabschuss aus einem U-Boot demonstriert. Das ist ein wichtiges Signal für Nordkorea; das ist der Zeitpunkt, ab dem sich kein Land mehr leisten kann, Nordkorea anzugreifen. Das ist die Ansage: Selbst wenn ihr uns per Nuklearschlag komplett auslöscht, können wir euch noch als Rache aus unseren U-Booten plattmachen.

Gut, nun hatte ja eh niemand vor, Nordkorea auszulöschen. Aber ich vermute mal, dass die ganzen Paranoiker in Nordkorea ab jetzt besser schlafen können werden.

### StackOverflow

#### Apply a list of Python functions in order elegantly

I have an input value val and a list of functions to be applied in the order:

funcs = [f1, f2, f3, ..., fn]


How to apply elegantly and not writing

fn( ... (f3(f2(f1(val))) ... )


and also not using for loop:

tmp = val
for f in funcs:
tmp = f(tmp)


### QuantOverflow

#### IR Swaps - Curve sensitivity at maturity node

I was recently trying to price some IR swaps in BBG. I noticed that when I shock the yield curve up by 1bps at a single specific node, the DV01 is close to zero except at the node nearest the maturity. Nearly 100% of the DV01 for a parallel shift comes from the shock to the node near maturity.

I don't really understand this, since I would expect every node to have similar risk, perhaps slightly increasing the further away you are.

I see this trend with every IR Swap that I look at.

Clearly I am missing some understanding of the exposure of IR Swaps, could anyone here help me?

Thanks!

Note: I'm looking at the combined legs in this case.

### StackOverflow

#### How to inject dependencies in Aurelia without using ES6 class feature

How do I inject dependencies when exporting a function instead a class?

#### What impact does vocabulary_size have on word2vec tensorflow implementation?

I've performed the steps this guide to generate a vector representation of words.

Now I'm using a custom dataset of 45'000 words I'm running word2vec on.

To run I modified word2vec_basic.py to use my own dataset by modifying https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/tutorials/word2vec/word2vec_basic.py#L57 to words = read_data('mytextfile.zip')

I encountered an issue similar to https://github.com/tensorflow/tensorflow/issues/2777 and so reduced the vocabulary_size to 200 . It now runs but the results do not appear to be capturing the context. For example here is a sample output :

Nearest to Leave: Employee, it, •, due, You, appeal, Employees, which,


What can I infer from this output ? Will increasing/decreasing vocabulary_size improve results ?

I'm using python3 so to run I use python3 word2vec_basic2.py

#### Image clustering by its similarity in python

I have a collection of photos and I'd like to distinguish clusters of the similar photos. Which features of an image and which algorithm should I use to solve my task?

### QuantOverflow

#### How to predict VaR changes on a DoD basis?

I am trying to predict change in VaR on a DoD basis. So let's say at t=0, I have my VaR based on full valuation. On t=1, I will have another VaR based on full valuation. I am trying to predict this VaR on t=1 without full valuations. I also have Composite VaR and Incremental VaR at t=0. At t=1, I will also know how my risk factors have changed and if there is any new or dropped trades in my portfolio as well.

What will be the best way to proceed? Links to any reference material will also be highly appreciated.

### CompsciOverflow

#### reinforcement learning in gridworld with subgoals

Andrew Ng, Daishi Harada, Stuart Russell published a conference paper entitled Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping.

There is a specific example there that I am extremely curious/interested about. It is in Figure 2(a) of the paper:

It is about a 5x5 gridworld with start and goal states in opposite corners. The catch is, the agent must learn to go from start to end BY VISITING specific subgoals 1,2,3,4 IN ORDER.

Has anyone seen/understood the code for this? I want to know how the reward function/shaping is given in this kind of problem.

I am interested to know how the flow of this modification to the grid world is written.

### Lobsters

#### OpenPiton - open source research processor

OpenSPARC core scaled up

### UnixOverflow

#### No touchpad on freebsd on an acer aspire11

First time using FreeBSD. Installed it on an acer aspire v11. Neither the touchpad or touchscreen work. From what I understand there should be a device /dev/psm0 that represents the touchpad, but there is none.

Searching for getting a touchpad working gives results that all seem to assume psm0 exists. Things for linux suggest setting i8042.nopnp as a kernel option, but I don't know what the equivalent here would be.

### StackOverflow

#### Update a field in an Elm-lang record via dot function?

Is it possible to update a field in an Elm record via a function (or some other way) without explicitly specifying the precise field name?

Example:

> fields = { a = 1, b = 2, c = 3 }
> updateField fields newVal fieldToUpdate = { fields | fieldToUpdate <- newVal }
> updateField fields 5 .a -- does not work


### UPDATE:

To add some context, I'm trying to DRY up the following code:

UpdatePhraseInput contents ->
let currentInputFields = model.inputFields
in { model | inputFields <- { currentInputFields | phrase <- contents }}

UpdatePointsInput contents ->
let currentInputFields = model.inputFields
in { model | inputFields <- { currentInputFields | points <- contents }}


Would be really nice if I could call a mythical updateInput function like this:

UpdatePhraseInput contents -> updateInput model contents .phrase
UpdatePointsInput contents -> updateInput model contents .points


#### Tensorflow Training and Validation Input Queue Separation

I tried to replicate the Fully Convolutional Network results using TensorFlow. I used Marvin Teichmann's implementation from github, https://github.com/MarvinTeichmann/tensorflow-fcn. I only need to write the training wrapper. I create two graphs that share variables and two input queues, one for training and one for validation. To test my training wrapper, I used two short lists of training and validation files and I do a validation immediately after every training epoch. I also printed out the shape of every image from the input queue to check whether I get the correct input. However, after I started the training, it seems that only the images from the training queue is being dequeued. So both the training and validation graphs take input from the training queue and the validation queue is never accessed. Can anyone help explain and solve this problem?

Here's part of the relevant code:

def get_data(image_name_list, num_epochs, scope_name, num_class = NUM_CLASS):
with tf.variable_scope(scope_name) as scope:
images_path = [os.path.join(DATASET_DIR, i+'.jpg') for i in image_name_list]
gts_path = [os.path.join(GT_DIR, i+'.png') for i in image_name_list]
seed = random.randint(0, 2147483647)
image_name_queue = tf.train.string_input_producer(images_path, num_epochs=num_epochs, shuffle=False, seed = seed)
gt_name_queue = tf.train.string_input_producer(gts_path, num_epochs=num_epochs, shuffle=False, seed = seed)
my_image = tf.image.decode_jpeg(image_value)
my_image = tf.cast(my_image, tf.float32)
my_image = tf.expand_dims(my_image, 0)
# gt stands for ground truth
my_gt = tf.cast(tf.image.decode_png(gt_value, channels = 1), tf.float32)
my_gt = tf.one_hot(tf.cast(my_gt, tf.int32), NUM_CLASS)
return my_image, my_gt

train_image, train_gt = get_data(train_files, NUM_EPOCH, 'training')
val_image, val_gt = get_data(val_files, NUM_EPOCH, 'validation')
with tf.variable_scope('FCN16') as scope:
train_vgg16_fcn = fcn16_vgg.FCN16VGG()
train_vgg16_fcn.build(train_image, train=True, num_classes=NUM_CLASS, keep_prob = KEEP_PROB)
scope.reuse_variables()
val_vgg16_fcn = fcn16_vgg.FCN16VGG()
val_vgg16_fcn.build(val_image, train=False, num_classes=NUM_CLASS, keep_prob = 1)
"""
Define the loss, evaluation metric, summary, saver in the computation graph. Initialize variables and start a session.
"""
for epoch in range(starting_epoch, NUM_EPOCH):
for i in range(train_num):
_, loss_value, shape = sess.run([train_op, train_entropy_loss, tf.shape(train_image)])
print shape
for i in range(val_num):
loss_value, shape = sess.run([val_entropy_loss, tf.shape(val_image)])
print shape


### TheoryOverflow

#### Something-Treewidth Property

Let $s$ be a graph parameter (ex. diameter, domination number, etc)

A family $\mathcal{F}$ of graphs has the $s$-treewidth property if there is a function $f$ such that for any graph $G\in \mathcal{F}$, the treewidth of $G$ is at most $f(s)$.

For instance, let $s = \mathit{diameter}$, and $\mathcal{F}$ be the family of planar graphs. Then it is known that any planar graph of diameter at most $s$ has treewidth at most $O(s)$. More generally, Eppstein showed that a family of graphs has the diameter-treewidth property if and only if it excludes some apex graph as a minor. Examples of such families are graphs of constant genus, etc.

As another example, let $s = \mathit{domination{-}number}$. Fomin and Thilikos have proved an analog result to Eppstein's by showing that a family of graphs has the domination-number-treewidth property if and only if $\mathcal{F}$ has local-treewidth. Note that this happens if and only if $\mathcal{F}$ has the diameter-treewidth property.

Questions:

1. For which graph parameters $s$ is the $s$-treewidth property known to hold on planar graphs?
2. For which graph parameters $s$ is the $s$-treewidth property known to hold on graphs of bounded local-treewidth?
3. Are there any other families of graphs, not comparable to graphs of bounded local-treewidth for which the $s$-treewidth property holds for some suitable parameter $s$?

I have a feeling that these questions have some relation with the theory of bidimensionality. Within this theory, there are several important parameters. For instance, the sizes of feedback vertex set, vertex cover, minimum maximal matching, face cover, dominating set, edge dominating set, R-dominating set, connected dominating set, connected edge dominating set, connected R-dominating set, etc.

1. Does any parameter $s$ encountered in bidimensionality theory have the $s$-treewidth property for some suitable family of graphs?

### CompsciOverflow

#### How to prove that the reversal of the concatenation of two strings is the concatenation of the reversals?

Given languages $L_1$ and $L_2$, how do we prove that $$(L_1L_2)^{\mathrm{rev}} = (L_2^{\mathrm{rev}})(L_1^{\mathrm{rev}})\,,$$ where ${}^{\mathrm{rev}}$ denotes reversal?

I think using mathematical induction, it can be shown true for two languages $L_1$ and $L_2$ of length $1$ each and then do induction using one of the lengths and then repeat with the other length. Is there any other way of proving this in a simpler way without making it too big with induction method?

### Fefe

#### Bei dem Gedanken, dass irgendwelche Strände in Südfrankreich ...

Bei dem Gedanken, dass irgendwelche Strände in Südfrankreich Jagd auf Burkaträgerinnen machen und Bußgelder verhängen, da kommen dem einen oder anderen (unter anderem mir) so gewisse Nazi-Vergleiche in den Kopf. So marodierende Glatzenhorden, die "Kanacken klatschen" gehen wollen. Wie sieht denn das in der Praxis aus?

Zitat:

The French ban on the burkini is threatening to turn into a farce as police officers armed with pepper spray and batons marched onto a beach today and ordered a woman to strip off.

Four burly cops stood over the middle-aged woman, who had been quietly sunbathing on the Promenade des Anglais beach in Nice - yards from the scene of the Bastille Day lorry attack - and watched her take off a Muslim-style garment which protected her modesty.

Nun ist Vorsicht geboten, weil die Daily Mail a) ein finsteres Wurstblatt ist, b) konservativ britisch die Franzosen als Feindbild hat, aber auf der anderen Seite c) selbst gerne gegen "die Terroristen" hetzt. Die Bilder sprechen aber für sich, finde ich.

Und als Bonus ist das Bild der Nonnen am Strand aufgetaucht, das den italienischen Imam neulich seinen Facebook-Account gekostet hat.

Update: Und falls jemand die Burka-Debatte an sich nicht versteht: Hier ist Hilfestellung.

Update: Leserbrief dazu:

Ich bin gerade in Nizza. Der Strand, wo das passiert ist, dürfte 10 Minuten zu Fuß von mir weg sein. Die Bekleidung, die auf dem Foto zu sehen ist, sieht man hin und wieder. Die letzten Tage ist mir aber nie aufgefallen, dass jemand komisch geschaut oder die Polizei gekommen ist.

Gestern waren wir in einem Vorort von Nizza am Strand, da hat es die Polizei und anderen Gäste nicht interessiert, dass eine Mutter diese Kleidung trug. Die Polizei hat lediglich einige Raucher auf das Rauchverbot am dortigen Strand hingewiesen, allerdings ohne Strafzettel und die Zigaretten wurden sofort wieder angezündet als die Streife ausser Sicht war. Mal zur Relation.

Update: Hier auch noch ein schöner Text zur Burkini-Debatte. Money Quote:

“Over 40 percent of our sales are from non-Muslim women,” she says. “The Jewish community embraces it. I’ve seen Mormons wearing it. A Buddhist nun purchased it for all of her friends. I’ve seen women who have issues with skin cancer or body image, moms, women who are not comfortable exposing their skin — they’re all wearing it.
Ich verstehe ja nur ein Detail nicht daran. Wieso sich mit einem Burkini in die Sonne legen? Macht man das nicht, damit die Sonne an die Haut kommt? Und das verhindert der Burkini? Also nicht dass ich jetzt irgendjemandem vorschreiben wollen würde, ob und wie er sich an den Strand legt, aber das Detail verstehe ich nicht.

### CompsciOverflow

#### Finding a value in a sorted array in log R time, R is the number of distinct elements

The standard binary search algorithm gives log N time, where N is the total number of elements in the array. When the array has duplicates, I don't see how you could detect those duplicates ahead of time. (Iterating through the array takes N time, which is too much.) Consequently how do you improve the performance from log N to log R?

#### If you have a Cook reduction in both directions, do you also have a Karp reduction?

If there exists a polynomial reduction of a decision problem $\mathcal{P}_1$ into another decision problem $\mathcal{P}_2$ and also a polynomial reduction of $\mathcal{P}_2$ into $\mathcal{P}_1$, then is there also a polynomial transformation between $\mathcal{P}_1$ and $\mathcal{P}_2$?

These are the definitions I use:

Cook reduction
$\mathcal{P}_1$ polynomially reduces to $\mathcal{P}_2$ if there is a polynomial-time oracle algorithm for $\mathcal{P}_1$ using an oracle for $\mathcal{P}_2$.

Karp reduction
$\mathcal{P}_1=(X_1,Y_1)$ polynomially transforms to $\mathcal{P}_2=(X_2,Y_2)$ if there is a function $f:X_1\rightarrow X_2$ computable in polynomial time such that for all $x\in X_1$, $x\in Y_1$ if and only if $f(x)\in X_2$.

### Planet Theory

#### Proceedings of ICALP 2016

The proceedings of ICALP 2016 are now available from the LIPIcs web site. Many thanks to all the colleagues who have worked so hard to make this possible.

I hope that many of you will read the papers in the proceedings, which were selected by Davide, Michael, Yuval and  their PCs, and build on their research contributions.

### StackOverflow

#### About how to balance imbalanced data

When I read Decision Tree in Scikit learn, I find:

Balance your dataset before training to prevent the tree from being biased toward the classes that are dominant. Class balancing can be done by sampling an equal number of samples from each class, or preferably by normalizing the sum of the sample weights (sample_weight) for each class to the same value.

I am confused.

(1)

Class balancing can be done by sampling an equal number of samples from each class

If I do like this, should I use add a proper sample weight for each samples in each class( or add class sample...).

For example, if I have two classes: A and B with number of samples

A:100 B:10000

Can I input 10000 samples for each and set weight:

input samples of A:10000, input samples of B:10000

weight of A:0.01 , weight of B: 1.0

(2)

But it still said:

preferably by normalizing the sum of the sample weights (sample_weight) for each class to the same value

I totally confused by it. Does it means I should input 100 samples of A and 10000 samples of B then set weight:

input samples of A:100, input samples of B:10000

weight of A:1.0 , weight of B: 1.0

But it seems I did nothing to balance the imbalanced data.

Which way is better and what's the meaning of second way in Scikit learn? Can anyone help me clarify it?

### StackOverflow

#### Training Hidden Markov Model in R

Is it possible to train Hidden Markov Model in R? I have a set of observations with its corresponding labels. And I need to train HMM in order to get the Markov parameters (i.e. the transition probabilities matrix, emission probabilities matrix and initial distribution). So, I can predict for the future observations.

In other words, I need the opposite of Forward_Backward Algorithm..

### Fred Wilson

#### Trapped In A System

A book that has really stayed with me since I read it is The Prize, the story of the attempt to reform the Newark public school system.

And there is a particular scene in that book that really sums it up for me.

The author is at an anti-charter school protest and meets a woman who had spent that morning trying to get her son into a new charter school that had opened in Newark. The author asks the woman how it is possible that on the same day she would spend the morning trying to get her son into a charter school and the afternoon at an anti-charter protest.

The woman explains that most of her family are employed in good paying union jobs in the district schools and that the growth of charters is a threat to those jobs.

As I read that story I was struck by how rational the woman was acting. She was helping to preserve a system that provided an economic foundation for her family and at the same time opting her son out of it.

In some ways that story is a microcosm of what is happening in the economy right now. Many people in the US (and around the world) are employed by (and trapped in) a system that no longer works very well. And although they realize the system is broken, they fight to support it because it underpins their economic security.

My partner Albert argues for a universal basic income to replace the old and broken system so we as a society can free ourselves from outdated approaches that don’t work anymore and move to adopt new and better systems.

I think it is worth a shot to be honest.

### QuantOverflow

#### What is a Short Option Hedging Portfolio?

In his book 'Stochastic Calculus for Finance II' Shreve uses the term: 'Short Option Hedging Portfolio' on page.156 (4.5.3). Can someone please explain this term with some kind of an example? It is preventing me from understanding why Portfolio Value evolution is equated with Option Value Evolution to derive the Differential Equation of Black-Scholes-Merton.

Thanks!

### StackOverflow

#### Tensorflow multi-variable logistic regression not working

I am trying to create a program which will classify a point as either 1 or 0 using Tensorflow. I am trying to create an oval shape around the center of this plot, where the blue dots are:

Everything in the oval should be classified as 1, every thing else should be 0. In the graph above, the blue dots are 1s and the red x's are 0s.

However, every time I try to classify a point, it always choses 1, even if it was a point I trained it with, saying it was 0.

My question is simple: Why is the guess always 1, and what am I doing wrong or should do differently to fix this problem? This is my first machine learning problem I have tried without a tutorial, so I really don't know much about this stuff.

Here's my code:

#!/usr/bin/env python3

import tensorflow as tf
import numpy
import matplotlib.pyplot as plt

training_in = numpy.array([[0, 0], [1, 1], [2, 0], [-2, 0], [-1, -1], [-1, 1], [-1.5, 1],   [3, 3], [3, 0], [-3, 0], [0, -3], [-1, 3], [1, -2], [-2, -1.5]])
training_out = numpy.array([1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0])

def transform_data(x):
return [x[0], x[1], x[0]**2, x[1]**2, x[0]*x[1]]

new_training_in = numpy.apply_along_axis(transform_data, 1, training_in)

feature_count = new_training_in.shape[1]

x = tf.placeholder(tf.float32, [None, feature_count])
y = tf.placeholder(tf.float32, [None, 1])

W = tf.Variable(tf.zeros([feature_count, 1]))
b = tf.Variable(tf.zeros([1]))

guess = tf.nn.softmax(tf.matmul(x, W) + b)

cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(tf.matmul(x, W) + b, y))

init = tf.initialize_all_variables()
sess = tf.Session()
sess.run(init)

for i in range(1000):
for (item_x, item_y) in zip(new_training_in, training_out):
sess.run(opti, feed_dict={ x: [item_x], y: [[item_y]]})

print(sess.run(W))
print(sess.run(b))

plt.plot(training_in[:6, 0], training_in[:6, 1], 'bo')
plt.plot(training_in[6:, 0], training_in[6:, 1], 'rx')

results = sess.run(guess, feed_dict={ x: new_training_in })

for i in range(training_in.shape[0]):
xx = [training_in[i:,0]]
yy = [training_in[i:,1]]
res = results[i]

# this always prints [ 1.]
print(res)

# uncomment these lines to see the guesses
# if res[0] == 0:
#     plt.plot(xx, yy, 'c+')
# else:
#     plt.plot(xx, yy, 'g+')

plt.show()


### QuantOverflow

#### Optimal Portfolio Asset Weights

I have calculated an optimal portfolio, using a historical covariance matrix, and determined the weights of n risky assets in the optimal portfolio.

The utility function is represented by U=E(R)-0.5*A*(variance).

I am wondering what makes certain assets receive high weights, and what makes certain assets receive low weights?

### StackOverflow

#### Best range of parameters in grid search?

I would like to run a naive implementation of grid search with MLlib but I am a bit confused about choosing the 'best' range of parameters. Apparently, I do not want to waste too much resources for a combination of parameters that will probably not give an improved model. Any suggestions from your experience?

# set parameter ranges:

val intercept   : List[Boolean]  = List(false)
val classes     : List[Int]      = List(2)
val validate    : List[Boolean]  = List(true)
val tolerance   : List[Double]   = List(0.0000001 , 0.000001 , 0.00001 , 0.0001 , 0.001 , 0.01 , 0.1 , 1.0)
val corrections : List[Int]      = List(5 , 10 , 15)
val iters       : List[Int]      = List(1 , 10 , 100 , 1000 , 10000)
val regparam    : List[Double]   = List(0.0 , 0.0001 , 0.001 , 0.01 , 0.1 , 1.0 , 10.0 , 100.0)
val updater     : List[Updater]  = List(new SimpleUpdater() , new L1Updater() , new SquaredL2Updater())


# perform grid search:

val combinations = for (a <- intercept;
b <- classes;
c <- validate;
d <- tolerance;
f <- corrections;
g <- iters;
h <- regparam;
i <- updater) yield (a,b,c,d,e,f,g,h,i)

for( ( interceptS , classesS , validateS , toleranceS , gradientS , correctionsS , itersS , regParamS , updaterS ) <- combinations.take(3) ) {

val lr : LogisticRegressionWithLBFGS = new LogisticRegressionWithLBFGS().
setNumClasses(numClasses=classesS).
setValidateData(validateData=validateS)

lr.
optimizer.
setConvergenceTol(tolerance=toleranceS).
setNumCorrections(corrections=correctionsS).
setRegParam(regParam=regParamS).
setUpdater(updater=updaterS)

}


### QuantOverflow

#### Open source equity/bond index data

I have been using the tseries package of R (get.hist.quote) to get historical quotes for various indices from yahoo finance. I am interested in DAX, VDAX, EB.REXX and DJ UBS Commodity Index. When I tried to expand the time window for my analyses I saw that all time series except DAX and VDAX are discontinued.

My questions:

1) Do you know why EB.REXX (the symbol was REX.DE) dissapeared on yahoo finance (I now use EB.REXX 10 years, REX0.DE, but it is also discontinued) and why I can not find DJ UBS Cdty Index (symbol: ^DJUBS) anymore?

I use code like

 require(tseries) 

 get.hist.quote(instrument="REX0.DE",start="2006-01-01",quote=c("AdjClose"),compression="d") get.hist.quote(instrument="^DJUBS",start="2006-01-01",quote=c("AdjClose"),compression="d") 

but both times series end in the 2nd half of 2012.

2) Do you know any R-compatible open data source where I can get

1. a price or performance index for German or core-EURO government bonds (like eb.rexx)
2. a price or performance index for broad commodities (like DJ UBS Cdty Index)?

EDIT: I started to try getSymbols of the quantmode package.

1. In google finance I found INDEXDB:RXPG for EB.REXX and INDEXDJX:DJUBS for DJ UBS - are these the correct indices? Where do I find any description of the data?
2. The example taken from the manual - getSymbols("MSFT",src="google") - works, but what I would need for the index data - getSymbols("INDEXDB:RXPG",src="google") - does not ...

### StackOverflow

#### How to add another feature (length of text) to current bag of words classification? Scikit-learn

I am using bag of words to classify text. It's working well but I am wondering how to add a feature which is not a word.

Here is my sample code.

import numpy as np
from sklearn.pipeline import Pipeline
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.svm import LinearSVC
from sklearn.feature_extraction.text import TfidfTransformer
from sklearn.multiclass import OneVsRestClassifier

X_train = np.array(["new york is a hell of a town",
"new york was originally dutch",
"new york is also called the big apple",
"nyc is nice",
"the capital of great britain is london. london is a huge metropolis which has a great many number of people living in it. london is also a very old town with a rich and vibrant cultural history.",
"london is in the uk. they speak english there. london is a sprawling big city where it's super easy to get lost and i've got lost many times.",
"london is in england, which is a part of great britain. some cool things to check out in london are the museum and buckingham palace.",
"london is in great britain. it rains a lot in britain and london's fogs are a constant theme in books based in london, such as sherlock holmes. the weather is really bad there.",])
y_train = [[0],[0],[0],[0],[1],[1],[1],[1]]

X_test = np.array(['it's a nice day in nyc',
'i loved the time i spent in london, the weather was great, though there was a nip in the air and i had to wear a jacket.'
])
target_names = ['Class 1', 'Class 2']

classifier = Pipeline([
('vectorizer', CountVectorizer(min_df=1,max_df=2)),
('tfidf', TfidfTransformer()),
('clf', OneVsRestClassifier(LinearSVC()))])
classifier.fit(X_train, y_train)
predicted = classifier.predict(X_test)
for item, labels in zip(X_test, predicted):
print '%s => %s' % (item, ', '.join(target_names[x] for x in labels))


Now it is clear that the text about London tends to be much longer than the text about New York. How would I add length of the text as a feature? Do I have to use another way of classification and then combine the two predictions? Is there any way of doing it along with the bag of words? Some sample code would be great -- I'm very new to machine learning and scikit learn.

### QuantOverflow

#### How much to invest to reach a target?

Your current wealth is $W$. Each day you can invest some of it; there's a probability $p$ that you will win as much as you invested, $1-p$ that you will lose it. You want to reach a target wealth $W_T$ within $n$ days. Each day, you can choose the fraction $f$ of your wealth to invest. How do you choose $f$ to maximise the chance to hit your target in time?

If it helps, assume $p > 0.5$, $n \gg 1$.

This is essentially a pure maths problem but I thought it would be interesting for quants. I have seen discussions of similar problems (e.g. "Can you do better than Kelly in the short run?", Browne (2000)), but they assume a continuous outcome and a few other things. I'd also be happy with a way to find $f$ via simulations, an analytical formula is not essential.

[Edit: you cannot bet more than you currently have. I should have specified this earlier.]

### StackOverflow

#### Difference between standardscaler and Normalizer in sklearn.preprocessing

What is the difference between standardscaler and normalizer in sklearn.preprocessing module? Don't both do the same thing? i.e remove mean and scale using deviation?

#### Should a neural network be able to have a perfect train accuracy?

The title says it all: Should a neural network be able to have a perfect train accuracy? Mine saturates at ~0.9 accuracy and I am wondering if that indicates a problem with my network or the training data.

Training instances: ~4500 sequences with an average length of 10 elements. Network: Bi-directional vanilla RNN with a softmax layer on top.

#### A function that can both compose and chain (dot notation) in Javascript

I'm trying to convert an old api that uses a lot of dot notation chaining which needs to be kept ie:

[1,2,3,4].newSlice(1,2).add(1) // [3]


I'd like to add the functional style of composition in this example Ramda but lodash or others would be fine:

const sliceAddOne = R.compose(add(1), newSlice(1,2))


My question is how can I do both chaining and composition in my function newSlice what would this function look like?

I have a little jsBin example.

### QuantOverflow

#### Conditional Expected Shortfall

I will pls like to know how to forecast the conditional mean

I have fitted an AR(1)-Garch(1,1) to my data and want to estimate conditional expected shotrfall

### Fefe

#### Wikileaks hat gerade massiv ins Klo gegriffen und personenbezogene ...

Wikileaks hat gerade massiv ins Klo gegriffen und personenbezogene Daten veröffentlicht. Damit haben sie erstmals tatsächlich Menschen in Gefahr gebracht, was das Pentagon ihnen ja seit Jahren vorwirft, ohne dafür tatsächlich auch nur einen einzigen Fall belegen zu können.

Wie sagen die Piraten bei Asterix? Sic transit gloria mundi. Seufz.

### QuantOverflow

#### Non-contractual accounts behavioural study

I need to carry a non-contractual accounts behavoiural study for a bank. The objective is to estimate core/non core ratios and then bucket and ftp them. Any recipe where to start? I have 3yrs of historical data, daily closing balances. From what I googled I understand that I need some kind of seasonal vs growth trend segregation. But only guidelines, nothing in particular. Visually represented my data has (e.g. current accounts) very heavy seasonal bias with highs in shoulder seasons and lows in the festive seasons ;)). How to isolate it? How do I then calculate the true core/volatile ratio?

### StackOverflow

#### I am getting error when i run an azure ML experiment in Excel

Error! {"error":{"code":"LibraryExecutionError","message":"Module execution encountered an internal library error.","details":[{"code":"TableSchemaColumnCountMismatch","target":" (AFx Library)","message":"data: The table column count (0) must match the schema column count (17)."}]}}


Can you help me to solve this problem

Thanks, Smitha

#### How to associate two groups of clusters in user-item matrix(a bit like collaborative filtering)?

I have constructed a user-item matrix by Python. This matrix is like this: user-item matrix example

In the schematic diagram of the matrix above, I use k-means to cluster rows and columns respectively, I determine the k by use x-means algorithm(use Bayesian Information Criterion), and naturally, I got two groups of clusters:

cluster based rows:

1.WangMing, BaiLi (indicates Chinese)

2.Alice, Bob(indicates American)

3.Sakura, Naruto(indicates Japanese)

cluster based columns:

1.noodles, dumplings(indicates Chinese food)

2.McDonald's, KFC, Burger King(indicates American fast food)

3.Sushi, salmon(indicates Japanese cuisine)

For some need, I want to associate two groups of clusters, for example, I want this output according the clusters result above:

WangMing, BaiLi(Chinese) -> noodles, dumplings(Chinese food)

Alice, Bob(American) -> McDonald's, KFC, Burger King(American fast food)

Sakura, Naruto(Japanese) -> Sushi, salmon(Japanese cuisine)

Because clustering is run independently, I don't know how to do this association. I have already constructed a user-item matrix and new to Machine learning, so could you please show me some code or Github project or some papers about how to handle this problem?

Thank you very much! I really need guide.

### StackOverflow

#### Python: tf-idf-cosine: to find document similarity

I was following a tutorial which was available at Part 1 & Part 2 unfortunately author didn't have time for the final section which involves using cosine to actually find the similarity between two documents. I followed the examples in the article with the help of following link from stackoverflow I have included the code that is mentioned in the above link just to make answers life easy.

from sklearn.feature_extraction.text import CountVectorizer
from sklearn.feature_extraction.text import TfidfTransformer
from nltk.corpus import stopwords
import numpy as np
import numpy.linalg as LA

train_set = ["The sky is blue.", "The sun is bright."] #Documents
test_set = ["The sun in the sky is bright."] #Query
stopWords = stopwords.words('english')

vectorizer = CountVectorizer(stop_words = stopWords)
#print vectorizer
transformer = TfidfTransformer()
#print transformer

trainVectorizerArray = vectorizer.fit_transform(train_set).toarray()
testVectorizerArray = vectorizer.transform(test_set).toarray()
print 'Fit Vectorizer to train set', trainVectorizerArray
print 'Transform Vectorizer to test set', testVectorizerArray

transformer.fit(trainVectorizerArray)
print
print transformer.transform(trainVectorizerArray).toarray()

transformer.fit(testVectorizerArray)
print
tfidf = transformer.transform(testVectorizerArray)
print tfidf.todense()


as a result of above code I have following matrix

Fit Vectorizer to train set [[1 0 1 0]
[0 1 0 1]]
Transform Vectorizer to test set [[0 1 1 1]]

[[ 0.70710678  0.          0.70710678  0.        ]
[ 0.          0.70710678  0.          0.70710678]]

[[ 0.          0.57735027  0.57735027  0.57735027]]


I am not sure how to use this output to calculate cosine similarity, I know how to implement cosine similarity respect to two vectors with similar length but here I am not sure how to identify the two vectors.

### StackOverflow

#### Converting code to pure functional form

In a sense, any imperative code can be converted to pure functional form by making every operation receive and pass on a 'state of the world' parameter.

However, suppose you have some code that is almost in pure functional form except that, buried many layers of function calls deep, are a handful of imperative operations that modify global or at least widely accessed state e.g. calling a random number generator, updating a counter, printing some debug info, and you want to convert it to pure functional form, algorithmically, with the minimum of changes.

Is there a way to do this without essentially turning the entire program inside out?

### StackOverflow

#### Functional JavaScript, remove assignment

How can I avoid the self variable here?

function urlBuilder(endpoint){
var status = ["user_timeline", "home_timeline", "retweets_of_me", "show", "retweets",
"update", "update_with_media", "retweet", "unretweet", "retweeters", "lookup"
],
friendships = ["incoming", "outgoing", "create", "destroy", "update", "show"];
let endpoints = {
status: status,
friendships: friendships
}

var self = { };

endpoints[endpoint].forEach(e => {
self[e] = endpoint + "/" + e;
});

return self;

}


somewhat better, still an assignment statement.

return [{}].map(el => {
endpoints[endpoint].forEach(e => {
el[e] = endpoint + "/" + e;
});
return el;
})[0];


### CompsciOverflow

#### Algorithm to optimize the auctioning profit [on hold]

I am trying to come up with an optimal algorithm for an auctioning system where people who want to buy a set of items can collectively place bid for them.

Input: N items along with M bids from people and each bid is like [price ; list of items]

Output: Maximize the total profits of the auction holder from the auction

By, profit I mean the sum of the accepted bid values. The condition for any two accepted bids is that their list of items should not have any item common.

I thought of some greedy solutions but they didn't work very well.

Which algorithms can be used here? Are there any algorithms which can fairly optimize the profits if not maximize (like if there's a maximum time for the algorithm to run)?

By algorithm that fairly optimizes the profits I mean like, the hill climbing algorithm where this problem could be modeled as a greedy local search which might not give the maximum profit but still would give us a fairly decent profit (at a local maxima) in a lesser amount of time.

EDIT This is an example to demonstrate the problem

Input : N = 6 and following are the bids from 4 people -

Person 1 : 3000 for 0, 1, 4

Person 2 : 2000 for 0, 1, 5

Person 3 : 1000.5 for 2, 3

Person 4 : 1525.75 for 0, 1, 2, 3, 4, 5

Then picking bids from person 1 and 3 would give maximum profit (4000.5) to the auction holder. (When the bids become too high then even an approximate algorithm which fairly optimizes if not maximizes this profit would do).

### StackOverflow

#### How to machine learn (Tree) on several attributes?

I am using python and scikit-learn's tree classifier in a little fictive machine learning problem. I have binary outcome variable (wc_measure) and I believe it is dependent on a few other variables (cash, crisis, and industry). I tried the following:

#   import neccessary packages
import pandas as pd
import numpy as np
import sklearn as skl
from sklearn import tree
from sklearn.cross_validation import train_test_split as tts

#   import data and give a little overview

s = sample

#   What I want to learn on
X = [s.crisis, s.cash, s.industry]
y = s.wc_measure
X_train, X_test, y_train, y_test = tts(X, y, test_size = .5)

#let's learn a little

my_tree = tree.DecisionTreeClassifier()
clf = my_tree.fit(X_train, y_train)
predictions = my_tree.predict(X_test)


I get following error: Number of labels=50 does not match number of samples=1. If I base Xon a single variable (e. g. X = s.crisis) I am asked to reshape X. I don't fully understand why I have either of these issues... Ideas?

PS: This is the return of print(X)

[0     4.0
1     4.0
2     5.0
3     3.0
4     4.0
5     2.0
6     2.0
7     1.0
8     3.0
9     3.0
10    4.0
11    3.0
12    2.0
13    4.0
14    5.0
15    4.0
16    2.0
17    2.0
18    3.0
19    2.0
20    5.0
21    4.0
22    2.0
23    4.0
24    5.0
25    1.0
26    5.0
27    3.0
28    4.0
29    2.0
...
70    1.0
71    4.0
72    4.0
73    1.0
74    4.0
75    3.0
76    4.0
77    2.0
78    2.0
79    5.0
80    2.0
81    3.0
82    5.0
83    4.0
84    4.0
85    5.0
86    3.0
87    3.0
88    4.0
89    2.0
90    2.0
91    3.0
92    3.0
93    4.0
94    3.0
95    1.0
96    4.0
97    2.0
98    3.0
99    4.0
Name: crisis, dtype: float32, 0      450.283417
1      113.472214
2       11.811784
3     1007.507446
4      293.895142
5     1133.297729
6     2237.830322
7     1475.787109
8      283.363678
9      626.888794
10      38.865730
11     991.999390
12    1115.746948
13     373.537231
14      97.570717
15     136.079193
16    2560.691406
17     667.062073
18    1378.384521
19     152.716400
20       5.779267
21     481.511566
22     677.809631
23     722.521790
24      32.927990
25    2504.450928
26      17.422865
27     651.585083
28     549.469177
29     297.458527
...
70    1198.370239
71     471.343933
72     389.709290
73    2962.622803
74     581.519287
75    1148.822388
76      67.653664
77    1346.391602
78    1764.086914
79      14.308219
80     973.152161
81     552.576904
82       2.863116
83     425.520752
84     321.773682
85      63.597332
86    1351.122559
87     735.856567
88     745.656677
89    2784.453125
90    1438.272705
91     768.780823
92     827.021423
93     591.778015
94     885.169434
95    1143.088867
96     399.816803
97    1517.454834
98    1311.692505
99     533.062561
Name: cash, dtype: float32, 0     5.0
1     2.0
2     3.0
3     5.0
4     4.0
5     3.0
6     5.0
7     1.0
8     1.0
9     2.0
10    1.0
11    5.0
12    2.0
13    4.0
14    6.0
15    2.0
16    6.0
17    2.0
18    5.0
19    1.0
20    3.0
21    4.0
22    2.0
23    6.0
24    4.0
25    4.0
26    3.0
27    3.0
28    5.0
29    1.0
...
70    2.0
71    4.0
72    3.0
73    6.0
74    6.0
75    5.0
76    1.0
77    3.0
78    5.0
79    4.0
80    2.0
81    3.0
82    2.0
83    5.0
84    3.0
85    5.0
86    5.0
87    4.0
88    6.0
89    6.0
90    4.0
91    3.0
92    4.0
93    6.0
94    3.0
95    2.0
96    3.0
97    4.0
98    6.0
99    4.0


PPS: Here is how I generate the data in Stata:

clear matrix
clear all
set more off

set obs 100
gen id = _n

*Basics
gen industry = round(runiform()*5+1)
gen activity = round(runiform()*5+1)
gen crisis = round(runiform()*4+1)
egen min_crisis = min(crisis)
egen max_crisis = max(crisis)
gen n_crisis = (crisis - min_crisis)/(max_crisis-min_crisis)

*Company details
gen staff = round((0.5 * industry + 0.3 * activity - 0.2 * crisis) * runiform()*100+1)

gen revenue = (0.5 * industry + 0.2 * activity - 0.3 * crisis ) * 1000 + runiform()
replace revenue = 0 if revenue<0

*Working Capital (wc)
gen stock = runiform()*0.5*crisis*revenue
gen receivables = runiform()*0.5*crisis*revenue
gen payables = runiform()*-0.5*crisis*revenue
replace payables = 0 if payables < 0
gen wc = stock + receivables - payables
egen avg_wc = mean(wc), by(industry)

*Liquidity
gen loan = (0.5 * industry + 0.2 * activity - 0.3 * crisis ) * 1000 + runiform()
replace loan = 0 if loan<0
egen pc_loan = pctile(loan), p(0.2) by(industry)
replace loan = 0 if loan<pc_loan

gen current_debt = n_crisis * loan + runiform()*100

gen cash = (1-n_crisis)*revenue + runiform()*100

*Measures

*WC-measure (binary)
gen wc_status = (wc-avg_wc)
egen max_wc_status = max(wc_status), by(industry)
egen min_wc_status = min(wc_status), by(industry)
gen n_wc_status = (wc_status - min_wc_status) / (max_wc_status-min_wc_status)
gen wc_measure = round(n_wc_status)


#### train logistic regression model with different feature dimension in scikit learn

Using Python 2.7 on Windows. Want to fit a logistic regression model using feature T1 and T2 for a classification problem, and target is T3.

I show the values of T1 and T2, as well as my code. The question is, since T1 has dimension 5, and T2 has dimension 1, how should we pre-process them so that it could be leveraged by scikit-learn logistic regression training correctly?

BTW, I mean for training sample 1, its feature of T1 is [ 0 -1 -2 -3], and feature of T2 is [0], for training sample 2, its feature of T1 is [ 1 0 -1 -2] and feature of T2 is [1], ...

import numpy as np
from sklearn import linear_model, datasets

arc = lambda r,c: r-c
T1 = np.array([[arc(r,c) for c in xrange(4)] for r in xrange(5)])
print T1
print type(T1)
T2 = np.array([[arc(r,c) for c in xrange(1)] for r in xrange(5)])
print T2
print type(T2)
T3 = np.array([0,0,1,1,1])

logreg = linear_model.LogisticRegression(C=1e5)

# we create an instance of Neighbours Classifier and fit the data.
# using T1 and T2 as features, and T3 as target
logreg.fit(T1+T2, T3)


T1,

[[ 0 -1 -2 -3]
[ 1  0 -1 -2]
[ 2  1  0 -1]
[ 3  2  1  0]
[ 4  3  2  1]]


T2,

[[0]
[1]
[2]
[3]
[4]]


### CompsciOverflow

#### Why are there two different until ($\cup$) semantics in Timed Computation Tree Logic?

Background:
In the book of Principles of Model Checking (Christel Baier and Joost-Peter Katoen, MIT Press, 2007), Section 9.2, page 701, the semantics of the until modality is defined over some time-divergent path $\pi \in s_0 \Rightarrow^{d_0} s_1 \Rightarrow^{d_1} \cdots \Rightarrow^{d_{i-1}} s_i \Rightarrow^{d_i} \cdots$ as follows:

$\pi \models \Phi \cup^{J} \Psi \iff$
$\exists i \ge 0. s_i + d \models \Psi \textrm{ for some } d \in [0,d_i] \textrm{ with } \sum_{k=0}^{i-1}d_k + d \in J \textrm{ and }$ $\forall j \le i. s_j + d' \models \Phi \lor \Psi \textrm{ for any } d' \in [0,d_j] \textrm{ with } \sum_{k=0}^{j-1} d_k + d' \le \sum_{k=0}^{i-1} d_k + d$.

Intuitively, time-divergent path $\pi \in s_0 \Rightarrow^{d_0} s_1 \Rightarrow^{d_1} \cdots \Rightarrow^{d_{i-1}} s_i \Rightarrow^{d_i} \cdots$ satisfies $\Phi \cup^{J} \Psi$ whenever at some time point in $J$, a state is reached satisfying $\Psi$ and at all previous time instants $\Phi \lor \Psi$ holds.

However, in the book of Model Checking by E.M. Clarke (Section 16.3, Page 256), the semantics of the until modality is given as follows:

$s \models E[\Phi \cup_{[a,b]} \Psi]$ if and only if there exists a path $\pi = s_0 s_1 s_2 \cdots$ starting at $s = s_0$ and some $i$ such that $a \le i \le b$ and $s_i \models \Psi$ and for all $j < i, s_j \models \Phi$.

As indicated, the second definition is stricter than the first one in that it does not allow the case of $\lnot \Phi \land \Psi$ before reaching a state satisfying $\Psi$.

Questions:

1. Why are there two different until ($\cup$) semantics in Timed Computation Tree Logic (TCTL)?

2. Which one is more official?

### StackOverflow

#### Should I prefer joined() or flatMap(_:) in Swift 3?

Swift 3 recently added joined(). I'm curious about the performance characteristics of these two ways of flattening an array:

let array = [[1,2,3],[4,5,6],[7,8,9]]
let j = Array(array.joined())
let f = array.flatMap({$0})  They both flatten the nested array into [1, 2, 3, 4, 5, 6, 7, 8, 9]. Should I prefer one over the other for performance? Also, is there a more readable way to write the calls? ### CompsciOverflow #### Can random suitless$52$playing card data be compressed to approach, match, or even beat entropy encoding storage? If so, how? I have real data I am using for a simulated card game. I am only interested in the ranks of the cards, not the suits. However it is a standard$52$card deck so there are only$4$of each rank possible in the deck. The deck is shuffled well for each hand, and then I output the entire deck to a file. So there are only$13$possible symbols in the output file which are$2,3,4,5,6,7,8,9,T,J,Q,K,A$. ($T$= ten rank). So of course we can bitpack these using$4$bits per symbol, but then we are wasting$3$of the$16$possible encodings. We can do better if we group$4$symbols at a time, and then compress them, because$13^4$=$28,561$and that can fit rather "snugly" into$15$bits instead of$16$. The theoretical bitpacking limit is log($13$) / log($2$) =$3.70044$for data with$13$random symbols for each possible card. However we cannot have$52$kings for example in this deck. We MUST have only$4$of each rank in each deck so the entropy encoding drops by about half a bit per symbol to about$3.2$. Ok, so here is what I am thinking. This data is not totally random. We know there are$4$of each rank so in each block of$52$cards (call it a shuffled deck), so we can make several assumptions and optimizations. One of those being we do not have to encode the very last card, because we will know what it should be. Another savings would be if we end on a single rank; for example, if the last$3$cards in the deck are$777$, we wouldn't have to encode those because the decoder would be counting cards up to that point and see that all the other ranks have been filled, and will assume the$3$"missing" cards are all$7$s. So my question to this site is, what other optimizations are possible to get an even smaller output file on this type of data, and if we use them, can we ever beat the theoretical (simple) bitpacking entropy of$3.70044$bits per symbol, or even approach the ultimate entropy limit of about$3.2$bits per symbol on average? If so, how? When I use a ZIP type program (WinZip for example), I only see about a$2:1$compression, which tells me it is just doing a "lazy" bitpack to$4$bits. If I "pre-compress" the data using my own bitpacking, it seems to like that better, because then when I run that through a zip program, I am getting a little over$2:1$compression. What I am thinking is, why not do all the compression myself (because I have more knowledge of the data than the Zip program does). I am wondering if I can beat the entropy "limit" of log($13$)/log($2$) =$3.70044$. I suspect I can with the few "tricks" I mentioned and a few more I can probably find out. The output file of course does not have to be "human readable". As long as the encoding is lossless it is valid. Here is a link to$3$million human readable shuffled decks ($1$per line). Anyone can "practice" on a small subset of these lines and then let it rip on the entire file. I will keep updating my best (smallest) filesize based on this data. https://drive.google.com/file/d/0BweDAVsuCEM1amhsNmFITnEwd2s/view By the way, in case you are interested in what type of card game this data is used for, here is the link to my active question (with$300$point bounty). I am being told it is a hard problem to solve (exactly) since it would require a huge amount of data storage space. Several simulations agree with the approximate probabilities though. No purely mathematical solutions have been provided (yet). It's too hard, I guess. http://math.stackexchange.com/questions/1882705/probability-2-player-card-game-with-multiple-patterns-to-win-who-has-the-advant I have a good algorithm that is showing$168$bits to encode the first deck in my sample data. This data was generated randomly using the Fisher-Yates shuffle algorithm. It is real random data, so my newly created algorithm seems to be working VERY well, which makes me happy. Regarding the compression "challenge", I can give you more information. I have my 3 million deck sorted in a database so I can run queries against it. I immediate noticed that the first 4 cards are the same for a few dozen decks each so many decks (for example) start with 2222 after sorting. That means this information is redundant and can be stored once and then the remainder of the changing parts of the decks stored with it. There should be a significant savings of bits per deck using this scheme since the normal overhead of the first 4 cards is 3.7 bits per card which is 14.8 bits total. We would have to give 1 bit back to tell the decoder whether we are encoding an "abbreviated" deck or a full deck so I am expecting maybe 13 bits or so saved per deck on average, thus putting it BELOW the 166 bit per deck "limit". However I do not have all 3 million bitpatterns from the output of my packing scheme yet, so I don't yet know if the redundancy will be similar to that of the raw data. I suspect it should be similar but not identical. #### Dynamic Programming I am learning algorithms, but I go stuck at the Dynamic Programming . Theoretically, I am getting the idea. But unable to implement it. Finding difficulty in identifying the recursion for problems. Someone Please help me the way to approach the dynamic programming problems. ### StackOverflow #### AUC calculation in decision tree in scikit-learn Using scikit-learn with Python 2.7 on Windows, what is wrong with my code to calculate AUC? Thanks. from sklearn.datasets import load_iris from sklearn.cross_validation import cross_val_score from sklearn.tree import DecisionTreeClassifier clf = DecisionTreeClassifier(random_state=0) iris = load_iris() #print cross_val_score(clf, iris.data, iris.target, cv=10, scoring="precision") #print cross_val_score(clf, iris.data, iris.target, cv=10, scoring="recall") print cross_val_score(clf, iris.data, iris.target, cv=10, scoring="roc_auc") Traceback (most recent call last): File "C:/Users/foo/PycharmProjects/CodeExercise/decisionTree.py", line 8, in <module> print cross_val_score(clf, iris.data, iris.target, cv=10, scoring="roc_auc") File "C:\Python27\lib\site-packages\sklearn\cross_validation.py", line 1433, in cross_val_score for train, test in cv) File "C:\Python27\lib\site-packages\sklearn\externals\joblib\parallel.py", line 800, in __call__ while self.dispatch_one_batch(iterator): File "C:\Python27\lib\site-packages\sklearn\externals\joblib\parallel.py", line 658, in dispatch_one_batch self._dispatch(tasks) File "C:\Python27\lib\site-packages\sklearn\externals\joblib\parallel.py", line 566, in _dispatch job = ImmediateComputeBatch(batch) File "C:\Python27\lib\site-packages\sklearn\externals\joblib\parallel.py", line 180, in __init__ self.results = batch() File "C:\Python27\lib\site-packages\sklearn\externals\joblib\parallel.py", line 72, in __call__ return [func(*args, **kwargs) for func, args, kwargs in self.items] File "C:\Python27\lib\site-packages\sklearn\cross_validation.py", line 1550, in _fit_and_score test_score = _score(estimator, X_test, y_test, scorer) File "C:\Python27\lib\site-packages\sklearn\cross_validation.py", line 1606, in _score score = scorer(estimator, X_test, y_test) File "C:\Python27\lib\site-packages\sklearn\metrics\scorer.py", line 159, in __call__ raise ValueError("{0} format is not supported".format(y_type)) ValueError: multiclass format is not supported  ### Undeadly #### Reminder: Early registration for EuroBSDcon 2016 ends Aug 24 EuroBSDcon 2016 (see earlier article) is on from 22 to 25 September 2016, in Belgrade, Serbia. Early registration ends 2016-08-24 23:59 CEST, so get in now for discounted prices on great (Open)BSD talks and tutorials! ### StackOverflow #### scikit learn logistic regression precision calculation weird warning Using scikit-learn with Python 2.7 on Windows. Here is my code and my code has no warning if I change precision to precision_weighted for scoring parameter. But I do not know what does the warning mean and what is the reason behind scene to explicitly specify average to be one of (None, 'micro', 'macro', 'weighted', 'samples')? Actually in my case, I want to treat all samples of equal weight, but it seems there is no such option in the 5 choices? from sklearn import linear_model, datasets from sklearn.cross_validation import cross_val_score # import some data to play with iris = datasets.load_iris() X = iris.data[:, :2] # we only take the first two features. Y = iris.target h = .02 # step size in the mesh logreg = linear_model.LogisticRegression(C=1e5) # we create an instance of Neighbours Classifier and fit the data. logreg.fit(X, Y) print cross_val_score(logreg, X, Y, cv=10, scoring="precision") #print cross_val_score(logreg, X, Y, cv=10, scoring="precision_weighted")  Warning message, C:\Python27\lib\site-packages\sklearn\metrics\classification.py:1203: DeprecationWarning: The default weighted averaging is deprecated, and from version 0.18, use of precision, recall or F-score with multiclass or multilabel data or pos_label=None will result in an exception. Please set an explicit value for average, one of (None, 'micro', 'macro', 'weighted', 'samples'). In cross validation use, for instance, scoring="f1_weighted" instead of scoring="f1". sample_weight=sample_weight) C:\Python27\lib\site-packages\sklearn\metrics\classification.py:1203: DeprecationWarning: The default weighted averaging is deprecated, and from version 0.18, use of precision, recall or F-score with multiclass or multilabel data or pos_label=None will result in an exception. Please set an explicit value for average, one of (None, 'micro', 'macro', 'weighted', 'samples'). In cross validation use, for instance, scoring="f1_weighted" instead of scoring="f1". sample_weight=sample_weight) C:\Python27\lib\site-packages\sklearn\metrics\classification.py:1203: DeprecationWarning: The default weighted averaging is deprecated, and from version 0.18, use of precision, recall or F-score with multiclass or multilabel data or pos_label=None will result in an exception. Please set an explicit value for average, one of (None, 'micro', 'macro', 'weighted', 'samples'). In cross validation use, for instance, scoring="f1_weighted" instead of scoring="f1". sample_weight=sample_weight) C:\Python27\lib\site-packages\sklearn\metrics\classification.py:1203: DeprecationWarning: The default weighted averaging is deprecated, and from version 0.18, use of precision, recall or F-score with multiclass or multilabel data or pos_label=None will result in an exception. Please set an explicit value for average, one of (None, 'micro', 'macro', 'weighted', 'samples'). In cross validation use, for instance, scoring="f1_weighted" instead of scoring="f1". sample_weight=sample_weight) C:\Python27\lib\site-packages\sklearn\metrics\classification.py:1203: DeprecationWarning: The default weighted averaging is deprecated, and from version 0.18, use of precision, recall or F-score with multiclass or multilabel data or pos_label=None will result in an exception. Please set an explicit value for average, one of (None, 'micro', 'macro', 'weighted', 'samples'). In cross validation use, for instance, scoring="f1_weighted" instead of scoring="f1". sample_weight=sample_weight) C:\Python27\lib\site-packages\sklearn\metrics\classification.py:1203: DeprecationWarning: The default weighted averaging is deprecated, and from version 0.18, use of precision, recall or F-score with multiclass or multilabel data or pos_label=None will result in an exception. Please set an explicit value for average, one of (None, 'micro', 'macro', 'weighted', 'samples'). In cross validation use, for instance, scoring="f1_weighted" instead of scoring="f1". sample_weight=sample_weight) C:\Python27\lib\site-packages\sklearn\metrics\classification.py:1203: DeprecationWarning: The default weighted averaging is deprecated, and from version 0.18, use of precision, recall or F-score with multiclass or multilabel data or pos_label=None will result in an exception. Please set an explicit value for average, one of (None, 'micro', 'macro', 'weighted', 'samples'). In cross validation use, for instance, scoring="f1_weighted" instead of scoring="f1". sample_weight=sample_weight) C:\Python27\lib\site-packages\sklearn\metrics\classification.py:1203: DeprecationWarning: The default weighted averaging is deprecated, and from version 0.18, use of precision, recall or F-score with multiclass or multilabel data or pos_label=None will result in an exception. Please set an explicit value for average, one of (None, 'micro', 'macro', 'weighted', 'samples'). In cross validation use, for instance, scoring="f1_weighted" instead of scoring="f1". sample_weight=sample_weight) C:\Python27\lib\site-packages\sklearn\metrics\classification.py:1203: DeprecationWarning: The default weighted averaging is deprecated, and from version 0.18, use of precision, recall or F-score with multiclass or multilabel data or pos_label=None will result in an exception. Please set an explicit value for average, one of (None, 'micro', 'macro', 'weighted', 'samples'). In cross validation use, for instance, scoring="f1_weighted" instead of scoring="f1". sample_weight=sample_weight) C:\Python27\lib\site-packages\sklearn\metrics\classification.py:1203: DeprecationWarning: The default weighted averaging is deprecated, and from version 0.18, use of precision, recall or F-score with multiclass or multilabel data or pos_label=None will result in an exception. Please set an explicit value for average, one of (None, 'micro', 'macro', 'weighted', 'samples'). In cross validation use, for instance, scoring="f1_weighted" instead of scoring="f1". sample_weight=sample_weight) [ 0.66666667 0.80555556 0.9047619 0.86666667 0.80555556 0.875 0.94444444 0.80555556 0.82222222 0.80555556]  ### XKCD #### Proofs ### StackOverflow #### Are Gaussian clusters linearly separable? Imagine you have two Gaussian probability distributions in two-dimensions The first is centered at (0,1) and the second at (0,-1). (For simplicity, assume they have the same variance.) Can one consider that the clusters of data points sampled from these two Gaussians are linearly separable? Intuitively, it's clear that the boundary separating the two distributions is linear, namely the abscissa in our case. However, the formal requirement for linear separability is that the convex hulls of the clusters do not overlap. This cannot be the case with Gaussian-generated clusters since their underlying probability distributions pervade all of R^2 (albeit with negligible probabilities far away from the mean). So, are Gaussian-generated clusters linearly separable? How can one reconcile the requirement of convex hulls with the fact that a straight line is the only conceivable "boundary"? Or, perhaps, the boundary effectively ceases to be linear once non-equal variances come in the pictures? ### CompsciOverflow #### periodicity extraction of cubic lattice substitutional system I would like to ask what is the state of art methods in extracting periodicities from lattice substitutional system Given a 12 by 12 by 12 cubic periodic systems, where each grid has a number, say 1, 0... The question to ask is what is the method that could be used to extract an "approximate subsystem" that has much lower periodicity. #### How to Determine Existence of Turing Reducible Languages? Are finite and recursive instances of$L$possible with the following constraints?$\text{L $\subseteq$ {0,1}*}$and$L \leq HPL \leq \overline L$where$\overline L = \{\text{ x $\in$ {0,1}* : x $\notin$ L }\}$.$HPL$denotes the halting problem language. 1. Does there exist a finite language$L$that meets the above constraints? In other words, given the above information, is it possible that$L$is finite? I'm thinking$L$being a finite language is recursive and thus can be reduced to$HPL$a recursive enumerable language. 2. Does there exist a recursive language$L$that meets the above constraints? In other words, given the above information, is it possible that$L$is recursive? ### Lobsters #### Using security features to do bad things ### TheoryOverflow #### Approximating set cover when it is known that an exact set cover exists Suppose$U = \{1, 2, \cdots, n\}$is a universe and$\mathcal S = \{S_1, S_2, \cdots, S_m\}$is a collection of sets such that each set contains exactly$c$elements, where$c$is a constant. In this case, a$c$-approximation is easy. It is also possible to improve that to$\ln c + 1$-approximation. My question is following: Suppose along with this special set cover instance, you are told that an exact cover (of size$n/c$) exists. Is it possible to get a better approximation factor? What is known about hardness of approximation in this case? ### StackOverflow #### How do I read the output of sparsenn on my test set? After running sparsenn on my training set, I get this output:  pass 0 tacc 0.54629 sacc 0.54629 trms 0.99939 srms 0.99939 tauc 0.50530 sauc 0.50530 ( [ { pass 10 tacc 0.54629 sacc 0.54629 trms 0.99600 srms 0.99600 tauc 0.50550 sauc 0.50550 ) [ { pass 20 tacc 0.54629 sacc 0.54629 trms 0.99569 srms 0.99569 tauc 0.50286 sauc 0.50286 ) [ } pass 30 tacc 0.54629 sacc 0.54629 trms 0.99572 srms 0.99572 tauc 0.50526 sauc 0.50526 ) ] } pass 40 tacc 0.54629 sacc 0.54629 trms 0.99573 srms 0.99573 tauc 0.50539 sauc 0.50539 ) ] }  which is alright. Now after I run this on my test set, I get an output with a number on each line - here's a sample:  0.070450 0.070450 0.070550 0.070450 0.070461 0.070524 0.070483 0.070450 0.070608 0.070450  I have 30000 lines in my output. What do these numbers represent? How do I understand the AUC from these? I would ideally like to generate the AUC curve.  ./nnlearn -e 170 -h 32 -r 0.1 nnoutput90.txt nnoutput90.txt qwerty ./nnclassify nntest.txt qwerty.auc p.txt  These are the code I'm running. ### Lobsters #### Advocating Against Android Fragments ### QuantOverflow #### How to automatically get all options data for a particular stock into microsoft excel? I'm looking for a way to get the entire options chain (All options expiries) for a particular stock in excel without manually copy pasting anything. It does not have to be real time and I will only be searching for 1 stocks options at a time. A free method would be ideal. ### arXiv Networking and Internet Architecture #### Mobile and Residential INEA Wi-Fi Hotspot Network. (arXiv:1608.06606v1 [cs.NI]) Since 2012 INEA has been developing and expanding the network of IEEE 802.11 compliant Wi-Fi hotspots (access points) located across the Greater Poland region. This network consists of 330 mobile (vehicular) access points carried by public buses and trams and over 20,000 fixed residential hotspots distributed throughout the homes of INEA customers to provide Internet access via the "community Wi-Fi" service. Therefore, this paper is aimed at sharing the insights gathered by INEA throughout 4 years of experience in providing hotspot-based Internet access. The emphasis is put on daily and hourly trends in order to evaluate user experience, to determine key patterns, and to investigate the influences such as public transportation trends, user location and mobility, as well as, radio frequency noise and interference. #### Application of Public Ledgers to Revocation in Distributed Access Control. (arXiv:1608.06592v1 [cs.CR]) There has recently been a flood of interest in potential new applications of blockchains, as well as proposals for more generic designs called public ledgers. Most of the novel proposals have been in the financial sector. However, the public ledger is an abstraction that solves several of the fundamental problems in the design of secure distributed systems: global time in the form of a strict linear order of past events, globally consistent and immutable view of the history, and enforcement of some application-specific safety properties. This paper investigates the applications of public ledgers to access control and, more specifically, to group management in distributed systems where entities are represented by their public keys and authorization is encoded into signed certificates. It is particularly difficult to handle negative information, such as revocation of certificates or group membership, in the distributed setting. The linear order of events and global consistency simplify these problems, but the enforcement of internal constraints in the ledger implementation often presents problems. We show that different types of revocation require slightly different properties from the ledger. We compare the requirements with Bitcoin, the best known blockchain, and describe an efficient ledger design for membership revocation that combines ideas from blockchains and from web-PKI monitoring. While we use certificate-based group-membership management as the case study, the same ideas can be applied more widely to rights revocation in distributed systems. #### Syntax and analytic semantics of LISA. (arXiv:1608.06583v1 [cs.PL]) We provide the syntax and semantics of the LISA (for "Litmus Instruction Set Architecture") language. The parallel assembly language LISA is implemented in the herd7 tool (this http URL) for simulating weak consistency models. #### Boosting PLC Networks for High-Speed Ubiquitous Connectivity in Enterprises. (arXiv:1608.06574v1 [cs.NI]) Powerline communication (PLC) provides inexpensive, secure and high speed network connectivity, by leveraging the existing power distribution networks inside the buildings. While PLC technology has the potential to improve connectivity and is considered a key enabler for sensing, control, and automation applications in enterprises, it has been mainly deployed for improving connectivity in homes. Deploying PLCs in enterprises is more challenging since the power distribution network is more complex as compared to homes. Moreover, existing PLC technologies such as HomePlug AV have not been designed for and evaluated in enterprise deployments. In this paper, we first present a comprehensive measurement study of PLC performance in enterprise settings, by analyzing PLC channel characteristics across space, time, and spectral dimensions, using commodity HomePlug AV PLC devices. Our results uncover the impact of distribution lines, circuit breakers, AC phases and electrical interference on PLC performance. Based on our findings, we show that careful planning of PLC network topology, routing and spectrum sharing can significantly boost performance of enterprise PLC networks. Our experimental results show that multi-hop routing can increase throughput performance by 5x in scenarios where direct PLC links perform poorly. Moreover, our trace driven simulations for multiple deployments, show that our proposed fine-grained spectrum sharing design can boost the aggregated and per-link PLC throughput by more than 20% and 100% respectively, in enterprise PLC networks. #### High-Quality Synthesis Against Stochastic Environments. (arXiv:1608.06567v1 [cs.LO]) In the classical synthesis problem, we are given an LTL formula psi over sets of input and output signals, and we synthesize a transducer that realizes psi. One weakness of automated synthesis in practice is that it pays no attention to the quality of the synthesized system. Indeed, the classical setting is Boolean: a computation satisfies a specification or does not satisfy it. Accordingly, while the synthesized system is correct, there is no guarantee about its quality. In recent years, researchers have considered extensions of the classical Boolean setting to a quantitative one. The logic LTL[F] is a multi-valued logic that augments LTL with quality operators. The satisfaction value of an LTL[F] formula is a real value in [0,1], where the higher the value is, the higher is the quality in which the computation satisfies the specification. Decision problems for LTL become search or optimization problems for LFL[F]. In particular, in the synthesis problem, the goal is to generate a transducer that satisfies the specification in the highest possible quality. Previous work considered the worst-case setting, where the goal is to maximize the quality of the computation with the minimal quality. We introduce and solve the stochastic setting, where the goal is to generate a transducer that maximizes the expected quality of a computation, subject to a given distribution of the input signals. Thus, rather than being hostile, the environment is assumed to be probabilistic, which corresponds to many realistic settings. We show that the problem is 2EXPTIME-complete, like classical LTL synthesis, and remains so in two extensions we consider: one that maximizes the expected quality while guaranteeing that the minimal quality is, with probability$1$, above a given threshold, and one that allows assumptions on the environment. #### Improving FPGA resilience through Partial Dynamic Reconfiguration. (arXiv:1608.06559v1 [cs.DC]) This paper explores advances in reconfiguration properties of SRAM-based FPGAs, namely Partial Dynamic Reconfiguration, to improve the resilience of critical systems that take advantage of this technology. Commercial of-the-shelf state-of-the-art FPGA devices use SRAM cells for the configuration memory, which allow an increase in both performance and capacity. The fast access times and unlimited number of writes of this technology, reduces reconfiguration delays and extends the device lifetime but, at the same time, makes them more sensitive to radiation effects, in the form of Single Event Upsets. To overcome this limitation, manufacturers have proposed a few fault tolerant approaches, which rely on space/time redundancy and configuration memory content recovery - scrubbing. In this paper, we first present radiation effects on these devices and investigate the applicability of the most commonly used fault tolerant approaches, and then propose an approach to improve FPGA resilience, through the use of a less intrusive failure prediction scrubbing. It is expected that this approach relieves the system designer from dependability concerns and reduces both time intrusiveness and overall power consumption. #### Robust Flows over Time: Models and Complexity Results. (arXiv:1608.06520v1 [cs.DM]) We study dynamic network flows with uncertain input data under a robust optimization perspective. In the dynamic maximum flow problem, the goal is to maximize the flow reaching the sink within a given time horizon$T$, while flow requires a certain travel time to traverse an arc. In our setting, we account for uncertain travel times of flow. We investigate maximum flows over time under the assumption that at most$\Gamma$travel times may be prolonged simultaneously due to delay. We develop and study a mathematical model for this problem. As the dynamic robust flow problem generalizes the static version, it is NP-hard to compute an optimal flow. However, our dynamic version is considerably more complex than the static version. We show that it is NP-hard to verify feasibility of a given candidate solution. Furthermore, we investigate temporally repeated flows and show that in contrast to the non-robust case (i.e., without uncertainties) they no longer provide optimal solutions for the robust problem, but rather yield a worst case optimality gap of at least$T$. We finally show that for infinite delays, the optimality gap is at most$O(k \log T)$, where$k$is a newly introduced instance characteristic. The results obtained in this paper yield a first step towards understanding robust dynamic flow problems with uncertain travel times. #### Wireless Sensor Networks: Local Multicast Study. (arXiv:1608.06511v1 [cs.NI]) Wireless sensor networks and Ad-hoc network in the region Multicast (Geocasting) means to deliver the message to all nodes in a given geographical area from the source point. Regional Multicast practical application of the specified area and other regions may be formed to broadcast transmission and location-related business information, extensive advertising, or to send an urgent message. Regional multicast protocol design goal is to ensure messaging and low transport costs. Most have been proposed agreement does not guarantee message delivery, although some other guaranteed messaging protocol has triggered high transmission costs. The goal now is to ensure that research messaging and low-cost local transmission protocol and its algorithm, to promote the development of domain communication. This paper introduces the research background and research results, and proposed to solve the problems and ideas. #### Adaptive Data Collection Mechanisms for Smart Monitoring of Distribution Grids. (arXiv:1608.06510v1 [cs.SY]) Smart Grid systems not only transport electric energy but also information which will be active part of the electricity supply system. This has led to the introduction of intelligent components on all layers of the electrical grid in power generation, transmission, distribution and consumption units. For electric distribution systems, Information from Smart Meters can be utilized to monitor and control the state of the grid. Hence, it is indeed inherent that data from Smart Meters should be collected in a resilient, reliable, secure and timely manner fulfilling all the communication requirements and standards. This paper presents a proposal for smart data collection mechanisms to monitor electrical grids with adaptive smart metering infrastructures. A general overview of a platform is given for testing, evaluating and implementing mechanisms to adapt Smart Meter data aggregation. Three main aspects of adaptiveness of the system are studied, adaptiveness to smart metering application needs, adaptiveness to changing communication network dynamics and adaptiveness to security attacks. Execution of tests will be conducted in real field experimental set-up and in an advanced hardware in the loop test-bed with power and communication co-simulation for validation purposes. #### Dijkstra Monads for Free. (arXiv:1608.06499v1 [cs.PL]) Dijkstra monads are a means by which a dependent type theory can be enhanced with support for reasoning about effectful code. These specification-level monads computing weakest preconditions, and their closely related counterparts, Hoare monads, provide the basis on which verification tools like F*, Hoare Type Theory (HTT), and Ynot are built. In this paper we show that Dijkstra monads can be derived "for free" by applying a continuation-passing style (CPS) translation to the standard monadic definitions of the underlying computational effects. Automatically deriving Dijkstra monads provides a correct-by-construction and efficient way of reasoning about user-defined effects in dependent type theories. We demonstrate these ideas in EMF*, a new dependently typed calculus, validating it both by formal proof and via a prototype implementation within F*. Besides equipping F* with a more uniform and extensible effect system, EMF* enables within F* a mixture of intrinsic and extrinsic proofs that was previously impossible. #### Delay Evaluation of OpenFlow Network Based on Queueing Model. (arXiv:1608.06491v1 [cs.DC]) As one of the most popular south-bound protocol of software-defined networking(SDN), OpenFlow decouples the network control from forwarding devices. It offers flexible and scalable functionality for networks. These advantages may cause performance issues since there are performance penalties in terms of packet processing speed. It is important to understand the performance of OpenFlow switches and controllers for its deployments. In this paper we model the packet processing time of OpenFlow switches and controllers. We mainly analyze how the probability of packet-in messages impacts the performance of switches and controllers. Our results show that there is a performance penalty in OpenFlow networks. However, the penalty is not much when probability of packet-in messages is low. This model can be used for a network designer to approximate the performance of her deployments. #### Multivariate Cryptography with Mappings of Discrete Logarithms and Polynomials. (arXiv:1608.06472v1 [cs.CR]) In this paper, algorithms for multivariate public key cryptography and digital signature are described. Plain messages and encrypted messages are arrays, consisting of elements from a fixed finite ring or field. The encryption and decryption algorithms are based on multivariate mappings. The security of the private key depends on the difficulty of solving a system of parametric simultaneous multivariate equations involving polynomial or exponential mappings. The method is a general purpose utility for most data encryption, digital certificate or digital signature applications. #### Warehousing Complex Archaeological Objects. (arXiv:1608.06469v1 [cs.DB]) Data organization is a difficult and essential component in cultural heritage applications. Over the years, a great amount of archaeological ceramic data have been created and processed by various methods and devices. Such ceramic data are stored in databases that concur to increase the amount of available information rapidly. However , such databases typically focus on one type of ceramic descriptors, e.g., qualitative textual descriptions, petrographic or chemical analysis results, and do not interoperate. Thus, research involving archaeological ceramics cannot easily take advantage of combining all these types of information. In this application paper, we introduce an evolution of the Ceramom database that includes text descriptors of archaeological features, chemical analysis results, and various images, including petrographic and fabric images. To illustrate what new analyses are permitted by such a database, we source it to a data warehouse and present a sample on-line analysis processing (OLAP) scenario to gain deep understanding of ceramic context. #### RELARM: A rating model based on relative PCA attributes and k-means clustering. (arXiv:1608.06416v1 [q-fin.CP]) Following widely used in visual recognition concept of relative attributes, the article establishes definition of the relative PCA attributes for a class of objects defined by vectors of their parameters. A new rating model (RELARM) is built using relative PCA attribute ranking functions for rating object description and k-means clustering algorithm. Rating assignment of each rating object to a rating category is derived as a result of cluster centers projection on the specially selected rating vector. Empirical study has shown a high level of approximation to the existing S & P, Moody's and Fitch ratings. #### Learning to Communicate: Channel Auto-encoders, Domain Specific Regularizers, and Attention. (arXiv:1608.06409v1 [cs.LG]) We address the problem of learning efficient and adaptive ways to communicate binary information over an impaired channel. We treat the problem as reconstruction optimization through impairment layers in a channel autoencoder and introduce several new domain-specific regularizing layers to emulate common channel impairments. We also apply a radio transformer network based attention model on the input of the decoder to help recover canonical signal representations. We demonstrate some promising initial capacity results from this architecture and address several remaining challenges before such a system could become practical. #### Phased Exploration with Greedy Exploitation in Stochastic Combinatorial Partial Monitoring Games. (arXiv:1608.06403v1 [cs.GT]) Partial monitoring games are repeated games where the learner receives feedback that might be different from adversary's move or even the reward gained by the learner. Recently, a general model of combinatorial partial monitoring (CPM) games was proposed \cite{lincombinatorial2014}, where the learner's action space can be exponentially large and adversary samples its moves from a bounded, continuous space, according to a fixed distribution. The paper gave a confidence bound based algorithm (GCB) that achieves$O(T^{2/3}\log T)$distribution independent and$O(\log T)$distribution dependent regret bounds. The implementation of their algorithm depends on two separate offline oracles and the distribution dependent regret additionally requires existence of a unique optimal action for the learner. Adopting their CPM model, our first contribution is a Phased Exploration with Greedy Exploitation (PEGE) algorithmic framework for the problem. Different algorithms within the framework achieve$O(T^{2/3}\sqrt{\log T})$distribution independent and$O(\log^2 T)$distribution dependent regret respectively. Crucially, our framework needs only the simpler "argmax" oracle from GCB and the distribution dependent regret does not require existence of a unique optimal action. Our second contribution is another algorithm, PEGE2, which combines gap estimation with a PEGE algorithm, to achieve an$O(\log T)$regret bound, matching the GCB guarantee but removing the dependence on size of the learner's action space. However, like GCB, PEGE2 requires access to both offline oracles and the existence of a unique optimal action. Finally, we discuss how our algorithm can be efficiently applied to a CPM problem of practical interest: namely, online ranking with feedback at the top. #### Formalization of Fault Trees in Higher-order Logic: A Deep Embedding Approach. (arXiv:1608.06392v1 [cs.LO]) Fault Tree (FT) is a standard failure modeling technique that has been extensively used to predict reliability, availability and safety of many complex engineering systems. In order to facilitate the formal analysis of FT based analyses, a higher-order-logic formalization of FTs has been recently proposed. However, this formalization is quite limited in terms of handling large systems and transformation of FT models into their corresponding Reliability Block Diagram (RBD) structures, i.e., a frequently used transformation in reliability and availability analyses. In order to overcome these limitations, we present a deep embedding based formalization of FTs. In particular, the paper presents a formalization of AND, OR and NOT FT gates, which are in turn used to formalize other commonly used FT gates, i.e., NAND, NOR, XOR, Inhibit, Comparator and majority Voting, and the formal verification of their failure probability expressions. For illustration purposes, we present a formal failure analysis of a communication gateway software for the next generation air traffic management system. #### A New Parallelization Method for K-means. (arXiv:1608.06347v1 [cs.DC]) K-means is a popular clustering method used in data mining area. To work with large datasets, researchers propose PKMeans, which is a parallel k-means on MapReduce. However, the existing k-means parallelization methods including PKMeans have many limitations. It can't finish all its iterations in one MapReduce job, so it has to repeat cascading MapReduce jobs in a loop until convergence. On the most popular MapReduce platform, Hadoop, every MapReduce job introduces significant I/O overheads and extra execution time at stages of job start-up and shuffling. Even worse, it has been proved that in the worst case, k-means needs MapReduce jobs to converge, where n is the number of data instances, which means huge overheads for large datasets. Additionally, in PKMeans, at most one reducer can be assigned to and update each centroid, so PKMeans can only make use of limited number of parallel reducers. In this paper, we propose an improved parallel method for k-means, IPKMeans, which has a parallel preprocessing stage using k-d tree and can finish k-means in one single MapReduce job with much more reducers working in parallel and lower I/O overheads than PKMeans and has a fast post-processing stage generating the final result. In our method, both k-d tree and the new improved parallel k-means are implemented using MapReduce and tested on Hadoop. Our experiments show that with same dataset and initial centroids, our method has up to 2/3 lower I/O overheads and consumes less amount of time than PKMeans to get a very close clustering result. #### Job Placement Advisor Based on Turnaround Predictions for HPC Hybrid Clouds. (arXiv:1608.06310v1 [cs.DC]) Several companies and research institutes are moving their CPU-intensive applications to hybrid High Performance Computing (HPC) cloud environments. Such a shift depends on the creation of software systems that help users decide where a job should be placed considering execution time and queue wait time to access on-premise clusters. Relying blindly on turnaround prediction techniques will affect negatively response times inside HPC cloud environments. This paper introduces a tool to make job placement decisions in HPC hybrid cloud environments taking into account the inaccuracy of execution and waiting time predictions. We used job traces from real supercomputing centers to run our experiments, and compared the performance between environments using real speedup curves. We also extended a state-of-the-art machine learning based predictor to work with data from the cluster scheduler. Our main findings are: (i) depending on workload characteristics, there is a turning point where predictions should be disregarded in favor of a more conservative decision to minimize job turnaround times and (ii) scheduler data plays a key role in improving predictions generated with machine learning using job trace data---our experiments showed around 20% prediction accuracy improvements. ### CompsciOverflow #### Relationship between PP and PH Toda's theorem says that$PH \subset P^{PP}$. Does this imply any relationship between$PH$and$PP$that does not involve oracles? Does it imply either that$PH \subset PP$or that$PP \subset PH$? Is it known or conjectured whether either of those hold? ### Lobsters #### Where the Database Market Goes From Here ### StackOverflow #### How to upload datasets and download updated weights file on AWS EC2, Tensorflow? I learned how to use AWS, like creating instances, stopping and terminating them by following this tutorial. But I don't know how to upload datasets on EC2 and how to download updated variables(weights) file from EC2 for later use? ### CompsciOverflow #### How to prevent overflow and underflow in the Euclidean distance and Mahalanobis distance I was working in my project when I was struck by the question of whether it would be necessary, or at least cautious, prevent overflow and underflow in the calculation of these two distances. I remembered that there is an implementation of the calculation of the hypotenuse to prevent this. Most languages ​​implementers, and is known for Hypot The calculation of the Euclidean distance remains the same "pattern" and I thought that if Hypot() controls the overflow and underflow should also beware of the Euclidean distance. I've disappointed to note that the language we use, and others, do not control the overflow and underflow for calculating distance. Will not worth spend this "additional effort"? I did a searchs and came to a question in Math.StackExchange There is no definitive answer to this issue and is somewhat old. The first thing I wondered is: Will okay? I think that yes, seeing that is a generalization of the same procedure that performs Hypot(). I decided to extrapolate this concept to the Mahalanobis distance. The original is as follows: $$D_M(X,Y,L) = \sqrt{\sum_{i=1}^{n}\left(\frac{X_i-Y_i}{L_i}\right)^2}$$ Since$L$is the vector of eigenvalues. And my proposal is this: $$D_M(X,Y,L) = C\sqrt{\sum_{i=1}^{n}\left(\frac{X_i-Y_i}{L_i }\frac{1}{C}\right)^2}$$ That is the same to: $$D_M(X,Y,L) = C\sqrt{\sum_{i=1}^{n}\left(\frac{X_i-Y_i}{L_i C}\right)^2}$$ And$C$is the max value from the$|(X_i-Y_i)/L_i|$: $$C = \max_{i}\left(|\frac{X_i-Y_i}{L_i}|\right)$$ Is it okay? #### Algorithm for cycle-detecting comparison I am looking for an algorithm that I can use to compare nested and potentially recursive data structures, for example to implement the Scheme equal? function. equal? recursively compares two objects for equality and properly handles cycles. Specifically, the algorithm needs to return true iff the (possibly infinite) unfoldings of the graphs would be equal, e.g. (letrec ((a (cons 1 (cons 2 a))) (b (cons 1 (cons 2 (cons 1 (cons 2 b)))) (equal? a b))  is true because a and b are both cyclic lists that repeat the sequence (1 2) infinitely. Destructive modification of traversed nodes is thread-unsafe, requires a spare bit in object headers, and requires a separate traversal to reset the bit. Using a hash table to store object addresses is not safe in the presence of a moving garbage collector unless the GC is blocked for the duration of the traversal (so the operation cannot be implemented except as a primitive). ### StackOverflow #### Trying to adapt TensorFlow's MNIST example gives NAN predictions I'm playing with TensorFlow, using the 'MNIST for beginners' example (initial code here). I've made some slight adaptions: mnist = input_data.read_data_sets(FLAGS.data_dir, one_hot=True) sess = tf.InteractiveSession() # Create the model x = tf.placeholder(tf.float32, [None, 784]) W = tf.Variable(tf.zeros([784, 10])) b = tf.Variable(tf.zeros([10])) y = tf.nn.softmax(tf.matmul(x, W) + b) # Define loss and optimizer y_ = tf.placeholder(tf.float32, [None, 10]) cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1])) train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy) fake_images = mnist.train.images.tolist() # Train tf.initialize_all_variables().run() for i in range(10): batch_xs, batch_ys = fake_images, mnist.train.labels train_step.run({x: batch_xs, y_: batch_ys}) # Test trained model print(y.eval({x: mnist.test.images}))  Specifically, I'm only running the training step 10 times (I'm not concerned about accuracy, more about speed). I'm also running it on all the data at once (for simplicity). At the end, I'm outputting the predictions TF is making, instead of the accuracy percentage. Here's (some of) the output of the above code:  [ 1.08577311e-02 7.29394853e-01 5.02395593e-02 ..., 2.74689011e-02 4.43389975e-02 2.32385024e-02] ..., [ 2.95746652e-03 1.30554764e-02 1.39354384e-02 ..., 9.16484520e-02 9.70732421e-02 2.57733971e-01] [ 5.94450533e-02 1.36338845e-01 5.22132218e-02 ..., 6.91468120e-02 1.95634082e-01 4.83607128e-02] [ 4.46179360e-02 6.66685810e-04 3.84704918e-02 ..., 6.51754031e-04 2.46591796e-03 3.10819712e-03]]  Which appears to be the probabilities TF is assigning to each of the possibilities (0-9). All is well with the world. My main goal is to adapt this to another use, but first I'd like to make sure I can give it other data. This is what I've tried: fake_images = np.random.rand(55000, 784).astype('float32').tolist()  Which, as I understand it, should generate an array of random junk that is structurally the same as the data from MNIST. But making the change above, here's what I get: [[ nan nan nan ..., nan nan nan] [ nan nan nan ..., nan nan nan] [ nan nan nan ..., nan nan nan] ..., [ nan nan nan ..., nan nan nan] [ nan nan nan ..., nan nan nan] [ nan nan nan ..., nan nan nan]]  Which is clearly much less useful. Looking at each option (mnist.train.images and the np.random.rand option), it looks like both are a list of lists of floats. Why won't TensorFlow accept this array? Is it simply complaining because it recognizes that there's no way it can learn from a bunch of random data? I would expect not, but I've been wrong before. ### Lobsters #### Types of Data ### Planet Theory #### Online bin packing with cardinality constraints resolved Authors: János Balogh, József Békési, György Dósa, Leah Epstein, Asaf Levin Download: PDF Abstract: Cardinality constrained bin packing or bin packing with cardinality constraints is a basic bin packing problem. In the online version with the parameter k \geq 2, items having sizes in (0,1] associated with them are presented one by one to be packed into unit capacity bins, such that the capacities of bins are not exceeded, and no bin receives more than k items. We resolve the online problem in the sense that we prove a lower bound of 2 on the overall asymptotic competitive ratio. This closes this long standing open problem, since an algorithm of an absolute competitive ratio 2 is known. Additionally, we significantly improve the known lower bounds on the asymptotic competitive ratio for every specific value of k. The novelty of our constructions is based on full adaptivity that creates large gaps between item sizes. Thus, our lower bound inputs do not follow the common practice for online bin packing problems of having a known in advance input consisting of batches for which the algorithm needs to be competitive on every prefix of the input. #### On Low-High Orders of Directed Graphs: Incremental Algorithms and Applications Authors: Loukas Georgiadis, Aikaterini Karanasiou, Giannis Konstantinos, Luigi Laura Download: PDF Abstract: A flow graph$G=(V,E,s)$is a directed graph with a distinguished start vertex$s$. The dominator tree$D$of$G$is a tree rooted at$s$, such that a vertex$v$is an ancestor of a vertex$w$if and only if all paths from$s$to$w$include$v$. The dominator tree is a central tool in program optimization and code generation and has many applications in other diverse areas including constraint programming, circuit testing, biology, and in algorithms for graph connectivity problems. A low-high order of$G$is a preorder$\delta$of$D$that certifies the correctness of$D$and has further applications in connectivity and path-determination problems. In this paper, we first consider how to maintain efficiently a low-high order of a flow graph incrementally under edge insertions. We present algorithms that run in$O(mn)$total time for a sequence of$m$edge insertions in an initially empty flow graph with$n$vertices.These immediately provide the first incremental certifying algorithms for maintaining the dominator tree in$O(mn)$total time, and also imply incremental algorithms for other problems. Hence, we provide a substantial improvement over the$O(m^2)$simple-minded algorithms, which recompute the solution from scratch after each edge insertion. We also show how to apply low-high orders to obtain a linear-time$2$-approximation algorithm for the smallest$2$-vertex-connected spanning subgraph problem (2VCSS). Finally, we present efficient implementations of our new algorithms for the incremental low-high and 2VCSS problems and conduct an extensive experimental study on real-world graphs taken from a variety of application areas. The experimental results show that our algorithms perform very well in practice. #### Fast binary embeddings with Gaussian circulant matrices: improved bounds Authors: Sjoerd Dirksen, Alexander Stollenwerk Download: PDF Abstract: We consider the problem of encoding a finite set of vectors into a small number of bits while approximately retaining information on the angular distances between the vectors. By deriving improved variance bounds related to binary Gaussian circulant embeddings, we largely fix a gap in the proof of the best known fast binary embedding method. Our bounds also show that well-spreadness assumptions on the data vectors, which were needed in earlier work on variance bounds, are unnecessary. In addition, we propose a new binary embedding with a faster running time on sparse data. #### A PTAS for the Steiner Forest Problem in Doubling Metrics Authors: T-H. Hubert Chan, Shuguang Hu, Shaofeng H.-C. Jiang Download: PDF Abstract: We achieve a (randomized) polynomial-time approximation scheme (PTAS) for the Steiner Forest Problem in doubling metrics. Before our work, a PTAS is given only for the Euclidean plane in [FOCS 2008: Borradaile, Klein and Mathieu]. Our PTAS also shares similarities with the dynamic programming for sparse instances used in [STOC 2012: Bartal, Gottlieb and Krauthgamer] and [SODA 2016: Chan and Jiang]. However, extending previous approaches requires overcoming several non-trivial hurdles, and we make the following technical contributions. (1) We prove a technical lemma showing that Steiner points have to be "near" the terminals in an optimal Steiner tree. This enables us to define a heuristic to estimate the local behavior of the optimal solution, even though the Steiner points are unknown in advance. This lemma also generalizes previous results in the Euclidean plane, and may be of independent interest for related problems involving Steiner points. (2) We develop a novel algorithmic technique known as "adaptive cells" to overcome the difficulty of keeping track of multiple components in a solution. Our idea is based on but significantly different from the previously proposed "uniform cells" in the FOCS 2008 paper, whose techniques cannot be readily applied to doubling metrics. #### Quantum Communication Complexity of Distributed Set Joins Authors: Stacey Jeffery, François Le Gall Download: PDF Abstract: Computing set joins of two inputs is a common task in database theory. Recently, Van Gucht, Williams, Woodruff and Zhang [PODS 2015] considered the complexity of such problems in the natural model of (classical) two-party communication complexity and obtained tight bounds for the complexity of several important distributed set joins. In this paper we initiate the study of the *quantum* communication complexity of distributed set joins. We design a quantum protocol for distributed Boolean matrix multiplication, which corresponds to computing the composition join of two databases, showing that the product of two$n\times n$Boolean matrices, each owned by one of two respective parties, can be computed with$\widetilde{O}(\sqrt{n}\ell^{3/4})$qubits of communication, where$\ell$denotes the number of non-zero entries of the product. Since Van Gucht et al. showed that the classical communication complexity of this problem is$\widetilde{\Theta}(n\sqrt{\ell})$, our quantum algorithm outperforms classical protocols whenever the output matrix is sparse. We also show a quantum lower bound and a matching classical upper bound on the communication complexity of distributed matrix multiplication over$\mathbb{F}_2$. Besides their applications to database theory, the communication complexity of set joins is interesting due to its connections to direct product theorems in communication complexity. In this work we also introduce a notion of *all-pairs* product theorem, and relate this notion to standard direct product theorems in communication complexity. #### Privacy Amplification Against Active Quantum Adversaries Authors: Gil Cohen, Thomas Vidick Download: PDF Abstract: Privacy amplification is the task by which two cooperating parties transform a shared weak secret, about which an eavesdropper may have side information, into a uniformly random string uncorrelated from the eavesdropper. Privacy amplification against passive adversaries, where it is assumed that the communication is over a public but authenticated channel, can be achieved in the presence of classical as well as quantum side information by a single-message protocol based on strong extractors. In 2009 Dodis and Wichs devised a two-message protocol to achieve privacy amplification against active adversaries, where the public communication channel is no longer assumed to be authenticated, through the use of a strengthening of strong extractors called non-malleable extractors which they introduced. Dodis and Wichs only analyzed the case of classical side information. We consider the task of privacy amplification against active adversaries with quantum side information. Our main result is showing that the Dodis-Wichs protocol remains secure in this scenario provided its main building block, the non-malleable extractor, satisfies a notion of quantum-proof non-malleability which we introduce. We show that an adaptation of a recent construction of non-malleable extractors due to Chattopadhyay et al. is quantum proof, thereby providing the first protocol for privacy amplification that is secure against active quantum adversaries. Our protocol is quantitatively comparable to the near-optimal protocols known in the classical setting. #### Communication complexity of approximate Nash equilibria Authors: Yakov Babichenko, Aviad Rubinstein Download: PDF Abstract: For a constant$\epsilon$, we prove a poly(N) lower bound on the communication complexity of$\epsilon$-Nash equilibrium in two-player NxN games. For n-player binary-action games we prove an exp(n) lower bound for the communication complexity of$(\epsilon,\epsilon)$-weak approximate Nash equilibrium, which is a profile of mixed actions such that at least$(1-\epsilon)$-fraction of the players are$\epsilon$-best replying. ### QuantOverflow #### Up and Down days in GBPUSD and a Filter I want to study if the odds of an up or down day in a forex pairs is 50-50. I just count the total number of up and down days in X years and compare it with the total days. The results are very similar to a 50-50 chance. Now I want to see if by applying an EMA200 filter you have more probabilities of an up day if the closing price is above the EMA, and vice-versa for a closing price below the EMA. The results show that it´s more probable to obtain an up day if the closing price is above the EMA. The question is: Does the test have any bias? I am worried that the results aren't true because of a bias in the test. Because the EMA depends on the price, maybe it´s just obvious that there are most up days if the price is above the EMA. $ema = EMA(Close,200);
foreach (NewDay) { $Totaldays++; if (Today(Close) > Today(Open)){$Totalup++; }
if (Today(Close) < Today(Open)){ $Totaldown++; } if (Today(Close) == Today(Open)){$Totaldojis++; }
if (Today(Close) > Today($ema)){$Totalabove++;
if (Today(Close) > Today(Open)){ $Upabove++; } if (Today(Close) < Today(Open)){$Downabove++; }
if (Today(Close) == Today(Open)){ $Dojisabove++; } } if (Today(Close) < Today($ema)){ $Totalbelow++; if (Today(Close) > Today(Open)){$Upbelow++; }
if (Today(Close) < Today(Open)){ $Downbelow++; } if (Today(Close) == Today(Open)){$Dojisbelow++; }
}


}

### StackOverflow

#### How do I set TensorFlow RNN state when state_is_tuple=True?

I have written an RNN language model using TensorFlow. The model is implemented as an RNN class. The graph structure is built in the constructor, while RNN.train and RNN.test methods run it.

I want to be able to reset the RNN state when I move to a new document in the training set, or when I want to run a validation set during training. I do this by managing the state inside the training loop, passing it into the graph via a feed dictionary.

In the constructor I define the the RNN like so

    cell = tf.nn.rnn_cell.LSTMCell(hidden_units)
rnn_layers = tf.nn.rnn_cell.MultiRNNCell([cell] * layers)
self.reset_state = rnn_layers.zero_state(batch_size, dtype=tf.float32)
self.state = tf.placeholder(tf.float32, self.reset_state.get_shape(), "state")
self.outputs, self.next_state = tf.nn.dynamic_rnn(rnn_layers, self.embedded_input, time_major=True,
initial_state=self.state)


The training loop looks like this

 for document in document:
state = session.run(self.reset_state)
for x, y in document:
_, state = session.run([self.train_step, self.next_state],
feed_dict={self.x:x, self.y:y, self.state:state})


x and y are batches of training data in a document. The idea is that I pass the latest state along after each batch, except when I start a new document, when I zero out the state by running self.reset_state.

This all works. Now I want to change my RNN to use the recommended state_is_tuple=True. However, I don't know how to pass the more complicated LSTM state object via a feed dictionary. Also I don't know what arguments to pass to the self.state = tf.placeholder(...) line in my constructor.

What is the correct strategy here? There still isn't much example code or documentation for dynamic_rnn available.

TensorFlow issue 2695 appears relevant, but I haven't fully digested it.

### StackOverflow

#### Cartesian product of multiple arrays in JavaScript

How would you implement the Cartesian product of multiple arrays in JavaScript?

As an example,

cartesian([1,2],[10,20],[100,200,300]) //should be
// [[1,10,100],[1,10,200],[1,10,300],[2,10,100],[2,10,200]...]


### CompsciOverflow

#### The importance of the language semantics for code generation and frameworks for code generation in model-driven development

I am implementing worflow where the code in industrial programming languages (JavaScript and Java) should be generated from the formal (formally verified) expressions (from ontologies as objects and rule formulas as behaviors). What is the best pratice for such code generation? Are some frameworks available for this?

In my opinion the semantics of the programming language is required. And I should be able to do the code generation in two steps: 1) translate my formal epxressions into semantic expressions of the target language; 2) translate semantic expressions into executable code. In reality I can not find any work that connects semantics of programming language with the code generation.

Is there special kind of semantics of programming languages that is usable not only for the analysis of the programs but also for the generation of the programs?

My guess is that it should be really useful approach for generating formally verified code but I can not find research work about this. Are there trends of better keywords avilable for this?

Maybe - more relevant question is - what kind of compilers/translators the Model Driven Development tools use for the generation of source code (platform dependent code) and how semantics of programming language can be used for the construction of such compilers?

Note added. There already is complete unifying (denotational and operational) semantics of Java, JavaScript and other industrial programmin languages in the K framework. So - this is more question about application of K framework for code generation is that is possible at all?

### StackOverflow

#### How is the gradient and hessian of logarithmic loss computed in the custom objective function example script in xgboost's github repository?

I would like to understand how the gradient and hessian of the logloss function are computed in an xgboost sample script.

I've simplified the function to take numpy arrays, and generated y_hat and y_true which are a sample of the values used in the script.

Here is a simplified example:

import numpy as np

def loglikelihoodloss(y_hat, y_true):
prob = 1.0 / (1.0 + np.exp(-y_hat))
hess = prob * (1.0 - prob)

y_hat = np.array([1.80087972, -1.82414818, -1.82414818,  1.80087972, -2.08465433,
-1.82414818, -1.82414818,  1.80087972, -1.82414818, -1.82414818])
y_true = np.array([1.,  0.,  0.,  1.,  0.,  0.,  0.,  1.,  0.,  0.])

loglikelihoodloss(y_hat, y_true)


The log loss function is the sum of where .

The gradient (with respect to p) is then however in the code its .

Likewise the second derivative (with respect to p) is however in the code it is .

How are the equations equal?

### Planet Theory

#### Stack and Queue Layouts via Layered Separators

Authors: Vida Dujmović, Fabrizio Frati
Abstract: It is known that every proper minor-closed class of graphs has bounded stack-number (a.k.a. book thickness and page number). While this includes notable graph families such as planar graphs and graphs of bounded genus, many other graph families are not closed under taking minors. For fixed $g$ and $k$, we show that every $n$-vertex graph that can be embedded on a surface of genus $g$ with at most $k$ crossings per edge has stack-number $\mathcal{O}(\log n)$; this includes $k$-planar graphs. The previously best known bound for the stack-number of these families was $\mathcal{O}(\sqrt{n})$, except in the case of $1$-planar graphs. Analogous results are proved for map graphs that can be embedded on a surface of fixed genus. None of these families is closed under taking minors. The main ingredient in the proof of these results is a construction proving that $n$-vertex graphs that admit constant layered separators have $\mathcal{O}(\log n)$ stack-number.

## August 23, 2016

### StackOverflow

#### How do I pass a scalar via a TensorFlow feed dictionary

My TensorFlow model uses tf.random_uniform to initialize a variable. I would like to specify the range when I begin training, so I created a placeholder for the initialization value.

init = tf.placeholder(tf.float32, name="init")
v = tf.Variable(tf.random_uniform((100, 300), -init, init), dtype=tf.float32)
initialize = tf.initialize_all_variables()


I initialize variables at the start of training like so.

session.run(initialize, feed_dict={init: 0.5})


This gives me the following error:

ValueError: initial_value must have a shape specified: Tensor("Embedding/random_uniform:0", dtype=float32)


I cannot figure out the correct shape parameter to pass to tf.placeholder. I would think for a scalar I should do init = tf.placeholder(tf.float32, shape=0, name="init") but this gives the following error:

ValueError: Incompatible shapes for broadcasting: (100, 300) and (0,)


If I replace init with the literal value 0.5 in the call to tf.random_uniform it works.

How do I pass this scalar initial value via the feed dictionary?

#### Ansible: given a list of ints in a variable, define a second list in which each element is incremented

Let's assume that we have an Ansible variable that is a list_of_ints.

I want to define an incremented_list, whose elements are obtained incrementing by a fixed amount the elements of the first list.

For example, if this is the first variable:

---
# file: somerole/vars/main.yml

list_of_ints:
- 1
- 7
- 8


assuming an increment of 100, the desired second list would have this content:

incremented_list:
- 101
- 107
- 108


I was thinking of something on the lines of:

incremented_list: "{{ list_of_ints | map('add', 100) | list }}"


Sadly, Ansible has custom filters for logarithms or powers, but not for basic arithmetic, so I can easily calculate the log10 of those numbers, but not increment them.

Any ideas, apart from a pull request on https://github.com/ansible/ansible/blob/v2.1.1.0-1/lib/ansible/plugins/filter/mathstuff.py ?

### CompsciOverflow

#### Back propagation in neural networks

I just finished watching these 3 Coursera videos on back propagation in neural networks. I get the idea of what we're trying to do, but I don't get how we achieve that by calculating error in each step as weight * cascaded error (eg. the formula at the top right of screen at 12:07 in the linked video). Let's say we start off with all the weights (theta) at zero. Wouldn't back propagation always calculate 0 error for everything, causing nothing to change?

### Lambda the Ultimate Forum

#### Whither FRP?

hi, I was re-reading an LtU blast from the past about FRP and the discussions there made me think to ask this here community to post some concise updates on how FRP research has been going of late. In case any of you in that field have so much free time.

### QuantOverflow

#### Fortune 1000 companies: which are public utilities?

I am reviewing the Fortune 1000 list of companies. Is there a reliable way using information from EDGAR filings to determine which of these companies are primarily acting as regulated public utilities?

#### Fixed Income Ex Ante Tracking Error

Anyone know a good source that walks through how to calculate Ex Ante tracking error for a fixed income portfolio?

### StackOverflow

#### Labeling data for neural net training

Does anyone know of or have a good tool for labeling image data to be used in training a DNN?

Specifically labeling 2 points in an image, like upperLeftCorner and lowerRightCorner, which then calculates a bouding box around the specified object. That's just an example but I would like to be able to follow the MSCoco data format.

Thanks!

### Lambda the Ultimate Forum

#### language handling of memory and other resource failures

My idea here is to introduce a place to discuss ideas about handling memory exhaustion, or related resource limit management. The goal is to have something interesting and useful to talk about, informing future designs of programming language semantics or implementation. Thoughtful new solutions are more on topic than anecdotes about old problems. (Feel free to tell an anecdote if followed by analysis of a workable nice solution.) Funny failure stories are not very useful.

Worst case scenarios are also of interest: situations that would very hard to handle nicely, as test cases for evaluating planned solutions. For example, you might be able to think of an app that would behave badly under a given resource management regime. This resembles thinking of counter examples, but with emphasis on pathology instead of contradiction.

In another discussion topic, Keean Schupke argued the idea failure due to out-of-memory is an effect, while others suggested it was semantically more out-of-scope than an effect in some languages. I am less interested in what is an effect, and more interested in how to handle problems. (The concept is on topic, when focus is what to do about it. Definitions without use cases seem adrift of practical concerns.)

Relevant questions include: After partial failure, what does code still running do about it? How is it presented semantically? Can failed things be cleaned up without poisoning the survivors afterward? How are monitoring, cleanup, and recovery done efficiently with predictable quality? How do PL semantics help a dev plan and organize system behavior after resource failures, especially memory?

### Planet Theory

#### TR16-132 | On the Sensitivity Conjecture for Read-k Formulas | Mitali Bafna, Satyanarayana V. Lokam, Sébastien Tavenas, Ameya Velingker

Various combinatorial/algebraic parameters are used to quantify the complexity of a Boolean function. Among them, sensitivity is one of the simplest and block sensitivity is one of the most useful. Nisan (1989) and Nisan and Szegedy (1991) showed that block sensitivity and several other parameters, such as certificate complexity, decision tree depth, and degree over R, are all polynomially related to one another. The sensitivity conjecture states that there is also a polynomial relationship between sensitivity and block sensitivity, thus supplying the "missing link". Since its introduction in 1991, the sensitivity conjecture has remained a challenging open question in the study of Boolean functions. One natural approach is to prove it for special classes of functions. For instance, the conjecture is known to be true for monotone functions, symmetric functions, and functions describing graph properties. In this paper, we consider the conjecture for Boolean functions computable by read-k formulas. A read-k formula is a tree in which each variable appears at most k times among the leaves and has Boolean gates at its internal nodes. We show that the sensitivity conjecture holds for read-once formulas with gates computing symmetric functions. We next consider regular formulas with OR and AND gates. A formula is regular if it is a leveled tree with all gates at a given level having the same fan-in and computing the same function. We prove the sensitivity conjecture for constant depth regular read-k formulas for constant k.

### StackOverflow

#### Is that possible to use optimization toolbox for gradient boosting model?

I have a decision stump boosting model, here is the essential part, where for a squared loss function, we are fitting a new model to the residuals.

fit=DecisionStump(y~., d)
y_p[1,]=fit$predictions for (i in 2:Niter){ # adjust objective and rebuild data frame y_adj=d$y-y_p[i-1,]
d_adj=data.frame(x1=d$x1,x2=d$x2,y=y_adj)

# fit base learner to adjusted data

# update learner
y_p[i,]=y_p[i-1,]+v*fit2$predictions }  My question: is that possible to rewrite the function that we can use R optimization tool box such as something like  f<-function(x) x^2 g<-function(x) 2*x optim(1,f,g,method="BFGS")  ### CompsciOverflow #### on "On the cruelty of really teaching computing science" Dijkstra, in his essay On the cruelty of really teaching computing science, makes the following proposal for an introductory programming course: On the one hand, we teach what looks like the predicate calculus, but we do it very differently from the philosophers. In order to train the novice programmer in the manipulation of uninterpreted formulae, we teach it more as boolean algebra, familiarizing the student with all algebraic properties of the logical connectives. To further sever the links to intuition, we rename the values {true, false} of the boolean domain as {black, white}. On the other hand, we teach a simple, clean, imperative programming language, with a skip and a multiple assignment as basic statements, with a block structure for local variables, the semicolon as operator for statement composition, a nice alternative construct, a nice repetition and, if so desired, a procedure call. To this we add a minimum of data types, say booleans, integers, characters and strings. The essential thing is that, for whatever we introduce, the corresponding semantics is defined by the proof rules that go with it. Right from the beginning, and all through the course, we stress that the programmer's task is not just to write down a program, but that his main task is to give a formal proof that the program he proposes meets the equally formal functional specification. While designing proofs and programs hand in hand, the student gets ample opportunity to perfect his manipulative agility with the predicate calculus. Finally, in order to drive home the message that this introductory programming course is primarily a course in formal mathematics, we see to it that the programming language in question has not been implemented on campus so that students are protected from the temptation to test their programs. He emphasises that this is a serious proposal, and outlines various possible objections, including that his idea is "utterly unrealistic" and "far too difficult." But that kite won't fly either for the postulate has been proven wrong: since the early 80's, such an introductory programming course has successfully been given to hundreds of college freshmen each year. [Because, in my experience, saying this once does not suffice, the previous sentence should be repeated at least another two times.] Which course is Dijkstra referring to, and is there any other literature available that discusses it? The essay appeared in 1988 when Dijkstra was at the University of Texas at Austin, which is probably a clue -- they host the Dijkstra archive but it is huge, and I'm particularly interested in hearing from others about this course. I don't want to discuss whether Dijkstra's idea is good or realistic here. I considered posting this on cstheory.se or cs.se but settled on here because a) a community of educators might be more likely to have someone who can answer easily, and b) Dijkstra himself emphasises that his course is "primarily a course in formal mathematics." Feel free to flag for migration if you disagree. #### Homography from 3D plane to plane parallel to image plane I have an image in which there is a calibration target (known geometry) in a scene (let's say a simple 2" x 2" square lying on a table). I would like to perform a perspective transformation so that resulting image is an orthogonal view of the table (as if the camera axis was parallel with the table normal). The general procedure for computing a homography is from a general plane to a different general plane where at least 4 correspondences are known in two images of the same scene. In this case where I only have one image, is the correct thing to do to simply "make up" a plane and force the correspondences to some arbitrary position on that plane? For example, in this this situation I would simply make correspondences between the 4 detected corners (A,B,C,D) in the image and four points of my choosing (which essentially just define the pixel->real world scale. For example, I could choose A' = (0,0), B' = (20,20), C' = (0,20), D' = (20,0) to indicate in the resulting image there are 10 pixels per inch. Of course I could choose any scale here, and I could also choose any position for the square target to land in the output (i.e. A' = (100,100), B' = (120,120), C' = (100,120), D' = (120,100) ). Is this the "correct" way to do this? Is there a better way to compute a projective transform that looks directly at a plane defined by a set of points in the image known to be in the plane? ### StackOverflow #### Backpropagation - Neural Networks How does the output value of step 1 result in the value of "0.582"? I am looking at an example of the usage of backpropagation in order to have a basic understanding of it. However I fail to understand how the value "0.582" is formed as the output for the example below. EDIT: I have attempted Feed-Forward, which has resulted in having the output value of "0.5835...". Now I am unsure if the example above output value is correct or whether the method I have used is correct. EDIT2: ## My FF Calculation f(x) = 1/1+e^-x . > NodeJ = f( 1*W1j+0.4*W2j+0.7*W3j ) NodeJ = f( 1(0.2)+0.4(0.3)+0.7(-0.1) ) = f(0.25) NodeJ = 1/(1+e^-0.25) = 0.562... . NodeI = f( 1*W1i+0.4*W2i+0.7*W3i ) NodeI = f( 1(0.1)+0.4(-0.1)+0.7(0.2) ) = f(0.25) NodeI = 1/(1+e^-0.25) = 0.562... . NodeK = f( NodeJ * Wjk + NodeI * Wik) NodeK = f( 0.562(0.1)+0.562(0.5)) = f(0.3372) NodeK = 1/(1+e^-0.3372) = 0.5835 Output = 0.5835 ### CompsciOverflow #### Why unsafe state not always cause deadlock? I was reading Operating Systems by Galvin and came across the below line, Not all unsafe states are deadlock, however. An unsafe state may lead to deadlock Can someone please explain how deadlock != unsafe state ? I also caught the same line here If a safe sequence does not exist, then the system is in an unsafe state, which MAY lead to deadlock. ( All safe states are deadlock free, but not all unsafe states lead to deadlocks. ) ### StackOverflow #### The output of a softmax isn't supposed to have zeros, right? I am working on a net in tensorflow which produces a vector which is then passed through a softmax which is my output. Now I have been testing this and weirdly enough the vector (the one that passed through softmax) has zeros in all coordinate but one. Based on the softmax's definition with the exponential, I assumed that this wasn't supposed to happen. Is this an error? EDIT: My vector is 120x160 =192000. All values are float32 ### CompsciOverflow #### Coin change problem I have a homework question as below: There is a currency system that has coins of value v1, v2, ..., vk for some integer k > 1 such that v1 = 1. You have to pay a person V units of money using this currency. Answer the following: (a) (16 points) Let v2 = c 1 , v3 = c 2 , ..., vk = c k−1 for some fixed integer constant c > 1. Design a greedy algorithm that minimises the total number of coins needed to pay V units of money for any given V . Give pseudocode, discuss running time, and give proof of correctness. I have this intuition that we should pick the maximum valued coin whenever possible. But I can't get anywhere while trying to prove this. I am tring "greedy stays ahead" method. #### Is there research into associative/commutative optimizations? While playing around with optimization sets in LLVM, it occurs to me that the order in which optimizations are run matters greatly since, in general, A(B(src)) is not equal to B(A(src)) where A and B are some optimization of type source -> source and src is of type source. Are there optimizations for which that property holds? Are there projects or research that attempt to formalize or otherwise create these types of optimizations? ### QuantOverflow #### NFLVR and HJM framework The no-arbitrage HJM drift condition is well know, for the traditional (ELMM: Equivalent Local Martingale Measure) formulation of no-arbitrage. My question is: is there a known necessary and sufficient condition on HJM models$f_t(T)$to satisfy the NFLVR (No Free Lunch With Vanishing Risk) condition? If so what is it and what is a good reference? #### quantlib python : missing methods? I'm reading Introduction to Selected Classes of the QuantLib Library I by Dimitri Reiswich and tries to "convert" it to Python. It seems to me that some C++ possibilities aren't available in python. I'm not familiar with SWIG but I guess it's a matter of declaring them in the appropriate *.i files. For instance both these work following the pdf text: January: either QuantLib::January or QuantLib::Jan print(ql.Date(12, 12, 2015)) print(ql.Date(12, ql.January, 2015))  But why Jan doesn't work ? print(ql.Date(12, ql.Jan, 2015))  In the Calendar description the 2 followinng commented lines return an error, browsing through the code I failed at finding them. Would someone be kind enought to point me directions on how to make them available ? import QuantLib as ql def calendarTesting(): frankfurtCal = ql.Germany(ql.Germany.FrankfurtStockExchange) saudiArabCal = ql.SaudiArabia() myEve = ql.Date(31, 12, 2009) print('is BD: {}'.format(frankfurtCal.isBusinessDay(myEve))) print('is Holiday: {}'.format(frankfurtCal.isHoliday(myEve))) # print('is weekend: {}'.format(saudiArabCal.isWeekend(ql.Saturday))) print('is last BD: {}'.format(frankfurtCal.isEndOfMonth(ql.Date(30, 12, 2009)))) # print('last BD: {}'.format(frankfurtCal.endOfMonth(myEve))) calendarTesting()  ### StackOverflow #### compare the similarity/overlap of two high dimentional dataset [on hold] I want to predict whether a client will pay credit card or not with 20 features (such as salary, age, single/married.....). However, I only have train data for good customers (clients who pay credit card). Now I need to predict unknown client will pay credit card or not. Since I don't have train data for bad customer. I don't think I can used a supervised ML algorithm. My guess is I need to used a unsupervised ML algorithm such as KNN clustering to predict by similarity. However, I am a little confused about how to do it. Does anyone know how to solve this? And, is there any python library that i can use (such as scikit learn) and examples of similar problem? THanks for your help! ### UnixOverflow #### How can I boot the PC-BSD live DVD-ISO IMAGE directly via GRUB2? Via the loopback command, GRUB2 allows to directly boot an ISO file. Now, I've configured the according menuentry to boot the PC-BSD Live DVD ISO, but when I try to boot it, the FreeBSD bootstrap loader outputs: can't load 'kernel'  Here is the GRUB2 menuentry I currently use: menuentry "PC-BSD" { search --no-floppy --fs-uuid --set root 0d11c28a-7186-43b9-ae33-b4bd351c60ad loopback loop /PCBSD9.0-RC1-x64-DVD-live.iso kfreebsd (loop)/boot/loader }  Does one know how I'd need to amend that in order to be able to boot the PC-BSD live system? ### StackOverflow #### Is always normalizing features standard practice? It seems like every single machine learning method (perceptron, SVM, etc) warns you about the need to normalize all the features during preprocessing. Is this always true for all common machine learning methods? Or am I just running into the few that require normalized features. #### Choose the best cluster partition based on a cost function I've a string that I'd like to cluster: s = 'AAABBCCCCC'  I don't know in advance how many clusters I'll get. All I have, is a cost function that can take a clustering and give it a score. There is also a constraint on the cluster sizes: they must be in a range [a, b] In my exemple, for a=3 and b=4, all possible clustering are: [ ['AAA', 'BBC', 'CCCC'], ['AAA', 'BBCC', 'CCC'], ['AAAB', 'BCC', 'CCC'], ]  Concatenation of each clustering must give the string s The cost function is something like this cost(clustering) = alpha*l + beta*e + gamma*d  where: • l = variance(cluster_lengths) • e = mean(clusters_entropies) • d = 1 - nb_characters_in_b_that_are_not_in_a)/size_of_b (for b the consecutive cluster of a) • alpha, beta, gamma are weights This cost function gives a low cost (0) for the best case: 1. Where all clusters have the same size. 2. Content inside each cluster is the same. 3. Consecutive clusters don't have the same content. Theoretically, the solution is to calculate the cost of all possible compositions for this string and choose the lowest. but It will take too much time. Is there any clustering algorithme that can find the best clustering according to this cost function in a reasonable time ? #### Determine the Initial Probabilities of an HMM So I have managed to estimate most of the parameters in a particular Hidden Markov Model (HMM) given the learn dataset. These parameters are: the emission probabilities of the hidden states and the transition matrix$P$of the Markov chain. I used Gibbs sampling for the learning. Now there is one set of parameters that is still missing that is the initial probabilities$\pi$(probability distribution of where the chain starts) and I want to deduce it from the learned parameters. How can I do it? Also, is it true that$\pi$is the same as the stationary probability distribution of$P$? ### CompsciOverflow #### Bellman Ford Algorithm fails to compute shortest path for a directed edge-weighted graph I was recently understanding shortest path algorithms when I encountered the problem below in the book Algorithms, 4th edition, Robert Sedgewick and Kevin Wayne. Suppose that we convert an EdgeWeightedGraph into a Directed EdgeWeightedGraph by creating two DirectedEdge objects in the EdgeWeightedDigraph (one in each direction) for each Edge in the EdgeWeightedGraph and then use the Bellman-Ford algorithm. Explain why this approach fails spectacularly. I've tried many examples on paper, but am unable to find scenarios where the directed graph generated would have new negative cycles in them, simply by converting an edge into two edges in opposite directions. I assume that there were no pre-existing negative cycles in the unweighted undirected edge-weighted graph. #### Time Complexity for repeated element An integer array of size n contains integer values from the range 0 to n-2. Only one of these integers comes twice in the array and all other comes only once. An algorithm has been designed to find this repeated integer from the given array. If this algorithm focuses on using minimum number of comparison operations, what will be the order of comparison operations used by it to get the correct answer? #### How do I retrieve text from an embedded script using casperjs? [on hold] the html element is <script> window.sawXmlIslandidClientStateXml="<nqw xmlns:saw=\x22com.siebel.analytics.web/report/v1.1\x22 xmlns:xsi=\x22http://www.w3.org/2001/XMLSchema-instance\x22 xmlns:sawst=\x22com.siebel.analytics.web/state/v1\x22>\u003csawst:clientState>\u003csawst:stateRef>\u003csawst:envState xmlns:sawst=\"com.siebel.analytics.web/state/v1\" xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\" xmlVersion=\"200811100\">\u003csawst:container cid=\"d:dashboard\" xsi:type=\"sawst:topLevelStateContainer\">\u003csawst:container cid=\"p:mco0pb0nob7sqjvg\" xsi:type=\"sawst:page\">\u003csawst:container cid=\"s:42263r43nih80fd1\" xsi:type=\"sawst:section\" rendered=\"true\">\u003csawst:container cid=\"g:c452lvndqssjqa45\" xsi:type=\"sawst:dashprompt\" links=\"-\" promptAutoCompleteState=\"off\"/>\u003c/sawst:container>\u003csawst:container cid=\"r:q4g2fiisnvk4nusv\" xsi:type=\"sawst:report\" links=\"fd\" defaultView=\"compoundView!1\" searchId=\"fvup02s9lt0o6urkplv4pqa5ri\" folder=\"/shared/Sales\" itemName=\"All Sales and Inventory Data\"/>\u003csawst:container cid=\"f:dpstate\" xsi:type=\"sawst:dashpromptstate\" statepoolId=\"ih2bj24l46bkgt558qsef04jeq\"/>\u003csawst:container cid=\"s:b0003tc6gnahvsfq\" xsi:type=\"sawst:section\" rendered=\"true\"/>\u003csawst:container cid=\"s:c5j314uterctfb08\" xsi:type=\"sawst:section\" rendered=\"true\"/>\u003c/sawst:container>\u003c/sawst:container>\u003c/sawst:envState>\u003c/sawst:stateRef>\u003csawst:reportXmlRefferedTo>\u003cref statePath=\"d:dashboard~p:mco0pb0nob7sqjvg~r:q4g2fiisnvk4nusv\" searchID=\"8oh8erup3kcqav10ukp36jaof2\">\u003c/ref>\u003c/sawst:reportXmlRefferedTo>\u003c/sawst:clientState></nqw>"; </script>  I want to retrieve the string ih2bj24l46bkgt558qsef04jeq under the identifier statepoolId from this script section. So how do I find this script in the HTML and get the string using casperjs? ### Lobsters #### Erlang Easter-Eggs ### QuantOverflow #### translating performance from EUR to USD [on hold] can someone please let me know how to translate a performance return from USD to EUR. For instance, I have a time series (return) over 7 years from an US hedge fund and would like to translate it into EUR return. I am not interested in any form of hedging, I just need the return to be denominated in EUR Thank you very much! ### AWS #### AWS Week in Review – Coming Back With Your Help! Back in 2012 I realized that something interesting happened in AWS-land just about every day. In contrast to the periodic bursts of activity that were the norm back in the days of shrink-wrapped software, the cloud became a place where steady, continuous development took place. In order to share all of this activity with my readers and to better illustrate the pace of innovation, I published the first AWS Week in Review in the spring of 2012. The original post took all of about 5 minutes to assemble, post and format. I got some great feedback on it and I continued to produce a steady stream of new posts every week for over 4 years. Over the years I added more and more content generated within AWS and from the ever-growing community of fans, developers, and partners. Unfortunately, finding, saving, and filtering links, and then generating these posts grew to take a substantial amount of time. I reluctantly stopped writing new posts early this year after spending about 4 hours on the post for the week of April 25th. After receiving dozens of emails and tweets asking about the posts, I gave some thought to a new model that would be open and more scalable. Going Open The AWS Week in Review is now a GitHub project (https://github.com/aws/aws-week-in-review). I am inviting contributors (AWS fans, users, bloggers, and partners) to contribute. Every Monday morning I will review and accept pull requests for the previous week, aiming to publish the Week in Review by 10 AM PT. In order to keep the posts focused and highly valuable, I will approve pull requests only if they meet our guidelines for style and content. At that time I will also create a file for the week to come, so that you can populate it as you discover new and relevant content. Content & Style Guidelines Here are the guidelines for making contributions: • Relevance -All contributions must be directly related to AWS. • Ownership – All contributions remain the property of the contributor. • Validity – All links must be to publicly available content (links to free, gated content are fine). • Timeliness – All contributions must refer to content that was created on the associated date. • Neutrality – This is not the place for editorializing. Just the facts / links. I generally stay away from generic news about the cloud business, and I post benchmarks only with the approval of my colleagues. And now a word or two about style: • Content from this blog is generally prefixed with “I wrote about POST_TITLE” or “We announced that TOPIC.” • Content from other AWS blogs is styled as “The BLOG_NAME wrote about POST_TITLE.” • Content from individuals is styled as “PERSON wrote about POST_TITLE.” • Content from partners and ISVs is styled as “The BLOG_NAME wrote about POST_TITLE.” There’s room for some innovation and variation to keep things interesting, but keep it clean and concise. Please feel free to review some of my older posts to get a sense for what works. Over time we might want to create a more compelling visual design for the posts. Your ideas (and contributions) are welcome. Sections Over the years I created the following sections: • Daily Summaries – content from this blog, other AWS blogs, and everywhere else. • New & Notable Open Source. • New SlideShare Presentations. • New YouTube Videos including APN Success Stories. • New AWS Marketplace products. • New Customer Success Stories. • Upcoming Events. • Help Wanted. Some of this content comes to my attention via RSS feeds. I will post the OPML file that I use in the GitHub repo and you can use it as a starting point. The New & Notable Open Source section is derived from a GitHub search for aws. I scroll through the results and pick the 10 or 15 items that catch my eye. I also watch /r/aws and Hacker News for interesting and relevant links and discussions. Over time, it is possible that groups or individuals may become the primary contributor for a section. That’s fine, and I would be thrilled to see this happen. I am also open to the addition to new sections, as long as they are highly relevant to AWS. Automation Earlier this year I tried to automate the process, but I did not like the results. You are welcome to give this a shot on your own. I do want to make sure that we continue to exercise human judgement in order to keep the posts as valuable as possible. Let’s Do It I am super excited about this project and I cannot wait to see those pull requests coming in. Please let me know (via a blog comment) if you have any suggestions or concerns. I should note up front that I am very new to Git-based collaboration and that this is going to be a learning exercise for me. Do not hesitate to let me know if there’s a better way to do things! Jeff; ### Fefe #### De Maiziere hofft inständig, dass keiner von euch ... De Maiziere hofft inständig, dass keiner von euch die Shadowbroker-Geschichte mitgekriegt hat. Denn wenn, nachdem der US-Geheimdienst, den man immer fragt, wenn die eigenen Leute zu inkompetent sind, wenn DENEN ihre Hintertür-Schlüssel geklaut werden, wie kann man sich dann vor ein Mikrofon stellen und so tun, als wären UNSERE Hintertüren schon sicher. Die wären nicht mal vor GCHQ und NSA sicher! Vermutlich nicht mal vor Jugend Forscht. Abgesehen davon, dass ich nicht in einem Land leben möchte, in dem der Staat glaubt, er müsse mein Tagebuch entschlüsseln können. #### Lacher des Tages: "Rote Flora: Polizei übermalt Ermittler-Porträts".Gefunden! ... Lacher des Tages: "Rote Flora: Polizei übermalt Ermittler-Porträts". Gefunden! HAHAHA ### StackOverflow #### Matlab fitcsvm gives me a zero training error and 40% in testing I know its over-fitting to the training data set, yet I dont know how to change the parameters to avoid this. I have tried changing the boxcontraint from 1e-5, 1e0, 1e1, 1e10 and got the same situation. tTargets = ones(size(trainTargets,1),1); tTargets(trainTargets(:,2)==1)=-1; svmModel = fitcsvm(trainData, ... tTargets,... 'KernelFunction','rbf',... 'BoxConstraint',1e0); [Group, score] = predict(svmModel, trainData); tTargets = ones(size(trainTargets,1),1); tTargets(trainTargets(:,2)==1)=-1; svmTrainError = sum(tTargets ~= Group)/size(trainTargets,1); [Group, score] = predict(svmModel, testData); tTargets = ones(size(testTargets,1),1); tTargets(testTargets(:,2)==1)=-1; svmTestError = sum(tTargets ~= Group)/size(testTargets,1);  I hope someone can help with this Thanks, ### Lobsters #### gRPC: "a true internet-scale RPC framework is now 1.0 and ready for production deployments" ### CompsciOverflow #### Path in directed, weighted, cyclic graph with total distance closest to D? Input: Directed, weighted, cyclic graph G. Two distinct vertices in that graph, A and B, where there exists a path from A to B. A distance d. Output: A path between A and B with distance closest to d. The path need not be a simple path - it can contain repeated edges and vertices. Which algorithms exist to solve this problem? I'm looking for an optimal solution, but I'm also interested to see if there are any approximation algorithms as well. Efficiency is not a huge concern - I just want to get an idea of the algorithms available. Bonus question: are there any algorithms to compute a Hamiltonian path from A to B that visits every node, while finding the distance closest to d? ### StackOverflow #### counting patterns in image I'm working on an algorithm that counts patterns (bars) in a specific image. It seemed to me very simple at the first look, but I realized the complexity quickly. I have tried simple thresholding, template matching (small sliding windows), edge detection... I have just few images like this one. so I think that a machine learning algorithm can't give better results! but I still need suggestions. ### CompsciOverflow #### Running time of naive recursive implementation of unbounded knapsack problem How does one go about analyzing the running time of a naive recursive solution to the unbounded knapsack problem? Lots of sources say the naive implementation is "exponential" without giving more detail. For reference, here's a bit of Python code that implements the brute-force solution. Note that this can run for a long time even on smallish inputs. One of the interesting things about Knapsack is some inputs are lot harder than others. import random, time import sys class Item: def __init__(self, weight, value): self.weight = weight self.value = value def __repr__(self): return "Item(weight={}, value={})".format(self.weight, self.value) def knapsack(capacity): if capacity==0: return ([], 0) max_value = 0 max_contents = [] for item in items: if item.weight <= capacity: (contents, value) = knapsack(capacity-item.weight) if value + item.value > max_value: max_value = value + item.value max_contents = [item] max_contents.extend(contents) return (max_contents, max_value) def generate_items(n, max_weight, vwratio=1): items = [] weights = random.sample(range(1,max_weight+1),n) for weight in weights: variation = weight/10 value = max(1, int(vwratio*weight + random.gauss(-variation, variation))) item = Item(weight, value) items.append(item) return items n=30 max_item_weight=100 capacity=100 items = generate_items(n=n, max_weight=max_item_weight, vwratio=1.1) st = time.time() solution, value = knapsack(capacity) print("completed in %f"%(time.time() - st))  Note that this algorithm can be improved upon nicely by memoizing yielding a O(nW) pseudo-polynomial time solution, but I was interested in understanding how to analyze the brute-force algorithm more precisely. ### Lobsters #### Browser buttons not seen as buttons A user fails to recognize native unstyled buttons in the browser. Something amazing has happened that every website has replaced buttons with custom styled widgets and users never see real buttons anymore. There was a time when users would recognize a button, even if it didn’t say “click me!”, because it looked like a button. Original title: I can’t even. Comments ### StackOverflow #### Scala Higher Order Functions and Implicit Typing I recently started working with Functional Programming in Scala and am learning Scala in the process. While attempting one of the Chapter 2 exercises to define a function that curries another function, I ran into this: If I write def curry[A,B,C](f: (A,B) => C): A => B => C = a: A => b: B => f(a, b)  then I get Chapter2.scala:49: error: ';' expected but ':' found. a: A => b: B => f(a, b) _______^ one error found BUT if I write def curry[A,B,C](f: (A,B) => C): A => B => C = a => b => f(a, b)  then it compiles fine, with no warnings, and works. What's the difference? #### Unbalanced labels - Better results in Confusion Matrix I've unbalanced labels. That is, in binary classifier, I've more positives (1) data and less negatives (0) data. I'm using Stratified K Fold Cross Validation and getting true negatives as zero. Could you please let me know what options I have to get a value greater tan zero for true negatives? Thanks in advance! #### set the number of folds within GridSearchCV (scikit-learn) How to set the number of folds within GridSearchCV? I'm using GridSearchCV in order to tune multiple hypterparameters. One of them is the number of folds, but I can't set it in the way I'm used to it: grid = GridSearchCV( cv=StratifiedKFold(y, n_folds=2) # <= change n_folds ) params = { 'cv': [5,7,10], # this doesn't work }  #### Redux store with Math formulas as functions While writing an engineering application with a React Redux framework we have come across an issue of having a database of products that have functions to work out their load capacities and other properties. I know it is not a good idea to load the functions into the store and retrieving the functions from another location in the reducer breaks purity and makes the reducer much harder to test. Is their any React Redux way of supplying the reducers with the database of product functions as a parameter, or similar, without putting them in the store and without breaking purity? Edit: Each of the products have functions that might describe for example the relationship between jack extension and load capacity. This relationship is usually non-linear and has a graph that will relate the capacity over its extendable range. We have used curve fitting tools to match these graphs to functions over their range. I would like to be able to use these functions in a reducer such that when someone selects a product and extension we can obtain the capacity and check its suitability against other calculated loads in the state. #### Mismatch between training error and results on training set examples after rbm re-training/tuning I’m having some trouble tuning an RBM in a specific fashion. My goal is to retrain an RBM in such a way that a full generation of an example can be made from only half the input. In my case a full input vector would be two separate digits from the MNIST dataset. The RBM I’m currently stuck with is the top layer of a system, with two parallel RBM’s below it, that is supposed to be a sort of association layer. The idea is to disconnect the bidirectional weights of the RBM into recognition(bottom-up) and generation(top-down) weights. The RBM implementation used is a modified example from the Theano deep learning tutorial: http://hastebin.com/javehiqaxe.py Plotting examples directly after training, using a full input vector, produced good results, since the input is correctly generated after a full pass through the system. The code I’m currently using is this: http://hastebin.com/zarucokepi.py It currently calculates an error between a target activation(using the weights as they were after initial training), using the whole input vector, and a partial activation that is created after setting the second half of the input to 0(this uses the recognition weights that are being trained). The gradient of this error function with respect to the recognition weights (only used in the partial activation generation) and the recognition weight are updates accordingly. The error decreases during this training/tuning step, sometimes as low as an error of 0.007 per batch of 25 after 50 epochs, but the results aren’t very good. The first half of the input is represented decently well when an example is plotted, but the second half is not represented well at all. This conflicts with the low error rates shown during the training/tuning step and I have no idea why this is. Especially since the examples are selected from the training set. The dataset consists of 25k examples with length 1000 that are the results of the bottom two RBMs stitched together. I’ve tried numerous update rules for the recognition weights(including direct update formulas), as well as different batch sizes and learning rates, but nothing seems to work. #### How to write testable code in Swift So this question of mine all began when I started to do unit testing for a simple 2 line of postNotification and addObserver. From this similar question here you can see that to make it testable you need to add ~20 lines & part away from the common way of writing your code. Facing this problem was actually the first time I understood the difference between Unit Testing and TDD. Unit testing is easy if your code is testable ie if you are following the TDD mindset. Next I was brought to how can I write testable code, which I didn't find much guidelines, every tutorial simply just jumps into writing a unit test. Apple own documentation has nothing for this. My initial thought was that I need to aim for 'functional programming' and write my functions in a pure function way. But then again this is very time consuming and may require lots of refactoring in existing codes or even for new projects lots of added lines and I'm not even sure if that is the correct approach. Are there any suggested guidelines or standards for writing testable code? I am not asking for how to write a testable code for NSNotificationCenter, I am asking for general guidelines for writing testable code. #### Is my RNN underperforming? After getting a hold of backprop and implementing my first simple feed-forward NN in numpy, I decided to try making it recurrent. I'm not sure if I'm doing this completely correctly, but here's what I've done: I've defined a set of weight matrices, one per hidden layer, to connect the hidden layers of the previous state to those of the current state. For forward propagation, I'm taking the dot product of the previous hidden layer states and their respective weight matrices (previous hidden to current hidden), and then I'm adding this new activation to the current state's activation for each hidden layer. Then I'm passing the new combined activation through a sigmoid function. For backprop, I'm calculating the deltas for these weights by taking the dot product of the current state's hidden layer errors (for each hidden layer) and the previous state's hidden layer activations, and then I'm subtracting the deltas from the (previous state to current state) weight matrices. I tested it on simple examples like getting it to learn how to replicate the word "hello" while only being trained on one letter at a time (like the example from Karpathy's article on RNNs) and it works, but it requires large hidden layers and over a thousand training iterations to actually learn how to produce "hello" given the seed letter "h" (I got it working with a single hidden layer with 50 nodes). Not sure if it's relevant, but for the "hello" example I was using 28 inputs (1-of-k encoded vector for 26 letters plus " " and ".") instead of just 4 inputs like in Karpathy's example. (EDIT: just tried with only 4 inputs (and 10 hidden nodes) and it converges in around 5-20 epochs most of the time, but sometimes it never converges if it has multiple hidden layers. Much more reasonable, but not sure if this is still worse than expected.) Is this normal for a vanilla RNN, or have I done something wrong? It seems like it's underperforming, but I don't really have a reference point. Thanks for the help! #### What is the output of a machine learning algorithm? I'm starting to study machine learning. I have a basic knowldege about it. If I consider a generic machine learning algorithm M, I would know which are its precise inputs and outputs. I'm not referring to some kind of implementation in a such programming language. I'm talking about the theory of machine learning. Take the example of supervised learning. The input of M should be the collection of pairs related to the function f the algorithm must learn. So, it will build some function h which approximate f. The output of M should be h? And what about unsupervised machine learning? ### CompsciOverflow #### Is NP-complete complexity defined in terms of polynomial reductions or polynomial transformations? [duplicate] This question already has an answer here: How do you know that a decision problem$X$is NP-complete?, if all other NP-problems polynomially transform to$X$or if all other NP-problems polynomially reduces (there exist a polynomial time oracle for any problem in NP using an oracle for$X$). Definitions seem to differ all over the web. Thanks! ### StackOverflow #### Multi-label feature selection using sklearn I'm looking to perform feature selection with a multi-label dataset using sklearn. I want to get the final set of features across labels, which I will then use in another machine learning package. I was planning to use the method I saw here, which selects relevant features for each label separately. from sklearn.svm import LinearSVC from sklearn.feature_selection import chi2, SelectKBest from sklearn.multiclass import OneVsRestClassifier clf = Pipeline([('chi2', SelectKBest(chi2, k=1000)), ('svm', LinearSVC())]) multi_clf = OneVsRestClassifier(clf)  I then plan to extract the indices of the included features, per label, using this: selected_features = [] for i in multi_clf.estimators_: selected_features += list(i.named_steps["chi2"].get_support(indices=True))  Now, my question is, how do I choose which selected features to include in my final model? I could use every unique feature (which would include features that were only relevant for one label), or I could do something to select features that were relevant for more labels. My initial idea is to create a histogram of the number of labels a given feature was selected for, and to identify a threshold based on visual inspection. My concern is that this method is subjective. Is there a more principled way of performing feature selection for multilabel datasets using sklearn? ### QuantOverflow #### Calculating Implied Forward Rates from Eurodollar Futures Quotes I'm trying to calculate the implied forward rates of the Eurodollar (USD) curve, knowing that the Eurodollar curve is supposed to be a mirror of the yield curve (else arb). I have this formula for the value of the strip:$Strip = \displaystyle \frac{\prod_{i= 1}^{n}\bigg(1 + R_i \cdot \big(\frac{days_i}{360} \big) \bigg) - 1}{\frac{term}{360}}$Using this for current values of LIBOR, I have /GEZ6, /GEH7, /GEM7, /GEU7 to replicate a 1-year forward curve. The rates are$R_1 = 93.5bp$,$R_2 = 95bp$,$R_3 = 98bp$,$R_4 = 101bp$. Using this formula gives me the value of the strip at 97.2 basis points, which I'm confident is wrong. How do I value the 1-year interest rate forward at December? ### Lobsters #### Build You Own Web Framework In Go (2014) ### TheoryOverflow #### Estimating the rank of a large sparse matrix Consider a large sparse n by n matrix. Are there any methods to estimate its rank in time roughly proportional the number of elements in the matrix? ### Lobsters #### Sharp Tools ### QuantOverflow #### What is the analytic solution of the expected shortfall for a annual losses? Assume we have annual losses$Z_i \sim Lognormal(0, 1)$, and$Z = \sum_{i=1}^N Z_i$,$N$is fixed, so what is the closed form of the expected shortfall$ES_{0.99}$? ### CompsciOverflow #### Average prefix code length of every 4-sized frequency vector is bounded at 2 I'm trying to show that for every frequency vector$(p_1, p_2, p_3, p_4)$such that$\sum_{i=1}^4 p_i=1$, the average word length outputted by Huffman algorithm is bounded at 2: If$(w_1,w_2,w_3,w_4)$is the outputted code, then$\sum_{i=1}^4 p_i |w_i| \le 2$. I've tried looking at the tree that is generated by Huffman algorithm, but the thing is that several different tree structures match different 4-sized frequency vectors and I can't tell something general about all of them. Also, is there a more general theorem for$k, n$(here$k=4, n=2$)? #### Why do Tarjan's and Kosaraju's algorithms for finding strongly connected components have the same running-time complexity? I followed an explanation of Kosaraju's and Tarjan's strongly-connected components algorithms, and they say that both have O(|V|+|E|) time complexity. That didn't make sense to me since Kosaraju uses two DFS passes and computes the transposed graph, but Tarjan's use only one DFS. ### Lobsters #### Help us name a new Mercurial feature Jun Wu of Facebook is proposing a new hg feature which helps editing a commit stack. Basically, it looks through your recent draft commits (commits which haven’t been pushed to a public repo) your working directory, and automatically updates the right draft commits with the changes in your working directory that best correspond according to hg annotate/blame information. The intent, of course, is to make it easier and automatic to clean up a series of WIP commits. What should this be called? Current proposals are stuff like hg amend --ancestors or hg histedit --smart. Jun Wu called it hg smartfixup. I don’t really like “smart” myself as I don’t find it very descriptive, and we already use “amend” essentially as a synonym for “fixup”. We take naming things kind of seriously, as they are our UI. Like the Master said, “If language is not correct, then what is said is not what is meant; if what is said is not what is meant, then what must be done remains undone; if this remains undone, morals and art will deteriorate; if justice goes astray, the people will stand about in helpless confusion. Hence there must be no arbitrariness in what is said. This matters above everything.” ;-) ### Fefe #### Oh wow, Nvidia hat krass Marktanteile verloren. Das ... Oh wow, Nvidia hat krass Marktanteile verloren. Das hab ich nicht kommen sehen. Da muss ich meine Kritik an AMDs RX480-Strategie wohl zurücknehmen. Krass ist das ein Blutbad. Das erklärt auch, wieso Nvidia die Preise so hochgedreht hat. Nicht nur weil sie es konnten, sondern weil sie dachten, dass sie es müssen. ### Lobsters #### Contextual Bandits - An introduction ### StackOverflow #### How get probability using libSVM (package e1071) in R? I'm trying to get probability output in libSVM (packeage e1071 in R) but the output is only TRUE or FALSE with my dataset. Follow the code: dadosBrutos<-read.csv("Dataset/circle.data",header = FALSE) svm.modelo <- svm(V3 ~ ., data=conjuntoTreinamento, type='C-classification', probability=TRUE) #cost=c, #type='C-classification', #kernel='linear', #scale=FALSE, #verbose=FALSE svm.predict <- predict(svm.modelo, subset(conjuntoTreinamento, select = -V3), probability=TRUE) posterior <- as.matrix(svm.predict)  But, when I use dataset Iris for example, the probability output is % and not the name of class. library(e1071) model <- svm(Species ~ ., data = iris, probability=TRUE) pred <- predict(model, iris, probability=TRUE) head(attr(pred, "probabilities")) # setosa versicolor virginica # 1 0.9803339 0.01129740 0.008368729 # 2 0.9729193 0.01807053 0.009010195 # 3 0.9790435 0.01192820 0.009028276 # 4 0.9750030 0.01531171 0.009685342 # 5 0.9795183 0.01164689 0.008834838 # 6 0.9740730 0.01679643 0.009130620  Can someone help me to understand this? Thank you Albert F. J. Costa #### Using SMOTE on unbalanced dataset I have a 2 class unbalanced dataset where the ratio is 20:1 I am using SMOTE to oversample the minor class and wanted to know when using SMOTE to develop a usable model, if it was best to oversample so that the percentage of the minor class was the same as the other class (i.e 1:1) or establish through trial an error the lowest possible ratio to improve the model overall to an acceptable level (i.e F1Score >0.7) but not use too many synthetic samples if that makes sense. Any thoughts/advice appreciated. ### Lobsters #### Top Eight Must-Listen Developer Podcasts This is a list of programming podcasts. To the list I’ll add garbage by lobste.rs own @jcs Comments ### QuantOverflow #### Calculating the net interest income and net interest margin The tabel below contains financial statement information from the CBA 2013 annual report. I am asked to find the Net interest income and net interest margain: My answers are as follows: Net interest income = interest income-interest expense =35-21=14 Net income margin = (673-632)/673. I'm not sure how to calculate net income margin, but would anyone be able to verify whether my answers are correct or not. ### Lobsters #### Elastic Routing in Runnable ### StackOverflow #### Is it possible to build TensorFlow for GTX 1070? I have an Ubuntu 14.04 LTS 64-bit with an Nvidia video card - GTX 1070 (10th generation). I'm trying to build TensorFlow. I tried building it with CUDA 7.5 and CuDNN 5, but it turned out the CUDA 7.5 I installed requires the 352.63.0 video driver, while the video driver I downloaded from Nvidia for GTX 1070 was 367.35 - a newer version. TensorFlow managed to build, but when I ran the example, there was a problem in runtime: boyko@boyko-pc:~/Desktop/tensorflow/tensorflow/models/image/mnist$ LD_LIBRARY_PATH=/usr/local/cuda-7.5/targets/x86_64-linux/lib python3 convolutional.py

It failed to find CUDA because of driver mismatch:

E tensorflow/stream_executor/cuda/cuda_driver.cc:491] failed call to cuInit: CUDA_ERROR_NO_DEVICE
E tensorflow/stream_executor/cuda/cuda_diagnostics.cc:296] kernel version 367.35.0 does not match DSO version 352.63.0 -- cannot find working devices in this configuration

Full log - http://pastebin.com/xiYtNsHk

CUDA 7.5 needs driver the 352.63 video driver, but GTX 1070 needs 367.35. The problem is TensorFlow officially supports only CUDA 7.5. So the requirements are a bit contradictory.

What do I need to do? Is it possible to use the 352.63 driver on a GTX 1070, will it run, even if it enables a limited feature-set? Or is there a CUDA 7.5 version built against this driver, or is there a way to build TensorFlow against CUDA 8.0?

This is a related question I found - Tensorflow Bazel 0.3.0 build CUDA 8.0 GTX 1070 fails.

### QuantOverflow

#### Positive Cross-Autocorrelation

Can anyone explain to me what positive cross-autocorrelation is, Lo and MacKinlay (1990) refer to it. I am aware of positive autocorrelation or positive cross-correlation, but cant get my head around Positive Cross-Autocorrelation.

### StackOverflow

#### Variable method reference in java 8

I am trying to create a method reference with the variable, which holds method name for some method from an object:

SomeClass obj = new SomeClass();
String methodName = "someMethod";


I am looking for way to create exactly obj::someMethod, but using variable methodName for this. Is it possible?

I know how to create functional interface instance from methodName and obj:

() -> {
try {
return obj.getClass().getMethod(methodName).invoke(obj);
} catch (NoSuchMethodException | IllegalAccessException e) {
return null;
}
};


but I am wondering is this can be done in more shorthand way.

### CompsciOverflow

#### Is “Binary Rectangle Tree” NP-hard?

It's 2D version of this problem: http://cstheory.stackexchange.com/questions/33982/is-binary-interval-tree-np-hard

The input is set of $n$ rectangles $R=\{R_1, \dots,R_n\}$, where each $R_i=I_1 \times I_2$ and $I_j$ are real intervals. The output should be the following rooted binary tree. Each leaf node corresponds to a rectangle from $R$. Each interior node contains a rectangle enclosing rectangles from both child nodes. The goal is to minimize the sum of surface areas of rectangles in interior nodes.

Example. Input $R=\{R_1, R_2, R_3\}=\{[0,1]\times[0,1],[1,2]\times[0,2],[20,25]\times[1,5]\}$

leaf nodes: $L_1=R_1,~L_2=R_2,~L_3=R_3$

interior nodes $I_1=(L_1,L_2),~I_2=(I_1,L_3)$

$SA(I_1)=4,~SA(I_2)=125$

total sum of surface areas is $129$

The 1D version can be solved in polynomial time by dynamic programming. I can use dynamic programming but it needs exponential memory. I don't see any obvious NPC reduction to this problem.

Is the 2D version NP-hard?

### TheoryOverflow

#### Why don't we transmit at rates higher than the Shannon capacity if we are going to get a nonzero probability of error anyways ?

Shannon capacity $C$ is the upper limit on a rate $R$ defined as the number of information symbols $k$ divided by the number of transmitted symbols $n$, that can be transmitted over a channel such that as $n \rightarrow \infty$, the probability of error goes to zero. If a rate $R > C$ is used, the probability of error is bounded away from zero.

Since $n$ is finite in practical applications, there is some nonzero probability of error.Thus, why would it matter if we transmit at a rate higher than $C$, Shannon's theorem predicts that such a rate will have a nonzero probability of error, but we already have a nonzero probability of error due to the fact that $n$ is finite.In other words, why don't we transmit at rates higher than $C$ if we are going to get a nonzero probability of error anyways ?

### CompsciOverflow

#### TCTL / UPPAAL : how to verifying a certain order of events?

I'd like to check if a certain order of events happens if another property holds true using UPPAAL and TCTL.

If A==true then eventually (B==true and eventually (C==true and eventually (D==true)))

But since "In contrast to TCTL, Uppaal does not allow nesting of path formulae." I'm not sure how to do this.

What I'v got so far is only something like:

E$\diamond$ (A && B)

E$\diamond$ (A && C)

E$\diamond$ (A && D)

### UnixOverflow

#### Does ZFS for Linux over stress VirtualBox?

I've been using MD raid + LVM for many years, but recently decided to take a look at ZFS. In order to try it, I created a VirtualBox VM with a similar layout to my main server - 7 'SATA' drives or various sizes.

I set it up with an approximation of my current MD+LVM configuration and proceeded to work out the steps I needed to follow to rearrange files, LVs, VGs etc, to make space to try ZFS. All seemed ok - I moved and rearranged PVs until I had the space set up over a period of 3 days uptime.

Finally, I created the first ZPool:

  pool: tank
state: ONLINE
scan: none requested
config:

tank        ONLINE       0     0     0
raidz1-0  ONLINE       0     0     0
sdb1    ONLINE       0     0     0
sdc1    ONLINE       0     0     0
sdd1    ONLINE       0     0     0
sde1    ONLINE       0     0     0
sdg1    ONLINE       0     0     0

errors: No known data errors


I created a couple of ZFS datasets and started copying files using both cp and tar. E.g. cd /data/video;tar cf - .|(cd /tank/video;tar xvf -).

I then noticed that I was getting SATA errors in the virtual machine, although the host system shows no errors.

Apr  6 10:24:56 model-zfs kernel: [291246.888769] ata4.00: exception Emask 0x0 SAct 0x400 SErr 0x0 action 0x6 frozen
Apr  6 10:24:56 model-zfs kernel: [291246.888801] ata4.00: failed command: WRITE FPDMA QUEUED
Apr  6 10:24:56 model-zfs kernel: [291246.888830] ata4.00: cmd 61/19:50:2b:a7:01/00:00:00:00:00/40 tag 10 ncq 12800 out
Apr  6 10:24:56 model-zfs kernel: [291246.888830]          res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
Apr  6 10:24:56 model-zfs kernel: [291246.888852] ata4.00: status: { DRDY }
Apr  6 10:24:56 model-zfs kernel: [291246.888883] ata4: hard resetting link
Apr  6 10:24:57 model-zfs kernel: [291247.248428] ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
Apr  6 10:24:57 model-zfs kernel: [291247.249216] ata4.00: configured for UDMA/133
Apr  6 10:24:57 model-zfs kernel: [291247.249229] ata4.00: device reported invalid CHS sector 0
Apr  6 10:24:57 model-zfs kernel: [291247.249254] ata4: EH complete


This error occurs multiple times on various different drives, occasionally with a failed command of 'READ FPDMA QUEUED' or (twice) 'WRITE DMA', to the extent that the kernel eventually reports:

Apr  6 11:51:32 model-zfs kernel: [296442.857945] ata4.00: NCQ disabled due to excessive errors


This does not stop the errors being reported.

An internet search showed that this error had been logged on the VirtualBox.org web sites about 4 years ago (https://www.virtualbox.org/ticket/8311) for version 4.0.2 of VirtualBox and was apparently considered fixed, but then reopened.

I'm running VirtualBox 4.3.18_Debian r96516 on Debian (Sid) kernel version 3.16.0-4-amd64 (which is also the guest OS as well as host OS). ZFS is version 0.6.3 for ZFSonLinux.org/debian.html.

I would have thought more work had been done on this in the intervening years as I can't believe I'm the only person to try out ZFS under VirtualBox so would have thought this error would have been identified and resolved especially as versions of both ZFS and VirtualBox are maintained by Oracle.

Or is it simply the case that ZFS stresses the virtual machine to its limits and the simulated drive/controller just can't respond fast enough?

Update:

In the 14 hours since I created a pool, the VM has reported 204 kernel ata erors. Most of the failed commands are 'WRITE FPDMA QUEUED', followed by 'READ FPDMA QUEUED', 'WRITE DMA' and a single 'FLUSH CACHE'. Presumably, ZFS retried the commands, but so far I am wary of using ZFS on a real server if it's producing so many errors on a virtual machine!

### TheoryOverflow

#### Original reference for Huffman shaped Merge Sort?

What is the first publication of the concept of optimizing merge sort by

1. identifying sequences of consecutive positions in increasing orders (aka runs) in linear time; then
2. repeatedly merging the two shortest such sequences and adding the result of this merging to the list of sorted fragments.

In some of my publications (e.g. http://barbay.cl/publications.html#STACS2009, http://barbay.cl/publications.html#TCS2013) I used this trick to sort faster and to generate a compressed data structure for permutation.

It seems that this trick was introduced before, just in the context of sorting faster, but neither me nor my student have been able to find back the reference?

### Fred Wilson

#### Understanding VCs

I saw Joe Fernandez‘ tweet a few days ago and thought “he is making an important point.”

VCs are not heroes. We are just one part of the startup ecosystem. We provide the capital allocation function and are rewarded when we do it well and eventually go out of business when we don’t do it well. I know. I’ve gone out of business for not doing it well.

If there are heroes in the startup ecosystem, they are the entrepreneurs who take the biggest risks and create the products, services, and companies that we increasingly rely on as tech seeps into everything.

VCs do have a courtside seat to the startup world by virtue of meeting and getting pitched by hundreds of founding teams a year and sitting in board meetings for many of these groundbreaking tech companies. We get to see things that most people don’t see and the result of that is that we often have insights that come from this unique view we are given of the startup sector.

Another thing that is important to know about VCs is that we operate in a highly competitive sector where usually only one or two VC firms are allowed to make a hotly contested investment. So in order to succeed, VCs need to market ourselves to entrepreneurs. There are many ways to do that and the best way is to back the most successful companies and be known for doing that. There is a reason that Mike Moritz and John Doerr were invited to lead Google’s initial VC round. By the time that happened, they had established themselves as the top VCs in the bay area and their firms, Sequoia and Kleiner Perkins, had established themselves as the top firms in the bay area.

Another way that VCs market ourselves to entrepreneurs is via social media. And blogging is one of the main forms of social media that VCs can use to do this. And, given that VCs have this unique position to gather insights from the startup sector, we can share these insights that we gain from our daily work with the world, and in particular entrepreneurs. If anyone has played this blogging game well enough to get into the top tier, it is me. I know of what I speak.

So how should entrepreneurs use this knowledge that is being imparted by VCs on a regular basis? Well first and foremost, you should see it as content marketing. That is what it is. That doesn’t mean it isn’t useful or insightful. It may well be. But you should understand the business model supporting all of this free content. It is being generated to get you to come visit that VC and offer them to participate in your Seed or Series A round. That blog post that Joe claimed is not scripture in his tweet is actually an advertisement. Kind of the opposite of scripture, right?

But you should also know that there is data behind that blog post, gained from hundreds (or thousands) of pitches and dozens (or hundreds) of board meetings. If VCs are good at anything, we are good at pattern recognition and inferring what these patterns are leading to. And so these blog posts that are not scripture, and are in fact advertising, can also contain information and sometimes even wisdom. So they should not be ignored either.

What I recommend to entrepreneurs is to listen carefully but not act too quickly. Get multiple points of view on important issues and decisions. And then carefully consider what to do with all of that information, filter it with your values, your vision, and your gut instinct. That’s what we do in our business and that is what entrepreneurs should do in their businesses.

If you are at a board meeting and a VC says “you should do less email marketing and more content marketing”, would you go see your VP Marketing after the meeting and tell them to cut email and double down on content? I sure hope not. I hope you would treat that VC comment as a single data point, to be heard, but most likely not acted on unless you get a lot of similar feedback.

VCs are mostly not idiots and can be quite helpful. But we are not gods and our word is not scripture. If you treat us like that, you are making a huge mistake. And I appreciate Joe making that point last week and am happy to amplify it with this post.

### infra-talk

#### Conference Room A/V Build-Out

We recently moved to our new building at 1034 Wealthy. We took the opportunity to update the A/V equipment for our conference rooms. Previously, we largely relied on projectors for presentation capabilities, an external USB microphone/speaker for audio, built-in webcams on laptops for video, and a table where we staged everything. This worked, but it was certainly not ideal. With the new building, I had the opportunity to standardize a new conference room A/V build-out that would be better suited to our needs.

All of our new conference rooms now have a mobile TV stand which holds all of our A/V equipment. This includes a large flatscreen TV, dedicated webcam, dedicated microphone/speaker, and all necessary cables and connectors. Our new setup provides important capabilities required for many of our meetings, especially teleconferences: mobility, audio input, audio output, video input, and video output.

## Capabilities

### Mobility

I choose the Kanto Living MTM82PL mobile TV mount, which includes the mounting hardware for a flatscreen TV, a small shelf, and a shelf for a webcam above the TV. It is a sleek, yet sturdy platform which allows our A/V build-out to be mobile. While largely dedicated to conference rooms, it can also be moved out to other areas–such as our cafe–for events or meet-ups.

### Video Output

The Samsung 65″ Class KU6300 6-Series 4K UHD TV was selected as our primary display. This provides a much better picture and much higher resolution than the old projectors we were using. It has a native resolution of 3840 x 2160, a 64.5″ screen (diagonal), and 3 HDMI ports. While not all of our devices can support that resolution at this point (for example, AppleTVs only support up to 1080p), it still seemed like a worthwhile investment to help future-proof the solution.

### Video Input

I chose the Logitech HD Pro Webcam C920 for video capabilities. It supports 1080p video when used with Skype for Windows, and 720p video when used with most other clients. The primary benefit of this webcam is that it can be mounted above the TV on the mobile stand, providing a wide view of the entire room–rather than just the person directly in front of the built-in laptop webcam.

### Audio Input/Output

We had previously made use of the Phoenix Audio Duet PCS as a conference room “telephone” for web meetings–it provides better audio capabilities for a group of people than a stand-alone laptop. We placed one of these in each of the conference rooms as part of the A/V build-out. It acts as the microphone and speaker, while using the Logitech webcam for video input and the Samsung TV for video output.

## Helpers

Of course, I needed a few other items to tie all of these different capabilities together.

### Cabling

I purchased 20 ft. Luxe Series High-Speed HDMI cables so people can connect directly to the Samsung TVs for presentations. This type of connection allows computers to utilize the full resolution of the new TVs.

The Moshi Mini DisplayPort to HDMI adapter provides connectivity for those Atoms whose MacBooks do not natively support HDMI.

### Presentation Helpers

I decided to purchase Apple TVs to allow for wireless presentation capabilities. With AirPlay, Macs (and other compatible devices) can transmit wirelessly to the TV–without the need for an HDMI cable. This is convenient for getting up and running quickly without any cable clutter, but it isn’t always appropriate (which is why a direct HDMI connection is available as well).

### Cable Management

In addition to the standard cable ties and other cable management tricks, I’ve found that Cozy Industries, makers of the popular MagCozy, also makes a DisplayCozy. This helps keep the Moshi HDMI adapter with the HDMI cable.

### Power Distribution

While the mobile TV cart provides a great deal of flexibility, the new building also has wide spaces between electrical outlets. To ensure that the A/V build-out would be usable in most spaces, I decided to add a surge protector with an extra-long cord. The Kensington Guardian 15′ works well for this.

## Finished Product

The post Conference Room A/V Build-Out appeared first on Atomic Spin.

### QuantOverflow

#### Shifted SABR for negative strikes

I am trying to apply SABR on EUR inflation caplets, with positive forward and negative strikes. Classical BS pricing is undefined, and so is SABR. I have read about the shifted SABR, which is supposed to accept negative strikes, but I was wondering whether anyone is aware of an existing implementation on Matlab for instance.

I have fitted the standard SABR parameters on positive strikes and modified some existing SABR code, adding the shift to the strike and the forward rate in the volatility equation. Now, I am feeding this modified volatility equation with my negative strikes and forward + the fitted SABR parameters, but nothing seems to have changed: it is still impossible to compute vols for negative strikes.

Do I have to feed the original strikes to the shifted model or the shifted strikes ? Do I have to shift the forward as well ? Is anyone aware of a better procedure for using the shifted SABR with negative rates?

Thanks a lot !

### StackOverflow

#### Easy print / auto push buttton / programming

please can you help me with printing in windows via right mouse button I try it but program (labelStar) show me a print dialogue . I try to edit .exe in resource hacker and PE explorer but i dont know how can I edit the file, I would like to make automatic confirmation of this dialogue or edit registry parameters for automatic confirm? here is a exe file: https://ufile.io/fcce6

thank you

### StackOverflow

#### (Query, Document, Relevance) free dataset for building an information retrieval system

I'm interested on finding a data set like "English Relevance Judgements File List": http://trec.nist.gov/data/qrels_eng

This dataset contains a labelled, pairs of queries and documents. However, it depends on a nonfree corpus, called "Data - English Documents": http://trec.nist.gov/data/docs_eng.html

Do you know any free dataset(s) similar this one?

Side-note: The dataset will be used in a research project for building an information retrieval system based on neural networks.

### Planet Emacsen

#### Irreal: Scimax

The ACM Technews newsletter has a short piece on John Kitchin's Scimax project. Here's the CMU article on Scimax, which gives an overview of the project.

Basically, Scimax is the collection of (mostly) Elisp utilities that Kitchin has put together to help with his group's writing and publishing of papers. It features using Org mode to write the papers in a reproducible research way and then publish them to the format required by the journal they are submitting the paper to. There are also some tools to aid in teaching. For more details, check out Kitchin's Scimax page.

The nice thing about Scimax is that all the utilities are packaged up into a single project repository that anyone interested can download and use. The project is hosted on Github if you're interested.

### StackOverflow

#### Why are scala Vals not lazy by default

I have noticed that I almost exclusively use lazy val assignments as they often avoid unnecessary computations, and I can't see that many situations where one would not want to do so (dependency on mutable variables being a notable exceptions of course).

It would seem to me that this is one of the great advantages of functional programming and one should encourage its use whenever possible and, if I understood correctly, Haskell does exactly this by default.

So why are Scala values not lazy by default? Is it solely to avoid issues relating to mutable variables?

#### Error in graphlab.SFrame('home_data.gl/')

I am doing Machine Learning Course from Coursera by University of Washington. In which I am using iPython's graphlab. During practise when I execute below command:

sales = graphlab.SFrame('home_data.gl/')


I am getting the error:

InvalidProductKey                         Traceback (most recent call last)
<ipython-input-3-c5971b60b216> in <module>()
----> 1 sales=graphlab.SFrame('home_data.gl/')

/opt/conda/lib/python2.7/site-packages/graphlab/data_structures/sframe.pyc in __init__(self, data, format, _proxy)
865             self.__proxy__ = _proxy
866         else:
--> 867             self.__proxy__ = UnitySFrameProxy(glconnect.get_client())
868             _format = None
869             if (format == 'auto'):

/opt/conda/lib/python2.7/site-packages/graphlab/connect/main.pyc in get_client()
138     """
139     if not is_connected():
--> 140         launch()
141     assert is_connected(), ENGINE_START_ERROR_MESSAGE
142     return __CLIENT__

/opt/conda/lib/python2.7/site-packages/graphlab/connect/main.pyc in launch(server_addr, server_bin, server_log, auth_token, server_public_key)
90         if server:
91             server.try_stop()
---> 92         raise e
93     server.set_log_progress(True)
94     # start the client



(Note the ipython notebook and home_data.gl are in same folder.)

#### Python implementation of voted perceptron

can anyone know how to implement the voted perceptron algorithm (reported in this article http://cseweb.ucsd.edu/~yfreund/papers/LargeMarginsUsingPerceptron.pdf) in python ? Which could be a dual representation of the voted perceptron?

Thanks

### Fefe

#### Old and busted: Anrufbeantworter.New hotness: Abmahnbeantworter!

Old and busted: Anrufbeantworter.

New hotness: Abmahnbeantworter!

### Planet Emacsen

#### Pragmatic Emacs: Search or swipe for the current word

It is often handy to search for the word at the current cursor position. By default, you can do this by starting a normal isearch with C-s and then hitting C-w to search for the current word. Keep hitting C-w to add subsequent words to the search.

If, like me, you use swiper for your searches, you can obtain the same effect using M-j after you start swiper.

This is all very nice, but both of those solutions above search for the string from the cursor position to the end of the word, so if “|” marks the cursor position in the word prag|matic, then either method above would search for matic. I made a small tweak to the relevant function in the ivy library that powers swiper so that the whole of the word is used, so in the example above M-j would search for the full pragmatic string.

Here is the code:

;; version of ivy-yank-word to yank from start of word
(defun bjm/ivy-yank-whole-word ()
"Pull next word from buffer into search string."
(interactive)
(let (amend)
(with-ivy-window
;;move to last word boundary
(re-search-backward "\\b")
(let ((pt (point))
(le (line-end-position)))
(forward-word 1)
(if (> (point) le)
(goto-char pt)
(setq amend (buffer-substring-no-properties pt (point))))))
(when amend
(insert (replace-regexp-in-string "  +" " " amend)))))

;; bind it to M-j
(define-key ivy-minibuffer-map (kbd "M-j") 'bjm/ivy-yank-whole-word)


### StackOverflow

#### What are alternatives of Gradient Descent?

Gradient Descent has a problem of Local Minima. We need run gradient descent exponential times for to find global minima.

Can anybody tell me about any alternatives of gradient descent with their pros and cons.

Thanks.

### Fefe

#### Benutzt ihr alle Adblocker? Solltet ihr. Der neueste ...

Benutzt ihr alle Adblocker? Solltet ihr. Der neueste Trend aus den USA ist, dass Webseiten ihre Besucher per E-Mail zuspammen. Nein, nicht Webseiten, auf denen man einen Account eingerichtet und denen man seine E-Mail-Adresse gesagt hat. Werbenetzwerke können ja Benutzer seitenübergreifend zuordnen. Das ist genau deren Funktion. Und wenn man jetzt irgendwo innerhalb des Netzwerks seine E-Mail-Adresse hinterlegt hat, dann kann man von jedem anderen Teilnehmer mit Spam bespritzt werden.

#### So, einige Einsendungen zu meinem Aufruf gestern sind ...

So, einige Einsendungen zu meinem Aufruf gestern sind reingekommen. Hier die erste Fuhre.
Vorweg, ich habe nichts für die Stiftung zu sagen, ich habe aber eine Aluhut-Theorie für dich:

Vielleicht ist die Skandal um die AA-Stiftung ja ein Gegenfeuer dazu, dass Bertelsmann jetzt die öffentliche Meinung bei facebook zensiert. Das ist ja eigentlich ein viel größerer Klops, es scheint aber angesichts der AA-Debatte nahezu unterzugehen.

Guter Hinweis! Die Presse berichtete, dass Facebook ihre Zensur an Arvato outgesourced hat, aber nicht alle werden wissen, dass Arvato Bertelsmann gehört.

Ein anderer Einsender verwies auf diesen Blogeintrag bei Mobilegeeks, den ich allerdings nicht ernst nehmen kann, weil er direkt hiermit eröffnet:

ausgerechnet denen den Willen zur Einschränkung der Meinungsfreiheit unterstellt, die versuchen, es wieder jedem möglich zu machen angstfrei die eigene Meinung zu sagen
Sorry, das geht mir dann doch zu weit. Nicht mal die AA-Stiftung selbst behauptet von sich, für die freie Meinungsäußerung ohne Angst einzutreten. Der Autor scheint mir auf dem Weg irgendwo falsch abgebogen zu sein.

Ein dritter Einsender verweist auf dieses EU-Papier für mehr Toleranz, das, aus dem entsprechenden Blickwinkel gelesen, wie die Einführung einer Indoktrinationsmaschinerie klingt (besonders Sektion 8 und 9).

Sanfte Grüße Sir, was sind Ihre Extreme?

Noch jemand wies auf die Selbstverteidigung der AA-Stiftung hin, die auf mich ja eher peinlich wirkt, ehrlich gesagt. Aber vielleicht seht ihr das ja anders.

Der einzige aus meiner Sicht ernstzunehmende Fürsprecher hat leider einen Haufen persönlicher Informationen in seinem Beitrag, die ich mal vorsichtig rauszufiltern versuchen werde, solange ich nicht sicher weiß, dass ich das veröffentlichen darf.

Ich habe mich jahrelang gegen das braune Haus in unserem Dorf [Name zensiert] engagiert. Wir haben dazu eine Bürgerinitiative gegründet und uns persönlich den Nazis entgegen gestellt. Ja, das ist das Haus, in dem Ralf Wohlleben, Rene Kapke und Co. gelebt und gewirkt haben. Ja, ich habe die Leute persönlich kennen gelernt und mit ihnen auch öffentlich und privat gestritten. Ja, das ist der Stadtteil, in dem die drei NSU-Leute aufgewachsen sind. Ja, das ist der Stadtteil, in dem die Garage steht, in der die Rohrbomben gefunden wurden, mit deren Auffinden das Untertauchen des NSU begann. Ja, das ist das Haus, in lt. Anklage die Waffenlieferung und Unterstützungen für den NSU stattgefunden habe. Inzwischen ist das Haus abgerissen. Unsere Aufgabe ist erledigt.

Aber zurück: Unsere Initiative hat von der AA-Stiftung Unterstützung bekommen. Wir hatten finanzielle Unterstützung in Form eines Preisgeldes (nicht von der Stiftung). Als wir eine Plakataktion machen wollten, bei der Schulen zur thematischen Auseinandersetzung mit dem Thema Rechtsextremismus aufgefordert wurden, hat uns die AA-Stiftung kräftig unterstützt. Sie haben dafür gesorgt, dass wir - ohne an unser Geld zu gehen und ohne den formalen Kram bei uns abzuladen - die besten Plakate an den großen Werbetafeln der Einfallsstraßen platziert wurden.

Kurz gesagt habe ich die AA-Stiftung nur als eine zurückhaltende und unterstützende Organisation lokaler Initiativen wahrgenommen. Allerdings ist das alles schon ein paar Jahre her. Zu den "Pranger"-Aktionen kann ich nichts sagen. Das ist in dem mir bekannten Netzwerken nicht angekommen und war dort nie Thema.

Ich würde Dich also bitten, zwischen den verschiedenen Aktionen der Stiftung zu unterscheiden und nicht aus einer Katastrophe auf alle anderen Bereiche zu schließen.

Update: Zu dem Mobilegeeks-Ding findet ein Kumpel, dass ich den zu vorschnell abkanzele. Seine Interpretation sei, dass die Stiftung niemanden zensieren, sondern nur die Auswüchse bekämpfen will, die es den Leuten unmöglich machen, ihre Meinung angstfrei zu äußern. Lest das also selber.

Update: Leserbrief:

nicht sicher, ob das etwas ist wonach du suchst, weil es nicht erklärt/verteidigt, was die AA sonst so macht aber sie führen eine Chronik flüchtlingsfeindlicher Aktionen, die in meinen Augen ein sehr wichtiger Dienst ist (zumindest habe ich sonst nirgends eine so vollständige und mit Quellen versehene Liste dieser Art gefunden)
Aber vermutlich ist das nicht wonach du suchst...
Doch, das ist genau, wonach ich suche.

Update: Und noch ein toller Leserbrief:

die Amadeu-Antonio-Stiftung greift da gerade ins Klo, aber wir brauchen sie aus diesen anderen Gründen noch.
Sie ist definitiv das kleinere Übel gegenüber den Nazis!
(Bisher ja nicht so der Kracher, aber bleibt mal dran)
Es mag vom Tegernsee oder München, von Charlottenburg oder dem Prenzlauer Berg aus schwer vorstellbar sein, aber z.B. hier im östlichen MV / nordöstlichen Brandenburg wie in weiten Teilen der (Ost-) Provinz gibt es eine jahrzehntelang etablierte dumpf-braune Subkultur, mittlerweile wächst die 3. Generation in diesem Milieu heran. Bis zur Jahrtausendwende gab es hier außer ein paar Pastoren einfach niemanden, der jungen Leuten, die keine Lust auf Landsermucke und Punker-Klatschen hatten, irgendwelche Angebote gemacht hätte. Von Staats wegen herrschte die Doktrin der "akzeptierenden Jugendarbeit" vor, d.h., die Jungglatzen gingen zum Abhitlern ins kommunal geförderte Jugendzentrum.

Für Leute von 12 - 18 Jahren, die mit Kirche nichts am Hut haben und nicht mit dem rassistischen Strom schwimmen wollen, bleibt eigentlich nur die Flucht.
Das Verdienst der AAS ist es, in solchen Verhältnissen für Ermutigung, Stärkung und regionale Vernetzung solcher Menschen zu sorgen. Mit einer Vielzahl kleiner Ansätze im sozio-kulturellen Bereich bleiben sie in der Fläche aktiv bzw. ermöglichen Aktivitäten wie multikulturelle Feste, interkulturelle Events und Kurse an Berufsschulzentren, wo die Kids vielleicht das erste Mal in ihrem Leben mit einem Kameruner oder Libanesen wirklich sprechen können oder einfach gemeinsam Musik machen, kochen und ähnliches.

Die ASS unterstützt mobile Beratungsteams, die in den Käffern, wo die Glatzen mal wieder einen nicht-rechten Klub verwüstet haben oder vorm Flüchtlingsheim randalieren, den ganz normalen Leuten den Rücken stärken und sie ermuntern, dem Mob etwas entgegenzusetzen.
Nicht zuletzt die lokalen Unterstützer-Initiativen für Flüchtlinge, die in dieses feindliche Klima geworfen sind, können sich bei der AAS Know-how, technische Hilfe und alle möglichen Arten der Unterstützung holen.
Sie ist es auch, die bei den Ministerien und untergeordneten Behörden immer wieder auf der Matte steht und Aktivität und Mittel gegen den Mainstream aus Ressentiment und Ignoranz einfordert.

Ich selbst bin Handwerker und trage privat und im Arbeitsumfeld mein Schärflein zu einer offeneren, toleranteren Gesellschaft bei, so gut ich kann.

Ursprünglich in Berlin zu Hause, beobachte und kenne ich den Alltag hier draußen sei knapp 15 Jahren.

Ich bin also nicht beruflich im sozialen Sektor unterwegs und betrachte die Entwicklung eher als aktiver Bürger (im Sinne von citoyen).
Mein Eindruck ist, dass es eine Menge Leute hier nicht mehr gäbe ohne die stetige, kleinteilige lokale Arbeit, die die AAS tut.
Gerade seit 2014/15 mit dem großen Zustrom von Migranten auf das platte Land kann man sehen, wie notwendig die Präsenz einer kleinen aber aktiven Fraktion weltoffener und nicht fremdenfeindlicher Menschen ist.
Das Gefühl tut gut, mit den Dumpfbacken nicht allein gelassen zu sein, und dieses Gefühl verdanke ich zu einem guten Teil Initiativen und Aktionen aus dem ASS-Portfolio, denn gerade im politischen Raum gibt es sonst kaum jemanden, der hier so lange bei der Stange geblieben wäre.

Zu dieser ganzen Nohatespeech-Sache kann ich wenig sagen, da ich weder Twitter- noch Facebook-Nutzer bin.
Möglicherweise hat sich die Stiftung, was ihre Netz-Aktivitäten betrifft, einfach die falschen Leute eingetreten.
Abseits von Don Alphonsos Privatfeldzug gegen gewisse Berliner Ex-Piratinnen muss ich allerdings Deinem Leser beipflichten, dass die breitflächige Bertelsmann-Zensur der sog. Sozialen Medien mir doch der größere Skandal zu sein scheint im Vergleich zu den paar 1000 €, die das Familienministerium in ein schlechtes Neue-Rechte-Wiki versenkt hat.

Wenn man mal schaut, wer sich da jetzt alles in Empörung übt und mit welchen Argumenten ("Stasi! Stasi! STAAASI!!!"), kann man sich eigentlich nur wundern, welche und wie viele ASS-Experten und -watchdogs es in diesem Land gibt.

Wunderbar, vielen Dank an die Leserbriefschreiber. Sowas hatte ich mir gewünscht. Audiatur et altera pars.

#### Deal des Tages:Nun hat Bundeskanzlerin Merkel in einem ...

Deal des Tages:
Nun hat Bundeskanzlerin Merkel in einem Zeitungsinterview von allen türkischstämmigen Bürgern Loyalität eingefordert. Im Gegenzug will sie "versuchen, ein offenes Ohr zu haben".
Das Schlüsselwort ist "versuchen", ja? So sieht die Merkel ihren Job!

#### Wo geht bei Tor die Reise hin, jetzt, wo die ganzen ...

Wo geht bei Tor die Reise hin, jetzt, wo die ganzen Freiwilligen sich distanzieren?

Modell Klingelbeutel mit einer Packung FUD ala "wenn das Freiwillige machen, wird das eh nichts und birgt Risiken". Daher lieber als Kommerz-Service mit Industriepartnern aus den Five Eyes!

### CompsciOverflow

#### Why do GF(2) and CRC not give the same result?

I am trying to understand (and implement functions for) binary polynomial division.

My first step was to understand and compare the results of two online tools. The first is a formal GF(2) polynomial calculator. The second is a CRC polynomial calculator.

I expected the remainder of the formal calculator being equal to the checksum of CRC calculator.

So I entered the following data to the formal calculator:

A = 0100000101000001 (should be same as "AA" ASCII data)
B = 11111


And I entered the following to the CRC calculator:

CRC order = 4
CRC polynom = F
Data sequence = AA
Initial = 0
Direct, no reverse input, no reverse output


I used width 4 and polynomial F (instead of 5 and 1F) since (as far as I understand) the CRC calculator expects polynomials in standard notation (that omits the leading 1-bit).

The CRC calulcator says checksum is 2 while formal calculator says binary remainder is 100 = 4.

Why don't I get same results?

#### Is this a correct argument for the O(n log n) bound on sorting algorithms? [on hold]

Let $A$ be the array to be sorted, and assume $A$ has size $n$. Now there are $n!$ ways to permute the elements of $A$, but any sorting algorithm must determine the "correct" permutation. This means that the algorithm must work at least $\lg (n!)$ time. Now $\lg (n!) = \sum_{i=1}^n \lg i \sim \int_1^n \log x \, dx \sim n \log n$ QED

### StackOverflow

#### Application of "Cocktail party effect" in phones

I recently learned about "Cocktail party problem":

The cocktail party effect is the ability to focus on a specific human voice while filtering out other voices or background noise." [1]

Is this problem "solved" in phones ? It doesn't look it is, because when I call with someone, who is on e.g. party or at some noisy place I still cannot hear his/her voice very clearly. If the solution isn't applied in phones, my question is why ? Is it because the microphones should be good enough and it will bring only a very small improvement ?

### CompsciOverflow

#### What is this approximation/error-reduction method called?

I'm wondering if anyone could help me find my footing in an approach I am taking with a student in my audio programming class for creating more accurate pitch detection algorithms. But the approach is not limited to pitch detection and in fact seems seems similar to Newton's Method, Euler's Method, Horner's Method, and so on. It is a very simple and general idea, and must have some background in numerical methods. I am looking for pointers to the literature.

Here is the idea. We have a function f which takes a signal and returns the fundamental frequency (such algorithms are close cousins to the Discrete Fourier Transform). In order to test its accuracy, I created simple sine wave signals of precise frequencies and tested the algorithm, and graphed the errors over a particular range; basically a perfect f would be the identity function, so we just had to record the deviation from the identity. The errors are basically sinusoidal. So I stored the errors in an array, and use cubic interpolation to create a continuous error function, and built that into the last stage of the algorithm. Of course, there is a problem, because the errors showed the deviation from a perfect f, and the original f is not perfect, so there would be errors in the errors, so to speak. So I iterated the process, correcting successively for errors in the errors, and the algorithm gets better each time. I have not yet figured out whether it will converge to some minimal error. I also have not tested it in musical settings. But it is very promising, and seems like a generally useful technique.

Separate from a programming trick, I would like to understand some of its properties such as convergence and so on. Anyone have any pointer, keywords, etc. for me to pursue this? I'm guessing it is a standard technique in numerical methods.

### QuantOverflow

#### Does the Knock-out option price go to $0$ when the stock price goes to the barrier $B$?

I am reading Steven Shreve's book "Stochastic Calculus for Finance 2 Continuous-Time Models", page 304. My intuition is that when the stock price gets closer to the barrier, it will be more and more likely that the price will exceed the barrier in a near future, hence it has a large probability to become worthless. This leads to the consequence that the price of the option should be closer and closer to zero. But I can not justify this intuition from the formula on page 304. Can someone explain this? Thanks a lot.

The formula is $$V(0)=S(0)I_1-KI_2-S(0)I_3+KI_4$$ where $$\quad I_1=\frac{1}{\sqrt{2\pi T}}\displaystyle\int_{k}^be^{\sigma w-rT+\alpha w-\frac{1}{2}\alpha^2T-\frac{1}{2T}w^2}dw$$

$$I_2=\frac{1}{\sqrt{2\pi T}}\displaystyle\int_{k}^be^{-rT+\alpha w-\frac{1}{2}\alpha^2T-\frac{1}{2T}w^2}dw$$ and $$\quad I_3=\frac{1}{\sqrt{2\pi T}}\displaystyle\int_{k}^be^{\sigma w-rT+\alpha w-\frac{1}{2}\alpha^2T-\frac{2}{T}b^2+\frac{2}{T}bw-\frac{1}{2T}w^2}dw$$

$$I_4=\frac{1}{\sqrt{2\pi T}}\displaystyle\int_{k}^be^{-rT+\alpha w-\frac{1}{2}\alpha^2T-\frac{2}{T}b^2+\frac{2}{T}bw-\frac{1}{2T}w^2}dw$$

### CompsciOverflow

#### How to calculate sum of binomial coefficients efficiently?

I want to compute the sum

$$\binom{n}{0}+\binom{n}{2}+\binom{n}{4}+\binom{n}{6}+\dots+\binom{n}{k} \bmod 10^9+7$$

where $n$ and $k$ can be up to $10^{14}$ and $k\le n$.

I found several links on stack overflow to calculate sum of binomial coefficients but none of them works on large constraints like $10^{14}$. I tried doing it by changing series using the relation $\binom{n}{k}=\binom{n-1}{k-1}+\binom{n-1}{k}$ and came up with a brute force solution which is of no use. Is there any way to do it efficiently?

This question is from the TCS codevita 2016 round 2 contest, which has ended.

### QuantOverflow

#### Is a bondfuture an IRD or a Credit Derivative?

I need to categorize a BondFuture trade in one of the five major asset classes and I am not sure if it should put it to the interest rate asset class or the credit asset class.

A quick (and dirty) thought it to split the bond trade to an IR swap and a CDS.

For example, buying a fixed rate bond could be 'linked' with going short on an IR Swap and short a CDS on the issuer.

Any other ideas?

Thanks

#### How to calculate a forward-starting swap with forward equations?

I have been trying to resolve this problem for some time but I cannot get the correct answer. The problem is the following one.

Compute the initial value of a forward-starting swap that begins at $t=1$, with maturity $T=10$ and a fixed rate of 4.5%. (The first payment then takes place at $t=2$ and the final payment takes place at $t=11$ as we are assuming, as usual, that payments take place in arrears.) You should assume a swap notional of 1 million and assume that you receive floating and pay fixed.)

We also know that

• $r_{0,0}=5\%$
• $u=1.1$
• $d=0.9$
• $q=1−q=1/2$

Using forward equations from $t=1$ to $t=9$, I cannot resolve the problem:

Here is what I have done in Excel with a final result of -31076 but it is not the correct answer:

#### Why is the Black 76 model not considered an interest rate model?

The Black 76 model is one of the standard models for interest rate derivatives like pricing caps, floors, swaptions, etc.

The Black 76 model is given as $$dF_t = \sigma F_t dW_t$$ so it models the dynamics of the forward rate $F_t$ which implies a certain term structure. Why is the Black 76 model not considered an interest rate model (like Vasicek) in the literature even though it is used for pricing interest rate derivatives?

#### HFT to blame for Flash Crashes?

Some people 1, 2, 3 claim that High Frequency Trading is partly to blame for the extreme volatilities in the markets yesterday (24. August 2015).

Is that true?

Is the amount HFTs move even enough to push the markets down like that? Does this behaviour align with the way they operate?

How can you explain the low Dow Jones market open? Isn't it more likely that private investors just sell at market open? Why would HFTs even trade directly to market open when there is no arbitrage to make?

### QuantOverflow

#### Using quantlib to price swaps with different payment and calculation resets for floating leg

I understand the VanillaSwap object assumes that payment and calculation resets are the same, so is there any way we could use quantlib to price a swap with different reset and calculation frequencies? (say payment is semiannual but resets is annual).

A few candidates I've considered are:

1. NonstandardSwap: however I think this does not allow different payment and reset schedules too.

2. Swap: it takes 2 legs but Leg itself is virtual, however there are many other ways implementing this, one way is to use the IborCoupon however that seems to require repeatedly creating every single coupon in order to constructing Leg.

Is there any other simpler way to deal with this given that everything else is similar to a VanillaSwap except using different payment and calculation dates?

### StackOverflow

#### scikit learn decision tree model evaluation

Here are the related code and document, wondering for the default cross_val_score without explicitly specify score, the output array means precision, AUC or some other metrics?

Using Python 2.7 with miniconda interpreter.

http://scikit-learn.org/stable/modules/generated/sklearn.tree.DecisionTreeClassifier.html

>>> from sklearn.datasets import load_iris
>>> from sklearn.cross_validation import cross_val_score
>>> from sklearn.tree import DecisionTreeClassifier
>>> clf = DecisionTreeClassifier(random_state=0)
>>> cross_val_score(clf, iris.data, iris.target, cv=10)
...
...
array([ 1.     ,  0.93...,  0.86...,  0.93...,  0.93...,
0.93...,  0.93...,  1.     ,  0.93...,  1.      ])


regards, Lin

#### My loss starts increasing as I decay learning rate, Tensorflow?

I'm using exponential decay to decay my learning rate after every 10 epochs. As you can see in the output below, as my learning rate starts to decrease, my loss starts to increase and I've tried several variations and same thing happens every time. What could go wrong?

 global_step = tf.Variable(0, name="global_step", trainable=False)
decayed_learning_rate = tf.train.exponential_decay(learning_rate = 0.0001,global_step = global_step,decay_steps = 1000, decay_rate = 0.6, staircase = True)
optimizer= tf.train.MomentumOptimizer(learning_rate = decayed_learning_rate, momentum = 0.9)
minimize_loss = optimizer.minimize(loss, global_step=global_step)


Here is the output:

 1
Epoch Finished
Loss after one Epoch(Training) = 8.291080, Training Accuracy= 0.18000
Loss after one Epoch(Validation) = 8.464677, Validation Accuracy= 0.10000
Loss after one Epoch(Test) = 8.631430, Test Accuracy= 0.13000
2
Epoch Finished
Loss after one Epoch(Training) = 4.619487, Training Accuracy= 0.12000
Loss after one Epoch(Validation) = 4.835144, Validation Accuracy= 0.14000
Loss after one Epoch(Test) = 5.233496, Test Accuracy= 0.09000
3
Epoch Finished
Loss after one Epoch(Training) = 4.591153, Training Accuracy= 0.10000
Loss after one Epoch(Validation) = 4.878084, Validation Accuracy= 0.09000
Loss after one Epoch(Test) = 4.112285, Test Accuracy= 0.11000
4
Epoch Finished
Loss after one Epoch(Training) = 4.530641, Training Accuracy= 0.13000
Loss after one Epoch(Validation) = 4.874103, Validation Accuracy= 0.10000
Loss after one Epoch(Test) = 4.225502, Test Accuracy= 0.14000
5
Epoch Finished
Loss after one Epoch(Training) = 3.664831, Training Accuracy= 0.26000
Loss after one Epoch(Validation) = 3.207108, Validation Accuracy= 0.29000
Loss after one Epoch(Test) = 4.435939, Test Accuracy= 0.17000
6
Epoch Finished
Loss after one Epoch(Training) = 3.682740, Training Accuracy= 0.26000
Loss after one Epoch(Validation) = 3.794605, Validation Accuracy= 0.21000
Loss after one Epoch(Test) = 3.890673, Test Accuracy= 0.17000
7
Epoch Finished
Loss after one Epoch(Training) = 3.638363, Training Accuracy= 0.27000
Loss after one Epoch(Validation) = 4.057161, Validation Accuracy= 0.21000
Loss after one Epoch(Test) = 4.400304, Test Accuracy= 0.19000
8
Epoch Finished
Loss after one Epoch(Training) = 3.290856, Training Accuracy= 0.13000
Loss after one Epoch(Validation) = 3.573865, Validation Accuracy= 0.02000
Loss after one Epoch(Test) = 3.289892, Test Accuracy= 0.13000
9
Epoch Finished
Loss after one Epoch(Training) = 3.249848, Training Accuracy= 0.13000
Loss after one Epoch(Validation) = 3.816904, Validation Accuracy= 0.09000
Loss after one Epoch(Test) = 3.365518, Test Accuracy= 0.09000
10
Epoch Finished
Loss after one Epoch(Training) = 3.261417, Training Accuracy= 0.13000
Loss after one Epoch(Validation) = 3.051553, Validation Accuracy= 0.13000
Loss after one Epoch(Test) = 3.935049, Test Accuracy= 0.10000
11
Epoch Finished
Loss after one Epoch(Training) = 3.274293, Training Accuracy= 0.12000
Loss after one Epoch(Validation) = 3.341079, Validation Accuracy= 0.12000
Loss after one Epoch(Test) = 3.465601, Test Accuracy= 0.09000
12
Epoch Finished
Loss after one Epoch(Training) = 3.245074, Training Accuracy= 0.12000
Loss after one Epoch(Validation) = 3.655849, Validation Accuracy= 0.09000
Loss after one Epoch(Test) = 3.890745, Test Accuracy= 0.11000
13
Epoch Finished
Loss after one Epoch(Training) = 3.242341, Training Accuracy= 0.12000
Loss after one Epoch(Validation) = 3.527991, Validation Accuracy= 0.04000
Loss after one Epoch(Test) = 3.207819, Test Accuracy= 0.12000
14
Epoch Finished
Loss after one Epoch(Training) = 3.277830, Training Accuracy= 0.12000
Loss after one Epoch(Validation) = 3.797029, Validation Accuracy= 0.10000
Loss after one Epoch(Test) = 3.317770, Test Accuracy= 0.11000
15
Epoch Finished
Loss after one Epoch(Training) = 3.269509, Training Accuracy= 0.12000
Loss after one Epoch(Validation) = 3.074466, Validation Accuracy= 0.12000
Loss after one Epoch(Test) = 3.887167, Test Accuracy= 0.10000
16
Epoch Finished
Loss after one Epoch(Training) = 4.100363, Training Accuracy= 0.14000
Loss after one Epoch(Validation) = 4.208894, Validation Accuracy= 0.10000
Loss after one Epoch(Test) = 4.150678, Test Accuracy= 0.15000
17
Epoch Finished
Loss after one Epoch(Training) = 4.037428, Training Accuracy= 0.14000
Loss after one Epoch(Validation) = 4.366947, Validation Accuracy= 0.10000
Loss after one Epoch(Test) = 4.501517, Test Accuracy= 0.09000
18
Epoch Finished
Loss after one Epoch(Training) = 4.048151, Training Accuracy= 0.14000
Loss after one Epoch(Validation) = 4.315053, Validation Accuracy= 0.10000
Loss after one Epoch(Test) = 3.972508, Test Accuracy= 0.10000
19
Epoch Finished
Loss after one Epoch(Training) = 4.046428, Training Accuracy= 0.14000
Loss after one Epoch(Validation) = 4.649216, Validation Accuracy= 0.08000
Loss after one Epoch(Test) = 4.125694, Test Accuracy= 0.11000
20
Epoch Finished
Loss after one Epoch(Training) = 4.082591, Training Accuracy= 0.13000
Loss after one Epoch(Validation) = 3.639134, Validation Accuracy= 0.16000
Loss after one Epoch(Test) = 4.476624, Test Accuracy= 0.16000
21
Epoch Finished
Loss after one Epoch(Training) = 4.068653, Training Accuracy= 0.13000
Loss after one Epoch(Validation) = 4.141028, Validation Accuracy= 0.10000
Loss after one Epoch(Test) = 4.086758, Test Accuracy= 0.15000
22
Epoch Finished
Loss after one Epoch(Training) = 4.066084, Training Accuracy= 0.13000
Loss after one Epoch(Validation) = 4.252730, Validation Accuracy= 0.10000
Loss after one Epoch(Test) = 4.357038, Test Accuracy= 0.10000
23
Epoch Finished
Loss after one Epoch(Training) = 4.031103, Training Accuracy= 0.14000
Loss after one Epoch(Validation) = 4.360917, Validation Accuracy= 0.10000
Loss after one Epoch(Test) = 3.916987, Test Accuracy= 0.11000
24
Epoch Finished
Loss after one Epoch(Training) = 4.031075, Training Accuracy= 0.14000
Loss after one Epoch(Validation) = 4.653004, Validation Accuracy= 0.07000
Loss after one Epoch(Test) = 4.183711, Test Accuracy= 0.10000
25
Epoch Finished
Loss after one Epoch(Training) = 4.039016, Training Accuracy= 0.14000
Loss after one Epoch(Validation) = 3.654388, Validation Accuracy= 0.15000
Loss after one Epoch(Test) = 4.228384, Test Accuracy= 0.18000


### StackOverflow

#### numpy.ndarray syntax understanding for confirmation

I am referring the code example here (http://scikit-learn.org/stable/auto_examples/linear_model/plot_iris_logistic.html), and specifically confused by this line iris.data[:, :2], since iris.data is 150 (row) * 4 (column) dimensional I think it means, select all rows, and the first two columns. I ask here to confirm if my understanding is correct, since I take time but cannot find such syntax definition official document.

Another question is, I am using the following code to get # of rows and # of columns, not sure if better more elegant ways? My code is more Python native style and not sure if numpy has better style to get the related values.

print len(iris.data) # for number of rows
print len(iris.data[0]) # for number of columns


Using Python 2.7 with miniconda interpreter.

print(__doc__)

# Code source: Gaël Varoquaux
# Modified for documentation by Jaques Grobler

import numpy as np
import matplotlib.pyplot as plt
from sklearn import linear_model, datasets

# import some data to play with
X = iris.data[:, :2]  # we only take the first two features.
Y = iris.target

h = .02  # step size in the mesh

logreg = linear_model.LogisticRegression(C=1e5)

# we create an instance of Neighbours Classifier and fit the data.
logreg.fit(X, Y)

# Plot the decision boundary. For that, we will assign a color to each
# point in the mesh [x_min, m_max]x[y_min, y_max].
x_min, x_max = X[:, 0].min() - .5, X[:, 0].max() + .5
y_min, y_max = X[:, 1].min() - .5, X[:, 1].max() + .5
xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h))
Z = logreg.predict(np.c_[xx.ravel(), yy.ravel()])

# Put the result into a color plot
Z = Z.reshape(xx.shape)
plt.figure(1, figsize=(4, 3))
plt.pcolormesh(xx, yy, Z, cmap=plt.cm.Paired)

# Plot also the training points
plt.scatter(X[:, 0], X[:, 1], c=Y, edgecolors='k', cmap=plt.cm.Paired)
plt.xlabel('Sepal length')
plt.ylabel('Sepal width')

plt.xlim(xx.min(), xx.max())
plt.ylim(yy.min(), yy.max())
plt.xticks(())
plt.yticks(())

plt.show()


regards, Lin

#### How to run a theano.function on TensorVariable

I want to do something similar to looping, over TensorVariables.After some research I noticed I could make use of theano.scan to simulate a loop.

I wrote the following code, and ran theano.function on a TensorVariable, where the numeric value is not determined yet. But apparently the theano.function expects numerical values and not symbolic tensorvariables. Is there a way to run a function on symbolic tensorvariables, or alternatively, a way to convert tensorvariables into nparrays for the input to theano.function?

The code is as follows:

def CI(observed, event_times, estimated_risk): #C_index in tensor mode
ti = T.dvector('ti')
tj = T.dvector('tj')
o = T.dvector('o')
has_ones = T.matrix('has_ones')

omega, updates = theano.scan(fn=lambda tj_element, ti_vector, o_vector: T.dot(-ti_vector + tj_element,o_vector),
outputs_info=None,
sequences=tj,
non_sequences=[ti, o])
calculate_omega = theano.function(inputs=[tj, ti, o], outputs=omega)
om = calculate_omega(event_times,event_times,observed)
om = transform_positives_to_ones(om)
om_count = count_ones(om)


and this is the error that I get:

'Expected an array-like object, but found a Variable: '
TypeError: ('Bad input argument to theano function with name "../differentiableCIndex.py:116"  at index 2(0-based)', 'Expected an array-like object, but found a Variable: maybe you are trying to call a function on a (possibly shared) variable instead of a numeric array?')


The Error is for om = calculate_omega(event_times,event_times,observed) because both event_times and observed are TensorVariables.

#### Sentiment analysis Using BernoulliNB Algorithm in C

I have chosen this topic as my college project. I'm interested in learning sentiment analysis but I dont know from where to start the coding.

Need Help.

I only studied about BernoulliNB till now.

#### caffe test error no field named "net" on testing MNIST

I have the same problem as Caffe error: no field named "net" on testing MNIST.

Running

keides2@ubuntu:~/caffe$build/tools/caffe test -model examples/mnist/lenet_solver.prototxt -weights examples/mnist/lenet_iter_10000.caffemodel -iterations 100  I get the following output:  I0820 11:31:33.820005 113569 caffe.cpp:279] Use CPU. [libprotobuf ERROR google/protobuf/text_format.cc:245] Error parsing text-format caffe.NetParameter: 2:4: Message type "caffe.NetParameter" has no field named "net". F0820 11:31:33.844912 113569 upgrade_proto.cpp:79] Check failed: ReadProtoFromTextFile(param_file, param) Failed to parse NetParameter file: examples/mnist/lenet_solver.prototxt Check failure stack trace: @ 0x7f3f9744edaa (unknown) @ 0x7f3f9744ece4 (unknown) @ 0x7f3f9744e6e6 (unknown) @ 0x7f3f97451687 (unknown) @ 0x7f3f977fc0c7 caffe::ReadNetParamsFromTextFileOrDie() @ 0x7f3f97834b0f caffe::Net<>::Net() @ 0x407843 test() @ 0x405f7b main @ 0x7f3f9645af45 (unknown) @ 0x406677 (unknown) @ (nil) (unknown)  'lenet_solver.prototxt' and 'lenet_train_test.prototxt' are original (not modified). And then, keides2@ubuntu:~/caffe$ printenv PYTHONPATH
/home/keides2/caffe/python


Could you help me?

### QuantOverflow

#### Forecast of ARMA-GARCH model in R

I managed to forecast a GARCH model yesterday and run a Monte Carlo simulation on R. Nevertheless, I can't do the same with an ARMA-GARCH. I tested 4 different method but without achieving an ARMA-GARCH simulation with my data.

The packages and the data I used:

library(quantmod)
library(tseries)
library(TSA)
library(betategarch)
library(mcsm)
library(PerformanceAnalytics)
library(forecast)
library(fGarch)
library(GEVStableGarch)

getSymbols("DEXB.BR",from="2005-07-01", to="2015-07-01")
STOCK = DEXB.BR
STOCK.rtn=diff(STOCK[,6] )
STOCK.diff = STOCK.rtn[2:length(STOCK.rtn)]
ARI_2_1=arima(STOCK[,6],order=c(2,1,1))
GA_1_1=garch(ARI_2_1$residuals, order = c(1,1))  # First tested method specifi = garchSpec(model = list(ar = c(0.49840, -0.0628), ma =c(-0.4551), omega = 8.393e-08, alpha = 1.356e-01, beta = 8.844e-01)) garchSim(spec = specifi, n = 500, n.start = 200, extended = FALSE)  This lead to a "NaN" forecast. garchSim(spec = specifi, n = 500) n=1000 armagarch.sim_1 = rep(0,n) armagarch.sim_50 = rep(0,n) armagarch.sim_100 = rep(0,n) for(i in 1:n) { armagarch.sim=garchSim(spec = specifi, n = 500, n.start = 200, extended = FALSE) armagarch.sim_1[i] = armagarch.sim[1] armagarch.sim_50[i] = armagarch.sim[50] armagarch.sim_100[i] = armagarch.sim[100] } # Second tested method GSgarch.Sim(N = 500, mu = 0, a = c(0.49840, -0.0628), b = c(-0.4551), omega = 8.393e-08, alpha = c(1.356e-01), gm = c(0), beta = c(8.844e-01), cond.dist = "norm") This part works. n=10000 Garmagarch.sim_1 = rep(0,n) Garmagarch.sim_50 = rep(0,n) Garmagarch.sim_100 = rep(0,n) for(i in 1:n) { Garmagarch.sim= GSgarch.Sim(N = 500, mu = 0, a = c(0.49840, -0.0628), b = c(-0.4551),omega = 8.393e-08, alpha = c(1.356e-01), gm = c(0), beta c(8.844e-01), cond.dist = "norm") Garmagarch.sim_1[i] = Garmagarch.sim[1] Garmagarch.sim_50[i] = Garmagarch.sim[50] Garmagarch.sim_100[i] = Garmagarch.sim[100] }  The simulation runs but > Garmagarch.sim[1]$model
[1] "arma(2,1)-aparch(1,1) ## Intercept:FALSE"


and

> Garmagarch.sim[50]

#### SNI support added to libtls, httpd in -current

Joel Sing (jsing@) has added server-side Server Name Indication (SNI) support to libtls and, based on that, to httpd.

### StackOverflow

#### Machine learning in skewed data

I am training a neural network on an emotion recognition dataset for five classes of emotion. I some problems:

1. The dataset set is skewed: class1 has 100 observations, but class2 has 3000.

I tried to use Smote to balance the data.

2. The training data is not similar to the testing data; that is a big problem.

### TheoryOverflow

#### Fermi level or Fermi energy?

I searched almost through the whole internet resulting definition of Fermi level and Fermi energy used in semiconductors. There are so many definitions and I still don't know which is the best to be understood. I really want to understand this Fermi "things" because they have to do much with semiconductors and I really want to understand semiconductors so I can better understand transistors.

### QuantOverflow

#### Hedging, Delta, Gamma, Vega

I sometimes find it difficult to see, how to hedge a portfolio.

Let say, that I created a product consisting of an Asian call (strike 1), Vanilla call (strike 2), and an Asian Put (strike 1) on a stock called ABC. Now let say the the delta of the total product is 60%, Gamma is 1,5% and Vega is 1,5.

Now If I SHORT this "product", then I can delta-hedge the portfolio by going LONG in the underlying (Stock ABC) by 0,70 for one product I sell. I think this is correct?

But what about the gamma and the vega?

So I can gamma-hedge as well, but here I cannot just by/sell the underlying. I need an option on the underlying? ANd this option need to have a gamma of 1,5, but do I need to buy or sell the option??

I hope you guys can help me! Thanks,

### HN Daily

#### Daily Hacker News for 2016-08-22

The 10 highest-rated articles on Hacker News on August 22, 2016 which have not appeared on any previous Hacker News Daily are:

## August 22, 2016

### CompsciOverflow

#### A Turing Machine that exclusively accepts an infinite string

While reading some of proofs in Computability Theory, I came up with following conclusion:

We can design a Turing Machine which exclusively accepts finite strings (obvious).

Now while trying the same for infinite string, I came up with the following conclusion :

There exist no Turing Machine which exclusively accepts infinite strings.

Proof : Since acceptance in context to Turing Machine is defined in terms of finite amount of time, therefore every time we say that an infinite string $w$ is accepted by a Turing Machine, we usually talk about searching a "finite pattern" in $w$. So while designing a machine for $w$ there exist a $w'$ which is finite and contains the same "finite pattern" which we try to find in $w$.

Example of "finite pattern" : A string that contains a 0. (Assuming Binary Strings)

So my question is that is my proof right? and if yes, is there a better way to prove this?

### Lobsters

#### Building a Really, Really Small Android App

An exploration on where bloat in an Android app might live.

I wish we had a performance tag for this sort of thing.

### StackOverflow

#### what method/classifier should I use for a training set with lots of attributes but few examples

each training example has 100 numeric attributes plus one output class and about 80% of the attributes are 'zero' (means no data collected). And the value of attributes varies in a small range, like (-20,20). I have 100 examples like this. What method/classifier should I use? I tried KNN, Naive Bayes, SVM, random forest/tree, none of these methods give me accuracy above 50% (I used 10-fold cross validation). what should I do?

### UnixOverflow

#### Mplayer garbling top 30 pixels in full screen

On FreeBSD 10.3 switching mplayer (MPlayer SVN-r37862-snapshot-3.4.1) to full screen when watching a video with 1920 × 1080 resolution, mplayer garbles the first top ~30 pixel-rows. For example, it looks like this. The top 30 pixel-rows are remnants from the frame directly seen before going to full screen mode.

Luckily mplayer behaves normal if started in full screen mode, like mplayer -fs file.avi, otherwise it would be completely unusable. But it is still very annoying problem. Do you have any ideas?

### StackOverflow

#### How to use Dirichlet Process Gaussian Mixture Model in Scikit-learn? (n_components?)

My understanding of "an infinite mixture model with the Dirichlet Process as a prior distribution on the number of clusters" is that the number of clusters is determined by the data as they converge to a certain amount of clusters.

This R Implementation https://github.com/jacobian1980/ecostates decides on the number of clusters in this way. Although, the R implementation uses a Gibbs sampler, I'm not sure if that affects this.

What confuses me is the n_components parameters. n_components: int, default 1 : Number of mixture components. If the number of components is determined by the data and the Dirichlet Process, then what is this parameter?

Ultimately, I'm trying to get:

(1) the cluster assignment for each sample;

(2) the probability vectors for each cluster; and

(3) the likelihood/log-likelihood for each sample.

It looks like (1) is the predict method, and (3) is the score method. However, the output of (1) is completely dependent on the n_components hyperparameter.

My apologies if this is a naive question, I'm very new to Bayesian programming and noticed there was Dirichlet Process in Scikit-learn that I wanted to try out.

Here's an example of usage: http://scikit-learn.org/stable/auto_examples/mixture/plot_gmm.html

Here's my naive usage:

from sklearn.mixture import DPGMM
Mod_dpgmm = DPGMM(n_components=3)
Mod_dpgmm.fit(X)


### UnixOverflow

#### make don't know how to make CXXFLAGS. Stop

I am very new to both FreeBSD and compiling code from source and would really appreciate any help. I am trying to compile fastText from source. When I execute the make command it returns the following message:

make don't know how to make CXXFLAGS. Stop


Here are first few lines from Makefile(complete file is available on the fastText github repo mentioned above):

CXX = c++
OBJS = args.o dictionary.o matrix.o vector.o model.o utils.o
INCLUDES = -I.

opt: CXXFLAGS += -O3 -funroll-loops
opt: fasttext

debug: CXXFLAGS += -g -O0 -fno-inline
debug: fasttext


FreeBSD version: 10.3
FreeBSD clang version: 3.4.1
gmake version: 4.1_2

_dhcp:*:65:65:dhcp programs:/var/empty:/usr/sbin/nologin


but others do not:

www:*:80:80:World Wide Web Owner:/nonexistent:/usr/sbin/nologin


What's the significance of this underscore? Is it purely historical or does it serve a practical purpose?

Some more examples can be seen in the FreeBSD ports/UIDs file.

### QuantOverflow

#### Proof of optimal exercise time theorem for American derivative security in N-period binomial asset-pricing model

At least two textbooks (Shreve's Stochastic Calculus for Finance - I, theorem 4.4.5 or Campolieti & Makarov's Financial Mathematics, proposition 7.8) prove the optimal exercise theorem that says that the stopping time $\tau^* = min \{n; V_n = G_n\}$ maximizes $$V_n = \max_{\tau \in S_n} \tilde{\mathrm{E}}\Big[\mathrm{I}_{\tau \leq N}\frac{1}{(1+r)^{\tau-n}}G_{\tau}\Big] \qquad (1)$$ by demonstrating that stopped process $\frac{1}{(1+r)^{n \wedge \tau^*}}V_{n \wedge \tau^*}$ is a martingale under the risk-neutral probability measure.

But how can someone conclude from this fact that $\tau^*$ is actually maximizing $(1)$?

### StackOverflow

#### Alize LIA_RAL installation [on hold]

I managed to install Alize and now when I try to install LIA_RAL I'm getting errors.

I'm on VM Ubuntu 16.04

The errors ocurre when I hit the ./configure and make

### Lobsters

#### The Exceptional Beauty of Doom 3's Source Code

This might be the only time I’ll ever submit a Kotaku article here.

It’s kinda interesting seeing somebody else’s take on that.

### QuantOverflow

#### Pricing a Vanilla swap between coupons; What rates to use?

Vanilla Swap question. Entered into a 5Y fixed for floating HUF swap. Fixed is annual coupons, Float is semi-annual coupons.

1 month later I want to price it. I set up my future values for Fixed coupons for the next 5Y and notional at the end, and my next [coupon + notional] for Float (the coupon is now in 5 months, and a Floating rate is valued at par right after it pays its coupon).

I have the BUBOR rates. For my discount factors for my PV, do I use straight line interpolation of the rates? Or use the next interest rate? For example, with .39Y to go before the floating rate coupon, do I use the 0.5Y rate, the .25Y rate, or the interpolated (weighted average of rate and time) of both?

Also under continuous compounding (e), since my Fixed leg is ACT/365 and BUBOR is ACT/360, do I have to multiply the BUBOR rate by (365/360) before getting my discount rate to make it equivalent?

### Planet Theory

#### Chrisitan Comment on the Jesus Wife Thing misses the important point

In 2012 a Professor of Divisinity at Harvard, Karen King, announced that she had a fragment that seemed to indicate that Jesus had a wife. It was later found to be fake.  The article that really showed it was a fake was in the Atlantic monthly here.  A Christian Publication called Breakpoint  told the story: here.

When I read a story about person X being proven wrong the question upper most in my mind is: how did X react?  If they retract then they still have my respect and can keep on doing whatever work they were doing. If they dig in their heels and insist they are still right, or that a minor fix will make the proof correct (more common in our area than in history) then they lose all my respect.

The tenth paragraph has the following:

Within days of the article’s publication, King admitted that the fragment is probably a forgery. Even more damaging, she told Sabar that “I haven’t engaged the provenance questions at all” and that she was “not particularly” interested in what he had discovered.

Dr. King should have been more careful and more curious (though hindsight is wonderful)  initially. However, her admitting it was probably a forgery (probably?) is ... okay. I wish she was more definite in her admission but... I've seen far worse.

A good scholar will admit when they are wrong. A good scholar will look at the evidence and be prepared to change their minds.

Does Breakpoint itself do this when discussing homosexuality or evolution or global warming. I leave that to the reader.

However, my major point is that the difference between a serious scientist and a crank is what one does when confronted with evidence that you are wrong.

### Lobsters

#### Firsthand report of the challenges of libre firmware on Eastern SoCs

From the EOMA86 team.

### StackOverflow

#### Python (scikit learn) lda collapsing to single dimension

I'm very new to scikit learn and machine learning in general.

I am currently designing a SVM to predict if a specific amino acid sequence will be cut by a protease. So far the the SVM method seems to be working quite well:

I'd like to visualize the distance between the two categories (cut and uncut), so I'm trying to use the linear discrimination analysis, which is similar to the principal component analysis, using the following code:

from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
lda = LinearDiscriminantAnalysis(n_components=2)
targs = np.array([1 if _ else 0 for _ in XOR_list])
DATA = np.array(data_list)
X_r2 = lda.fit(DATA, targs).transform(DATA)
plt.figure()
for c, i, target_name in zip("rg", [1, 0],["Cleaved","Not Cleaved"]):
plt.scatter(X_r2[targs == i], X_r2[targs == i], c=c, label=target_name)
plt.legend()
plt.title('LDA of cleavage_site dataset')


However, the LDA is only giving a 1D result

In: print X_r2[:5]
Out: [[ 6.74369996]
[ 4.14254941]
[ 5.19537896]
[ 7.00884032]
[ 3.54707676]]


However, the pca analysis will give 2 dimensions with the data I am inputting:

pca = PCA(n_components=2)
X_r = pca.fit(DATA).transform(DATA)
print X_r[:5]
Out: [[ 0.05474151  0.38401203]
[ 0.39244191  0.74113729]
[-0.56785236 -0.30109694]
[-0.55633116 -0.30267444]
[ 0.41311866 -0.25501662]]


edit: here is a link to two google-docs with the input data. I am not using the sequence information, just the numerical information that follows. The files are split up between positive and negative control data. Input data: file1 file2

### Planet Emacsen

#### Ben Simon: Well Duh: a more intelligent emacs file opening strategy

Last week I finally modernized my PHP emacs setup. I did so by selecting two powerful modes (php-mode and web-mode) and implementing a bit of code to easily toggle between the two. I included this comment in my blog post:

At some point, I could codify this so that files in the snippets directory, for example, always open in web-mode, whereas files loaded from under lib start off in php-mode.

When I wrote the above statement I assumed that I'd need to dust off the o'l emacs lisp manual and would need to write some code to analyze the directory of the file being opened. Turns out, I was vastly over-thinking this.

The standard way to associate a mode with a file is by using the elisp variable auto-mode-alist. This is Emacs 101 stuff, and is something I've been doing for 20+ years. In my emacs config file I had this line:

(add-to-list 'auto-mode-alist '("[.]php$" . php-mode))  Which says too open .php files in php-mode. What I'd never done, nor considered, is that you don't have to limit yourself to matching the base filename. The auto-mode-alist is matched against the entire path. To open up ‘snippet’ files in web-mode is trivial. I just put the above code in my .emacs file: (add-to-list 'auto-mode-alist '("[.]php$" . php-mode))
'("\$$pages\\|snippets\\|templates\$$/.*[.]php?$" . web-mode))  The order is key here. add-to-list pushes new items to the front of the list. So the first line adds a general rule to open up all .php files in php-mode, and the second line adds a specific rule: if the full path to the file contains the the word pages or snippets or templates, then open the file in web-mode. It's not perfect, but files matching this path convention are far more likely to be in the right mode for me. While I'm a bonehead for not seeing this sooner, I sure do appreciate trivial solutions. ### StackOverflow #### Java 8 unbound reference syntax struggle I'm trying to create a method that puts a Function's results in to a Consumer you using unbound references (I think). Here's the scenario. With JDBC's ResultSet you can get row values by index. I have a Bean instance I want to place selected values into. I'm looking for a way to avoid writing boiler plate mapping code but instead achieve something like: static <T> void copy(Consumer<T> setter, Function<T, Integer> getter, Integer i);  And call it like: copy(Bean::setAValue, ResultSet::getString, 0)  I don't want the bind Bean and ResultSet to instance too early since I want this to be usable with any bean of ResultSet. The example I've been trying to work from is: public static <T> void println(Function<T,String> function, T value) { System.out.println(function.apply(value)); }  Called via: println(Object::toString, 0L);  ### Lobsters #### Deniz Altinbüken on Chain Replication (old and new) & Wes Chow (Mini) on Tiered Replication | Papers We Love ### StackOverflow #### tensorflow rnn model path I have trained the language model using Tensorflow as given in thie tutorial For training I used the following command.  bazel-bin/tensorflow/models/rnn/ptb/ptb_word_lm --data_path=./simple-examples/data/ --model small  The training was successful with the following o/p at the end. Epoch: 13 Train Perplexity: 37.196 Epoch: 13 Valid Perplexity: 124.502 Test Perplexity: 118.624  But I am still confused with, where the training model is stored and how to use it. #### Combining K Means with anomaly detection via normal distribution I have some questions concerning Machine Learning and anomaly detection. My task is to detect anomalies in the big dataset of variables. First I did extract some features - both continous and boolean. Next I perform scaling (normalization = (x-μ)/σ) following by KMeans clustering. Next I would like to focus on anomalies in the big clusters ( I assume those observations that fall far away from the centers in the big clusters could also be treated as anomalies). Following the tips from Coursera course taught by Andrew Ng I would like to use normal distrbution to do that: For each big cluster: For each big cluster: 1.Find the parameters: μ_i and σ2_i for each feature x_i. 1. Fit normal distribution to each feature x_i to compute probabilities. 2. Compute the product of probabilities of each feature x_i (I assume features are indep.,in this step I compute the probability of the observation x. 3. Find those observations that distribution is smaller that ϵ (0.05) Detailed info is here: https://www.youtube.com/watch?v=reDIsljRhcc For small cluster (up tp 50 obs) - I treat all of the observations as anomalies. Unfortunately my results were not satisfactory till now. I ended up with probability equal to 1200!!! for some x and 0 for other. So that is why I am asking for your help. Maybe someone will point me what am I doing wrong? 1.In general, the idea of combining clustering and anomaly detection via normal distribution sounds ok? 2.How to deal with boolean variables or variables which take on integer values in a small range (for example 1,2,5,8,9) 3.The extremly high results of probability might be caused by scaling of variables? For some features (for some clusters) I had obtained μ = 0.000006 and σ2=0.0000008. I found on Wikipedia the following paragraph: "In the limit when σ tends to zero, the probability density f(x) eventually tends to zero at any x ≠ μ, but grows without limit if x = μ". It seems, like its a problem that I am facing. How to deal with it? Thanks for any help! ### Lobsters #### GitLab 8.11 released with Issue Boards and Merge Conflict Resolution #### Sinatra 2.0 ### StackOverflow #### How can I get my array to only be manipulated locally (within a function) in Ruby? Why is my array globally manipulated, when I run the below ruby code? And how can I get arrays to be manipulated only within the function's scope? a = [[1,0],[1,1]] def iterate(array) array.map { |row| return true if row.keep_if{|i| i != 1 } == [] } false; end puts a.to_s puts iterate(a) puts a.to_s  $ ruby test.rb output:

[[1, 0], [1, 1]]
true
[[0], []]


I can't get it to work. I've even tried .select{true} and assign it to a new name. How does the scope work in Ruby for Arrays? Just for reference, $ruby -v: ruby 2.2.1p85 (2015-02-26 revision 49769) [x86_64-linux]  ### Lobsters #### What are you working on this week? This is the weekly thread to discuss what you have done recently and are working on this week. Please be descriptive and don’t hesitate to champion your accomplishments or ask for help, advice or other guidance. #### Android 7.0 Nougat is now rolling out to Nexus devices ### CompsciOverflow #### How can I efficiently find the optimal order to apply special offers to a shopping cart? Given a list of items which represent items in a shopping cart, and a list of available special offers which replace one or more regular items to lower the cost of those items, how can I decide the order to apply the special offers to minimize the final basket price? For example, I have in my cart 4 items: • Coke \$2
• Coke \$2 • Sandwich \$3
• Chocolate bar \$1 Total: \$8

There are two special offers in store:

• Buy one get one free coke (\$2 saving). • Coke, chocolate bar and a sandwich for \$4.50 (\$1.50 saving). One method of determining the order offers are applied might be to sort them by the savings they give. After applying the offers using this method my cart now looks like this: • Buy one get one free coke \$2
• Sandwich \$3 • Chocolate bar \$1

Total: \$6 There is no meal deal offer applied because after the Coke deal is applied there is not any coke items left to make a meal deal. This method of sorting by savings may seem to work, but there are cases in which it can fail, for example if the same deals were in place and my cart looked like this: • Coke \$2
• Sandwich \$3 • Chocolate bar \$1
• Coke \$2 • Sandwich \$3
• Chocolate bar \$1 Total: \$12

After deals are applied, the two Coke items are substituted for the promotional offer first (it being the deal with the greatest saving). There is no other applicable deal so the algorithm ends, reducing the basket price by \$2. Obviously there is an error here because if two meal deals were applied before the coke deal, the price would have been reduced by$3.

The naive solution to this problem would be to enumerate each possible permutation of the list of special offers, and find the one that minimizes the basket total when applied. This would have a factorial runtime based on the number of special offers available.

Is it possible to improve on a factorial runtime and if not, are there any efficient approximate solutions?

### QuantOverflow

Say I have a portfolio of expected return $10%$ and volatility $20%%. If I have another asset that is either one of: 1. Negatively correlated 2. Positively correlated 3. Uncorrelated With negative expected return$\mu < 0$and volatility$\sigma$. From intuition I think that if we are allowed to use leverage, we should be adding this to portfolio under scenarios 1 and 3 to reduce risk (and apply leverage to achieve desire rate of return). Is this true? How would I size this position if I want to target 10%? Is this scenario similar to the case of shorting one asset and buying another that are positively correlated to each other? In both instances (long/short positively correlated or long/long negatively.. or zero correlated), they should be risk reducing. And if we're allowed to use leverage we should be ale to achieve target return at lower risk? Though this also depends on the bounds of expected return and correlation? Basically, is it ever smart to add something with negative expected value to a portfolio depending on its correlation to the portfolio? ### Lobsters #### Open Source synchronous multi-room audio player ### Fefe #### Hach, welch triefende Ironie, wenn die Süddeutsche ... Hach, welch triefende Ironie, wenn die Süddeutsche ihren eigenen Artikel über Zensur zensiert :-) #### Ich weiß ja nicht, wie euch das geht, aber wenn so ... Ich weiß ja nicht, wie euch das geht, aber wenn so im öffentlichen Diskurs gefühlt die Mehrheit mit mir übereinstimmt, dann stellt sich das Gefühl ein, dass ich gerade für die falsche Seite kämpfe. So geht mir das gerade bei der Amadeu-Antonio-Stiftung. Mir fällt auf, dass ich keinen einzige Fürsprecher von denen im Blick habe gerade. Es gibt da niemanden. Klar, es gibt so Leute, die dann persönlich werden und mich als Querfront oder Arschloch beschimpfen. Geschenkt. Wird es immer geben. Aber so inhaltliche Argumente für diese Stiftung und ihr Gebahren? Nichts! Nirgendwo! Ich nehme daher an, dass ich hier von widrigen, wahrscheinlich selbstverschuldeten Umständen vom Wahrnehmen der (eloquenten, validen) Pro-Argumente für diese Stiftung abgehalten werde. Daher hier jetzt dieser Aufruf. Wenn jemand Pro-Argumente kennt, oder besser noch: Leute, die die Pro-Argumente für diese Stiftung selbst flammend verteidigend vorbringen, dann stellt doch mal bitte einen Kontakt her. Ich würde denen ungerne Unrecht tun. Gerade wenn alle Fakten glasklar so aussehen, als sei die Position offensichtlich, ist sie es ja im Allgemeinen gerade nicht, sondern man hat sich bloß in einer Filterblase eingeklemmt. Aber Achtung: Ich will hier keine taktischen Argumente hören (so ala "ja, die greifen da gerade ins Klo, aber wir brauchen sie aus diesen anderen Gründen noch für diesen anderen Feldzug"). Ich will gerne eine Verteidigung von deren Tun hören, nicht wieso sie möglicherweise das kleinere Übel gegenüber den Nazis sind. Das reicht mir nicht. #### Während wir auf Fürsprecher der Amadeu-Antonio-Stiftung ... Während wir auf Fürsprecher der Amadeu-Antonio-Stiftung warten, hier noch ein Blogposting, das mal die Maßstäbe des Grundgesetzes anlegt und dabei zu unschönen Ergebnissen kommt. Dass ich als alter Atheist nochmal auf evangelisch.de linken würde, hätte ich auch nicht gedacht. ### StackOverflow #### Сaffe constant multiply layer How can I define multiply constant layer in Caffe? Like MulConstant in Torch. I need a way to add it manually with predefined const to existing network. I've tried like that, but Caffe fails to parse it: layers { name: "caffe.ConstantMul_0" type: "Eltwise" bottom: "caffe.SpatialConvolution_0" top: "caffe.ConstantMul_0" eltwise_param { op: MUL coeff: 0.85 } }  ### CompsciOverflow #### Are vector clocks useful in centralized systems? Vectors clocks seem to be a common way to synchronize the partial ordering of events in a distributed, peer-to-peer, system across all clients. Is there any benefit to using them in a centralized system, where one node in the system has the power to order all events anyway, to order events? If one computer can decide the order anyway, there would be no need for vector clocks, right? ### Lobsters #### Idea Fight - A web application for prioritizing ideas #### Testing, for people who hate testing ### StackOverflow #### Text Classification/Document Classification with Sequence Tagging with Mallet I have documents arranged in folders as classes called categories. For a new input (such as a question asked), I have to identify its category. What is be the best way to do this using MALLET? I've gone through multiple articles about this, but couldn't find such a way. Also, do I need to do sequence tagging on the input text? #### Kotlin on Android: map a cursor to a list In Kotlin, What's the best way to iterate through an Android Cursor object and put the results into a list? My auto-converted Java: val list = ArrayList<String>() while (c.moveToNext()) { list.add(getStringFromCursor(c)) }  Is there a more idiomatic way? In particular, can it be done in a single assignment of a read-only list? E.g.... val list = /*mystery*/.map(getStringFromCursor)  ... or some other arrangement, where the list is assigned fully-formed. #### tensorflow merge input and output I would like to use two model in tensorflow in a row, to fit the first one and to use directly it for the second one as input. But I didn't find the good way to do it. I tried to proceed as the following , x = tf.placeholder('float', shape=[None, image_size[0] , image_size[1]]) y1_ = tf.placeholder('float', shape=[None, image_size[0] , image_size[1], 1]) y2_ = tf.placeholder('float', shape=[None, image_size[0] , image_size[1],\ labels_count]) image = tf.reshape(x, [-1,image_size[0] , image_size[1],1]) # y1 first output, to fit W_conv = weight_variable([1, 1, 1, labels_count]) b_conv = bias_variable([labels_count]) y1 = conv2d(image, W_conv) + b_conv cross_entropy1 = tf.reduce_sum(tf.nn.sigmoid_cross_entropy_with_logits(y1, y1_)) train_step1 =\ tf.train.GradientDescentOptimizer(LEARNING_RATE).minimize(cross_entropy) # Then use as input the folowing im_y1 = tf.zeros_initializer([None,image_size[0] , image_size[1],2]) im_y1[:,:,:,0]=x im_y1[:,:,:,1]=y1  The thing is to minimise first minimise cross_entropy( y1 y1_) with parameters W_conv b_conv then use y1 as parameter by construciting im_y1 as describe. But like I written it, it dosent work because tf.zeros_initializer refuse to get the argument None. What is the good way to pipeline different fit in the same model in Tensorflow? Thanks to any comments! ### Lobsters #### Announcing Citus 5.2 ### TheoryOverflow #### Enumerating all simply typed lambda terms of a given type How can I enumerate all simply typed lambda terms which have a specified type? More precisely, suppose we have the simply typed lambda calculus augmented with numerals and iteration, as described in this answer. How can I enumerate all lambda terms of type N (natural number)? For example, the first few lambda terms of type N are zero succ zero succ (succ zero), K zero zero succ (succ (succ zero)), K zero (suc zero), K (suc zero) zero, iter zero suc zero  and so on. How can I systematically continue this pattern, while ensuring that only well-typed terms are generated? ### Lobsters #### Google Intrusion Detection Problems ### QuantOverflow #### Sharpe Ratio and your annualization My question is related on this How to annualize Sharpe Ratio? but is a bit different. Under assumpion of IID returns, if excess return is positive, the SR increase over time horizon, with factor$\sqrt T$. Looked at in this way it seems that simply by increasing time horizon the risk reward improves. But if we take the variance, instead of standard deviation, this effect disappears; moreover the ratio remain constant over time. This fact seems to me strange. What do you think? ### StackOverflow #### Grouping GPS coordinates [on hold] Hello I am new in Datamining, I am using K-means to Cluster the coordinates using euclidean distance. the question is like, Is there any way by which i can map the clustered coordinates with its respective attributes in the original data-set? The original data set looks like this  GLOBALEVENTID|Year|EventCode|ActionGeo_Lat|ActionGeo_Long| | 534550298|2015| 046| 35.7449| -86.7489| | 534550299|2015| 0331| -37.5627| 143.863| | 534550300|2015| 071| -38.2348| 146.395| | 534550301|2015| 010| 35.6501| -80.5164| | 534550302|2015| 020| 23.0| -102.0| | 534550303|2015| 193| 23.0| -102.0| | 534550304|2015| 193| 23.0| -102.0| | 534550305|2015| 020| 37.7334| -84.2999| | 534550306|2015| 020| 42.3442| -75.1704| | 534550307|2015| 020| -18.15| 177.5| | 534550308|2015| 012| 11.0| 78.0| | 534550309|2015| 051| -2.0729| 146.937| | 534550310|2015| 051| -2.0729| 146.937| | 534550311|2015| 012| 11.0| 78.0| | 534550312|2015| 012| 11.0| 78.0| | 534550313|2015| 012| 41.5834| -72.7622| | 534550314|2015| 138| 39.0| 35.0| | 534550315|2015| 120| -10.0| -55.0| | 534550316|2015| 080| 10.5167| 76.2167| | 534550317|2015| 020| 41.5834| -72.7622|  This the output of one cluster after i cluster the GPS points.  Cluster with a size of 17 starts here: [[-16.5272, 29.9841], [-16.5272, 29.9841], [-17.8178, 31.0447], [-17.8178, 31.0447], [-17.8178, 31.0447], [-16.8925, 34.6558], [-16.8925, 34.6558], [-16.8925, 34.6558], [-16.8925, 34.6558], [-15.6667, 35.2], [-15.6667, 35.2], [-17.8178, 31.0447], [-17.8178, 31.0447], [-13.5, 34.0], [-16.6389, 32.022], [-16.6389, 32.022], [-16.6389, 32.022]] Cluster ends here.  So i want way to again retrieve the attributes from original data set with respect to clustered coordinates. Is this possible or any alternate solution for the problem? #### Predicting SPC (Statistical Process Control) I will give a brief explanation to my scenario. The company mass produces components like valves/nuts/bolts etc which need to measured for dimensions (like length,radius,thickness etc) for quality purposes. As it is not feasible to inspect all the pieces, they are chosen in a batch style. Foe eg: from a batch of every 100 pieces, 5 will be randomly selected & mean of their dimensions measured & noted for drawing SPC control charts (plots mean dimension on y axis & batch number on x axis). Even though there are a number of factors (like operator efficiency, machine/tool condition etc) which affect the quality of the product, they don't seem to be measurable. My objective is to develop a machine learning model to predict the product dimensions of the coming batch samples(mean). This will help the operator to forecast if there is going to be any significant dimensional variation so that he can pause working & figure out potential reasons & thus prevent the wastage of the product/material. I have some idea about R programming & machine learning techniques like decision trees/regression etc but couldn't land on a proper model for this. Mainly because I couldn't think of the independent variables for this situation. I don't have much idea about time series modelling though. Will someone throw some insights/ideas/suggestions about how to tackle this. I am sorry that I had to write a long story but just wanted to make things as clear as possible. Thanks in advance. Sreenath #### Is there anything wrong with using this custom MySQLi escaping PHP function? I wrote a basic wrapper function to escape a string using MySQLi. Is there anything wrong with using this? Is is better than the original? Is it useful? The function takes two arguments, $conn, which is the MySQLi connection, and &$var, which is the string you want to escape. function escapestr($conn, &$var){$var = $conn->real_escape_string($var);
return $var; }  Usage: $conn = mysqli_connect("localhost", "username", "password", "my_favourite_db");
$userInput =$_GET["input"]; // value: this is my "inputted" string
$userInput = escapestr($conn, $userInput); // value: this is my \"inputted\" string  Or, it can directly update the variable. $conn = mysqli_connect("localhost", "username", "password", "my_favourite_db");
$userInput =$_GET["input"]; // value: this is my "inputted" string
escapestr($conn,$userInput); // value: this is my \"inputted\" string


### QuantOverflow

#### Definition of BSE's Investor Categorywise Turnover

I am not entirely sure that this question is on-topic here, but the intent of the question is to understand the definition of a financial metric precisely, and to understand what use it serves in quantitative financial analysis.

I was trying to understand the definition of some data that I have downloaded from BSE's website. It is ostensibly called F&O Investor Categorywise Turnover, and is available here. I have downloaded a certain history of the data and placed it here.

This is what it looks like:

From the data, or the chart, it can be seen that the daily buy and sell quantities are the same for all investor categories. I can believe that this would true for proprietary traders who cannot carry over their position, but this also seems to be true for "Others" which includes private and domestic institutional investors, and also foreign institutional investors.

My questions:

• is there a definition for these metrics that means that they must (almost) balance at the end of the day? Is there an equilibrium condition that causes them to be almost equal?
• is there a use for these metrics, either macroeconomic or financial?

Thanks.

### UnixOverflow

#### NFS mount fails at boot time

I have the following in /etc/fstab on FreeBSD:

venture:/usr/redacted    /usr/local/redacted   nfs     rw      0       0


This fails during boot. However, after boot, the following command succeeds

mount -t nfs venture:/usr/redacted /usr/local/redacted


Two related questions:

1) last time I rebooted at the console (this machine is in a datacenter), I'm pretty sure I saw an explanatory message at boot time regarding the failure to mount. I think it had something to do with resolving the hostname. However, this message does not appear in /var/log/messages with other boot-time messages; is there someplace else I should be looking?

2) Any thoughts about what could be preventing the hostname from resolving at boot time, but no problem 30 seconds later from the command prompt?

### StackOverflow

#### difference between (>>=) and (>=>)

I need some clarification regarding (>>=) and (>=>).

*Main Control.Monad> :type (>>=)
(>>=) :: Monad m => m a -> (a -> m b) -> m b
(>=>) :: Monad m => (a -> m b) -> (b -> m c) -> a -> m c


I know about bind operator(>>=) but I am not getting the context where (>=>) is useful. Please explain with simple toy example.

Edit : Correcting based on @Thomas comments

### Fefe

#### Der Internationale Strafgerichtshof in Den Haag hat ...

Der Internationale Strafgerichtshof in Den Haag hat zum ersten Mal einen Fall von kultureller Zerstörung gehört. Es geht um Islamisten die in Timbuktu historische Schreine zerstört hatten. Hier ist, was passierte:
It is the first time that the court in The Hague has tried a case of cultural destruction.

It is also the first time a suspected Islamist militant has stood trial at the ICC and the first time a suspect has pleaded guilty.

Das hilft den Schreinen natürlich auch nicht mehr viel. Aber das finde ich schon bemerkenswert, dass der erste, der vor Gericht zu seinen Taten steht, ein Islamist ist.

#### Die Süddeutsche versucht mal zu recherchieren, wie ...

Die Süddeutsche versucht mal zu recherchieren, wie und wieviel Facebook eigentlich so herumzensiert.

Komisch, seit ein paar Jahren fragt mich gar keiner mehr, wieso ich eigentlich nicht auf Facebook bin, meine Inhalte nicht dort verbreite. Wir sind da jetzt in die schlechtes-Gewissen-Phase übergegangen, ist mein Eindruck. Die Leute haben alle noch Accounts, aber sie nutzen sie nur noch mit einem schlechten Gewissen. So wie früher bei Geocities und Myspace. Es ist nur noch Lethargie, die euch da hält.

#### Die Russen sind ja sonst eher für feinfühlige Diplomatie ...

Die Russen sind ja sonst eher für feinfühlige Diplomatie bekannt, aber das funktioniert nicht immer. Aktueller Fall: Die Russen prahlen damit, dass sie Syrien von einer iranischen Luftwaffenbasis aus bombardieren — der Iran schmeißt sie von der Basis.

### CompsciOverflow

#### Prove that $C = \{ x \in N : [0,x] \subseteq W_x \}$ is not saturated

How would one go proving, by using the second recursion theorem, that $C = \{ x \in N : [0,x] \subseteq W_x \}$ (where $W_x$ is the domain of $\phi_x$) is not saturated?

Below is my attempt at a proof so far. I suspect it's wrong, but I can't find the mistake, if there is one. In the latter case, how can it be fixed?

EDIT: There is. The Second Recursion Theorem guarantees that $n$ exists, but it doesn't guarantee that $n \in C$.

A set $A$ is saturated $\overset{\Delta}{\equiv}$ $x \in A \wedge \phi_x = \phi_y \Rightarrow y \in A$

So, suppose $C$ is saturated.

The second recursion theorem states that for all $h$ total, computable: $$\exists n : \phi_n = \phi_{h(n)}$$

It would be then sufficient to construct a computable, total $h$ and show that for all $n \in C$, $h(n) \not \in C$, including the $n$ such that $\phi_n = \phi_{h(n)}$, guaranteed to exist by the 2RT.

This way we would have $n \in C$ such that $\phi_n = \phi_{h(n)}$ but $\phi_{h(n)} \not\in C$, therefore the saturation property wouldn't hold.

Let

$$g(x,y) = \begin{cases} \uparrow & y = 0 \\ y & y \neq 0 \end{cases}$$

The smn theorem guarantees that there is a computable function $s$ s.t. $\phi_{s(x)} (y) = g(x,y)$.

Clearly $0 \not\in Dom(\phi_{s(x)})$ for any $x$ and thus $[0,x] \not\subseteq W_{s(x)}$: thus $s(x) \not\in C$ for all $x$.

Let $h = s$.

Now $\forall x \in C h(x) \not \in C$, and by the second recursion theorem we have a counterexample that shows that the saturation property does not hold:

$$n : \neg(n \in C \wedge \phi_n = \phi_{h(n)} \Leftrightarrow h(n) \in C)$$

This concludes the proof.

#### Formal way to model or describe distributed systems architecture

I've been tasked to create the systems architecture for a distributed system.

One approach to designing this system is to pick systems architecture patterns, and then evaluate different technologies that implement those architectural patterns.

For example, a particular architecture might call for a message bus, and given that, I could choose between various off-the-shelf open source or commercial projects that implement a message bus.

While this approach yields a nice white-board diagram, and a high-level understanding of how the system will work, some drawbacks are:

• its difficult to gauge the performance of the system as a whole without fully implementing it
• its difficult to determine how well each pattern / implementation will mesh with the other components
• because of that, choosing between patterns tends to be gut feelings similar to "Kafka is cool, I used it on project X and it did really well"
• there are no hard guarantees about the performance of the system as a whole (consistency, availability, etc)

Is there a formal approach to modeling distributed systems? Ideally one that provides a way to abstract patterns, and provide analytic tools for making predictions about the behavior of the system?

### UnixOverflow

#### mysqld_safe not starting after install MySQL 5.7.13 port in FreeBSD/amd64 10.3 using pkg

I have a problem with running mysql server. I've just installed FreeBSD 10.3 and I want to run here MySQL server, but process doesn't starts.

Here are all commands i gave after install FreeBSD, step-by-step:

portsnap fetch extract
pkg update
pkg install mysql57-server


/* Here mysql says about .mysql_secret file with password to root, but it's not generating at all. I can use but there is no result... */

find / -iname .mysql_secret


When I try to first run MySQL using this command:

mysqld_safe --initialize --user=mysql


I get this one:

mysqld_safe Logging to '/var/db/mysql/host.err'
mysqld_safe Starting mysqld deamon with databases from /var/db/mysql
mysqld_safe mysqld from pid file /var/db/mysql/host.pid ended


Here you are /var/db/mysql/host.err

2016-08-22T11:56:27.6NZ mysqld_safe Starting mysqld daemon with databases from /var/db/mysql
2016-08-22T11:56:27.533572Z 0 [ERROR] --initialize specified but the data directory has files in it. Aborting.
2016-08-22T11:56:27.533635Z 0 [ERROR] Aborting

2016-08-22T11:56:27.6NZ mysqld_safe mysqld from pid file /var/db/mysql/host.pid ended


I found something simmilar:

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=209512

There is still no solution. Any ideas? I really need MySQL. I have tried with MySQL 5.6 too. Same problem...

At the end /usr/local/etc/mysql/my.cnf

# $FreeBSD: branches/2016Q3/databases/mysql57-server/files/my.cnf.sample.in 414707 2016-05-06 14:39:59Z riggs$

[client]
port                            = 3306
socket                          = /tmp/mysql.sock

[mysql]
prompt                          = \u@\h [\d]>\_
no_auto_rehash

[mysqld]
user                            = mysql
port                            = 3306
socket                          = /tmp/mysql.sock
basedir                         = /usr/local
tmpdir                          = /var/db/mysql_tmpdir
secure-file-priv                = /var/db/mysql_secure
log-bin                         = mysql-bin
log-output                      = TABLE
master-info-repository          = TABLE
relay-log-info-repository       = TABLE
relay-log-recovery              = 1
slow-query-log                  = 1
server-id                       = 1
sync_binlog                     = 1
sync_relay_log                  = 1
binlog_cache_size               = 16M
expire_logs_days                = 30
enforce-gtid-consistency        = 1
gtid-mode                       = ON
safe-user-create                = 1
lower_case_table_names          = 1
explicit-defaults-for-timestamp = 1
myisam-recover-options          = BACKUP,FORCE
open_files_limit                = 32768
table_open_cache                = 16384
table_definition_cache          = 8192
net_retry_count                 = 16384
key_buffer_size                 = 256M
max_allowed_packet              = 64M
query_cache_type                = 0
query_cache_size                = 0
long_query_time                 = 0.5
innodb_buffer_pool_size         = 1G
innodb_data_home_dir            = /var/db/mysql
innodb_log_group_home_dir       = /var/db/mysql
innodb_data_file_path           = ibdata1:128M:autoextend
innodb_temp_data_file_path      = ibtmp1:128M:autoextend
innodb_flush_method             = O_DIRECT
innodb_log_file_size            = 256M
innodb_log_buffer_size          = 16M
innodb_autoinc_lock_mode        = 2

[mysqldump]
max_allowed_packet              = 256M
quote_names
quick


### StackOverflow

#### Poor results with tensorflow DNNClassifier and cross_val_score

I am using python 3.5, tensorflow 0.10 and its DNNClassifier. If I perform a single training and testing stage, as below, the test result is decent: accuracy = 0.9333

import tensorflow as tf
from tensorflow.contrib import learn
from sklearn.cross_validation import cross_val_score, ShuffleSplit, train_test_split
from sklearn.metrics import accuracy_score
import numpy as np
from sklearn.metrics import accuracy_score
from sklearn import datasets, cross_validation

feature_columns = learn.infer_real_valued_columns_from_input(iris.data)

x_train, x_test, y_train, y_test = train_test_split(iris.data, iris.target, test_size=0.20, random_state = 20)

model = learn.DNNClassifier(hidden_units=[5],
n_classes=3,
feature_columns=feature_columns,
)

model.fit(x_train, y_train, steps=1000)
predicted = model.predict(x_test)

print('Accuracy on test set: %f' % accuracy_score(y_test, predicted))


If I use sklearn's cross_val_score, then the final results is much poorer, about 0.33 accuracy:

model = learn.DNNClassifier(hidden_units=[5],
n_classes=3,
feature_columns=feature_columns,
)

scores = cross_val_score(estimator=model,
X=iris.data,
y=iris.target,
scoring = 'accuracy',
cv=5,
fit_params={'steps': 1000},
#                          verbose=100
)

print(scores)
print(np.mean(scores))


The scores ad their mean are:

[ 0.          0.33333333  1.          0.33333333  0.        ]
0.333333333333


What's wrong with my code in cross validation estimation?

### CompsciOverflow

#### Relation between "syntax" and "grammar" in CS

I do sure that "grammar" and "syntax" is two different thing in CS, e.g

Syntax of Java language is defined by a context-free grammar.

My question are

What is different in definitions of "grammar" and "syntax" in CS?

What is relation between them, can we describe it by using set theory?

#### Set notation of the set of all strings

How do I present the complement using set notation?

I guess it has to be shows with universal set - {aa,bb} but I do not know how to represent the universal set in terms of set notation. Since, the strings of the universal set can be anything over alphabets {a,b}. So, how to represent it? I guess something like this might work if we are going from outside to inside {{{a,b}*}*}. Any help is appreciated.

### Lobsters

#### Suggested hardcopy book for algorithms and patterns for my upcoming 16hr trip to China [& 16 back]?

As you may know from my previous post I am going to China to see family. I am learning specifically Perl 6 but want a good refresh of basics and a book might help me. Thank you in advance.

Edited — Changed first word from “Best” to “Suggested” and added a question mark for clarity. - Author.

### StackOverflow

#### Classes with static arrow functions

I'm currently implementing the static land specification (an alternative of fantasy land). I want to not only use plain objects as types but also ES2015 classes with static methods. I've implemented these static methods as arrow functions in curried form instead of normal functions. However, this isn't possible with ES2015 classes:

class List extends Array {
static map = f => xs => xs.map(x => f(x))
static of = x => [x]
}


My map doesn't need its own this, because it is merely a curried function on the List constructor. To make it work I have to write static map(f) { return xs => xs.map(x => f(x)) }, what is very annoying.

• Why can't I use arrow functions along with an assignment expression in ES2015 classes?
• Is there a concise way to achieve my goal anyway?

#### How do i make a range to my custom sldier?

I'm making a custum slider. I have two numbers...

Min = 120 Max = 400

I want: 120 = 0% 400 = 100%

I don't know the equation for this. I want create a percentage to a custom range... How do i do it?

#### Using tso to identify outliers warning

I'm trying to identify the outliers in my data using the code below with function tso from the package tsouliers in R. I'm fairly certain my data has level shifts. I'm getting the warning below. I'm unclear what the issue is.

Code:

library("tsoutliers")

tsSeries<-ts(xData, frequency=168)

product.outlier<-tso(tsSeries,types=c("AO","LS","TC"))
plot(product.outlier)


Warning messages:

1: In auto.arima(x = c(11, 14, 17, 5, 5, 5.5, 8, NA, 5.5, 6.5, 8.5,  :
Unable to fit final model using maximum likelihood. AIC value approximated
2: In auto.arima(x = c(11, 14, 17, 5, 5, 5.5, 8, NA, 5.5, 6.5, 8.5,  :
Unable to fit final model using maximum likelihood. AIC value approximated


I updated my code and added na.approx. The code now finds outliers in the data, and returns the set of warnings below. Why is the code now finding outliers when before it didn't? Are these legitimate outliers? What do the warnings below mean, are there settings I should change in tso to resolve them? All tips very much appreciated.

##Updated Code

test<-ts(na.approx(xData), frequency=168)

product.outlier<-tso(test,types=c("AO","LS","TC"))


Warning messages:

1: In auto.arima(x = c(11, 14, 17, 5, 5, 5.5, 8, 6.75, 5.5, 6.5, 8.5,  :
Unable to fit final model using maximum likelihood. AIC value approximated
2: In sqrt(diag(fit$var.coef)[id]) : NaNs produced 3: In auto.arima(x = c(11, 14, 17, 5, 5, 5.5, 8, 6.75, 5.5, 6.5, 8.5, : Unable to fit final model using maximum likelihood. AIC value approximated 4: In auto.arima(x = c(11, 14, 17, 5, 5, 5.5, 8, 6.75, 5.5, 6.5, 8.5, : Unable to fit final model using maximum likelihood. AIC value approximated  Data: dput(xData) c(11, 14, 17, 5, 5, 5.5, 8, NA, 5.5, 6.5, 8.5, 4, 5, 9, 10, 11, 7, 6, 7, 7, 5, 6, 9, 9, 6.5, 9, 3.5, 2, 15, 2.5, 17, 5, 5.5, 7, 6, 3.5, 6, 9.5, 5, 7, 4, 5, 4, 9.5, 3.5, 5, 4, 4, 9, 4.5, 6, 10, NA, 9.5, 15, 9, 5.5, 7.5, 12, 17.5, 19, 7, 14, 17, 3.5, 6, 15, 11, 10.5, 11, 13, 9.5, 9, 7, 4, 6, 15, 5, 18, 5, 6, 19, 19, 6, 7, 7.5, 7.5, 7, 6.5, 9, 10, 5.5, 5, 7.5, 5, 4, 10, 7, 5, 12, 6, NA, 4, 2, 5, 7.5, 11, 13, 7, 8, 7.5, 5.5, 7.5, 15, 7, 4.5, 9, 3, 4, 6, 17.5, 11, 7, 6, 7, 4.5, 4, 4, 5, 10, 14, 7, 7, 4, 7.5, 11, 6, 11, 7.5, 15, 23.5, 8, 12, 5, 9, 10, 4, 9, 6, 8.5, 7.5, 6, 5, 8, 6, 5.5, 8, 11, 10.5, 4, 6, 7, 10, 11.5, 11.5, 3, 4, 16, 3, 2, 2, 8, 4.5, 7, 4, 8, 11, 6.5, 7.5, 17, 6, 6.5, 9, 12, 17, 10, 5, 5, 9, 3, 8.5, 11, 4.5, 7, 16, 11, 14, 6.5, 15, 8.5, 7, 6.5, 11, 2, 2, 13.5, 4, 2, 16, 11.5, 3.5, 9, 16.5, 2.5, 4.5, 8.5, 5, 6, 7.5, 9.5, NA, 9.5, 8, 2.5, 4, 12, 13, 10, 4, 6, 16, 16, 13, 8, 12, 19, 19, 5.5, 8, 6.5, NA, NA, NA, 15, 12, NA, 6, 11, 8, 4, 2, 3, 4, 10, 7, 5, 4.5, 4, 5, 11.5, 12, 10.5, 4.5, 3, 4, 7, 15.5, 9.5, NA, 9.5, 12, 13.5, 10, 10, 13, 6, 8.5, 15, 16.5, 9.5, 14, 9, 9.5, 11, 15, 14, 5.5, 6, 14, 16, 9.5, 23, NA, 19, 12, 5, 11, 16, 8, 11, 9, 13, 6, 7, 3, 5.5, 7.5, 19, 6.5, 5.5, 4.5, 7, 8, 7, 10, 11, 13, NA, 12, 1.5, 7, 7, 12, 8, 6, 9, 15, 9, 3, 5, 11, 11, 8, 6, 3, 7.5, 4, 7, 7.5, NA, NA, NA, NA, 6.5, 2, 16.5, 7.5, 8, 8, 5, 2, 7, 4, 6.5, 4.5, 10, 6, 4.5, 6.5, 9, 2, 6, 3.5, NA, 5, 7, 3.5, 4, 4.5, 13, 19, 8.5, 10, 8, 13, 10, 10, 6, 13.5, 12, 11, 5.5, 6, 3.5, 9, 8, NA, 6, 5, 8.5, 3, 12, 10, 9.5, 7, 24, 7, 9, 11.5, 5, 7, 11, 6, 5.5, 3, 4.5, 4, 5, 5, 3, 4.5, 6, 10, 5, 4, 4, 9.5, 5, 7, 6, 3, 13, 5.5, 5, 7.5, 3, 5, 6.5, 5, 5.5, 6, 4, 3, 5, NA, 5, 5, 6, 7, 8, 5, 5.5, 9, 6, 8.5, 9.5, 8, 9, 6, 12, 5, 7, 5, 3.5, 4, 7.5, 7, 5, 4, 4, NA, 7, 5.5, 6, 8.5, 6.5, 9, 3, 2, 8, 15, 6, 4, 10, 7, 13, 14, 9.5, 9, 18, 6, 5, 4, 6, 4, 11.5, 17.5, 7, 8, 10, 4, 7, 5, 9, 6, 5, 4, 8, 4, 2, 1.5, 3.5, 6, 5.5, 5, 4, 8, 10.5, 4, 11, 9.5, 5, 6, 11, 21, 9.5, 11, 13.5, 7.5, 13, 10, 7, 9.5, 6, 10, 5.5, 6.5, 12, 10, 10, 6.5, 2, 8, NA, 10, 5, 4, 4.5, 5, 7.5, 12, 22, 5, 8.5, 2.5, 3, 10.5, 4, 7, 13, 4, 3, 5, 6.5, 3, 9, 9.5, 16, NA, 4, 12, 4.5, 7, 5.5, 8, 14, 3, 8, 12, 14, 7, 8, 6, 8.5, 6, 6.5, 15.5, 13, 3.5, 12, 7, 6, NA, 3, 5.5, 8.5, 9, 12, 13, 8, 6.5, 8, 3, 5, 16.5, 2, 7, 6, 2, 5, 6.5, 3, 3, 7, 2, NA, 13, 7, 16, 13, 12.5, 12, 7, 13, 11, 21.5, 16, 20, 3, 4, 5, 7, 11, 7, 9, 11, 7, 13, 4, 14, 5, 12, 6, 7, 9, 12, 7, 12.5, 6.5, 16, 5, 12, 9, 9.5, 9, 7, 9.5, 3, 13, 8, 7, 7, 7, 9, 6, 6, 11, 15, 9, 6, 19, 10.5, 4, 6, 14.5, 9, 17, 14, 4, 16, 5, 6.5)  ### Fefe #### Kennt ihr die Hypothese, dass der Staat grundsätzlich ... Kennt ihr die Hypothese, dass der Staat grundsätzlich unfähig ist, und man lieber Privatwirtschaft ranlassen sollte? Nun, in Schottland haben sie das mal mit den öffentlichen Schulen getestet. More than 200 schools built in Scotland under private finance initiative (PFI) schemes are now at least partially owned by offshore investment funds. Yeah! Das ist doch mal eine gute Wahl! Endlich macht das mal jemand richtig! Diese Geldverschwendung immer bei der öffentlichen Hand! Und? Wie läuft es so? Utopia? The 17 schools built in Edinburgh under PPP1 were closed for repairs earlier this year after construction faults were found. Oh. Hmm. Nun, äh, das wäre bestimmt auch unter anderen Umständen passiert!1!! ### CompsciOverflow #### How is the problem of sorting in contiguous runs called? I am having a bit of brain fail and I can't remeber the name of the following problem (so I can find some literature around it...). Given a sequence of values, sort it in a way that equal elements are compacted in runs (contiguous subsequences of identical elements). For instance: $$\{1, 2, 4, 2, 1, 3\} \rightarrow \{ 2, 2, 4, 3, 1, 1 \}$$ The runs are not otherwise sorted -- only equality comparison is required, not ordering; and they're compacted (there should not be two different runs containing equal elements). ### Lobsters #### Shit non-mathematicians say about maths #### Reflections On NixOS ### CompsciOverflow #### Reducing k Vertex Cover to SAT (last clause problem) I am working on a transformation from k Vertex Cover to SAT and I have some issues regarding the last clause in the boolean formula. Here is my approach: $$\forall \text{ nodes } n_i \in V, \text{ invent variables } v_i$$ $$\forall \text{ edges } (n_i, n_j) \in E, \text{ invent the terms } (v_i \lor v_j) \Rightarrow C_{edges}$$ $$\text{ encode the proposition: exactly } k \ v_i \text{ variables are true} \Rightarrow C_{prop}$$ $$\varphi = C_{edges} \land C_{prop}$$ where $$C_{edges} = \land_{(n_i, n_j) \in E} \ (v_i \lor v_j)$$ Now, we invent$\frac{n(n-1)}{2}$new variables,$y_{ij}$, with the following meaning: $$y_{ij} = 1 \iff \text{exactly } j \text{ variables in } v_1 \cdots v_i \text{ are } 1$$ formally, $$y_{ij} = 1 \iff (y_{i-1,j} \land \neg v_i) \lor (y_{i-1,j-1}\land v_i)$$ or $$y_{ij} \equiv (y_{i-1,j} \land \neg v_i) \lor (y_{i-1,j-1}\land v_i)$$ with $$i = 2\cdots n, \ j = 1\cdots i \text{ and for } i = 1, \text{ we have } y_{11} \equiv v_1$$ Thus,$C_{prop} = y_{nk}$Now, I should write$y_{nk}$in terms of$v_1 \cdots v_n$, but I suppose that might exceed the polynomial-time constraint of the transformation. Is this approach correct or am I missing something? Is there a way I can express$y_{nk}$without writing it in terms of$v_i$, in the end? ### QuantOverflow #### The Relation Between the Ricci flow and the Black-Scholes-Merton Equation Grisha Perelman once wrote that The Ricci-flow equation, a type of heat equation, is a distant relative of the Black-Scholes equation that bond traders around the world use to price stock and bond options. Wilmot has derived from the BS Equation to the heat equation, but wonder if there is any proof that you can get the BS Equation from the Ricci flow. ### infra-talk #### The Tradeoff of Multiple Repositories More often than I expect, I come across software projects that consist of multiple source control repositories. The reasons vary. Perhaps it’s thought that the web frontend and backend aren’t tightly coupled and don’t need to be in the same repository. Perhaps there’s code that’s meant to be used throughout an entire organization. Regardless, there are real costs involved in the decision to have a development team work in distinct, yet related, repositories. I believe these costs are always overlooked. ## Double (or n Times) the Gruntwork The most obvious cost involved is additional gruntwork. Let’s imagine a project with a mobile app and web service, each having its own Git repository. When it’s time to start a new feature, the feature branch will need to be created twice. When the work is finished, two pull requests will need to be made. When it’s appropriate to make a commit, it might need to be done twice. When it’s time to push, it might need to be done twice. To help manage all of this, an extra terminal might be appropriate. Individually, none of these costs is very significant. Collectively, they represent a moderate inconvenience and cognitive burden. I’ve seen developers weigh this and decide it’s worth the cost, because they are trying to achieve some other ideal. Ultimately, these inconveniences are just symptoms of a more fundamental—and easily overlooked—tradeoff. ## Context: Not Version-Controlled A repository is essentially a set of snapshots in time. For any commit, it’s easy to see not only what changes were made, but also precisely what other files existed and contained at that point in time. This is pretty obvious, after all. It’s one of the biggest selling points of version control. With a project consisting of one single repository, that snapshot encapsulates everything there is to know about the source code. Once there are multiple repositories involved in a single project, this context is fragmented. This fragmentation manifests in various ways. Let’s look at some examples: • When moving code between repositories, neither one has knowledge of the other. Information about where the code came from or went is lost. • If your frontend branch repo depends on your server to be running with a corresponding branch, there’s no native or reasonable way to express that relationship. Information is lost. ## The Real Tradeoff of Multiple Repositories Breaking a project into multiple repositories involves a fundamental tradeoff. By doing so, information about the broader context of the application is pushed entirely outside of version control. Although it’s possible to work to counteract this, for example, by establishing team practices, using Git submodules, or building custom machinery, it will require work. That’s work spent to regain what you get for free by using a single repository. Therefore, the most likely place that this information will move is into the culture and individual minds of the team. This is a much more ephemeral and unreliable place than a source repository. It makes it harder to onboard new developers and coordinate things like continuous integration. ## Conclusion It’s up to your unique situation whether it’s a win or loss to split your code into multiple repositories, but the costs are both real and easily overlooked. I’d strongly suggest weighing these tradeoffs thoughtfully. And, if you find yourself on a project where these costs are bringing you down, I’ve written a blog post on how to super-collide your repositories together. The post The Tradeoff of Multiple Repositories appeared first on Atomic Spin. ### Related Posts ### Fred Wilson #### The Spillover Effect The New York Times has a piece today about how bay area tech companies are giving the Phoenix Arizona economy a boost. I think this is a trend we are just seeing the start of. A big theme of board meetings I’ve been in over the past year is the crazy high cost of talent in the big tech centers (SF, NYC, LA, Boston, Seattle) and the need to grow headcount in lower cost locations. This could mean outside of the US in places like Eastern Europe, Asia, India, but for the most part the discussions I have been in have centered on cities in the US where there is a good well educated work force, an increasing number of technically skilled workers, and a much lower cost of living. That could be Phoenix, or it could be Indianapolis, Pittsburgh, Atlanta, and a host of other really good places to live in the US. Just like we are seeing tech seep into the strategic plans of big Fortune 1000 companies, we are seeing tech seep into the economic development plans of cities around the US (and around the world). Tech is where the growth opportunities are right now. A good example of how this works is Google’s decision to build a big office in NYC in the early part of the last decade and build (and buy) engineering teams in that office. Google is now a major employer in NYC and the massive organization they have built has now spilled over into the broader tech sector in NYC. My partner Albert calls Google’s NYC office “the gift that Google gave NYC.” We will see that story play out across many cities in the US (and outside of the US) in the next five to ten years. It is simply too expensive for most companies to house all of their employees in the bay area or NYC. And so they will stop doing that and go elsewhere for talent. That’s a very healthy and positive dynamic for everyone, including the big tech centers that are increasingly getting too expensive to live in for many tech employees. ### Lobsters #### Website enumeration insanity: how our personal data is leaked #### Crypto Tokens and the Coming Age of Protocol Innovation ### StackOverflow #### how can I improve my LSTM code on tensorflow? I am trying to predict the household power consumption using LSTM. The following is a small portion of input data(total training input is about ~1M records) that I am using to train my model followed by my LSTM code using TensorFlow. I need some help with: (1) Verification of my model : I would like a network with 2 LSTM layers of size 512 with time_step: 10. I think I only have 1 LSTM hidden layer, but I am not sure how to add another. (2) General improvement on my model : seems like the accuracy is not converging to any specific value and I am not quite sure where to look at this point. Any advice on modeling/stacking layers/choice of optimizer/etc will be much appreciated. Input data (: 16/12/2006;17:24:00;4.216;0.418;234.840;18.400;0.000;1.000;17.000 16/12/2006;17:25:00;5.360;0.436;233.630;23.000;0.000;1.000;16.000 16/12/2006;17:26:00;5.374;0.498;233.290;23.000;0.000;2.000;17.000 16/12/2006;17:27:00;5.388;0.502;233.740;23.000;0.000;1.000;17.000 16/12/2006;17:28:00;3.666;0.528;235.680;15.800;0.000;1.000;17.000 16/12/2006;17:29:00;3.520;0.522;235.020;15.000;0.000;2.000;17.000 16/12/2006;17:30:00;3.702;0.520;235.090;15.800;0.000;1.000;17.000 16/12/2006;17:31:00;3.700;0.520;235.220;15.800;0.000;1.000;17.000 16/12/2006;17:32:00;3.668;0.510;233.990;15.800;0.000;1.000;17.000 16/12/2006;17:33:00;3.662;0.510;233.860;15.800;0.000;2.000;16.000  LSTM code : from getdata_new import DataFromFile import numpy as np import tensorflow as tf import datetime # ========== # MODEL # ========== # Parameters learning_rate = 0.01 training_iters = 100000 batch_size = 1000 display_step = 10 # Network Parameters seq_len = 10 # Sequence length n_hidden = 512 # hidden layer num of features n_classes = 1201 # n_input = 13 num_layers = 2 trainset= DataFromFile(filename="/tmp/train_data.txt",delim=";") # Define weights weights = { 'out': tf.Variable(tf.random_normal([n_hidden, n_classes])) } biases = { 'out': tf.Variable(tf.random_normal([n_classes])) } x = tf.placeholder("float", [None, seq_len, n_input]) y = tf.placeholder("float", [None, n_classes]) def RNN(x, weights, biases): x = tf.transpose(x, [1, 0, 2]) x = tf.reshape(x, [-1, n_input]) x = tf.split(0, seq_len, x) lstm_cell = tf.nn.rnn_cell.BasicLSTMCell(n_hidden) outputs, states = tf.nn.rnn(lstm_cell, x, dtype=tf.float32) # Linear activation, using outputs computed above return tf.matmul(outputs[-1], weights['out']) + biases['out'] pred = RNN(x, weights, biases) # Define loss and optimizer cost = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(pred, y)) optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate).minimize(cost) # Evaluate model correct_pred = tf.equal(tf.argmax(pred,1), tf.argmax(y,1)) accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32)) #test testPred = tf.argmax(pred,1) # Initializing the variables init = tf.initialize_all_variables() # Launch the graph with tf.Session() as sess: sess.run(init) acc = 0.0 for step in range(1,training_iters+1): batch_x, batch_y = trainset.train_next(batch_size,seq_len) # Run optimization op (backprop) sess.run(optimizer, feed_dict={x: batch_x, y: batch_y}) if step % display_step == 0: # Calculate batch accuracy acc = sess.run(accuracy, feed_dict={x: batch_x, y: batch_y}) # Calculate batch loss loss = sess.run(cost, feed_dict={x: batch_x, y: batch_y}) print "Iter " + str(step) + ", Minibatch Loss= " + \ "{:.6f}".format(loss) + ", Training Accuracy= " + \ "{:.5f}".format(acc) print "Optimization Finished!"  Result : Iter 99860, Minibatch Loss= 0.015933, Training Accuracy= 0.29900 Iter 99870, Minibatch Loss= 0.015993, Training Accuracy= 0.26200 Iter 99880, Minibatch Loss= 0.015783, Training Accuracy= 0.30500 Iter 99890, Minibatch Loss= 0.016071, Training Accuracy= 0.27200 Iter 99900, Minibatch Loss= 0.015390, Training Accuracy= 0.40300 Iter 99910, Minibatch Loss= 0.015247, Training Accuracy= 0.43700 Iter 99920, Minibatch Loss= 0.015264, Training Accuracy= 0.42700 Iter 99930, Minibatch Loss= 0.015212, Training Accuracy= 0.43800 Iter 99940, Minibatch Loss= 0.016164, Training Accuracy= 0.26500 Iter 99950, Minibatch Loss= 0.015923, Training Accuracy= 0.30800 Iter 99960, Minibatch Loss= 0.016338, Training Accuracy= 0.22600 Iter 99970, Minibatch Loss= 0.016327, Training Accuracy= 0.19000 Iter 99980, Minibatch Loss= 0.016322, Training Accuracy= 0.22300 Iter 99990, Minibatch Loss= 0.016608, Training Accuracy= 0.15400 Iter 100000, Minibatch Loss= 0.016809, Training Accuracy= 0.10700 Optimization Finished!  ### CompsciOverflow #### booth bit pair recording technique In booth bit pair recording technique how to multiply a multiplicand with -2 or 2? For example while multiplying 01101(+13,multiplicand) and 11010(-6,,multiplier), we get 01101 x 0-1-2. how to multiply the multiplicand using -2? ### QuantOverflow #### Sending orders to CME [on hold] For example, I'm getting quotes (cqg, rithmic, etc), then doing some math (bash, c#,python, etc), and then sending orders to CME. What I need to send orders, please advice? I heard that brokers have api for such purposes, right? CQG API, Rithmic API, IBPy - is that I need? Or should I rent CME membership? Please explain? My aim is not HFT, just usual intraday. Thus, I don't need smth like 'super low latency api/datafeed'. ### CompsciOverflow #### Query regarding the structure of Graph over All (known/unknown) NPC Problems? Let us consider the set of all NPComplete problems. Since every problem in the set is Reducible to/from at least one known NPComplete problem, lets create a directed Graph with the following conventions: 1. If a problem B is Reducible from Problem A, create an edge from A to B. We assume an oracle who knows all possible reductions who creates the edges. Here are a few questions regarding the Graph. Q1. Is the graph strongly connected (i.e. every problem in the Graph is reachable from or reducible from every other problem for every instance? Guess: I presume the answer is yes. Q2. For any two problems (say A, B), no matter what the distance b/w them is, there is guaranteed at most Polynomial Blowup in problem size when we reduce, from A to B? Guess: Unsure. For any problem P, if we reduce from one of its neighboring problem, there is polynomial increase or decrease in space. But, not sure if it holds if the problems are arbitrarily large distance apart. The definition of NPC, needs one single reduction in Polynomial time, but space analogue of reduction for All NPC Problem pairs seems out of reach. #### ATL Property Pages [on hold] I am trying to learn ATL and COM and I'm currently looking at property pages. I added a property page to a blank project and the project can be successfully built but when I try to run it I get a pop up error window saying: Unable to start program \ATLProject\Debug\ATLProject.dll \ATLProject\Debug\ATLProject.dll is not a valid Win32 application. I am new to ATL so the Microsoft documentation doesn't help me much with that. Can someone recommend a good tutorial or tell me what I need to do in order to be able to open the property page? Thank you! ### Lobsters #### Systemd Rolls Out Its Own Mount Tool Lennart Poettering commented on reddit. Comments ### Planet Emacsen #### Irreal: Mark Rectangle If you're like me you don't often have occasion to mark rectangles so it's easy to forget how simple it is to do. Here's a nice reminder from Tony Garnock-Jones. ### QuantOverflow #### Discount Factor from euribor future [on hold] I have U6 = 100.339, which represents the Euribor interest rate future. How do I get the corresponding Discount factor? ### Planet Emacsen #### Flickr tag 'emacs': DSCF2762_gmpd MGdesigner posted a photo: The secret attendee in coscup 2016 #### Flickr tag 'emacs': DSCF2758_gpmd MGdesigner posted a photo: The secret attendee in coscup 2016 ### StackOverflow #### Partial Function Application in Scala I'm learning Functional Programming, by following the book Functional Programming in Scala by Paul Chiusano and Rúnar Bjarnason. I'm specifically on chapter 3, where I am implementing some companion functions to a class representing a singly-linked list, that the authors provided. package fpinscala.datastructures sealed trait List[+A] case object Nil extends List[Nothing] case class Cons[+A](head: A, tail: List[A]) extends List[A] object List { def sum(ints: List[Int]): Int = ints match { case Nil => 0 case Cons(x,xs) => x + sum(xs) } def product(ds: List[Double]): Double = ds match { case Nil => 1.0 case Cons(0.0, _) => 0.0 case Cons(x,xs) => x * product(xs) } def apply[A](as: A*): List[A] = if (as.isEmpty) Nil else Cons(as.head, apply(as.tail: _*)) def tail[A](ls: List[A]): List[A] = ls match { case Nil => Nil case Cons(x,xs) => xs } ... (more functions) }  The functions I am implementing go inside the object List, being companion functions. While implementing dropWhile, whose method signature is: def dropWhile[A](l: List[A])(f: A => Boolean): List[A]  I came across some questions regarding partial function application: In the book, the authors say that the predicate, f, is passed in a separate argument group to help the scala compiler with type inference because if we do this, Scala can determine the type of f without any annotation, based on what it knows about the type of the List , which makes the function more convenient to use. So, if we passed f in the same argument group, scala would force the call to become something like this: val total = List.dropWhile(example, (x:Int) => 6%x==0 ) where we define the type of x explicitly and we would "lose" the possibility of partial function application, am I right? However, why is partial function application useful in this case? Only to allow for type inference? Does it make sense to "partially apply" a function like dropWhile without applying the predicate f to it? Because it seems to me that the computation becomes "halted" before being useful if we don't apply f... So... why is partial function application useful? And is this how it's always done or is it only something specific to Scala? I know Haskell has something called "complete inference" but I don't know exactly its implications... Thanks in advance ### QuantOverflow #### Regarding the Hurst Exponent I tried calculating the Hurst Exponent using c#, and compared the results to a series with a known exponent. I am having the following issue in my calculations: 1- All my results are negative.. Instead of positive. I get numbers close to the following: mean reverting series: -1, random series -0.5, and trending series : 0. The series have Hurst Exppnents of 0, 0.5. 1.00, respectively. It appears as if I take the original and subtract 1 . I have been trying to figure where my error is for a couple of days and can't seem to find it. Has anyone come across in the past... Is there any suggestions on how to fix it? #### Other numerraire choices when applying Feynman Kac all of the books and notes I have seen on the Feynman Kac formula mostly applied to Risk neutral measure, i.e. different interest rate models, stochastic volatility, etc. I think risk neutral measure can be replaced with any other measure associated with a traded numerraire$N(t)$such that $$\frac{V(t)}{N(t)}=\mathbb{E}_t^N\left[\frac{V(T)}{N(T)}\right]$$ So what came to my mind is annuity measure and swaption price or forward measure and cap price. However, I could not find any references on those PDEs. Can someone point me to some references or provide different measure examples and how PDE is derived in that case. It would be especially useful if the example is a "real application" one and can be seen in practice pricing financial instruments. ### StackOverflow #### How do I improve this object design in Typescript? I have created a class in Typescript that implements a simple stream (FRP). Now I want to extend it with client side functionality (streams of events). To illustrate my problem, here is some pseudo-code: class Stream<T> { map<U>(f: (value: T) => U): Stream<U> { // Creates a new Stream instance that maps the values. } // Quite a few other functions that return new instances. }  This class can be used both on the server and on the client. For the client side, I created a class that extends this one: class ClientStream<T> extends Stream<T> { watch(events: string, selector: string): Stream<Event> { // Creates a new ClientStream instance } }  Now the ClientStream class knows about map but the Stream class doesn't know about watch. To circumvent this, functions call a factory method. protected create<U>(.....): Stream<U> { return new Stream<U>(.....) }  The ClientStream class overrides this function to return ClientStream instances. However, the compiler complains that ClientStream.map returns a Stream, not a ClientStream. That can be 'solved' using a cast, but besides being ugly it prevents chaining. I don't really like this pattern, but I have no other solution that is more elegant. Things I've thought about: • Use composition (decorator). Not really an option given the number of methods I would have to proxy through. And I want to be able to add methods to Stream later without having to worry about ClientStream. • Mix Stream into ClientStream. More or less the same problem, ClientStream has to know the signatures of the functions that are going to be mixed in (or not? Please tell). • Merge these classes into one. This is a last resort, the watch function has no business being on the server. Do you have a better (more elegant) solution? If you have an idea that gets closer to a more functional style, I'd be happy to hear about it. Thanks! #### I am running GBT in Spark ML for CTR prediction. I am getting exception because of MaxBin Parameter Exception details : • Exception in thread "main" java.lang.IllegalArgumentException: requirement failed: DecisionTree requires maxBins (= 32) to be at least as large as the number of values in each categorical feature, but categorical feature 4139 has 16094 values. Considering remove this and other categorical features with a large number of values, or add more training examples. at scala.Predef$.require(Predef.scala:233) at org.apache.spark.mllib.tree.impl.DecisionTreeMetadata$.buildMetadata(DecisionTreeMetadata.scala:133) at org.apache.spark.mllib.tree.RandomForest.run(RandomForest.scala:137) at org.apache.spark.mllib.tree.DecisionTree.run(DecisionTree.scala:60) at org.apache.spark.mllib.tree.GradientBoostedTrees$.org$apache$spark$mllib$tree$GradientBoostedTrees$$boost(GradientBoostedTrees.scala:208) GBTClassifier gbt = new GBTClassifier().setLabelCol("indexedclick").setFeaturesCol("features_index").setMaxIter(20).**setMaxBins(16094)**.setMaxDepth(30).setMinInfoGain(0.0001).setStepSize(0.00001).setSeed(200).setLossType("logistic").setSubsamplingRate(0.2);  I want to know what should be the correct max bin size because If even I am setting large value of MaxBin also causing the same exception. Your small help will be highly appreciated. ### QuantOverflow #### Plain Vanilla Interest Rate Swap I'm trying to build an intuitive understanding of the following The price of the replicating portfolio at time$t$of the floating rate receiver is$P_t^{swap}=P_{t,t_0}-P_{t,t_N}-\bar{R}\sum_{n=1}^N(t_n-t_{n-1})P_{t,t_n}$. (Some notation:$\bar{R}$is the fixed rate.$P_{t,t_n}$is the value at time$t$of a zero coupon bond with maturity$t_n$. And we have future times$t_0,…,t_N$.) My understanding of this is still very young and I have several questions as a result: So is$P_t^{swap}$essentially the money it would take to buy the side of the swap (at time$t$) that receives the floating rate, and therefore pays out the fixed rate? e.g. if$P_t^{swap}=0$, you wouldn't make money or lose money in entering this swap. As we're the floating rate receiver here, we have to pay out the fixed rate every$t_n$and hence the final term in the expression? It's just been modelled as a sum of zero coupon bonds? What does$P_{t,t_0}-P_{t,t_N}$really mean? The value of a zero coupon bond maturing at time$t_0$minus the value of a zero coupon bond maturing at time$t_n$(I hope that's right to say) it surely always$>0$, as who would rather buy a zero coupon bond that matures at a later time? And finally, how is the combination of these three terms the value of the floating rate receiver's replicated portfolio? I hope it's clear what these questions mean and apologies for anything I've missed. ### Lobsters #### The Elegance of Deflate ### QuantOverflow #### best bid and best quotes from quotes dataset I have a dataset containing bid and ask quotes for a single day and stock. It has multiple quotes for some of the timestamps. Can I get best bid and best ask quotes from it? ### StackOverflow #### Can i predict data price based on a survey on azure machine learning? I want to predict my input price based on a list of questions/answers using azure machine learning. I built one using the "bayesian linear regression" but it seems that it is predicting the price based on the prices i have in my dataset and not based on the Q/A. Am i in the wrong path or am i missing something? Any suggestion would be helpful. ### UnixOverflow #### Invalid ZFS file system has no data Background: I have a FreeNas box with a boot SSD and a 2x 3TB HDD. I know only enough linux and FreeNas to get me in trouble and must have gotten it up and running a while ago. I transferred data to the drive (somehow) and backed it up to CrashPlan (since disappeared). I moved the box to the garage to get it out of the middle of the floor and forgot about it. Recently, I went to retrieve data off the hard drive by pulling it out of the box and putting it in my Windows box. The drive was seen by disk management with two partitions, but I was unable to assign a drive letter (disk1). Starting to panic, I grabbed the other drive and put it in the Windows box to find that Windows did see it and assign it a drive letter, but it was empty (disk2). I cloned the drive that I couldn't mount (disk1) to the drive Windows could mount (disk2) so I could go about recovering the partition. I loaded up easeus to recover the gpt partition and found that it said "invalid ZFS file system". I grabbed the SSD from the FreeNas box, put it in the computer I'm working on and booted FreeNas. I was able to get in and saw the FreeNas saw a pool, but it stated that 2.7TB were empty, which is not right. Here is what I know. If I copied the original data to the FreeNas pool, it would have been setup for disk1 to be mirrored to disk2, so I don't think I destroyed any parity information during the clone. I don't think disk2 had any data, unless the partition was damaged and it stated it was empty when it wasn't. I have the original FreeNas box, but at this point, I don't remember which SATA port each drive was plugged in to (if that makes a difference). I REALLY would like to get this data as it is pictures of my wedding and when we were dating. If I need to leave this to a professional, please recommend someone and tell me what I need to tell them (is my zfs file system invalid?). ### Lobsters #### Lagrange Points ### StackOverflow #### how to use promise function in javascript functional function like forEach reduce? I am using promise function like this: // WORK let res = {approveList: [], rejectList: [], errorId: rv.errorId, errorDesc: rv.errorDesc}; for (let i = 0; i < rv.copyDetailList.length; i ++) { const item = rv.copyDetailList[i]; const v = await convertCommonInfo(item); if (!item.errorId) { res.approveList.push(v); } else { res.rejectList.push(merge(v, {errorId: item.errorId, errorDesc: item.errorMsg})); } }  This works well, but I want to try to use some functional function, I find that I have to use map then reduce // WORK, But with two traversal const dataList = await promise.all(rv.copyDetailList.map((item) => convertCommonInfo(item))); const res = dataList.reduce((obj, v, i) => { const item = rv.copyDetailList[i]; if (!item.errorId) { obj.approveList.push(v); } else { obj.rejectList.push(merge(v, {errorId: item.errorId, errorDesc: item.errorMsg})); } return obj; }, {approveList: [], rejectList: [], errorId: rv.errorId, errorDesc: rv.errorDesc});  I find that forEach function can not work: // NOT WORK, not wait async function rv.copyDetailList.forEach(async function(item) { const v = await convertCommonInfo(item); if (!item.errorId) { res.approveList.push(v); } else { res.rejectList.push(merge(v, {errorId: item.errorId, errorDesc: item.errorMsg})); } });  This doesn't work, it just return init value. In fact this puzzle me, sine I await the function, why not work? Even I want to use reduce function: // NOT WORK, Typescirpt can not compile rv.copyDetailList.reduce(async function(prev, item) { const v = await convertCommonInfo(item); if (!item.errorId) { prev.approveList.push(v); } else { prev.rejectList.push(merge(v, {errorId: item.errorId, errorDesc: item.errorMsg})); } }, res);  But since I am using Typescript, I got error like this: error TS2345: Argument of type '(prev: { approveList: any[]; rejectList: any[]; errorId: string; errorDesc: string; }, item: Resp...' is not assignable to parameter of type '(previousValue: { approveList: any[]; rejectList: any[]; errorId: string; errorDesc: string; }, c...'. Type 'Promise<void>' is not assignable to type '{ approveList: any[]; rejectList: any[]; errorId: string; errorDesc: string; }'. Property 'approveList' is missing in type 'Promise<void>'.  So I want to know two things: 1. Why forEach await can not work? 2. Can I use promise function in reduce? #### model measurement in scikit learn Working on the logistic regression, and post code sample and document. Wondering if there is any built-in API in scikit learn which calculate model measurement, like precision, recall, AUC, etc. for any model prediction results? Thanks. Working with Python 2.7. http://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html#sklearn.linear_model.LogisticRegression.fit #!/usr/bin/python # -*- coding: utf-8 -*- """ ========================================================= Logistic Regression 3-class Classifier ========================================================= Show below is a logistic-regression classifiers decision boundaries on the iris <http://en.wikipedia.org/wiki/Iris_flower_data_set>_ dataset. The datapoints are colored according to their labels. """ print(__doc__) # Code source: Gaël Varoquaux # Modified for documentation by Jaques Grobler # License: BSD 3 clause import numpy as np import matplotlib.pyplot as plt from sklearn import linear_model, datasets # import some data to play with iris = datasets.load_iris() X = iris.data[:, :2] # we only take the first two features. Y = iris.target h = .02 # step size in the mesh logreg = linear_model.LogisticRegression(C=1e5) # we create an instance of Neighbours Classifier and fit the data. logreg.fit(X, Y) # Plot the decision boundary. For that, we will assign a color to each # point in the mesh [x_min, m_max]x[y_min, y_max]. x_min, x_max = X[:, 0].min() - .5, X[:, 0].max() + .5 y_min, y_max = X[:, 1].min() - .5, X[:, 1].max() + .5 xx, yy = np.meshgrid(np.arange(x_min, x_max, h), np.arange(y_min, y_max, h)) Z = logreg.predict(np.c_[xx.ravel(), yy.ravel()]) # Put the result into a color plot Z = Z.reshape(xx.shape) plt.figure(1, figsize=(4, 3)) plt.pcolormesh(xx, yy, Z, cmap=plt.cm.Paired) # Plot also the training points plt.scatter(X[:, 0], X[:, 1], c=Y, edgecolors='k', cmap=plt.cm.Paired) plt.xlabel('Sepal length') plt.ylabel('Sepal width') plt.xlim(xx.min(), xx.max()) plt.ylim(yy.min(), yy.max()) plt.xticks(()) plt.yticks(()) plt.show()  regards, Lin ### Planet Emacsen #### sachachua: 2016-08-22 Emacs News ### StackOverflow #### How to express parsing logic in Parsec ParserT monad I was working on "Write Yourself a Scheme in 48 hours" to learn Haskell and I've run into a problem I don't really understand. It's for question 2 from the exercises at the bottom of this section. The task is to rewrite import Text.ParserCombinators.Parsec parseString :: Parser LispVal parseString = do char '"' x <- many (noneOf "\"") char '"' return$ String x


such that quotation marks which are properly escaped (e.g. in "This sentence \" is nonsense") get accepted by the parser.

In an imperative language I might write something like this (roughly pythonic pseudocode):

def parseString(input):
if input[0] != "\"" or input[len(input)-1] != "\"":
return error
input = input[1:len(input) - 1] # slice off quotation marks
output = "" # This is the 'zero' that accumulates over the following loop
# If there is a '"' in our string we want to make sure the previous char
# was '\'

for n in range(len(input)):
if input[n] == "\"":
try:
if input[n - 1] != "\\":
return error
catch IndexOutOfBoundsError:
return error
output += input[n]
return output


I've been looking at the docs for Parsec and I just can't figure out how to work this as a monadic expression.

I got to this:

parseString :: Parser LispVal
parseString = do
char '"'
regular <- try $many (noneOf "\"\\") quote <- string "\\\"" char '"' return$ String $regular ++ quote  But this only works for one quotation mark and it has to be at the very end of the string--I can't think of a functional expression that does the work that my loops and if-statements do in the imperative pseudocode. I appreciate you taking your time to read this and give me advice. #### numpy reshape confusion with negative shape values Always confused how numpy reshape handle negative shape parameter, here is an example of code and output, could anyone explain what happens for reshape [-1, 1] here? Thanks. Related document, using Python 2.7. http://docs.scipy.org/doc/numpy/reference/generated/numpy.reshape.html import numpy as np from sklearn.preprocessing import LabelEncoder from sklearn.preprocessing import OneHotEncoder S = np.array(['box','apple','car']) le = LabelEncoder() S = le.fit_transform(S) print(S) ohe = OneHotEncoder() one_hot = ohe.fit_transform(S.reshape(-1,1)).toarray() print(one_hot) [1 0 2] [[ 0. 1. 0.] [ 1. 0. 0.] [ 0. 0. 1.]]  #### Seperating a tree Regression Model based on unique values of one column I have a data set of 20,000,000 rows. Each row has 30 columns. One of the columns contains 7000 unique Product Numbers. Each row contains a Unit Cost value that I would like to predict using all the columns other than the Unit Cost. I would like to build a unique decision tree or a unique branch of a decision tree to model the data for each Product Number. Basically partitioning the rows for each Product Number and modelling each Product Number in isolation. I would like to train a single model in Azure to do this if possible. Any suggestions? ### TheoryOverflow #### Finding degree two subfield Let$K=\frac{\mathbb{Q}[x]}{<f(x)>}$where$f(x)$is irreducible over$\mathbb{Q}$and has even degree. I want to find$K_2$such that$ \mathbb{Q} \subseteq K_2\subseteq K$and$[K_2:\mathbb{Q}]=2$. If K is a Galois extension over$\mathbb{Q}$then discriminant of$f(x)$solves the problem. But what if$K$is not a galois extension? ### StackOverflow #### OneHotEncoder confusion in scikit learn Using in Python 2.7 (miniconda interpreter). Confused by the example below about OneHotEncoder, confused why enc.n_values_ output is [2, 3, 4]? If anyone could help to clarify, it will be great. http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.OneHotEncoder.html >>> from sklearn.preprocessing import OneHotEncoder >>> enc = OneHotEncoder() >>> enc.fit([[0, 0, 3], [1, 1, 0], [0, 2, 1], [1, 0, 2]]) OneHotEncoder(categorical_features='all', dtype=<... 'float'>, handle_unknown='error', n_values='auto', sparse=True) >>> enc.n_values_ array([2, 3, 4]) >>> enc.feature_indices_ array([0, 2, 5, 9]) >>> enc.transform([[0, 1, 1]]).toarray() array([[ 1., 0., 0., 1., 0., 0., 1., 0., 0.]])  regards, Lin ### XKCD #### Meteorite Identification ### Lobsters #### New tag suggestion: Pony From https://lobste.rs/s/1nwfty/on_state_pony . Pony is a high performance concurrent programming language: http://www.ponylang.org/ ### QuantOverflow #### Borrower, platform, SPV relationship with borrower payment dependent notes? As you know, real estate crowd funding platforms are taking off at the moment. Platforms connect investors to real estate assets. I am having an issue with understanding how exactly borrower payment dependent notes work. I simply want more details about the SPVs involved. Is the underlying loan simply a loan and not a debt security? Does the platform directly lend the money or do they do it through a different SPV? And then the platform creates a different SPV to which it issues these borrower payment dependent notes? Would the investors then be buying a pro rata share of the SPV? Thank you so much. ### Lobsters #### Mainframes as a lifestyle choice ### StackOverflow #### How to build a simple computing engine for Hadoop? I am a student with some basic knowledge. I want to build a SIMPLE (minimum dependencies) computing engine for Hadoop 2 for my research, exactly similar to Apache Tez. But I don't know how to do it. Could you guys show me some steps I need to take to achieve my goal? • What should I learn first? From basic to advanced • What book I need to read? • Tutorial links: blogs, youtube,... [Optional] Could you show me steps from basic to advanced to optimize my engine? (If I can finish the simple version) I don't care about how long it takes, I just want to learn. Thanks. ### Lobsters #### Safe Systems Programming in C# and .NET ### QuantOverflow #### volatility of a mid curve option Question: When checking the volatility surface for, let's say, a swaption, where the the option expires in 1Y and the underlying starts in 1Y and ends in 5Y, one would check the volatility surface for the quoted volatilities and pick the volatility from Exp. 1Yx5Y ; What happens to the volatility of a mid curve option? how do you relate/ interpolate the volatility in this case? let's say the option expires in 1Y, and the asset starts in 6Y and ends in 5Y after start? where on the volatility surface should the volatility of a mid curve option be situated? Or in other words howw do you get the volatility for the 6Y fwd 5Y swap for an option that expires in 1Y ? ### CompsciOverflow #### Understanding LEADING and TRAILING operations of an operator precedence grammar I want to understand what the LEADING and TRAILING of non-terminal in an operator precedence grammar physically mean. I am confused by the various definitions I have read on them. I understand that the LEADING of a non-terminal is the first terminal which can be present in it's derivation. On the other hand, the TRAILING of a non-terminal is the last terminal which can be present in it's derivation. In the following example: E -> E + T -- I E -> T -- II T -> T * F -- III T -> F -- IV F -> ( E ) -- V F -> id -- VI  By my understanding, LEADING(E) = { +, *, (, id } LEADING(T) = { *, (, id } LEADING(F) = { (, id }  This turns out fine, but my problem is in the TRAILING. TRAILING(F) = { id, ) } TRAILING(T) = TRAILING(F) = { id, ) } -- (1) TRAILING(E) = TRAILING(T) = { id, ) } -- (2)  Reason for (2) is that according to productions I and II, the last terminal of the derivation of E will be last terminals in the derivation of T. Hence, TRAILING(E) = TRAILING(T). Similarly, TRAILING(T) = TRAILING(F). Unfortunately the solution to this problem states: TRAILING(F) = { id, ) } TRAILING(T) = TRAILING(F) union { * } = { *, id, ) } TRAILING(E) = TRAILING(T) union { + } = { +, *, id, ) }  I don't see how * or + can be the last terminals in the derivation of E. Any derivation of E will always end with either an id or ). Similarly, case for T. ### TheoryOverflow #### Is there a terminology for these concepts? In$P/poly$essentially we want to find polynomial sized constants that will help solve problems of some fixed length in polynomial time. If a problem is$NP$complete then it has a short certificate for YES instances. If NO instances also have short certificates that can be verified in polynomial time then$NP=coNP$. Assume we live where$NP\neq coNP$,$NP\cup coNP\subsetneq P/poly$holds. Then there are no short proofs for NO instances. However could there be a scenario where we may have short certificates for every length$n$NO instances of NP complete problems that can be verified using a polynomial circuit of size$n^c$(which may take exponential time to compute)? What if we seek a scenario where we may have short certificates for NO instances that can be verified by a randomized algorithm which runs in polynomial time with probability$2/3$at any length? What if there is a randomized algorithm which runs in polynomial time which refutes NO instances with probability$1/2+1/2^{{n}^\alpha}$where$\alpha\in(0,1)$holds at fixed input length$n$? Is such scenarios possible and if so is there a terminology for these concepts? We do not worry about YES instances. For example in coding theory if we seek 'is there a codeword of weight < w?' a minimum weight codeword will be the standard short certificate that can be verified in polynomial time for YES instances while for NO instances we can have a scenario where short certificates can be verified in P/poly, BPP or in PP. ### arXiv Networking and Internet Architecture #### A Concise Forwarding Information Base for Scalable and Fast Flat Name Switching. (arXiv:1608.05699v1 [cs.NI]) Forwarding information base (FIB) scalability is a fundamental problem of numerous new network architectures that propose to use location-independent network names. We propose Concise, a FIB design that uses very little memory to support fast query of a large number of location-independent names. Concise makes use of minimal perfect hashing and relies on the SDN framework and supports fast name classification. Our conceptual contribution of Concise is to optimize the memory efficiency and query speed in the data plane and move the relatively complex construction and update components to the resource-rich control plane. We implemented Concise on three platforms. Experimental results show that Concise uses significantly smaller memory to achieve faster query speed compared to existing FIBs for flat name switching. #### Automata Theory Approach to Predicate Intuitionistic Logic. (arXiv:1608.05698v1 [cs.LO]) Predicate intuitionistic logic is a well established fragment of dependent types. According to the Curry-Howard isomorphism proof construction in the logic corresponds well to synthesis of a program the type of which is a given formula. We present a model of automata that can handle proof construction in full intuitionistic first-order logic. The automata are constructed in such a way that any successful run corresponds directly to a normal proof in the logic. This makes it possible to discuss formal languages of proofs or programs, the closure properties of the automata and their connections with the traditional logical connectives. #### Revisiting Reuse in Main Memory Database Systems. (arXiv:1608.05678v1 [cs.DB]) Reusing intermediates in databases to speed-up analytical query processing has been studied in the past. Existing solutions typically require intermediate results of individual operators to be materialized into temporary tables to be considered for reuse in subsequent queries. However, these approaches are fundamentally ill-suited for use in modern main memory databases. The reason is that modern main memory DBMSs are typically limited by the bandwidth of the memory bus, thus query execution is heavily optimized to keep tuples in the CPU caches and registers. To that end, adding additional materialization operations into a query plan not only add additional traffic to the memory bus but more importantly prevent the important cache- and register-locality opportunities resulting in high performance penalties. In this paper we study a novel reuse model for intermediates, which caches internal physical data structures materialized during query processing (due to pipeline breakers) and externalizes them so that they become reusable for upcoming operations. We focus on hash tables, the most commonly used internal data structure in main memory databases to perform join and aggregation operations. As queries arrive, our reuse-aware optimizer reasons about the reuse opportunities for hash tables, employing cost models that take into account hash table statistics together with the CPU and data movement costs within the cache hierarchy. Experimental results, based on our HashStash prototype demonstrate performance gains of$2\times$for typical analytical workloads with no additional overhead for materializing intermediates. #### Hierarchical Shape Abstraction for Analysis of Free-List Memory Allocators. (arXiv:1608.05676v1 [cs.PL]) We propose a hierarchical abstract domain for the analysis of free-list memory allocators that tracks shape and numerical properties about both the heap and the free lists. Our domain is based on Separation Logic extended with predicates that capture the pointer arithmetics constraints for the heap-list and the shape of the free-list. These predicates are combined using a hierarchical composition operator to specify the overlapping of the heap-list by the free-list. In addition to expressiveness, this operator leads to a compositional and compact representation of abstract values and simplifies the implementation of the abstract domain. The shape constraints are combined with numerical constraints over integer arrays to track properties about the allocation policies (best-fit, first-fit, etc). Such properties are out of the scope of the existing analyzers. We implemented this domain and we show its effectiveness on several implementations of free-list allocators. #### lpopt: A Rule Optimization Tool for Answer Set Programming. (arXiv:1608.05675v2 [cs.LO] UPDATED) State-of-the-art answer set programming (ASP) solvers rely on a program called a grounder to convert non-ground programs containing variables into variable-free, propositional programs. The size of this grounding depends heavily on the size of the non-ground rules, and thus, reducing the size of such rules is a promising approach to improve solving performance. To this end, in this paper we announce lpopt, a tool that decomposes large logic programming rules into smaller rules that are easier to handle for current solvers. The tool is specifically tailored to handle the standard syntax of the ASP language (ASP-Core) and makes it easier for users to write efficient and intuitive ASP programs, which would otherwise often require significant hand-tuning by expert ASP engineers. It is based on an idea proposed by Morak and Woltran (2012) that we extend significantly in order to handle the full ASP syntax, including complex constructs like aggregates, weak constraints, and arithmetic expressions. We present the algorithm, the theoretical foundations on how to treat these constructs, as well as an experimental evaluation showing the viability of our approach. #### POLYPATH: Supporting Multiple Tradeoffs for Interaction Latency. (arXiv:1608.05654v1 [cs.OS]) Modern mobile systems use a single input-to-display path to serve all applications. In meeting the visual goals of all applications, the path has a latency inadequate for many important interactions. To accommodate the different latency requirements and visual constraints by different interactions, we present POLYPATH, a system design in which application developers (and users) can choose from multiple path designs for their application at any time. Because a POLYPATH system asks for two or more path designs, we present a novel fast path design, called Presto. Presto reduces latency by judiciously allowing frame drops and tearing. We report an Android 5-based prototype of POLYPATH with two path designs: Android legacy and Presto. Using this prototype, we quantify the effectiveness, overhead, and user experience of POLYPATH, especially Presto, through both objective measurements and subjective user assessment. We show that Presto reduces the latency of legacy touchscreen drawing applications by almost half; and more importantly, this reduction is orthogonal to that of other popular approaches and is achieved without any user-noticeable negative visual effect. When combined with touch prediction, Presto is able to reduce the touch latency below 10 ms, a remarkable achievement without any hardware support. #### Polynomial Kernels and Wideness Properties of Nowhere Dense Graph Classes. (arXiv:1608.05637v1 [cs.DM]) Nowhere dense classes of graphs are very general classes of uniformly sparse graphs with several seemingly unrelated characterisations. From an algorithmic perspective, a characterisation of these classes in terms of uniform quasi-wideness, a concept originating in finite model theory, has proved to be particularly useful. Uniform quasi-wideness is used in many fpt-algorithms on nowhere dense classes. However, the existing constructions showing the equivalence of nowhere denseness and uniform quasi-wideness imply a non-elementary blow up in the parameter dependence of the fpt-algorithms, making them infeasible in practice. As a first main result of this paper, we use tools from logic, in particular from a subfield of model theory known as stability theory, to establish polynomial bounds for the equivalence of nowhere denseness and uniform quasi-wideness. A powerful method in parameterized complexity theory is to compute a problem kernel in a pre-computation step, that is, to reduce the input instance in polynomial time to a sub-instance of size bounded in the parameter only (independently of the input graph size). Our new tools allow us to obtain for every fixed value of$r$a polynomial kernel for the distance-$r$dominating set problem on nowhere dense classes of graphs. This result is particularly interesting, as it implies that for every class$\mathcal{C}$of graphs which is closed under subgraphs, the distance-$r$dominating set problem admits a kernel on$\mathcal{C}$for every value of$r$if, and only if, it admits a polynomial kernel for every value of$r$(under the standard assumption of parameterized complexity theory that$\mathrm{FPT} \neq W[2]$). #### Symbolic Abstract Contract Synthesis in a Rewriting Framework. (arXiv:1608.05619v1 [cs.PL]) We propose an automated technique for inferring software contracts from programs that are written in a non-trivial fragment of C, called KernelC, that supports pointer-based structures and heap manipulation. Starting from the semantic definition of KernelC in the K framework, we enrich the symbolic execution facilities recently provided by K with novel capabilities for assertion synthesis that are based on abstract subsumption. Roughly speaking, we define an abstract symbolic technique that explains the execution of a (modifier) C function by using other (observer) routines in the same program. We implemented our technique in the automated tool KindSpec 2.0, which generates logical axioms that express pre- and post-condition assertions by defining the precise input/output behaviour of the C routines. #### CurryCheck: Checking Properties of Curry Programs. (arXiv:1608.05617v1 [cs.PL]) We present CurryCheck, a tool to automate the testing of programs written in the functional logic programming language Curry. CurryCheck executes unit tests as well as property tests which are parameterized over one or more arguments. In the latter case, CurryCheck tests these properties by systematically enumerating test cases so that, for smaller finite domains, CurryCheck can actually prove properties. Unit tests and properties can be defined in a Curry module without being exported. Thus, they are also useful to document the intended semantics of the source code. Furthermore, CurryCheck also supports the automated checking of specifications and contracts occurring in source programs. Hence, CurryCheck is a useful tool that contributes to the property- and specification-based development of reliable and well tested declarative programs. #### A short-key one-time pad cipher. (arXiv:1608.05613v1 [cs.CR]) A process for the secure transmission of data is presented that has to a certain degree the advantages of the one-time pad (OTP) cipher, that is, simplicity, speed, and information-theoretically security, but overcomes its fundamental weakness, the necessity of securely exchanging a key that is as long as the message. For each transmission, a dedicated one-time pad is generated for encrypting and decrypting the plaintext message. This one-time pad is built from a randomly chosen set of basic keys taken from a public library. Because the basic keys can be chosen and used multiple times, the method is called multiple-time pad (MTP) cipher. The information on the choice of basic keys is encoded in a short keyword that is transmitted by secure means. The process is made secure against known-plaintext attack by additional design elements. The process is particularly useful for high-speed transmission of mass data and video or audio streaming. #### On Joining Graphs. (arXiv:1608.05594v1 [cs.DB]) In the graph database literature the term "join" does not refer to an operator used to merge two graphs. In particular, a counterpart of the relational join is not present in existing graph query languages, and consequently no efficient algorithms have been developed for this operator. This paper provides two main contributions. First, we define a binary graph join operator that acts on the vertices as a standard relational join and combines the edges according to a user-defined semantics. Then we propose the "CoGrouped Graph Conjunctive$\theta$-Join" algorithm running over data indexed in secondary memory. Our implementation outperforms the execution of the same operation in Cypher and SPARQL on major existing graph database management systems by at least one order of magnitude, also including indexing and loading time. #### Logical Data Independence in the 21st Century -- Co-Existing Schema Versions with InVerDa. (arXiv:1608.05564v1 [cs.DB]) We present InVerDa, a tool for end-to-end support of co-existing schema versions within one database. While it is state of the art to run multiple versions of a continuously developed application concurrently, the same is hard for databases. In order to keep multiple co-existing schema versions alive, that all access the same data set, developers usually employ handwritten delta code (e.g. views and triggers in SQL). This delta code is hard to write and hard to maintain: if a database administrator decides to adapt the physical table schema, all handwritten delta code needs to be adapted as well, which is expensive and error-prone in practice. With InVerDa, developers use a simple bidirectional database evolution language in the first place that carries enough information to generate all the delta code automatically. Without additional effort, new schema versions become immediately accessible and data changes in any version are visible in all schema versions at the same time. We formally validate the correctness of this propagation. InVerDa also allows for easily changing the physical table designs without affecting the availability of co-existing schema versions. This greatly increases robustness (264 times less lines of code) and allows for significant performance optimization. #### Relationship between the Reprogramming Determinants of Boolean Networks and their Interaction Graph. (arXiv:1608.05552v1 [cs.DM]) In this paper, we address the formal characterization of targets triggering cellular trans-differentiation in the scope of Boolean networks with asynchronous dynamics. Given two fixed points of a Boolean network, we are interested in all the combinations of mutations which allow to switch from one fixed point to the other, either possibly, or inevitably. In the case of existential reachability, we prove that the set of nodes to (permanently) flip are only and necessarily in certain connected components of the interaction graph. In the case of inevitable reachability, we provide an algorithm to identify a subset of possible solutions. #### Goal-Oriented Reduction of Automata Networks. (arXiv:1608.05548v1 [cs.LO]) We consider networks of finite-state machines having local transitions conditioned by the current state of other automata. In this paper, we depict a reduction procedure tailored for a given reachability property of the form from global state s there exists a sequence of transitions leading to a state where an automaton g is in a local state T'. By exploiting a causality analysis of the transitions within the individual automata, the proposed reduction removes local transitions while preserving all the minimal traces that satisfy the reachability property. The complexity of the procedure is polynomial in the total number of local states and transitions, and exponential in the number of local states within one automaton. Applied to automata networks modelling dynamics of biological systems, we observe that the reduction shrinks down significantly the reachable state space, enhancing the tractability of the model-checking of large networks. #### A Survey on Routing in Anonymous Communication Protocols. (arXiv:1608.05538v1 [cs.CR]) The Internet has undergone dramatic changes in the past 15 years, and now forms a global communication platform that billions of users rely on for their daily activities. While this transformation has brought tremendous benefits to society, it has also created new threats to online privacy, ranging from profiling of users for monetizing personal information to nearly omnipotent governmental surveillance. As a result, public interest in systems for anonymous communication has drastically increased. Several such systems have been proposed in the literature, each of which offers anonymity guarantees in different scenarios and under different assumptions, reflecting the plurality of approaches for how messages can be anonymously routed to their destination. Understanding this space of competing approaches with their different guarantees and assumptions is vital for users to understand the consequences of different design options. In this work, we survey previous research on designing, developing, and deploying systems for anonymous communication. To this end, we provide a taxonomy for clustering all prevalently considered approaches (including Mixnets, DC-nets, onion routing, and DHT-based protocols) with respect to their unique routing characteristics, deployability, and performance. This, in particular, encompasses the topological structure of the underlying network; the routing information that has to be made available to the initiator of the conversation; the underlying communication model; and performance-related indicators such as latency and communication layer. Our taxonomy and comparative assessment provide important insights about the differences between the existing classes of anonymous communication protocols, and it also helps to clarify the relationship between the routing characteristics of these protocols, and their performance and scalability. #### Private and Truthful Aggregative Game for Large-Scale Spectrum Sharing. (arXiv:1608.05537v1 [cs.GT]) Thanks to the rapid development of information technology, the size of the wireless network becomes larger and larger, which makes spectrum resources more precious than ever before. To improve the efficiency of spectrum utilization, game theory has been applied to study the spectrum sharing in wireless networks for a long time. However, the scale of wireless network in existing studies is relatively small. In this paper, we introduce a novel game and model the spectrum sharing problem as an aggregative game for large-scale, heterogeneous, and dynamic networks. The massive usage of spectrum also leads to easier privacy divulgence of spectrum users' actions, which calls for privacy and truthfulness guarantees in wireless network. In a large decentralized scenario, each user has no priori about other users' decisions, which forms an incomplete information game. A "weak mediator", e.g., the base station or licensed spectrum regulator, is introduced and turns this game into a complete one, which is essential to reach a Nash equilibrium (NE). By utilizing past experience on the channel access, we propose an online learning algorithm to improve the utility of each user, achieving NE over time. Our learning algorithm also provides no regret guarantee to each user. Our mechanism admits an approximate ex-post NE. We also prove that it satisfies the joint differential privacy and is incentive-compatible. Efficiency of the approximate NE is evaluated, and the innovative scaling law results are disclosed. Finally, we provide simulation results to verify our analysis. #### Towards Reversible Computation in Erlang. (arXiv:1608.05521v1 [cs.PL]) In a reversible language, any forward computation can be undone by a finite sequence of backward steps. Reversible computing has been studied in the context of different programming languages and formalisms, where it has been used for debugging and for enforcing fault-tolerance, among others. In this paper, we consider a subset of Erlang, a concurrent language based on the actor model. We formally introduce a reversible semantics for this language. To the best of our knowledge, this is the first attempt to define a reversible semantics for Erlang. #### Network Volume Anomaly Detection and Identification in Large-scale Networks based on Online Time-structured Traffic Tensor Tracking. (arXiv:1608.05493v1 [cs.NI]) This paper addresses network anomography, that is, the problem of inferring network-level anomalies from indirect link measurements. This problem is cast as a low-rank subspace tracking problem for normal flows under incomplete observations, and an outlier detection problem for abnormal flows. Since traffic data is large-scale time-structured data accompanied with noise and outliers under partial observations, an efficient modeling method is essential. To this end, this paper proposes an online subspace tracking of a Hankelized time-structured traffic tensor for normal flows based on the Candecomp/PARAFAC decomposition exploiting the recursive least squares (RLS) algorithm. We estimate abnormal flows as outlier sparse flows via sparsity maximization in the underlying under-constrained linear-inverse problem. A major advantage is that our algorithm estimates normal flows by low-dimensional matrices with time-directional features as well as the spatial correlation of multiple links without using the past observed measurements and the past model parameters. Extensive numerical evaluations show that the proposed algorithm achieves faster convergence per iteration of model approximation, and better volume anomaly detection performance compared to state-of-the-art algorithms. #### Efficient Computation of Slepian Functions on the Sphere. (arXiv:1608.05479v1 [cs.DM]) In this work, we develop a new method for the fast and memory-efficient computation of Slepian functions on the sphere. Slepian functions, which arise as the solution of the Slepian concentration problem on the sphere, have desirable properties for applications where measurements are only available within a spatially limited region on the sphere and/or a function is required to be analyzed over the spatially limited region. Slepian functions are currently not easily computed for large band-limits (L > 100) for an arbitrary spatial region due to high computational and large memory storage requirements. For the special case of a polar cap, the symmetry of the region enables the decomposition of the Slepian concentration problem into smaller sub-problems and consequently the efficient computation of Slepian functions for large band-limits. By exploiting the efficient computation of Slepian functions for the polar cap region on the sphere, we develop a formulation, supported by a fast algorithm, for the computation of Slepian functions for an arbitrary spatial region to enable the analysis of modern data-sets that support large band-limits. For the proposed algorithm, we carry out accuracy analysis, computational complexity analysis and review of memory storage requirements. We illustrate, through numerical experiments, that the proposed method enables faster computation, and has smaller storage requirements, while allowing for sufficiently accurate computation of the Slepian functions. #### Papers presented at the 32nd International Conference on Logic Programming (ICLP 2016). (arXiv:1608.05440v1 [cs.PL]) This is the list of the full papers accepted for presentation at the 32nd International Conference on Logic Programming, New York City, USA, October 18-21, 2016. In addition to the main conference itself, ICLP hosted four pre-conference workshops, the Autumn School on Logic Programing, and a Doctoral Consortium. The final versions of the full papers will be published in a special issue of the journal Theory and Practice of Logic Programming (TPLP). We received eighty eight abstract submissions, of which twenty seven papers were accepted for publication as TPLP rapid communications. Papers deemed of sufficiently high quality to be presented as the conference, but not enough to be appear in TPLP, will be published as Technical Communications in the OASIcs series. Fifteen papers fell into this category. #### Quantum Entanglement Distribution in Next-Generation Wireless Communication Systems. (arXiv:1608.05188v1 [quant-ph] CROSS LISTED) In this work we analyze the distribution of quantum entanglement over communication channels in the millimeter-wave regime. The motivation for such a study is the possibility for next-generation wireless networks (beyond 5G) to accommodate such a distribution directly - without the need to integrate additional optical communication hardware into the transceivers. Future wireless communication systems are bound to require some level of quantum communications capability. We find that direct quantum-entanglement distribution in the millimeter-wave regime is indeed possible, but that its implementation will be very demanding from both a system-design perspective and a channel-requirement perspective. #### Shortest unique palindromic substring queries in optimal time Authors: Yuto Nakashima, Hiroe Inoue, Takuya Mieno, Shunsuke Inenaga, Hideo Bannai, Masayuki Takeda Download: PDF Abstract: A palindrome is a string that reads the same forward and backward. A palindromic substring$P$of a string$S$is called a shortest unique palindromic substring ($\mathit{SUPS}$) for an interval$[x, y]$in$S$, if$P$occurs exactly once in$S$, this occurrence of$P$contains interval$[x, y]$, and every palindromic substring of$S$which contains interval$[x, y]$and is shorter than$P$occurs at least twice in$S$. The$\mathit{SUPS}$problem is, given a string$S$, to preprocess$S$so that for any subsequent query interval$[x, y]$all the$\mathit{SUPS}\mbox{s}$for interval$[x, y]$can be answered quickly. We present an optimal solution to this problem. Namely, we show how to preprocess a given string$S$of length$n$in$O(n)$time and space so that all$\mathit{SUPS}\mbox{s}$for any subsequent query interval can be answered in$O(k+1)$time, where$k$is the number of outputs. #### Thrill: High-Performance Algorithmic Distributed Batch Data Processing with C++ Authors: Timo Bingmann, Michael Axtmann, Emanuel Jöbstl, Sebastian Lamm, Huyen Chau Nguyen, Alexander Noe, Sebastian Schlag, Matthias Stumpp, Tobias Sturm, Peter Sanders Download: PDF Abstract: We present the design and a first performance evaluation of Thrill -- a prototype of a general purpose big data processing framework with a convenient data-flow style programming interface. Thrill is somewhat similar to Apache Spark and Apache Flink with at least two main differences. First, Thrill is based on C++ which enables performance advantages due to direct native code compilation, a more cache-friendly memory layout, and explicit memory management. In particular, Thrill uses template meta-programming to compile chains of subsequent local operations into a single binary routine without intermediate buffering and with minimal indirections. Second, Thrill uses arrays rather than multisets as its primary data structure which enables additional operations like sorting, prefix sums, window scans, or combining corresponding fields of several arrays (zipping). We compare Thrill with Apache Spark and Apache Flink using five kernels from the HiBench suite. Thrill is consistently faster and often several times faster than the other frameworks. At the same time, the source codes have a similar level of simplicity and abstraction ### QuantOverflow #### Efficiently storing real-time intraday data in an application agnostic way What would be the best approach to handle real-time intraday data storage? For personal research I've always imported from flat files only into memory (historical EOD), so I don't have much experience with this. I'm currently working on a side project, which would require daily stock quotes updated every minute from an external feed. For the time being, I suppose any popular database solution should handle it without sweating too much in this scenario. But I would like the adopted solution to scale easily when real-time ticks become necessary. A similar problem has been mentioned by Marko, though it was mostly specific to R. I'm looking for a universal data storage accessible both for lightweight web front-ends (PHP/Ruby/Flex) and analytical back-end (C++, R or Python, don't know yet). From what chrisaycock mentioned column oriented databases should be the most viable solution. And it seems to be the case. But I'm not sure I understand all the intricacies of column oriented storage in some exemplary usage scenarios: • Fetching all or subset of price data for a specific ticker for front-end charting • Compared to row based solutions fetching price data should be faster because it's a sequential read. But how does storing multiple tickers in one place influence this? For example a statement like "select all timestamps and price data where ticker is equal to something". Don't I have to compare the ticker on every row I fetch? And in the situation where I have to provide complete data for some front-end application, wouldn't serving a raw flat file for the instrument requested be more efficient? • Analytics performed in the back-end • Things like computing single values for a stock (e.g. variance, return for last x days) and dependent time-series (daily returns, technical indicators etc.). Fetching input data for computations should be more efficient as in the preceding case, but what about writing? The gain I see is bulk writing the final result (like value of computed indicator for every timestamp), but still I don't know how the database handles my mashup of different tickers in one table. Does horizontal partitioning/sharding handle it for me automatically or am I better splitting manually into table per instrument structure (which seems unnecessary cumbersome)? • Updating the database with new incoming ticks • Using row based orientation would be more efficient here, wouldn't it? And the same goes about updating aggregated data (for example daily OHLC tables). Won't it be a possible bottleneck? All this is in the context of available open source solutions. I thought initially about InfiniDB or HBase, but I've seen MonetDB and InfoBright being mentioned around here too. I don't really need "production quality" (at least not yet) as mentioned by chrisaycock in the referenced question, so would any of this be a better choice than the others? And the last issue - from approximately which load point are specialized time-series databases necessary? Unfortunately, things like kdb+ or FAME are out of scope in this case, so I'm contemplating how much can be done on commodity hardware with standard relational databases (MySQL/PostgreSQL) or key-value stores (like Tokyo/Kyoto Cabinet's B+ tree) - is it a dead end really? Should I just stick with some of the aforementioned column oriented solutions owing to the fact that my application is not mission critical or is even that an unnecessary precaution? Thanks in advance for your input on this. If some part is too convoluted, let me know in a comment. I will try to amend accordingly. EDIT: It seems that strictly speaking HBase is not a column oriented store but rather a sparse, distributed, persistent multidimensional sorted map, so I've crossed it out from the original question. After some research I'm mostly inclined towards InfiniDB. It has all the features I need, supports SQL (standard MySQL connectors/wrappers can be used for access) and full DML subset. The only thing missing in the open source edition is on the fly compression and scaling out to clusters. But I guess it's still a good bang for the buck, considering it's free. ### UnixOverflow #### How to send audit logs with audisp-remote and receive them with netcat I am trying to configure a CentOS 7 running in VirtualBox to send its audit logs to the host which is FreeBSD 10.3. Ideally, I'd like to receive the logs with FreeBSD's auditdistd(8) but for now I'd just like to be able to use netcat for that. My problem is that netcat doesn't get any data. # Details 1. When I run service auditd status I get the following results: Redirecting to /bin/systemctl status auditd.service auditd.service - Security Auditing Service Loaded: loaded (/usr/lib/systemd/system/auditd.service; enabled; vendor preset: enabled) Active: active (running) since Fri 2016-08-19 11:35:42 CEST; 3s ago Process: 2216 ExecStartPost=/sbin/augenrules --load (code=exited, status=1/FAILURE) Main PID: 2215 (auditd) CGroup: /system.slice/auditd.service ├─2215 /sbin/auditd -n └─2218 /sbin/audispd Aug 19 11:35:42 hephaistos audispd[2218]: plugin /sbin/audisp-remote was restarted Aug 19 11:35:42 hephaistos audispd[2218]: plugin /sbin/audisp-remote terminated unexpectedly Aug 19 11:35:42 hephaistos audispd[2218]: plugin /sbin/audisp-remote was restarted Aug 19 11:35:42 hephaistos audispd[2218]: plugin /sbin/audisp-remote terminated unexpectedly Aug 19 11:35:42 hephaistos audispd[2218]: plugin /sbin/audisp-remote was restarted Aug 19 11:35:42 hephaistos audispd[2218]: plugin /sbin/audisp-remote terminated unexpectedly Aug 19 11:35:42 hephaistos audispd[2218]: plugin /sbin/audisp-remote was restarted Aug 19 11:35:42 hephaistos audispd[2218]: plugin /sbin/audisp-remote terminated unexpectedly Aug 19 11:35:42 hephaistos audispd[2218]: plugin /sbin/audisp-remote has exceeded max_restarts Aug 19 11:35:42 hephaistos audispd[2218]: plugin /sbin/audisp-remote was restarted  # Setup ## Network Setup 1. CentOS and FreeBSD are connected on a host-only network. I've assigned them the following IP's: • CentOS: 192.168.56.101 • FreeBSD: 192.168.56.1 ## FreeBSD Setup 1. I've got netcat listening on port 60: nc -lk 60  The connection works. I can use nc 192.168.56.1 60 on CentOS to send data to FreeBSD. ## CentOS Setup 1. The kernel version is: 4.7.0-1.el7.elrepo.x86_64 #1 SMP Sun Jul 24 18:15:29 EDT 2016 x86_64 x86_64 x86_64 GNU/Linux. 2. The version of Linux Audit userspace is 2.6.6. 3. auditd is running and actively logging to /var/log/audit.log. 4. The auditing rules in /etc/audit/rules.d/ are well configured. 5. The configuration of /etc/audisp/audisp-remote.conf looks like this: remote-server = 192.168.56.1 port = 60 local_port = any transport = tcp mode = immediate  6. I've got two default files in /etc/audisp/plugins.d/: syslog.conf and af_unix.conf and both of them are not active. I've added af-remote.conf and it looks like this: # This file controls the audispd data path to the # remote event logger. This plugin will send events to # a remote machine (Central Logger). active = yes direction = out path = /sbin/audisp-remote type = always #args = format = string  It is a modified example from the official repository (link). 7. Here's the content of /etc/audisp/audispd.conf: q_depth = 150 overflow_action = SYSLOG priority_boost = 4 max_restarts = 10 name_format = HOSTNAME  I'll be happy to provide more details if needed. ### StackOverflow #### Reversing a list in another list in Haskell I'm quite new to Haskell and I'm trying to reverse a list. At the same time I want to reverse the lists in that list. So for example: Prelude> rev [[3,4,5],[7,5,2]] [[2,5,7],[5,4,3]]  I know that the following code reverses a list: rev :: [[a]] -> [[a]] rev [[]] = [[]] rev [[x]] = [[x]] rev xs = last xs : reverse (init xs)  I have been struggling for while, I have made some additions to the code but it still isn't working and I'm stuck. rev :: [[a]] -> [[a]] rev [[]] = [[]] rev [[x]] = [[x]] rev xs = last xs : reverse (init xs) rev [xs] = last [xs] : reverse (init [xs])  I'd appreciate any help. Thanks in advance. ### Lobsters #### SHILL - A Secure, SHELL, Scripting Language (2014) ### StackOverflow #### ConvNet layers not showing activity and dropping to zero after a few minibatches I am attempting to train a convolutional neural network using Tensorflow. The structure of my network is similar to VGG except smaller - I'm using 3 pooling layers, 2 fully connected, and 252 target classes. I am using several layers of abstraction to make the final graph more readable. Base conv2d is like this def conv2d(inputs, n_out, kernel, step, **kwargs): name = kwargs.pop('name','conv') relu = kwargs.pop('relu', True) bias = kwargs.pop('bias', True) padding = kwargs.pop("padding", "SAME") channles_in = inputs.get_shape().as_list()[-1] with tf.variable_scope(name) as scope: weights = tf.get_variable('weights', [kernel, kernel, channles_in, n_out], initializer=xavier_initializer()) convolve = tf.nn.conv2d(inputs, weights, [1, step, step, 1], padding=padding) if bias is True: bias_layer = tf.get_variable("biases", [n_out], initializer=tf.constant_initializer(0.01)) convolve = tf.nn.bias_add(convolve, bias_layer) if relu is True: convolve = tf.nn.relu(convolve) return convolve  Affine layer def affine(inputs, n_out, **kwargs): name = kwargs.pop("name", "affine") bias = kwargs.pop("bias", True) relu = kwargs.pop("relu", True) input_shape = inputs.get_shape().as_list() if len(input_shape) == 4: n_in = reduce(mul, input_shape[1:], 1) inputs = tf.reshape(inputs, shape=[-1, n_in]) else: n_in = input_shape[-1] with tf.variable_scope(name) as scope: weights = tf.get_variable('weights', [n_in, n_out], initializer=xavier_initializer()) fc = tf.matmul(inputs, weights) if bias is True: bias_layer = tf.get_variable("biases", [n_out], initializer=tf.constant_initializer(0.01)) fc = tf.nn.bias_add(fc, bias_layer) if relu is True: fc = tf.nn.relu(fc) return fc  max pooling op def max_pool(inputs, ksize, stride, **kwargs): name = kwargs.pop("name", "max_pool") padding = kwargs.pop("padding", "SAME") with tf.variable_scope(name): pool = tf.nn.max_pool(inputs, ksize=[1, ksize, ksize, 1], strides=[1, stride, stride, 1], padding=padding) return pool  batch normalization for input images def batch_norm(inputs): with tf.variable_scope('batch_norm'): mean, var = tf.nn.moments(inputs, axes=[0, 1, 2]) return tf.nn.batch_normalization(inputs, mean, var, offset=0, scale=1.0, variance_epsilon=1e-6)  max pooling group (only using 2 convolutions per group) def max_pool_group(name, inputs, n_out, conv_k=3, conv_s=1, pool_k=2, pool_s=2): with tf.variable_scope(name): conv1 = conv2d(inputs, n_out, conv_k, conv_s, name='conv1') conv2 = conv2d(conv1, n_out, conv_k, conv_s, name='conv2') pool = max_pool(conv2, pool_k, pool_s, name='pool1') variable_summaries(pool, name) return pool  With all that, my model (minus the other Tensorflow boilerplate stuff) is defined as inputs = tf.placeholder(tf.float32, name='input', shape=[batch_size, 224, 224, 3]) target = tf.placeholder(tf.float32, name='target', shape=[batch_size, n_classes]) learning_rate_ph = tf.placeholder(tf.float32, name='learning_rage', shape=[]) def model(inputs): normed = batch_norm(inputs) pool1 = max_pool_group('pool1', normed, 64) pool2 = max_pool_group('pool2', pool1, 128) pool3 = max_pool_group('pool3', pool2, 256) fc1 = affine(pool3, 2048, name='fc1') variable_summaries(fc1, 'fc1') fc2 = affine(fc1, 2048, name='fc2') variable_summaries(fc2, 'fc2') logits = affine(fc2, n_classes, relu=False, name='logits') variable_summaries(logits, 'logits') return logits logits = model(inputs) loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits, target)) tf.scalar_summary('loss', loss) optimizer = tf.train.GradientDescentOptimizer(learning_rate_ph).minimize(loss) train_prediction = tf.nn.softmax(logits)  The issue I'm having is that after only a few minibatches, pool1 and pool2 drop to 0 and show no activity. You can see in the image that parameters are still being adjusted in other layers. I've checked the input images and they all check out, labels all line up, and I have a scheme to drop the learning rate after every 30 epochs. I've changed the fully connected layer sizes between 2048 and 4096 and I've changed the depth of the network to look like VGG-D with the same response -- pooling layers converge to 0. Training on a handfull of images (2-10), the model will hit targets and the loss will drop but the layers converge to 0. What could I be doing wrong? Is there something obvious I'm missing? Thanks in advance for your help. ### CompsciOverflow #### What data structure for this operation? [duplicate] This question already has an answer here: Say I have two arrays, { energy and score }, and they could have elements like this: E: 1 3 4 7 S: 16 10 5 1  What I want is the best score with the best energy. What data structure can support inserting items in a way that I don't have have an item with less energy and less score than another item i.e. for any i,j where i!=j => score[i] > score[j] || energy[i] > energy[j] When inserting, I do three steps: 1- if any item has more or equal score and energy, return; 2- if any item has less or equal score and energy, remove this item; 3- insert the required item. Here are some examples: 1- insert e=8, s=1. The arrays become: E: 1 3 4 8 S: 16 10 5 1 ^  2- insert e=5, s=6. The arrays becomes: E: 1 3 5 8 S: 16 10 6 1 ^  3- insert e=5, s=11. The arrays becomes : E: 1 5 8 S: 16 11 1 ^ (3,10) is removed because (5,11) has more energy and more score than it.  What data structure can support this in (hopefully) O(logn) time? ### QuantOverflow #### How to extract all the ticker symbols of an exchange with Quantmod in R? I am using the Quantmod package in R for some data analysis. Now I can downbload price history of particular stocks or index with the following code:- library(quantmod) # Loading quantmod library getSymbols("^DJI", from = as.character(Sys.Date()-365*3))  I want to download all the ticker symbols that are composite of a particular Index such as DJI for example. What will be the best way to do that through R? Thanks a lot in advance. ### CompsciOverflow #### Multi-point evaluations of a polynomial mod p Given a polynomial of degree$n$modulo a prime number$p$, I want to evaluate that polynomial at multiple values of the variable$x$, what is the best way to do this? I tried using Berlekamp's algorithm for factorization but it takes$O(n^3)$just to factorize and then$O(n)$per point evaluation. Is there any other way to bring the complexity down considerably like to$n\log(q)$where$q$is the number of points I want to evaluate the polynomial at? Or possibly polynomial time? All the coefficients and the values of$x$that the polynomial is to be evaluated at lie between the$0$and$prime - 1$, the prime is of order$10^6$. ### DataTau #### Seven ways to be data-driven off a cliff ### HN Daily #### Daily Hacker News for 2016-08-21 ## August 21, 2016 ### QuantOverflow #### Is this a viable method for testing market making strategies? I found a video game market (steam community market) which allows for trading of in game items between users, most items are <0.25 USD each, and market capitalization appears to be maybe$5-$10 USD on some items. Something to be noted is the transaction fee is 15% which does limit the possibilities a bit. Some discoveries that should be taken into account: Many items appear to have very consistent supply and demand, thus leaving them in a sort equilibrium thus the drift in price is very low. One thing I did find while manually market making is occasionally a user will sell an item at a price lower than the bid, effectively making spreads go negative. In that case, the highest bidder then receives the item. This occurs on average about 1 out of 15 times a trade is cleared, but will go as long as 100 trades between opportunities on occasion. I created a program that provides constantly re-lists bid and offer quotes at the same price as soon as they are filled. By clearing very high volumes (over 1k items a day) I managed to turn$1 into $14 USD within 3 weeks. This isn't as good as it sound given the returns capped at about 5 bucks invested and I even got blocked from accessing the server after making 8k+ requests in an hour. Given some less popular items have capitalizations of less than 5 USD it is possible for a dealer to accumulate almost all copies of the item in existence, allowing for the fixing of prices, perhaps useful for reducing volatility. There is a one week holding period before the in game item is delivered to your inventory, this was implemented into the market before I attempted arbitrage based on negative spreads, and was why I stopped my high volume strategy. A one week delivery means a very large open interest is required in order provide liquidity 24/7. Another thing, I have discovered is certain items are pretty much identical, but the amount they have been used in the game affects the value of the item and creates a spread, this could be used for correlation based stat-arb perhaps? Modeling the market: For an item in equilibrium, the price will not drift much so the time series can be assumed as stationary. Since It is easy to accumulate a massive open interest, at least compared to volume, one can assume for the sake of the model that there is unlimited buying and selling potential on behalf of the liquidity provider, and I will assume no one else is providing liquidity to the market and all other bids/offers are individuals seeking the utility of the item, rather than to make gains trading it.$b$will represent the dealers bid$o$will represent the dealers offer$B_t$will represent a process for the given item's market bid rate defined as$>= b$with an unknown distribution.$O_t$will represent a process for the given item's market offer rate defined as$<= oS_t$can be defined as$O_t-B_t$One Hypothesis I have, is that$B_t-b$and$o-O_t$are perhaps log normal distributed. A method I propose for verifying the distribution is to sample the percent of negative spread arbitrage opportunities and compare them to the expected amount of opportunities a given distribution expects for the Process$S_t$which could be used for fitting a distribution. My question: Can any research done into this in-game market be applied to real market making for financial markets, or is there factors not accounted for? ### CompsciOverflow #### Find ellipsoid that contains intersection of an ellipsoid and a hyperplane I have an$n$-dimensional ellipsoid$E$and a hyperplane$H$. This hyperplane cuts$E$into two parts:$E_1$and$E_2$(whose disjoint union is$E$). I want to find another ellipsoid$E'$that has minimal hyper-volume and contains$E_1$. Is there an efficient algorithm to do this? My first thought was to formulate it as an optimization problem, but I am having difficulty with formulating it, as I don't know how to formulate the containment ($E_1 \subseteq E'$) constraint. An approximation for the minimal hyper-volume ellipsoid is also good for my needs. #### Higher order verification in a complete logic I'd like to design a language that is able to reason over itseslf, means, able to get as input a code in that language (that might have went through some external redundant preprocessing, or "reflection" if to use another term) and reason over it. MLTT is of course a natural choice. But I'm seeking for a logically complete language, obviously sacrificing expressiveness (in fact it must not be able to express arithmetic, otherwise by Godel or more directly by the mortal matrix problem or Hilbert's 10th problem it cannot be complete). Therefore as first candidadte I thought of MSO over graphs. My question is how can MSO over graphs reason over itself, or more precisely, if we had a prolog-like (kb&query) language in MSO over graphs logic, how could it interpret or compile itself (and as a byproduct also reason over itseslf)? #### Can Tree Transducers Self-Interpret? Is it possible for MSO graph/tree transducers to reflect themselves, namely to create an interpreter of tree/graph transducers using tree/graph transducers? If yes, I'll be happy for some design guidelines. ### QuantOverflow #### What is Toxic FX Flow debate? So, basically I want to debate and find out the real reason behind being flag by ECNs and venues as "toxic". How to avoid being flagged? What kind of strategies are toxic and why? Below is an article found by a brokerage firm... so the opinion in that article couldn't be that objective. Article about toxix FX flow Any thoughts? ### StackOverflow #### Scikit-Learn: Std.Error, p-Value from LinearRegression I've been trying to get the standard error & p-Values by using LR from scikit-learn. But no success. I've end up finding up this article: but the std error & p-value does not match that from the statsmodel.api OLS method import numpy as np from sklearn import datasets from sklearn import linear_model import regressor import statsmodels.api as sm boston = datasets.load_boston() which_betas = np.ones(13, dtype=bool) which_betas[3] = False X = boston.data[:,which_betas] y = boston.target #scikit + regressor stats ols = linear_model.LinearRegression() ols.fit(X,y) xlables = boston.feature_names[which_betas] regressor.summary(ols, X, y, xlables) # statsmodel x2 = sm.add_constant(X) models = sm.OLS(y,x2) result = models.fit() print result.summary()  Output as follows: Residuals: Min 1Q Median 3Q Max -26.3743 -1.9207 0.6648 2.8112 13.3794 Coefficients: Estimate Std. Error t value p value _intercept 36.925033 4.915647 7.5117 0.000000 CRIM -0.112227 0.031583 -3.5534 0.000416 ZN 0.047025 0.010705 4.3927 0.000014 INDUS 0.040644 0.055844 0.7278 0.467065 NOX -17.396989 3.591927 -4.8434 0.000002 RM 3.845179 0.272990 14.0854 0.000000 AGE 0.002847 0.009629 0.2957 0.767610 DIS -1.485557 0.180530 -8.2289 0.000000 RAD 0.327895 0.061569 5.3257 0.000000 TAX -0.013751 0.001055 -13.0395 0.000000 PTRATIO -0.991733 0.088994 -11.1438 0.000000 B 0.009827 0.001126 8.7256 0.000000 LSTAT -0.534914 0.042128 -12.6973 0.000000 --- R-squared: 0.73547, Adjusted R-squared: 0.72904 F-statistic: 114.23 on 12 features OLS Regression Results ============================================================================== Dep. Variable: y R-squared: 0.735 Model: OLS Adj. R-squared: 0.729 Method: Least Squares F-statistic: 114.2 Date: Sun, 21 Aug 2016 Prob (F-statistic): 7.59e-134 Time: 21:56:26 Log-Likelihood: -1503.8 No. Observations: 506 AIC: 3034. Df Residuals: 493 BIC: 3089. Df Model: 12 Covariance Type: nonrobust ============================================================================== coef std err t P>|t| [95.0% Conf. Int.] ------------------------------------------------------------------------------ const 36.9250 5.148 7.173 0.000 26.811 47.039 x1 -0.1122 0.033 -3.405 0.001 -0.177 -0.047 x2 0.0470 0.014 3.396 0.001 0.020 0.074 x3 0.0406 0.062 0.659 0.510 -0.081 0.162 x4 -17.3970 3.852 -4.516 0.000 -24.966 -9.828 x5 3.8452 0.421 9.123 0.000 3.017 4.673 x6 0.0028 0.013 0.214 0.831 -0.023 0.029 x7 -1.4856 0.201 -7.383 0.000 -1.881 -1.090 x8 0.3279 0.067 4.928 0.000 0.197 0.459 x9 -0.0138 0.004 -3.651 0.000 -0.021 -0.006 x10 -0.9917 0.131 -7.547 0.000 -1.250 -0.734 x11 0.0098 0.003 3.635 0.000 0.005 0.015 x12 -0.5349 0.051 -10.479 0.000 -0.635 -0.435 ============================================================================== Omnibus: 190.837 Durbin-Watson: 1.015 Prob(Omnibus): 0.000 Jarque-Bera (JB): 897.143 Skew: 1.619 Prob(JB): 1.54e-195 Kurtosis: 8.663 Cond. No. 1.51e+04 ============================================================================== Warnings: [1] Standard Errors assume that the covariance matrix of the errors is correctly specified. [2] The condition number is large, 1.51e+04. This might indicate that there are strong multicollinearity or other numerical problems.  I've also found the following articles Both the codes in the SO link doesn't compile Here is my code & data that I'm working on - but not being able to find the std error & p-values import pandas as pd import statsmodels.api as sm import numpy as np import scipy from sklearn.linear_model import LinearRegression from sklearn import metrics def readFile(filename, sheetname): xlsx = pd.ExcelFile(filename) data = xlsx.parse(sheetname, skiprows=1) return data def lr_statsmodel(X,y): X = sm.add_constant(X) model = sm.OLS(y,X) results = model.fit() print (results.summary()) def lr_scikit(X,y,featureCols): model = LinearRegression() results = model.fit(X,y) predictions = results.predict(X) print 'Coefficients' print 'Intercept\t' , results.intercept_ df = pd.DataFrame(zip(featureCols, results.coef_)) print df.to_string(index=False, header=False) # Query:: The numbers matches with Excel OLS but skeptical about relating score as rsquared rSquare = results.score(X,y) print '\nR-Square::', rSquare # This looks like a better option # source: http://scikit-learn.org/stable/modules/generated/sklearn.metrics.r2_score.html#sklearn.metrics.r2_score r2 = metrics.r2_score(y,results.predict(X)) print 'r2', r2 # Query: No clue at all! http://scikit-learn.org/stable/modules/model_evaluation.html#regression-metrics print 'Rsquared?!' , metrics.explained_variance_score(y, results.predict(X)) # INFO:: All three of them are providing the same figures! # Adj-Rsquare formula @ https://www.easycalculation.com/statistics/learn-adjustedr2.php # In ML, we don't use all of the data for training, and hence its highly unusual to find AdjRsquared. Thus the need for manual calculation N = X.shape[0] p = X.shape[1] adjRsquare = 1 - ((1 - rSquare ) * (N - 1) / (N - p - 1)) print "Adjusted R-Square::", adjRsquare # calculate standard errors # mean_absolute_error # mean_squared_error # median_absolute_error # r2_score # explained_variance_score mse = metrics.mean_squared_error(y,results.predict(X)) print mse print 'Residual Standard Error:', np.sqrt(mse) # OLS in Matrix : https://github.com/nsh87/regressors/blob/master/regressors/stats.py n = X.shape[0] X1 = np.hstack((np.ones((n, 1)), np.matrix(X))) se_matrix = scipy.linalg.sqrtm( metrics.mean_squared_error(y, results.predict(X)) * np.linalg.inv(X1.T * X1) ) print 'se',np.diagonal(se_matrix) # https://github.com/nsh87/regressors/blob/master/regressors/stats.py # http://regressors.readthedocs.io/en/latest/usage.html y_hat = results.predict(X) sse = np.sum((y_hat - y) ** 2) print 'Standard Square Error of the Model:', sse if __name__ == '__main__': # read file fileData = readFile('Linear_regression.xlsx','Input Data') # list of independent variables feature_cols = ['Price per week','Population of city','Monthly income of riders','Average parking rates per month'] # build dependent & independent data set X = fileData[feature_cols] y = fileData['Number of weekly riders'] # Statsmodel - OLS # lr_statsmodel(X,y) # ScikitLearn - OLS lr_scikit(X,y,feature_cols)  My data-set Y X1 X2 X3 X4 City Number of weekly riders Price per week Population of city Monthly income of riders Average parking rates per month 1 1,92,000$15     18,00,000   $5,800$50
2   1,90,400    $15 17,90,000$6,200  $50 3 1,91,200$15     17,80,000   $6,400$60
4   1,77,600    $25 17,78,000$6,500  $60 5 1,76,800$25     17,50,000   $6,550$60
6   1,78,400    $25 17,40,000$6,580  $70 7 1,80,800$25     17,25,000   $8,200$75
8   1,75,200    $30 17,25,000$8,600  $75 9 1,74,400$30     17,20,000   $8,800$75
10  1,73,920    $30 17,05,000$9,200  $80 11 1,72,800$30     17,10,000   $9,630$80
12  1,63,200    $40 17,00,000$10,570 $80 13 1,61,600$40     16,95,000   $11,330$85
14  1,61,600    $40 16,95,000$11,600 $100 15 1,60,800$40     16,90,000   $11,800$105
16  1,59,200    $40 16,30,000$11,830 $105 17 1,48,800$65     16,40,000   $12,650$105
18  1,15,696    $102 16,35,000$13,000 $110 19 1,47,200$75     16,30,000   $13,224$125
20  1,50,400    $75 16,20,000$13,766 $130 21 1,52,000$75     16,15,000   $14,010$150
22  1,36,000    $80 16,05,000$14,468 $155 23 1,26,240$86     15,90,000   $15,000$165
24  1,23,888    $98 15,95,000$15,200 $175 25 1,26,080$87     15,90,000   $15,600$175
26  1,51,680    $77 16,00,000$16,000 $190 27 1,52,800$63     16,10,000   $16,200$200


I've exhausted all my options and whatever I could make sense of. So any guidance on how to compute std error & p-values that is the same as per the statsmodel.api is appreciated.

EDIT: I'm trying to find the std error & p-values for intercept and all the independent variables

### CompsciOverflow

#### Sum of size of distinct set of descendants $d$ distance from a node $u$, over all $u$ and $d$ is $\mathcal{O}(n\sqrt{n})$

Let's consider a rooted tree $T$ of $n$ nodes. For any node $u$ of the tree, define $L(u,d)$ to be the list of descendants of $u$ that are distance $d$ away from $u$. Let $|L(u,d)|$ denote the number of nodes that are present in the list $L(u,d)$.

Prove that the sum of $|L(u,d)|$ over all distinct lists $L(u,d)$ is bounded by $\mathcal{O}(n\sqrt{n})$.

# My work

Consider all $L(u,d)$ such that the left most node on the level $Level(u) + d$ is some node $v$. The pairs $u, d$ for all such $L(u,d)$ must be distinct and the sum of all $d_i$ will correspond to the number of nodes $x$ in the tree with $Level(x) \le Level(u) + d$.

This is because if some sequence of nodes $v_1, v_2, \dots v_k$ corresponds to the descendants of some node $u$ at a distance $d$ and the sequence of nodes $v_1, v_2, \dots v_{k'}$ where $k' > k$ corresponds to the descendants of some node $u'$ at a distance $d+1$, then there must also exist a node $u''$ such that $L(u'', d) = v_{k+1}, v_{k+2}, \dots v_{k'}$. This would also mean that $u''$ is not in the subtree of $u$ and thus there are at least $d$ distinct nodes in the subtree of $u''$ upto a distance $d$ from $u''$.

If the distinct distances are $d_1, d_2, \dots d_k$ then, $n \ge \sum_{i}d_i \ge \sum_{i=1}^{k}i \ge \frac{k(k+1)}{2}$. =

$\implies k \le \sqrt{n}$

After this I tried to show that there can be only $\mathcal{O}(\sqrt{n})$ distinct lists $L(u,d)$ so that I can then trivially obtain the upper-boud of $n\sqrt{n}$ but I could not make any more useful observations.

This link claims that such an upper bound does exist but has not provided the proof.

Any ideas how we might proceed to prove this?

# Problem

While taking notes from a Haskell book, this code example should return: Left [NameEmpty, AgeTooLow], but it only returns the first case Left [NameEmpty]. Then when I pass the function mkPerson2 results to which it should return Right Person _ _, I get back a Non-exhaustive pattern error. I've looked over this code for quite some time, but it looks right to me. What am I missing here? Any explanation on the subject would be absolutely appreciated, thanks!

Book I'm using

# Code

module EqCaseGuard where

type Name = String
type Age  = Integer
type ValidatePerson a = Either [PersonInvalid] a

data Person = Person Name Age deriving Show

data PersonInvalid = NameEmpty
| AgeTooLow
deriving Eq

ageOkay :: Age -> Either [PersonInvalid] Age
ageOkay age = case age >= 0 of
True  -> Right age
False -> Left [AgeTooLow]

nameOkay :: Name -> Either [PersonInvalid] Name
nameOkay name = case name /= "" of
True  -> Right name
False -> Left [NameEmpty]

mkPerson2 :: Name -> Age -> ValidatePerson Person
mkPerson2 name age = mkPerson2' (nameOkay name) (ageOkay age)

mkPerson2' :: ValidatePerson Name -> ValidatePerson Age -> ValidatePerson Person
mKPerson2' (Right nameOk) (Right ageOk) = Right (Person nameOk ageOk)


# Error

*EqCaseGuard> mkPerson2 "jack" 22
*** Exception: eqCaseGuard.hs:(54,1)-(55,53): Non-exhaustive patterns in function mkPerson2'

*EqCaseGuard> mkPerson2 "" (-1)
Left [NameEmpty]
*EqCaseGuard>


### Fefe

#### Eine Sache, die ich in den USA immer total geil finde, ...

Eine Sache, die ich in den USA immer total geil finde, ist was für krasse euphemistische Tarnnamen sich die Lobbygruppen da immer geben. Die krasse Lobbygruppe für Deregulierung heißt zum Beispiel "Americans for Prosperity", als ob Deregulierung zu Wohlstand führen würde (außer für die eh schon Superreichen, natürlich). Die Schusswaffen-Regierungs-Lobby nennt sich sowas wie "Americans for Responsible Solutions" oder "Independence USA" (lolwut?), die Mehr-Schusswaffen-für-Alle-Lobby nennt sich "Institute for Legislative Action" (tut natürlich das Gegenteil davon, behindert Gesetzgebung zur Schusswaffenkontrolle). Eine der bekannteren Lobbygruppen gegen Regulierung der Fast Food-, Fleisch-, Alkohol- und Tabakindustrie heißt "Center for Consumer Freedom".

Nein, wirklich!

Und was macht der so?

Kommt ihr NIE drauf!

Klebt Pro-AfD-Werbeplakate!

### QuantOverflow

#### interactive brokers market order slippage

I have been considering to use the interactive brokers API for my automated trading platform, however I would like to see if anybody here has experience with the quality of the service.

My main concern is how much slippage is on average present with market orders.

I understand that the answer depends strongly on the type of future being traded, so assuming that only highly liquid large cap companies such as ones which are listed on the DOW or SNP.

since I am still developing and testing my quantitative strategy, I would just like to be aware of what kind of typical slippage I should account for in my testing models.

Edit: also if someone can recommend a service they feel is better then Interactive Brokers please let me know ( I'm located in Canada)

### StackOverflow

#### RXJava with Retrofit2 I can't retrieve the server response, nor a simple Log.d

I see from the debug that the array is retrieved, but if I put in onNext a Log.d with the first position of the array, I do not retrieve the Log in the console, and I do not know if I am calling the first position of the array in a correct way

Log.d("IVO", "onNext" + stackOverflowQuestions.items.get(0).title.toString());

this is the Main

package com.vogella.android.retrofitstackoverflow;

import (..)

public class MainActivity extends ListActivity {
//original https://api.stackexchange.com/2.2/search?order=desc&sort=activity&tagged=android&site=stackoverflow

protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);

requestWindowFeature(Window.FEATURE_INDETERMINATE_PROGRESS);
requestWindowFeature(Window.FEATURE_PROGRESS);
android.R.layout.simple_list_item_1,
android.R.id.text1,
new ArrayList<Question>());
setProgressBarIndeterminateVisibility(true);
setProgressBarVisibility(true);
}

@Override
return true;
}

@Override
setProgressBarIndeterminateVisibility(true);
Gson gson = new GsonBuilder()
.setDateFormat("yyyy-MM-dd'T'HH:mm:ssZ")
.create();
Retrofit retrofit = new Retrofit.Builder()
.baseUrl("https://api.stackexchange.com")
.build();

// prepare call in Retrofit 2.0
StackOverflowAPI stackOverflowAPI = retrofit.create(StackOverflowAPI.class);

//the real call to the server

.subscribeOn(Schedulers.io())
.subscribe(new Subscriber<StackOverflowQuestions>() {
@Override
public void onCompleted() {
Log.d("IVO", "completed");
}

@Override
public void onError(Throwable e) {

}

@Override
public void onNext(StackOverflowQuestions stackOverflowQuestions) {

Log.d("IVO", "onNext" + stackOverflowQuestions.items.get(0).title.toString());
Log.d("IVO", "onNext" );
}
});

return true;
}


}

this is StackOverflowQuestions

package com.vogella.android.retrofitstackoverflow;

import java.util.List;

public class StackOverflowQuestions {
List<Question> items;
}


this is Question

package com.vogella.android.retrofitstackoverflow;

// This is used to map the JSON keys to the object by GSON
public class Question {

String title;

@Override
public String toString() {
return(title);
}
}


EDIT StackOverflowAPI as requested:

package com.vogella.android.retrofitstackoverflow;

import android.util.Log;

import retrofit2.Callback;
import retrofit2.http.GET;
import retrofit2.http.Query;
import retrofit2.Call;
import rx.Observable;

public interface StackOverflowAPI {
@GET("/2.2/questions?order=desc&sort=creation&site=stackoverflow")

}


### Wes Felter

#### The Merkle: John McAfee Declares War On Kim Dotcom

The Merkle: John McAfee Declares War On Kim Dotcom:

Can we get Captain Crunch and Mark Karpeles to serve as seconds?

### Overcoming Bias

You can often learn about your own world by first understanding some other world, and then asking if your world is more like that other world than you had realized. For example, I just attended WorldCon, the top annual science fiction convention, and patterns that I saw there more clearly also seem echoed in wider worlds.

At WorldCon, most of the speakers are science fiction authors, and the modal emotional tone of the audience is one of reverence. Attendees love science fiction, revere its authors, and seek excuses to rub elbows with them. But instead of just having social mixers, authors give speeches and sit on panels where they opine on many topics. When they opine on how to write science fiction, they are of course experts, but in fact they mostly prefer to opine on other topics. By presenting themselves as experts on a great many future, technical, cultural, and social topics, they help preserve the illusion that readers aren’t just reading science fiction for fun; they are also part of important larger conversations.

When science fiction books overlap with topics in space, physics, medicine, biology, or computer science, their authors often read up on those topics, and so can be substantially more informed than typical audience members. And on such topics actual experts will often be included on the agenda. Audiences may even be asked if any of them happen to have expertise on a such a topic.

But the more that a topic leans social, and has moral or political associations, the less inclined authors are to read expert literatures on that topic, and the more they tend to just wing it and think for themselves, often on their feet. They less often add experts to the panel or seek experts in the audience. And relatively neutral analysis tends to be displaced by position taking – they find excuses to signal their social and political affiliations.

The general pattern here is: an audience has big reasons to affiliate with speakers, but prefers to pretend those speakers are experts on something, and they are just listening to learn about that thing. This is especially true on social topics. The illusion is exposed by facts like speakers not being chosen for knowing the most about a subject discussed, and those speakers not doing much homework. But enough audience members are ignorant of these facts to provide a sufficient fig leaf of cover to the others.

This same general pattern repeats all through the world of conferences and speeches. We tend to listen to talks and panels full of not just authors, but also generals, judges, politicians, CEOs, rich folks, athletes, and actors. Even when those are not the best informed, or even the most entertaining, speakers on a topic. And academic outlets tend to publish articles and books more for being impressive than for being informative. However, enough people are ignorant of these facts to let audiences pretend that they mainly listen to learn and get information, rather than to affiliate with the statusful.

Added 22Aug: We feel more strongly connected to people when we together visibly affirm our shared norms/values/morals. Which explains why speakers look for excuses to take positions.

### Planet Theory

#### TR16-131 | Threshold Secret Sharing Requires a Linear Size Alphabet | Andrej Bogdanov, Siyao Guo, Ilan Komargodski

We prove that for every $n$ and $1 < t < n$ any $t$-out-of-$n$ threshold secret sharing scheme for one-bit secrets requires share size $\log(t + 1)$. Our bound is tight when $t = n - 1$ and $n$ is a prime power. In 1990 Kilian and Nisan proved the incomparable bound $\log(n - t + 2)$. Taken together, the two bounds imply that the share size of Shamir's secret sharing scheme (Comm. ACM '79) is optimal up to an additive constant even for one-bit secrets for the whole range of parameters $1 < t < n$. More generally, we show that for all $1 < s < r < n$, any ramp secret sharing scheme with secrecy threshold $s$ and reconstruction threshold $r$ requires share size $\log((r + 1)/(r - s))$. As part of our analysis we formulate a simple game-theoretic relaxation of secret sharing for arbitrary access structures. We prove the optimality of our analysis for threshold secret sharing with respect to this method and point out a general limitation.

### CompsciOverflow

have to work on cloud computing for my graduation project. It will be a cloud-based application and not a research. I've read about combining embedded systems with clouds but I can't think of a cloud-based embedded application that can benefit of clouds. Is there any good ideas for such a project? or any other cloud-based application?

### Planet Emacsen

#### Grant Rettke: Emacs-wgrep Provides Writable Grep Buffers That Apply The Changes To The Files

Emacs-wgrep provides writable grep buffer that apply the changes to files.

Intuitive and familiar idea if you already like editable dired buffers.

### StackOverflow

#### What is the definition of a pattern in Rust and what is pattern matching? [on hold]

I am a programmer who is very familiar with languages like C and C++, but I have very little experience with things that are functional in nature. I am attempting to learn Rust and would like to know what Rust defines a pattern as, and what pattern matching with a match expression is in Rust.

### Planet Emacsen

#### Grant Rettke: It Is Time To Migrate from grep to ag

ag is fast, does what you expect, and works well with Emacs. Maybe it is time for you to switch.

### QuantOverflow

#### Creating a Beta-Neutral Portfolio

Given a portfolio of assets (say 10) and trading signal (1=Hold):

      ___________________   Day Count  ______________________

Asset |0|1|2|3|4|5|6|7|8|9|10|11| ... |30|31|32|33|34|35| ...
--------+------------------------------------------------------
1. IBM  |1|1|1|1|1|1|1|0|0|0| 0| 0| ... | 0|-1|-1|-1| 0| 0| ...
2. APPL |0|0|0|1|1|1|1|1|0|0|-1|-1| ... |-1| 0| 0| 0| 0| 0| ...
:                        :                 :
:                        :                 :
10.TSLA |0|0|0|0|0|1|1|1|0|0| 0| 0| ... | 0|-1|-1|-1| 0| 0| ...


1. IBM : Buy on Day0 and Sell on Day7; then Short on Day31 and Buy back on Day34, and so on.
2. APPL: Buy on Day3 and Sell on Day8; then Short on Day10 and Buy back on Day31, and so on
3. TSLA: Buy on Day5 and Sell on Day8; then Short on Day31 and Buy back on Day34, and so on.

My question is that, given that the rebalancing time is not fixed and that on some days there are Long only or Short only positions, how can one make this portfolio Beta-Neutral?

### CompsciOverflow

#### Is it inefficient to use Unity to turn 32kb of Javascript into a mobile app? Are there alternatives?

Q: What is Unity actually doing? Is it simply wrapping a shell around the code, or compiling it into lower level code?

Q: It possible to just make a simple app that is merely a browser, and run the js game code in that?

Part of the reason I ask is that my game code is <32kb and all I need is a menu and a way to connect to a server for PvP.

I tried a Unity built Minesweeper clone, which is a similarly simple, grid based puzzle that actually has fewer in-game elements, and much less complexity. It took forever to load. The project has ~50x more objects than I have functions. The project files take up 10,000x the space.

Unity seems more like a graphics engine and my code expresses a simple, abstract, non-trivial combinatorial game.

### StackOverflow

#### An example on how to perform nearest neighbor search after dimensionality reduction [on hold]

I am following the Matlab's implementation of probabilistic principal component analysis (PPCA) for dimensionality reduction. It seeks to relate a p-dimensional observation vector y to a corresponding k-dimensional vector of latent (or unobserved) variable x, which is normal with mean zero and covariance. The relationship is

y[n]=W∗x[n]+μ+ε,

where y[n] is the row vector of the n th observed sample of dimension p; x is the latent /unobserved variable of dimension k (p>>k). It is not clear to me how to perform nearest neighbor search after doing dimensionality reduction using PPCA. It would be of immense help if an example is provided for any data set.

#### Unit testing function that calls other function

Say I have the following two functions:

add_five (number) -> number + 2



As you can see, add_five has a bug.

If I now test add_six it would fail because the result is incorrect, but the code is correct.

Imagine you have a large tree of functions calling each other, it would be hard to find out which function contains the bug, because all the functions will fail (and not only the one with the bug).

So my question is: should unit tests fail because of incorrect behaviour (wrong results) or because of incorrect code (bugs).

### CompsciOverflow

#### Scheduling a sequence of queue operations that push and pop items at specified times

What is the time complexity of the following problem?

## Definitions

A FIFO is a queue functional unit supporting four commands: PUSH (data to back of queue), POP (the head of the queue), PNP (POP the head of queue and PUSH it to the back), NOP (do nothing). Each command takes one unit of time to execute.

FIFO code (or a schedule of commands) is a sequence of commands to execute.

## Problem Description

We are given $n$ items of data $T_1,\dots,T_n$, and $n$ triplets $(T_1,t^{in}_1,t^{out}_1),\dots,(T_n,t^{in}_n,t^{out}_n)$. $t^{in}_i$ and $t^{out}_i$ identify the time when $T_i$ is PUSHed and POPed respectively. We're guaranteed that $t^{in}_i<t^{out}_i$ for every $i$ and $t^{in}_i,t^{out}_i$ are all unique.

The goal is to produce FIFO code (a schedule of commands) that push each $T_i$ at time $t^{in}_i$ and pop it at time $t^{out}_i$, by adding NOP and PNP commands between the PUSH and POP commands given. No extra PUSH or POP commands can be added: the resulting code must contain exactly $n$ PUSHes and $n$ POPs.

## Example

Input: $(T_1,2,4)$, $(T_2,1,5)$

Solution:

1. PUSH $T_2$
2. PUSH $T_1$
3. PNP
4. POP // T_1
5. POP // T_2

### QuantOverflow

#### Portfolio risk analysis in Options & Mixed portfolios

I am currently working on a risk analysis model that is primarily focused on options portfolios, but will likely be later expanded to cover mixed (options, stocks, bond, futures, etc...) portfolios. This will be used at a non-professional but advanced level to identify overweighted risks and show how proposed positions would affect the portfolio risk balance.

The goal is to be able to clearly show risks in a number of scenarios; Market move up/down, Correction down(w/ IV shock), Individual symbol shocks, etc

I want to be able to show the effect of risks to Portfolio performance and also to the greeks and the resulting risk profile.

The basic portfolio analysis methods such as beta weighting and VaR models seem to be very limited and don't have any concept of IV change or the effects of volatility shocks. I could mix some different models, but I still need the basic underlying models to do that.

Could anyone offer some suggestions for a risk modeling framework or even specific analysis techniques that could be used in simulations to get the results I need? At this point, I am searching but finding little that directly applies. Some guidance would be very welcome.

### infra-talk

#### Puppet Lint Plugins – 2.0 Upgrade and new repo

After the recent puppet-lint 2.0 release and the success of our puppet-lint 2.0 upgrade at work it felt like the right moment to claw some time back and update my own (11!) puppet-lint plugins to allow them to run on either puppet-lint 1 or 2. I’ve now completed this and pushed new versions of the gems to rubygems so if you’ve been waiting for version 2 compatible gems please feel free to test away.

Now I’ve realised exactly how many plugins I’ve ended up with I’ve created a new GitHub repo, unixdaemon-puppet-lint-plugins, that will serve as a nicer discovery point to all of my plugins and a basic introduction to what they do. It’s quite bare bones at the moment but it does present a nicer approach than clicking around my github profile looking for matching repo names.

### Planet Emacsen

#### Irreal: Capturing BibTeX Entries with Google Scholar

Brad Collins has a nice post on collecting BibTeX citations. As he notes, there are plenty of articles on how to generate a citation in Org mode from a BibTeX entry but not on how to gather the entries to begin with.

He starts with a simple Org mode template to capture the citation once you retrieve it from Google Scholar. The idea is that you copy it from Google Scholar and paste it into the capture buffer. If you do this a lot, it would be pretty easy to write a bit of Elisp to automatically copy the citation, bring up the capture buffer, and paste the entry into it.

If you're using Firefox or Chrome you can make things easier by installing the Google Scholar button and then follow Collins' workflow. If you're on a Mac using Safari—or, I suppose, on Windows using one of the Microsoft browsers—his basic workflow still works. Just follow these steps:

1. Go to the Google Scholar page
2. Search for the paper you're interested in
3. Click the “Cite” link at the bottom of the article description
4. Choose BibTeX in the popup
5. A tab will open with the plain text citation in BibTeX format
6. Copy and paste the citation as described in Collins' post

If you're writing a lot of papers for school or for work, Collins' method is an easy way to build up your bibliography database. Even if gathering a citation is an occasional thing, knowing how to use Google Scholar to retrieve it is useful.

### StackOverflow

#### offset randomforestclassifier scikit learn

I wrote a program in python to use a machine learning algorithm to make predictions on data. I use the function RandomForestClassifier from Scikit Learn to create a random forest to make predictions.

The purpose of the program is to predict if an unknown astrophysical source is a pulsar or an agn; so it trains the forest on known data of which it knows if sources are pulsar or agn, then it makes predictions on unknown data, but it doesn’t work. The program predict that unknown data are all pulsar or all agn and it rarely predicts a different result, but not correct.

Below I describe the passages of my program.

It creates a data frame with data for all the sources: all_df It is made of ten columns, nine used as predictors and one as target:

predictors=all_df[['spec_index','variab_index','flux_density','unc_ene_flux100','sign_curve','h_ratio_12','h_ratio_23','h_ratio_34','h_ratio_45']]
targets=all_df['type']


type column contains the label “pulsar” or “agn” for each source.

The values of predictors and targets are used successively in the program to train the forest.

The program divides the predictors and the targets in two sets, the train, which is the 70% of the total, and the test, which is the 30% of the total of all_df, using the function train_test_split from Scikit Learn:

pred_train, pred_test, tar_train, tar_test=train_test_split(predictors, targets, test_size=0.3)


Data in these sets are mixed, so the program orders the indexes of these sets, without changing data position:

pred_train=pred_train.reset_index(drop=True)
pred_test=pred_test.reset_index(drop=True)
tar_train=tar_train.reset_index(drop=True)
tar_test=tar_test.reset_index(drop=True)


After that, the program creates and trains the random forest:

clf=RandomForestClassifier(n_estimators=1000,oob_score=True,max_features=None,max_depth=None,criterion='gini')#,random_state=1)
clf=clf.fit(pred_train,tar_train)


Now the program makes prediction on the test set:

predictions=clf.predict(pred_test)


At this point, the program seems to work.

Now it pass another data frame, with the unknown data, to the forest created above and I have the bad result described before. Can you help me? The problem could be an offset in randomforestclassifier, but I had no significative results modifying randomforestclassifier options. If you need, I can give further explanations. Thanks in advance.

Bye, Fabio

PS: I tried the cross validation too: I divided the train set into train and test again, with the same proportions (0.7 and 0.3), to create, train and test the forest before testing it on the initial test set, modifying randomforestclassifier options to obtain better results, but I had no improvements.

### CompsciOverflow

#### Is my implementation of a Disjoint Set fast?

When I'm reading about new data structures I try to just read a little bit about it, and then implement it. Then I read how to actually implement it, which I think gives me better understanding.

Below is my implementation of a disjoint set (I think), and also some testing.

from random import shuffle

class DisjointSets(object):
def __init__(self, size):
self.set = range(size)

def find(self, i):
if self.set[i] != self.set[self.set[i]]:
self.set[i] = self.find(self.set[i])
return self.set[i]

def union(self, i, j):
self.set[self.find(i)] = self.find(j)

# Create disjoint sets
N = 2 ** 20
s = DisjointSets(N)

# Join all sets
scrambled = range(1, N)
shuffle(scrambled)

for i in scrambled:
s.union(i, i - 1)

# Assert we have one big set
for i in range(N):
assert s.find(i) == s.find(0)


After benchmarking union and find seem to be O(1), at least amortized. (I set N to different values between 2^15 and 2^20.) The thing is that I am not using any rank, which I thought was required to get O(1) performance. (Ignoring inverse Ackermann)

Any input?

### StackOverflow

#### Machine Learning and Kalman filtering

I have a dataset with a large number of features and some thousands of observations. However, a number of observations are still missing and I want to estimate those missing observations by exploiting the pattern or periodicity of the available data. My questions are

1. Is it possible to apply Kalman filtering to estimate those missing values?
2. Is Kalman filtering used in anomaly detection problems?

Lastly, If I have unknown data, without any labelling, then how can I train the classifier in this case. My understanding is that for supervised machine learning an example data is available with labels already. From that example dataset, we train the classifier and use this in classifying unseen observations and outlier detection. I have one dataset that doesn't have outliers but there is no labelling in it. And, there is another dataset with same features but it contains outliers in it. Also, it contains much larger number of an observation than the uncorrupted data set. How should I tackle outlier detection problem in this case?

### StackOverflow

#### How to display the premise and consequence when the setCar is set to true?

I want to get the premise and consequence for each line of generated rules after the running the apriori algorithm in Weka 3.8.0.

        apriori.setNumRules(NUMBER_OF_RULES);
apriori.setMinMetric(MINIMUM_CONFIDENCE);
apriori.setLowerBoundMinSupport(MINIMUM_SUPPORT);

apriori.setCar(true);

apriori.buildAssociations(instances);


I tried the code below to get the rules but it gives me an exception (weka.associations.ItemSet cannot be cast to weka.associations.AprioriItemSet):

        AssociationRules arules = apriori.getAssociationRules();


Also, I tried using the getAllTheRules() method but it gives me a different result.

    ArrayList<Object>[] arules = apriori.getAllTheRules();
System.out.println(((ItemSet)arules[0].get(1)).getRevision()); //12014
System.out.println(((ItemSet)arules[0].get(2)).getRevision()); //12014
System.out.println(((ItemSet)arules[0].get(5)).getRevision()); //12014


### CompsciOverflow

#### computer networks and graph theory [on hold]

What is the fastest way to find the distance between 2 nodes , if the intermediate node change during runtime?

### StackOverflow

#### Getting only NaN when using Tensorflow's C++ API

So I have this function in my code:

tensorflow::Tensor predictV1(tensorflow::Session* sess, tensorflow::Tensor X)
{
assert(X.dim_size(1) == 480);
assert(X.dim_size(2) == 640);
assert(X.dim_size(3) == 3);
tensorflow::Tensor keep_prob(tensorflow::DT_FLOAT, tensorflow::TensorShape());
keep_prob.scalar<float>()() = 1.0;

std::cout<<"Created keep_prob"<<std::endl;
std::vector<std::pair<std::string, tensorflow::Tensor>> inputs = {
{ "x", X },
{ "keep_prob", keep_prob},
};
std::cout<<"Created the input vector"<<std::endl;
std::vector<tensorflow::Tensor> outputs;
// Run the session, evaluating our "y" operation from the graph
tensorflow::Status status = sess->Run(inputs, {"y"}, {"y"}, &outputs);
if (!status.ok()) {
std::cout << status.ToString() << "\n";
return X;
}
tensorflow::Tensor Y = outputs[0];
auto T_M = Y.tensor<float,4>();
std::cout<<T_M(0,0,0,0)<<std::endl;
return Y;
}


Now sess in created by LoadGraph function in the TF's C++ documentation here. So I create a frozen graph (using freeze_graph script in TF). I feed it LoadGraph which in turn I feed to this functions.

When I run the program, all values of Y is -nan. So I went back to test it in python, and it seems like there isn't this problem over there. (Note: the graph is created by this script)

So where does the error come from?

EDIT: I forgot the cnn_functions

### QuantOverflow

#### SEC 10-Q/K Filings

I am working on some research that requires parsing of SEC 10 K/Q filings. We have built a parser that will parse the raw txt SEC filing that usually contains many blocks of unencoded files (html, xml, pdfs, images, spreadsheets, etc). A typical decoded 10 K/Q (as of CY 2014) has a set of files that looks like the following:

Does anyone have any documentation or guidance that explains what the R1.htm - RX.htm files are supposed to contain and more broadly any documentation that describes what is typically found in a decoded 10 K/Q? The SEC doesn't have any documentation at this level of granularity. (Reasons being this that this submission exemplified above maybe from that of a particular filing prep vendor / software, however, this format seems to be the most pervasive as of CY2014).

Thank you in advance for any guidance.

### StackOverflow

#### Functional Programming exercise with Scala

I have recently started reading the book Functional Programming in Scala by Paul Chiusano and Rúnar Bjarnason, as a means to learn FP. I want to learn it because it will open my head a bit, twist my way of thinking and also hopefully make me a better programmer overall, or so I hope.

In their book, Chp. 3, they define a basic singly-linked-list type as follows:

package fpinscala.datastructures

sealed trait List[+A]
case object Nil extends List[Nothing]
case class Cons[+A](head: A, tail: List[A]) extends List[A]

object List {
def sum(ints: List[Int]): Int = ints match {
case Nil => 0
case Cons(x,xs) => x + sum(xs)
}
def product(ds: List[Double]): Double = ds match {
case Nil => 1.0
case Cons(0.0, _) => 0.0
case Cons(x,xs) => x * product(xs)
}
def apply[A](as: A*): List[A] =
if (as.isEmpty) Nil
}


I'm now working on implementing the tail method, which shall work similarly to the tail method defined in Scala libraries. I guess that the idea here, is to define a tail method inside the List object, what they call a companion method, and then call it normally in another file (like a Main file).

So far, I have this:

def tail[A](ls: List[A]): List[A] = ls match {
case Nil => Nil
case Cons(x,xs) => xs
}


Then I created a Main file in another folder:

package fpinscala.datastructures

object Main {
def main(args:Array[String]):Unit = {
println("Hello, Scala !! ")
val example = Cons(1, Cons(2, Cons(3, Nil)))
val example2 = List(1,2,3)
val example3 = Nil
val total = List.tail(example)
val total2 = List.tail(example3)
println(total2)
}
}


This works and gives me:

Hello, Scala !!
Cons(2,Cons(3,Nil))


My question is:

Is this the correct way to write the tail method, possibly, as the authors intended? And is this package structure correct? Because it feels very wrong to me, although I just followed the authors package.

I also don't know if I should have used a specific type instead of writing a polymorphic method (is this the name?)...

Bear with me, for I am a newbie in the art of FP.

### QuantOverflow

#### Why does one get a delta greater than 1 when using the likelihood estimator?

I'm using the likelihood estimator as derived in this pdf to calculate the delta of a european call option. However I consistently get a delta exceeding one. I'm taking 10,000 samples of Z for calculating the delta.

Thanks

#### Is there a specific meaning to the word "convoluted" in maths or mathematical finance?

I'm reading about copula estimation in the book Financial Modeling Under Non-Gaussian Distributions by Jondeau, Poon and Rockinger. They say that full maximum likelihood can be difficult because of i) dimensionality and because ii) "the copula parameter may be a "convoluted expression" of the margins parameter."

I've been looking for a translation of that word, but I only find difficult or complicated as synonyms of convoluted. I believe there is more to it than just "complicated". Is there maybe a more specific meaning for this word? Would you know what the authors mean with "convoluted expression" in this context?