# Planet Primates

## May 02, 2016

### TheoryOverflow

#### On classes $AWPP$ and $APP$

(1) $PP$ contains problems like 'is perm(M)>k'. So what problems does $AWPP$ and $APP$ contain with respect to permanent?

(2) Since it is not known if $NP$ is in $AWPP$ or $APP$ is there a candidate problem for $NP$ not in $AWPP$ other than $NP$ complete problems?

### QuantOverflow

#### Daycount Actual/Actual AFB example

This question is about the following example in Wikipedia about time factor using the Actual/Actual AFB daycount.

Assume that the $t_1=\text{28 Feb 2004}$ and $t_2=\text{29 Feb 2008}$.

There are $4$ years in between. According to the rule, we define $t_3=\text{29 Feb 2004}$ and add the number of days between $t_1$ and $t_3$ divided by either $365$ or $366$ depending on whether there is a $\text{29Feb}$ between ...

Here is one of the points where I am not clear. Should it be between $t_1$ and $t_2$ or from $t_1$ to $t_3$.

The other part that I am not clear is if the interval should be considered open at the right endpoint or not. This is, if the $\text{29Feb}$ should be in the interval $[t_1,t_2)$ or $[t_1,t_2]$, for the case in which $t_2$ is the relevant end-point.

In Wikipedia, the example currently obtains the same daycount: $4 + \frac{1}{366}$ regardless of whether the ISDA convention is applied or not.

### CompsciOverflow

#### Path finding trivial problems

Is there any pathfinding trivial problems like the TSP?
I know that is possible to solve TSP using A*, Best-first and using spanning tree and other things as heuristic but, is there any problem well know by the AI community?

#### How to calculate Huffman code length without generating the actual huffman code?

Given a vector of elements, how can I calculate the length of the Huffman codewords without generating the Huffman code itself?

Using Matlab, I was able to compute the Huffman code and and get the length of the codewords but is it possible to get the lengths alone without computing the codewords?

### StackOverflow

#### Erratic behavior of train_test_split() in scikit-learn

Python 3.5 (anaconda install) SciKit 0.17.1

I just can't understand why train_test_split() has been giving me what I consider unreliable splits of a list of training cases.

Here's an example. My list trnImgPaths has 3 classes, each one with 67 images (total 201 images):

['/Caltech101/ferry/image_0001.jpg',
... thru ...
'/Caltech101/ferry/image_0067.jpg',
'/Caltech101/laptop/image_0001.jpg',
... thru ...
'/Caltech101/laptop/image_0067.jpg',
'/Caltech101/airplane/image_0001.jpg',
... thru ...
'/Caltech101/airplane/image_0067.jpg']


My list of targets trnImgTargets perfectly matches this both in length and also the classes themselves align perfectly with trnImgPaths.

In[148]: len(trnImgPaths)
Out[148]: 201
In[149]: len(trnImgTargets)
Out[149]: 201


If I run:

[trnImgs, testImgs, trnTargets, testTargets] = \
train_test_split(trnImgPaths, trnImgTargets, test_size=141, train_size=60, random_state=42)


or

[trnImgs, testImgs, trnTargets, testTargets] = \
train_test_split(trnImgPaths, trnImgTargets, test_size=0.7, train_size=0.3, random_state=42)


or

[trnImgs, testImgs, trnTargets, testTargets] = \
train_test_split(trnImgPaths, trnImgTargets, test_size=0.7, train_size=0.3)


Although I end up getting:

In[150]: len(trnImgs)
Out[150]: 60
In[151]: len(testImgs)
Out[151]: 141
In[152]: len(trnTargets)
Out[152]: 60
In[153]: len(testTargets)
Out[153]: 141


I never get a perfect split of 20 - 20 - 20 for the training set. I can tell because both by manual checking and doing a sanity check by confusion matrix. Here are the results for each experiment above, respectively:

[[19  0  0]
[ 0 21  0]
[ 0  0 20]]

[[19  0  0]
[ 0 21  0]
[ 0  0 20]]

[[16  0  0]
[ 0 22  0]
[ 0  0 22]]


I expected the split to be perfectly balanced. Any thoughts why this is happening?

It even appears it may be misclassifying a few cases a priori, because there will never be n=22 training cases for a given class.

### CompsciOverflow

#### On power of $P/poly$

(1) We know that $EXP ⊄ P/poly ⇒ BPP$ is in $SUBEXP$. Does $SUBEXP ⊄ P/poly$ mean $P=BPP$ or anything close?

(2) We know that if $NP$ is in $P/poly$ then $PH$ collapses to second level. What is the consequence if $\oplus P$ is in $P/poly$?

### TheoryOverflow

#### Deep Neural Network

Every deep network is typically trained to carry out a particular task (sentence prediction, image classification or speech recognition, for example). However, a single human brain network is capable of performing all these tasks simultaneously. Describe how you would go about training a “Jack-of-all-trades” network that can do many tasks reasonably well. What kind of issues do you expect while training such a model? Describe what regularization methods, weight decay methods, and loss functions you would consider in your approach

### StackOverflow

#### decision tree for significant variables

how can I use decision tree graph to determine the significant variables,I know which one has largest information gain should be in the root of tree which means has small entropy so this is my graph if I want to know which variables are significant how can I interpret

thanks for help
enter image description here

### CompsciOverflow

#### Equivalence of data-flow analysis, abstract interpretation and type inference?

@Babou's answer to a recent question reminds me that at one time I think I read a paper about the equivalence (in terms both of the facts that can be inferred or proved and the time complexity of running the inference algorithm) of data-flow analysis, abstract interpretation, and type inference.

In some sub-cases (like between forward context-sensitive interprocedural data-flow analysis and abstract interpretation) the equivalence is relatively obvious to me, but the question seems more subtle for other comparisons. For example, I can't figure out how Hindley-Milner type inference could be used to prove some of the properties that can be proved with flow-sensitive data-flow analysis.

What are the seminal references discussing the equivalences (or differences) between data-flow analysis, abstract interpretation and type inference?

#### How can I explain to my parents that I study programming languages? (soft question)

(the title is oversimplified, but hey, it is just a title)

I am currently finishing my MSc in computer science. I am interested in programming languages, especially in type systems. I got interested in research in this field and next semester I will start a PhD on the subject.

Now here is the real question: how can I explain what I (want to) do to people with no previous knowledge in either computer science or related fields?

The title comes from the facts that I am not even able to explain what I do to my parents, friends and so on. Yeah, I can say "the whole point is to help software developers to write better software", but I do not think it is really useful: they are not aware of "programming", they have not clue of what it means. It feels like I am saying I am an auto mechanic to someone from the Middle Ages: they simply do not know what I am talking about, let alone how to improve it.

Does anyone have good analogies with real-world? Enlightening examples causing "a-ha" moments? Should I actually show a short and simple snippet of code to 60+ year-old with no computer science (nor academic) experience? If so, which language should I use? Did anyone here face similar issues?

I know it is a broad question, maybe border-line with the policy of this site, but I think that some good, straight-to-the-point answers could come. If you know how to rephrase the question to better fit the policies, please edit or leave a comment.

(is there a soft-question tag? Maybe its absences means this is not really a good question for this site...)

### StackOverflow

#### Machine learning libraries support to spark streaming

I am working on the spark streaming project where machine learning need to implemented. I have seen there are couple of algorithms are present in spark but mostly for batch only. For streaming, there are only streaming linear regression and kmeans streaming are present . Is it true. Can I use other machine learning libraries ( random forest) in streaming ?

#### LSTM network learning

I have attempted to program my own LSTM (long short term memory) neural network. I would like to verify that the basic functionality is working. I have implemented a Back propagation through time BPTT algorithm to train a single cell network.

Should a single cell LSTM network be able to learn a simple sequence, or are more than one cells necessary? The network does not seem to be able to learn a simple sequence such as 1 0 0 0 1 0 0 0 1 0 0 0 1.

I am sending the the sequence 1's and 0's one by one, in order, into the network, and feeding it forward. I record each output for the sequence.

After running the whole sequence through the LSTM cell, I feed the mean error signals back into the cell, saving the weight changes internal to the cell, in a seperate collection, and after running all the errors one by one through and calculating the new weights after each error, I average the new weights together to get the new weight, for each weight in the cell.

Am i doing something wrong? I would very appreciate any advice.

Thank you so much!

### CompsciOverflow

#### Why is determining if there is a solution to a Battleship puzzle NP-Complete?

This paper http://www.mountainvistasoft.com/docs/BattleshipsAsDecidabilityProblem.pdf says that the decision problem, "Given a particular puzzle, is there a solution?" is NP-Complete. I don't understand why this can't be done in polynomial time. Given constraints that no two ships can be orthogonally or diagonally adjacent, why not just create a grid where there are 2 times as many columns as "bins" with enough rows to put a "separator" run in between every ship. I've seen the reduction demonstrated this way and it seems like it could be done in polynomial time.

#### Backward vs Forward Data-flow Analysis

I understand how both forward and backward data-flow analysis work but in what situations would we use them? Why do we need to be able to do it in both ways? Do compilers of certain types perform one way more efficiently than the other?

### QuantOverflow

#### approximating fBm sotchastic integral

Suppose I have the following stochastic integral:

$$\int_a^b f(t)dB_H(t)$$

with the term $dB_H(t)$ a fractional brownian motion with associated $H$ parameter.

Is it true that for $H \in (1/2,1)$, we have the following result?

$$\int_a^b f(t)dB_H(t) := \lim_{\Delta t_k \rightarrow0} \sum_k f(t_k)[(B_H(t_{k+1})-B_H(t_k)]$$

### StackOverflow

#### Regression prediction with Azure

I have a date column and a "value"colum (with values 1 or 0). I want to predict the value column given a date (when it will be 1 and when it will be 0).

I have created a regression model, and obtained the same scored label for every input. 0.698689.

I have created the predictive model, and added the project column as you said, in order to not ask me for the value input. Now it asks me for the date, and I obtain the scored label, and the value is not obtained. I understand that socred label is the accurancy, but not the predictive value.

Could you help me?

Thanks

### StackOverflow

#### Microsoft Tay -like learning algorithm?

what kind of algorithm a system like Microsoft Tay used to "learn new knowledge"?

I just started studying machine learning (ML) and natural language processing (NLP), and they seem to be all about classifications, finding relations, calculating similarities, but nothing about actual understanding/meaning.

I know that "understanding" a complex topic, so my question is how an autonomous system like Tay was able to respond to user statements and keep a dialog?

by using naive bayes classification, for example, I can find a document that is more related to the user statement, and than, inspect each paragraph, find the one most related, and present it to the user. this would be a super-primitive QA system, but is infinitely far from a conversation.

tools like word2vec also help in finding the best matches and most related documents, but this is also not a conversation.

I imagine that there might be a combination of state-machines, maybe some logic programming with Prolog (to keep some structured knowledge), some NLP filtering and classification to process user statements, and some other strategies to promote some intention or objetive to keep the conversation wheel spinning.

(I know that Tay algorithms are closed, but is there any open-source arrangement or paper I could look for?)

### UnixOverflow

#### Restart specific racoon tunnel

I have several gif* interfaces on my FreeBSD box. They are representing tunnels, encrypted using racoon+ipsec. If, at some moment, one of the tunnels hangs up, I am forced to reset racoon this way:

/usr/local/etc/rc.d/racoon restart


But in that case all tunnels are reset, which leads to a short absence of connectivity on all my tunnels (3-5 seconds, but nevertheless).

Is there any method to reset one specific gif tunnel, while not touching any other tunnels?

### TheoryOverflow

#### Minmax vs Maxmin

I'm reading this paper about building a combat simulator for 8 unit vs 8 unit mini combats in StarCraft: Brood War. The basic idea is to build a search tree simulating these small combats in order to determine next best move in combat scenarios in StarCraft game play. Section 3.2 (which probably is necessary to understand my question) is the part I am having trouble with, where he talks about approximating a combat (where simultaneous moves occur) with a version where alternating move occur. This allows him to use minmax trees and alpha-beta search.

The part I don't understand is when he describes how building the minmax tree gives an advantage to one player, while a maxmin tree gives an advantage to the other player. For one, his diagram labels the tree where "max" goes first as "maxmin", but his text (and wikipedia :P) describe the tree where max goes first as "minmax". Perhaps that is just a mislabelling.

The main part I am confused about is when he goes on to say:

Proposition 1: For stacked matrix games G, we have $$mini(G) \leq Nash(G) \leq maxi(G)$$

My understanding of this is that $Nash(G)$ is the real final game state taking into account simultaneous moves. Then $mini(G)$ is the final game state if we approximate with MAX moving first, and $maxi(G)$ is the final game state if we approximate with MIN moving first. Besides the Nash equilibrium stuff, which is a bit beyond my education so far, I don't understand how the inequality $$mini(G) \leq maxi(G)$$ can be true. Below are 4 examples. The leaf node values come, I believe, from an evaluation function of that game state's value to MAX:

The two on the left are maxmin, meaning MIN goes first. The two on the right are minmax, with MAX going first. The top pair contains one set of leaf node values that lead to $$5=maxi(G) \lt mini(G)=8$$ This goes against the proposition above. However, the bottom pair contain a different set of leaf node values that lead to $$4=maxi(G) \gt mini(G)=3$$ Here the proposition seems to hold. I believe I could also construct a scenario where $mini(G) = maxi(G)$.

What am I missing here? Is there really a hard relation between $mini(G)$ and $maxi(G)$ or doesn't it just depend on the leaf node values?

### QuantOverflow

#### Is the money market account (MMA) numeraire and the forward measure equivalent?

Suppose we have a risk-neutral measure $\tilde{\mathbb{P}}$. The money market account is given as $M(t) = e^{\int^t_0 R(s) ds}$, while the price of the zero-coupon bond at time $t$ that matures at $T$ is denoted $B(t,T)$.

So, the forward measure is defined to be the measure with $B(t,T)$ taken as the numeraire. However, I am curious if taking $M(t)$ will also make the measure into a forward measure. If this is not true in general, does it work when the interest rate is constant as $R(t) = r$? This would imply that $B(t,T) = e^{r(T-t)}$, and $B(0,T) = \frac{1}{M(T)}$ and $B(T,T) = \frac{1}{M(0)}$, which seems to imply somewhat of a connection between the two measures just by looking at the Radon-Nikodym derivative, $\mathbb{Z}$.

Also, I have an additional question about the usefulness of the forward measure. It seems that forward measures are useful in options pricing because we can take the discount out for the risk-neutral pricing formula so that $V(t) = D(t) \tilde{\mathbb{E}}^F[V(T) | {\cal{F}}(t)]$. But are there any other advantages of using the forward measure?

### TheoryOverflow

#### Patience Sort+ ping pong merge implementation

A recent paper out of Microsoft Research describes a new, faster implementation of the patience sort algorithm. A key part of the implementation is an improved merging strategy dubbed the "ping-pong" merge. I am confused as to why this merge strategy uses two arrays to perform the merging, instead of just using a single array and always performing a "blind merge" as described in the paper. It seems that always performing blind merges, and thus only using a single array to perform the merge, would cut down memory usage with no change in runtime.

### StackOverflow

#### How to use Theano/TensorFlow/Keras for running SGD without neural nets?

Given a model equation, a specific loss function and gradients (that I've already derived), how do I use something like Theano/TensorFlow (or Keras since it's more generic) to train the model without using neural nets?

I simply want to use SGD to minimize the regularized logistic loss. Is this a good example: http://www.deeplearning.net/tutorial/logreg.html ?

Equations (1) and (2) of http://arxiv.org/pdf/1510.04935v2.pdf are, for instance, something I'm trying to work with.

#### Creating a filterable list with RxJS

I'm trying to get into reactive programming. I use array-functions like map, filter and reduce all the time and love that I can do array manipulation without creating state.

As an exercise, I'm trying to create a filterable list with RxJS without introducing state variables. In the end it should work similar to this:

I would know how to accomplish this with naive JavaScript or AngularJS/ReactJS but I'm trying to do this with nothing but RxJS and without creating state variables:

var list = [
'John',
'Marie',
'Max',
'Eduard',
'Collin'
];

Rx.Observable.fromEvent(document.querySelector('#filter'), 'keyup')
.map(function(e) { return e.target.value; });

// i need to get the search value in here somehow:
Rx.Observable.from(list).filter(function() {});


Now how do I get the search value into my filter function on the observable that I created from my list?

Thanks a lot for your help!

### QuantOverflow

#### SABR Calibration: Normal vs Log-Normal Market Data

This question is about getting some clarification as to how to understand market quotes for normal & log-normal vols together with certain model assumptions.

So let us define

#### Dynamic Clustering and Sleep Mode Strategies for Small Cell Networks. (arXiv:1604.08758v1 [cs.NI])

In this paper, a novel cluster-based approach for optimizing the energy efficiency of wireless small cell networks is proposed. A dynamic mechanism based on the spectral clustering technique is proposed to dynamically form clusters of small cell base stations. Such clustering enables intra-cluster coordination among the base stations for optimizing the downlink performance through load balancing, while satisfying users' quality-of-service requirements. In the proposed approach, the clusters use an opportunistic base station sleep-wake switching mechanism to strike a balance between delay and energy consumption. The inter-cluster interference affects the performance of the clusters and their choices of active or sleep state. Due to the lack of inter-cluster communications, the clusters have to compete with each other to make decisions on improving the energy efficiency. This competition is formulated as a noncooperative game among the clusters that seek to minimize a cost function which captures the tradeoff between energy expenditure and load. To solve this game, a distributed learning algorithm is proposed using which the clusters autonomously choose their optimal transmission strategies. Simulation results show that the proposed approach yields significant performance gains in terms of reduced energy expenditures up to 40% and reduced load up to 23% compared to conventional approaches.

#### Opportunistic Sleep Mode Strategies in Wireless Small Cell Networks. (arXiv:1604.08756v1 [cs.NI])

The design of energy-efficient mechanisms is one of the key challenges in emerging wireless small cell networks. In this paper, a novel approach for opportunistically switching ON/OFF base stations to improve the energy efficiency in wireless small cell networks is proposed. The proposed approach enables the small cell base stations to optimize their downlink performance while balancing the load among each another, while satisfying their users' quality-of-service requirements. The problem is formulated as a noncooperative game among the base stations that seek to minimize a cost function which captures the tradeoff between energy expenditure and load. To solve this game, a distributed learning algorithm is proposed using which the base stations autonomously choose their optimal transmission strategies. Simulation results show that the proposed approach yields significant performance gains in terms of reduced energy expenditures up to 23% and reduced load up to 40% compared to conventional approaches.

#### Enabling Relaying Over Heterogeneous Backhauls in the Uplink of Wireless Femtocell Networks. (arXiv:1604.08744v1 [cs.NI])

In this paper, we develop novel two-tier interference management strategies that enable macrocell users (MUEs) to improve their performance, with the help of open-access femtocells. To this end, we propose a rate-splitting technique using which the MUEs optimize their uplink transmissions by dividing their signals into two types: a coarse message that is intended for direct transmission to the macrocell base station and a fine message that is decoded by a neighboring femtocell and subsequently relayed over a heterogeneous (wireless/wired) backhaul. For deploying the proposed technique, we formulate a non-cooperative game between the MUEs in which each MUE can decide on its relaying femtocell while maximizing a utility function that captures both the achieved throughput and the expected backhaul delay. Simulation results show that the proposed approach yields up to 125% rate improvement and up to 2 times delay reduction with wired backhaul and, 150% rate improvement and up to 10 times delay reduction with wireless backhaul, relative to classical interference management approaches, with no cross-tier cooperation.

#### Verifying Buchberger's Algorithm in Reduction Rings. (arXiv:1604.08736v1 [cs.SC])

In this paper we present the formal, computer-supported verification of a functional implementation of Buchberger's critical-pair/completion algorithm for computing Gr\"obner bases in reduction rings. We describe how the algorithm can be implemented and verified within one single software system, which in our case is the Theorema system.

In contrast to existing formal correctness proofs of Buchberger's algorithm in other systems, e. g. Coq and ACL2, our work is not confined to the classical setting of polynomial rings over fields, but considers the much more general setting of reduction rings; this, naturally, makes the algorithm more complicated and the verification more difficult.

The correctness proof is essentially based on some non-trivial results from the theory of reduction rings, which we formalized and formally proved as well. This formalization already consists of more than 800 interactively proved lemmas and theorems, making the elaboration an extensive example of higher-order theory exploration in Theorema.

#### System Level Performance Evaluation of LTE-V2X Network. (arXiv:1604.08734v1 [cs.NI])

Vehicles are among the fastest growing type of connected devices. Therefore, there is a need for Vehicle-to-Everything (V2X) communication i.e. passing of information from a Vehicle-to-Vehicle (V2V) or Vehicle-to-Infrastructure (V2I) and vice versa. In this paper, the main focus is on the communication between vehicles and road side units (RSUs) commonly referred to as V2I communication in a multi-lane freeway scenario. Moreover, we analyze network related bottlenecks such as the maximum number of vehicles that can be supported when coverage is provided by the Long Term Evolution Advanced (LTE-A) network. The performance evaluation is assessed through extensive system-level simulations. Results show that new resource allocation and interference mitigation techniques are needed in order to achieve the required high reliability requirements, especially when network load is high.

#### Exploring Social Networks for Optimized User Association in Wireless Small Cell Networks with Device-to-Device Communications. (arXiv:1604.08727v1 [cs.NI])

In this paper, we propose a novel social network aware approach for user association in wireless small cell networks. The proposed approach exploits social relationships between user equipments (UEs) and their physical proximity to optimize the network throughput. We formulate the problem as a matching game between UEs and their serving nodes (SNs). In our proposed game, the serving node can be a small cell base station (SCBS) or an important node with device-to-device capabilities. In this game, the SCBSs and UEs maximize their respective utility functions capturing both the spatial and social structures of the network. We show that the proposed game belongs to the class of matching games with externalities. Subsequently, we propose a distributed algorithm using which the SCBSs and UEs interact and reach a stable matching. We show the convergence of the proposed algorithm and study the properties of the resulting matching. Simulation results show that the proposed socially-aware user association approach can efficiently offload traffic while yielding a significant gain reaching up to 63% in terms of data rates as compared to the classical (social-unaware) approach.

#### "Knowing value" logic as a normal modal logic. (arXiv:1604.08709v1 [cs.AI])

Recent years witness a growing interest in nonstandard epistemic logics of "knowing whether", "knowing what", "knowing how" and so on. These logics are usually not normal, i.e., the standard axioms and reasoning rules for modal logic may be invalid. In this paper, we show that the conditional "knowing value" logic proposed by Wang and Fan (2013) can be viewed as a disguised normal modal logic by treating the negation of Kv operator as a special diamond. Under this perspective, it turns out that the original first-order Kripke semantics can be greatly simplified by introducing a ternary relation $R_i^c$ in standard Kripke models which associates one world with two $i$-accessible worlds that do not agree on the value of constant $c$. Under intuitive constraints, the modal logic based on such Kripke models is exactly the one studied by Wang and Fan (2013,2014). Moreover, there is a very natural binary generalizations of the "knowing value" diamond, which, surprisingly, does not increase the expressive power of the logic. The resulting logic with the binary diamond has a transparent normal modal system which sharpens our understanding of the "knowing value" logic and simplifies some previous hard problems.

#### A Modest Proposal for Open Market Risk Assessment to Solve the Cyber-Security Problem. (arXiv:1604.08675v1 [cs.CR])

We introduce a model for a market based economic system of cyber-risk valuation to correct fundamental problems of incentives within the information technology and information processing industries. We assess the makeup of the current day marketplace, identify incentives, identify economic reasons for current failings, and explain how a market based risk valuation system could improve these incentives to form a secure and robust information marketplace for all consumers by providing visibility into open, consensus based risk pricing and allowing all parties to make well informed decisions.

#### Dependence between External Path-Length and Size in Random Tries. (arXiv:1604.08658v1 [math.CO])

We study the size and the external path length of random tries and show that they are asymptotically independent in the asymmetric case but strongly dependent with small periodic fluctuations in the symmetric case. Such an unexpected behavior is in sharp contrast to the previously known results that the internal path length is totally positively correlated to the size and that both tend to the same normal limit law. These two examples provide concrete instances of bivariate normal distributions (as limit laws) whose correlation is $0$, $1$ and periodically oscillating.

Exploiting the unlicensed spectrum is considered by 3GPP as one promising solution to meet the ever-increasing traffic growth. As a result, one major enhancement for LTE in Release 13 has been to enable its operation in the unlicensed spectrum via Licensed-Assisted Access (LAA). In this article, we provide an overview of the Release 13 LAA technology including motivation, use cases, LTE enhancements for enabling the unlicensed band operation, and the coexistence evaluation results contributed by 3GPP participants.

#### Throughput and range characterization of IEEE 802.11ah. (arXiv:1604.08625v1 [cs.NI])

The most essential part of Internet of Things (IoT) infrastructure is the wireless communication system that acts as a bridge for the delivery of data and control messages. However, the existing wireless technologies lack the ability to support a huge amount of data exchange from many battery driven devices spread over a wide area. In order to support the IoT paradigm, the IEEE 802.11 standard committee is in process of introducing a new standard, called IEEE 802.11ah. This is one of the most promising and appealing standards, which aims to bridge the gap between traditional mobile networks and the demands of the IoT. In this paper, we first discuss the main PHY and MAC layer amendments proposed for IEEE 802.11ah. Furthermore, we investigate the operability of IEEE 802.11ah as a backhaul link to connect devices over a long range. Additionally, we compare the aforementioned standard with previous notable IEEE 802.11 amendments (i.e. IEEE 802.11n and IEEE 802.11ac) in terms of throughput (with and without frame aggregation) by utilizing the most robust modulation schemes. The results show an improved performance of IEEE 802.11ah (in terms of power received at long range while experiencing different packet error rates) as compared to previous IEEE 802.11 standards.

#### Stringer: Balancing Latency and Resource Usage in Service Function Chain Provisioning. (arXiv:1604.08618v1 [cs.NI])

Network Functions Virtualization, or NFV, enables telecommunications infrastructure providers to replace special-purpose networking equipment with commodity servers running virtualized network functions (VNFs). A service provider utilizing NFV technology faces the SFC provisioning problem of assigning VNF instances to nodes in the physical infrastructure (e.g., a datacenter), and routing Service Function Chains (sequences of functions required by customers, a.k.a. SFCs) in the physical network. In doing so, the provider must balance between various competing goals of performance and resource usage. We present an approach for SFC provisioning, consisting of three elements. The first element is a fast, scalable round-robin heuristic. The second element is a Mixed Integer Programming (MIP) based approach. The third element is a queueing-theoretic model to estimate the average latency associated with any SFC provisioning solution. Combined, these elements create an approach that generates a set of SFC provisioning solutions, reflecting different tradeoffs between resource usage and performance.

#### On the Inefficiency of Standard Multi-Unit Auctions. (arXiv:1303.1646v3 [cs.GT] UPDATED)

We study two standard multi-unit auction formats for allocating multiple units of a single good to multi-demand bidders. The first one is the Discriminatory Auction, which charges every winner his winning bids. The second is the Uniform Price Auction, which determines a uniform price to be paid per unit. Variants of both formats find applications ranging from the allocation of state bonds to investors, to online sales over the internet, facilitated by popular online brokers. For these formats, we consider two bidding interfaces: (i) standard bidding, which is most prevalent in the scientific literature, and (ii) uniform bidding, which is more popular in practice. In this work, we evaluate the economic inefficiency of both multi-unit auction formats for both bidding interfaces, by means of upper and lower bounds on the Price of Anarchy for pure Nash equilibria and mixed Bayes-Nash equilibria. Our developments improve significantly upon bounds that have been obtained recently in [Markakis, Telelis, ToCS 2014] and [Syrgkanis, Tardos, STOC 2013] for submodular valuation functions. Moreover, we consider for the first time bidders with subadditive valuation functions for these auction formats. Our results signify that these auctions are nearly efficient, which provides further justification for their use in practice.

#### A partition of the hypercube into maximally nonparallel Hamming codes. (arXiv:1210.0010v2 [cs.IT] UPDATED)

By using the Gold map, we construct a partition of the hypercube into cosets of Hamming codes such that for every two cosets the corresponding Hamming codes are maximally nonparallel, that is, their intersection cardinality is as small as possible to admit nonintersecting cosets.

### CompsciOverflow

#### Extracting features for texture classification

I m a beginner in the field of pattern recognition and computer vision. I m working on a project right now to classify t-shirt patterns into three categories i.e. solids, stripes and checks. I have close up training images of the t-shirt images. A sample shirt image looks like this

I have looked at a bank of gabor filter features, but they are computationally expensive. It would of great help if someone could point me out in the general direction for working forward. Any help is appreciated.

EDIT - I found the solution in D.W.'s answer below, though my solution is not very good. I m classifying solid patterns by counting the number of line segments in the image. If they fall below a certain number, i m classifying them as solid. If not, i further classify them into stripes or checkered using HoG features and a linear SVM. The accuracy achieved was around 91%. It was a little low due to some misclassifed samples in the training set.

### Planet Theory

#### On the Complexity of Solving Zero-Dimensional Polynomial Systems via Projection

Authors: Cornelius Brand, Michael Sagraloff
Abstract: Given a zero-dimensional polynomial system consisting of n integer polynomials in n variables, we propose a certified and complete method to compute all complex solutions of the system as well as a corresponding separating linear form l with coefficients of small bit size. For computing l, we need to project the solutions into one dimension along O(n) distinct directions but no further algebraic manipulations. The solutions are then directly reconstructed from the considered projections. The first step is deterministic, whereas the second step uses randomization, thus being Las-Vegas.

The theoretical analysis of our approach shows that the overall cost for the two problems considered above is dominated by the cost of carrying out the projections. We also give bounds on the bit complexity of our algorithms that are exclusively stated in terms of the number of variables, the total degree and the bitsize of the input polynomials.

#### Complexity Hierarchies and Higher-Order Cons-Free Rewriting

Authors: Cynthia Kop, Jakob Grue Simonsen
Abstract: Constructor rewriting systems are said to be cons-free if, roughly, constructor terms in the right-hand sides of rules are subterms of constructor terms in the left-hand side; the computational intuition is that rules cannot build new data structures. It is well-known that cons-free programming languages can be used to characterize computational complexity classes, and that cons-free first-order term rewriting can be used to characterize the set of polynomial-time decidable sets.

We investigate cons-free higher-order term rewriting systems, the complexity classes they characterize, and how these depend on the order of the types used in the systems. We prove that, for every k $\geq$ 1, left-linear cons-free systems with type order k characterize E$^k$TIME if arbitrary evaluation is used (i.e., the system does not have a fixed reduction strategy).

The main difference with prior work in implicit complexity is that (i) our results hold for non-orthogonal term rewriting systems with possible rule overlaps with no assumptions about reduction strategy, (ii) results for such term rewriting systems have previously only been obtained for k = 1, and with additional syntactic restrictions on top of cons-freeness and left-linearity.

Our results are apparently among the first implicit characterizations of the hierarchy E = E$^1$TIME $\subseteq$ E$^2$TIME $\subseteq$ .... Our work confirms prior results that having full non-determinism (via overlaps of rules) does not directly allow characterization of non-deterministic complexity classes like NE. We also show that non-determinism makes the classes characterized highly sensitive to minor syntactic changes such as admitting product types or non-left-linear rules.

### Fefe

#### Greenpeace sagt, sie wollen heute TTIP veröffentlichen.Die ...

Greenpeace sagt, sie wollen heute TTIP veröffentlichen.

Aber irgendwie hab ich ja keinen Bock mehr auf die Presse an der Stelle. Schaut nur, was die aus den Panama-Papers gemacht haben. Ein paar dreckige Scoops extrahiert, einmal in Richtung Russland gepinkelt, und dann fallengelassen wie eine heiße Kartoffel.

Kein Vergleich zu den veröffentlichten Rohdaten und der Suchmaschine, die Wikileaks immer hochgezogen hat bei sowas.

Ich komme mir bei sowas in letzter Zeit vor, als würde mich da jemand an einem Nasenring durchs Dorf zu ziehen versuchen. Nein, liebe SZ, ich würde mir gerne selber Gedanken machen. Eure Gedanken könnt ihr behalten. Packt die Daten ins Internet und überlasst das Rumgerüchten und Raunen lieber stinkenden Internet-Foren. Die sind da eh besser als ihr.

Ein bisschen Einordnen könnt ihr auch gerne machen, aber nicht aus einer "wir sind die einzigen, die die Daten haben"-Position heraus.

### Planet Theory

#### Designing optimal- and fast-on-average pattern matching algorithms

Authors: Gilles Didier, Laurent Tichit
Abstract: Given a pattern $w$ and a text $t$, the speed of a pattern matching algorithm over $t$ with regard to $w$, is the ratio of the length of $t$ to the number of text accesses performed to search $w$ into $t$. We first propose a general method for computing the limit of the expected speed of pattern matching algorithms, with regard to $w$, over iid texts. Next, we show how to determine the greatest speed which can be achieved among a large class of algorithms, altogether with an algorithm running this speed. Since the complexity of this determination make it impossible to deal with patterns of length greater than 4, we propose a polynomial heuristic. Finally, our approaches are compared with \nAlgos{} pre-existing pattern matching algorithms from both a theoretical and a practical point of view, i.e. both in terms of limit expected speed on iid texts, and in terms of observed average speed on real data. In all cases, the pre-existing algorithms are outperformed.

#### Ortho-polygon Visibility Representations of Embedded Graphs

Authors: Emilio Di Giacomo, Walter Didimo, William S. Evans, Giuseppe Liotta, Henk Meijer, Fabrizio Montecchiani, Stephen K. Wismath
Abstract: An ortho-polygon visibility representation of an $n$-vertex embedded graph $G$ (OPVR of $G$) is an embedding preserving drawing of $G$ that maps every vertex to a distinct orthogonal polygon and each edge to a vertical or horizontal visibility between its end-vertices. The vertex complexity of an OPVR of $G$ is the minimum $k$ such that every polygon has at most $k$ reflex corners. We present polynomial time algorithms that test whether $G$ has an OPVR and, if so, compute one of minimum vertex complexity. We argue that the existence and the vertex complexity of an OPVR of $G$ are related to its number of crossings per edge and to its connectivity. Namely, we prove that if $G$ has at most one crossing per edge (i.e. $G$ is a $1$-plane graph) an OPVR of $G$ always exists while this may not be the case if two crossings per edge are allowed. Also, if $G$ is a $3$-connected $1$-plane graph, we can compute in $O(n)$ time an OPVR of $G$ whose vertex complexity is bounded by a constant. However, if $G$ is a $2$-connected $1$-plane graph, the vertex complexity of any OPVR of $G$ may be $\Omega(n)$. In contrast, we describe a family of $2$-connected $1$-plane graphs for which, in $O(n)$ time, an embedding that guarantees constant vertex complexity can be computed. Finally, we present the results of an experimental study on the vertex complexity of ortho-polygon visibility representations of $1$-plane graphs.

#### A Linear Time Parameterized Algorithm for Node Unique Label Cover

Authors: Daniel Lokshtanov, M. S. Ramanujan, Saket Saurabh
Abstract: The optimization version of the Unique Label Cover problem is at the heart of the Unique Games Conjecture which has played an important role in the proof of several tight inapproximability results. In recent years, this problem has been also studied extensively from the point of view of parameterized complexity. Cygan et al. [FOCS 2012] proved that this problem is fixed-parameter tractable (FPT) and Wahlstr\"om [SODA 2014] gave an FPT algorithm with an improved parameter dependence. Subsequently, Iwata, Wahlstr\"om and Yoshida [2014] proved that the edge version of Unique Label Cover can be solved in linear FPT-time. That is, there is an FPT algorithm whose dependence on the input-size is linear. However, such an algorithm for the node version of the problem was left as an open problem. In this paper, we resolve this question by presenting the first linear-time FPT algorithm for Node Unique Label Cover.

#### Optimal Computation of Avoided Words

Authors: Yannis Almirantis, Panagiotis Charalampopoulos, Jia Gao, Costas S. Iliopoulos, Manal Mohamed, Solon P. Pissis, Dimitris Polychronopoulos
Abstract: The deviation of the observed frequency of a word $w$ from its expected frequency in a given sequence $x$ is used to determine whether or not the word is avoided. This concept is particularly useful in DNA linguistic analysis. The value of the standard deviation of $w$, denoted by $std(w)$, effectively characterises the extent of a word by its edge contrast in the context in which it occurs. A word $w$ of length $k>2$ is a $\rho$-avoided word in $x$ if $std(w) \leq \rho$, for a given threshold $\rho < 0$. Notice that such a word may be completely absent from $x$. Hence computing all such words na\"{\i}vely can be a very time-consuming procedure, in particular for large $k$. In this article, we propose an $O(n)$-time and $O(n)$-space algorithm to compute all $\rho$-avoided words of length $k$ in a given sequence $x$ of length $n$ over a fixed-sized alphabet. We also present a time-optimal $O(\sigma n)$-time and $O(\sigma n)$-space algorithm to compute all $\rho$-avoided words (of any length) in a sequence of length $n$ over an alphabet of size $\sigma$. Furthermore, we provide a tight asymptotic upper bound for the number of $\rho$-avoided words and the expected length of the longest one. We make available an open-source implementation of our algorithm. Experimental results, using both real and synthetic data, show the efficiency of our implementation.

#### An I/O-efficient Generator for Massive Complex Networks with Explicit Community Structure

Authors: Michael Hamann, Manuel Penschuck
Abstract: The LFR benchmark is a popular benchmark graph model used to evaluate community detection algorithms. We present the first external memory algorithm that is able to generate massive complex networks following the LFR model. Its most expensive component is the generation of random graphs with prescribed degree sequences which can be divided in two steps: They are first materialized as deterministic graphs using the Havel-Hakimi algorithm and then randomized. Our main contribution are HP-HH and ES-TFP, two I/O-efficient external memory algorithms for these two steps. In an experimental evaluation we demonstrate their performance: our implementation is able to generate graphs with more than 10 billion edges on a single machine, is competitive with a parallel massively distributed algorithm and on smaller instances faster than a state-of-the-art internal memory implementation.

#### Relative Convex Hull Determination from Convex Hulls in the Plane

Authors: P. Wiederhold, H. Reyes
Abstract: A new algorithm for the determination of the relative convex hull in the plane of a simple polygon A with respect to another simple polygon B which contains A, is proposed. The relative convex hull is also known as geodesic convex hull, and the problem of its determination in the plane is equivalent to find the shortest curve among all Jordan curves lying in the difference set of B and A and encircling A. Algorithms solving this problem known from Computational Geometry are based on the triangulation or similar decomposition of that difference set. The algorithm presented here does not use such decomposition, but it supposes that A and B are given as ordered sequences of vertices. The algorithm is based on convex hull calculations of A and B and of smaller polygons and polylines, it produces the output list of vertices of the relative convex hull from the sequence of vertices of the convex hull of A.

#### On Approximating Functions of the Singular Values in a Stream

Authors: Yi Li, David P. Woodruff
Abstract: For any real number $p > 0$, we nearly completely characterize the space complexity of estimating $\|A\|_p^p = \sum_{i=1}^n \sigma_i^p$ for $n \times n$ matrices $A$ in which each row and each column has $O(1)$ non-zero entries and whose entries are presented one at a time in a data stream model. Here the $\sigma_i$ are the singular values of $A$, and when $p \geq 1$, $\|A\|_p^p$ is the $p$-th power of the Schatten $p$-norm. We show that when $p$ is not an even integer, to obtain a $(1+\epsilon)$-approximation to $\|A\|_p^p$ with constant probability, any $1$-pass algorithm requires $n^{1-g(\epsilon)}$ bits of space, where $g(\epsilon) \rightarrow 0$ as $\epsilon \rightarrow 0$ and $\epsilon > 0$ is a constant independent of $n$. However, when $p$ is an even integer, we give an upper bound of $n^{1-2/p} \textrm{poly}(\epsilon^{-1}\log n)$ bits of space, which holds even in the turnstile data stream model. The latter is optimal up to $\textrm{poly}(\epsilon^{-1} \log n)$ factors.

Our results considerably strengthen lower bounds in previous work for arbitrary (not necessarily sparse) matrices $A$: the previous best lower bound was $\Omega(\log n)$ for $p\in (0,1)$, $\Omega(n^{1/p-1/2}/\log n)$ for $p\in [1,2)$ and $\Omega(n^{1-2/p})$ for $p\in (2,\infty)$. We note for $p \in (2, \infty)$, while our lower bound for even integers is the same, for other $p$ in this range our lower bound is $n^{1-g(\epsilon)}$, which is considerably stronger than the previous $n^{1-2/p}$ for small enough constant $\epsilon > 0$. We obtain similar near-linear lower bounds for Ky-Fan norms, SVD entropy, eigenvalue shrinkers, and M-estimators, many of which could have been solvable in logarithmic space prior to our work.

#### On the Erdos-Szekeres convex polygon problem

Authors: Andrew Suk
Abstract: Let $ES(n)$ be the smallest integer such that any set of $ES(n)$ points in the plane in general position contains $n$ points in convex position. In their seminal 1935 paper, Erdos and Szekeres showed that $ES(n) \leq {2n - 4\choose n-2} + 1 = 4^{n -o(n)}$. In 1960, they showed that $ES(n) \geq 2^{n-2} + 1$ and conjectured this to be optimal. Despite the efforts of many researchers, no improvement in the order of magnitude has ever been made on the upper bound over the last 81 years. In this paper, we nearly settle the Erdos-Szekeres conjecture by showing that $ES(n) =2^{n +o(n)}$.

#### Derivative-free Efficient Global Optimization on High-dimensional Simplex

Authors: Priyam Das
Abstract: In this paper, we develop a novel derivative-free deterministic greedy algorithm for global optimization of any objective function of parameters belonging to a unit-simplex. Main principle of the proposed algorithm is making jumps of varying step-sizes within the simplex parameter space and searching for the best direction to move in a greedy manner. Unlike most of the other existing methods of constraint optimization, here the objective function is evaluated at independent directions within an iteration. Thus incorporation of parallel computing makes it even faster. Requirement of parallelization grows only in the order of the dimension of the parameter space, which makes it more convenient for solving high-dimensional optimization problems in simplex parameter space using parallel computing. A comparative study of the performances of this algorithm and other existing algorithms have been shown for some moderate and high-dimensional optimization problems along with some transformed benchmark test-functions on simplex. Around 20-300 folds improvement in computation time has been achieved using the proposed algorithm over Genetic algorithm with more accurate solution.

#### Decomposing Cubic Graphs into Connected Subgraphs of Size Three

Authors: Laurent Bulteau, Guillaume Fertin, Anthony Labarre, Romeo Rizzi, Irena Rusu
Abstract: Let $S=\{K_{1,3},K_3,P_4\}$ be the set of connected graphs of size 3. We study the problem of partitioning the edge set of a graph $G$ into graphs taken from any non-empty $S'\subseteq S$. The problem is known to be NP-complete for any possible choice of $S'$ in general graphs. In this paper, we assume that the input graph is cubic, and study the computational complexity of the problem of partitioning its edge set for any choice of $S'$. We identify all polynomial and NP-complete problems in that setting, and give graph-theoretic characterisations of $S'$-decomposable cubic graphs in some cases.

## May 01, 2016

### StackOverflow

#### how is scan implimented in a purely functional way?

Functional reactive programming uses scan on streams/observables. But it also follows the functional principal of being stateless. How is this implemented without state, especially in languages like haskell?

EDIT: by stateless, I ment no mutable values. I know you can use recursuion for reduce, but it seems like that would be harder with scan because the function needs to pause until the next element comes in.

### QuantOverflow

#### Mathematically: How does increasing the number of assets reduce idiosyncratic risk?

As part of an Asset Pricing Module I'm currently taking, whilst looking at APT Ross (1974), we looked at how according to this model, risk originates from both systematic and idiosyncratic asset specific sources.

We first considered an N asset portfolio with equal weights to show how increasing N assets decreases the idiosyncratic (eP - here i am calling it the residual error term e of the portfolio P) residual variances:

Var(e) = (1/N)*(Average Sigma e)

It is clear to see that as N increases, the Variance of e decreases.

However, my question is for the case where N asset are held, but not in equal proportions. We end up with the following expression for the Var(eP):

Var(eP) = [(Summation from i=1 to N) (wi)^2 * (Sigma ei)] + All Covariance Terms

From our assumptions at the outset, idiosyncratic risks of say asset i don't affect asset j, so all the second terms from the above equation equal 0.

My question, in the below equation where wi is equal to the weight of asset i:

Var(eP) = [(Summation from i=1 to N) (wi)^2 * (Sigma ei)]

How can we see here that increasing N reduces idiosyncratic risk?

Thanks

### StackOverflow

#### Why do I get good accuracy with IRIS dataset with a single hidden node?

I have a minimal example of a neural network with a back-propagation trainer, testing it on the IRIS data set. I started of with 7 hidden nodes and it worked well.

I lowered the number of nodes in the hidden layer to 1 (expecting it to fail), but was surprised to see that the accuracy went up.

I set up the experiment in azure ml, just to validate that it wasn't my code. Same thing there, 98.3333% accuracy with a single hidden node.

Can anyone explain to me what is happening here?

### TheoryOverflow

#### Topological properties of classes

Is there any reasonable natural way to think of closure properties of complexity classes such as closure of $PP$ or $PH$ is $PSPACE$?

$P/poly$ is in interior of class $PP$ and so on?

I found this link https://en.wikipedia.org/wiki/Resource_bounded_measure.

Is there reasonable way to concoct topological view from this of complexity landscape?

### StackOverflow

#### Keras. ValueError: I/O operation on closed file

I use jupyter notebook with anaconda. I use kerast firstly, and i can't do tutorial. About this issues are two themes in stackoverflow, but solve not found.

My code:

model = Sequential()

model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy'])

X_train_shape = X_train.reshape(len(X_train), 1)
Y_train_shape = Y_train.reshape(len(Y_train), 1)
model.fit(X_train, Y_train, nb_epoch=5, batch_size=32)


And I have error, it's some random and sometimes one or two epoch competed:

Epoch 1/5 4352/17500 [======>.......................]

--------------------------------------------------------------------------- ValueError Traceback (most recent call last) in () 2 # of 32 samples 3 #sleep(0.1) ----> 4 model.fit(X_train, Y_train, nb_epoch=5, batch_size=32) 5 #sleep(0.1)

C:\Anaconda3\envs\py27\lib\site-packages\keras\models.pyc in fit(self, x, y, batch_size, nb_epoch, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight, **kwargs) 395 shuffle=shuffle, 396 class_weight=class_weight, --> 397 sample_weight=sample_weight) 398 399 def evaluate(self, x, y, batch_size=32, verbose=1,

C:\Anaconda3\envs\py27\lib\site-packages\keras\engine\training.pyc in fit(self, x, y, batch_size, nb_epoch, verbose, callbacks, validation_split, validation_data, shuffle, class_weight, sample_weight) 1009 verbose=verbose, callbacks=callbacks, 1010
val_f=val_f, val_ins=val_ins, shuffle=shuffle, -> 1011 callback_metrics=callback_metrics) 1012 1013 def evaluate(self, x, y, batch_size=32, verbose=1, sample_weight=None):

C:\Anaconda3\envs\py27\lib\site-packages\keras\engine\training.pyc in _fit_loop(self, f, ins, out_labels, batch_size, nb_epoch, verbose, callbacks, val_f, val_ins, shuffle, callback_metrics) 753 batch_logs[l] = o 754 --> 755 callbacks.on_batch_end(batch_index, batch_logs) 756 757 epoch_logs = {}

C:\Anaconda3\envs\py27\lib\site-packages\keras\callbacks.pyc in on_batch_end(self, batch, logs) 58 t_before_callbacks = time.time() 59 for callback in self.callbacks: ---> 60 callback.on_batch_end(batch, logs) 61 self._delta_ts_batch_end.append(time.time() - t_before_callbacks) 62 delta_t_median = np.median(self._delta_ts_batch_end)

C:\Anaconda3\envs\py27\lib\site-packages\keras\callbacks.pyc in on_batch_end(self, batch, logs) 187 # will be handled by on_epoch_end 188 if self.verbose and self.seen < self.params['nb_sample']: --> 189 self.progbar.update(self.seen, self.log_values) 190 191 def on_epoch_end(self, epoch, logs={}):

C:\Anaconda3\envs\py27\lib\site-packages\keras\utils\generic_utils.pyc in update(self, current, values) 110 info += ((prev_total_width - self.total_width) * " ") 111 --> 112 sys.stdout.write(info) 113 sys.stdout.flush() 114

C:\Anaconda3\envs\py27\lib\site-packages\ipykernel\iostream.pyc in write(self, string) 315 316 is_child = (not self._is_master_process()) --> 317 self._buffer.write(string) 318 if is_child: 319 # newlines imply flush in subprocesses

ValueError: I/O operation on closed file

### hubertf's NetBSD blog

#### Bootstrap pkgsrc under 'bash on Windows'

Much bruha was made about Windows running Linux userland recently. Leaving out the fact that emulating other operating systems is something that NetBSD does for ages, there is one real challenge that every Linux user faces when he has set up his operating system: getting software installed easily. And of course there is only one truely portable answer to that question: use pkgsrc, of course!

The process is pretty much straight forward, and Ryo ONODERA has verified the prerequired Windows versions and Linux packages, and has sent instructions on how to bootstrap pkgsrc on Windows 10. Now who's the first one to post a screenshot with output of pkgsrc/misc/cowsay running "cowsay hello pkgsrc"? :-)

### CompsciOverflow

#### Maximum set packing and minimum set cover duality

I read that the maximum set packing and the minimum set cover problems are dual of each other when formulated as linear programming problems. By the strong duality theorem, the optimal solution to the primal and dual LP problems should have the same value.

However, consider a universe $U = \{1, 2, 3, 4, 5\}$ and a collection of sets: $S = \{ \{1, 2, 3\}, \{3, 4, 5\}, \{1\}, \{2\}, \{3\}\}$. From what I understand, a minimum set cover would consist of the first 2 sets in set $S$, while the maximum set packing consists of the last 3 sets. These solutions aren't in accordance with the statement of the strong duality theorem.

Given that, I don't understand how the 2 problems can be dual of each other. What am I missing?

Thank you very much.

### StackOverflow

#### Converting maths equation to Java

In this paper, section 2.1, they provide an approach to gain new reasoned probabilities from a set of classifier results. I can understand the concepts but am having difficultly completing the representation in Java. Here is my approach so far:

public double[] sftmx(double[] ovaProbabilities) {
double[] newProbs = new double[ovaProbabilities.length];

double[] zArray = generateZArray(ovaProbabilities);
double zSum = 0;
for (int j = 0; j < zArray.length; j++) {
zSum += (zArray[j]);
}

for (int i = 0; i < zArray.length; i++) {
newProbs[i] = ((zArray[i])/zSum);
}

return newProbs;
}

private double[] generateZArray(double[] ovaPrbs) {
double[] zArray = new double[ovaPrbs.length];
for (int i = 0; i < ovaPrbs.length; i++) {
double wk = 0;
double wko = 0;

// working out wk, wko and C values and equation minimization

zArray[i] = Math.exp((wk*ovaPrbs[i])+wko);
}

return zArray;
}


I primarily don't see why/how an equation needs minimising and what relevance it has. I hope the code is readable, I've been trying to get it working but this is as far as I've got!

### CompsciOverflow

#### Decidablity with exponential number of solution

I am trying to understanding this. If a problem has exponential amount of candidate solutions, such as 2^2^n. Is this decidable? To my understanding, as long as its' verfiable, no matter how big the solutions there are, it's.decidable.

Thanks for clarity

#### why should I go for logistic regression?

I am a student working on a Database management project with a bit of Python coding involved. The project is about Review Analysis.Basically I am trying to read a review and determine how good or bad it is.

I have a file of good and bad words with its scores like :: good = 2, better =3,best=4;

By this , I have written a code for sentiment analysis of the review which gives me the score of the review considering every single word.

Since the range varied from - infinity to + infinity, I had to scale it down to a range of 0 to 1. A graphical representation must be done on this data; i.e. the score of review ,may it be from -infinity to +infinity or 0 to 1.

I was advised to go for logistic regression plot by my guide. Why logistic regression? Is there a better way?

### StackOverflow

#### What does foldr do in Haskell?

So I came across the foldr function in Haskell which from what I pick up, you can use to calculate the product and sum of a list:

foldr f x xs

foldr (*) 1 [1..5] = 120
foldr (+) 0 [1..5] = 15


And adding numbers onto the x part would like, add on to the overall sum or multiply onto the final product

What does foldr actually do and why would anyone use it instead of the built in functions 'sum' or 'product' etc?

### TheoryOverflow

#### Consequences of bipartite perfect matching not in NL?

Are any significant consequences known of $\text{BPM} \not\in \textsf{NL}$?

I'm interested in the status of the following well-studied decision problem, in particular whether it is known to be in $\textsf{NL}$:

Bipartite Perfect Matching (BPM)
Input: bipartite $2n$-vertex graph $G$
Question: does $G$ contain a matching with $n$ edges?

Maximum Bipartite Matching is the version where the required size of a matching is given as part of the input. By Chandra-Stockmeyer-Vishkin, this is equivalent to BPM via $\textsf{AC}^0$-reductions, and these problems are hard for $\textsf{NL}$ via $\textsf{AC}^0$-reductions.

Recall the following sequence of inclusions (see the Zoo): $$\textsf{L} \subseteq \textsf{UL} \subseteq \textsf{NL} \subseteq \textsf{NC}^2 \subseteq \textsf{RNC}^2.$$

Looking at a more general problem, maximum matching for general graphs is in $\textsf{RNC}^2$, by Mulmuley-Vazirani-Vazirani (preprint), and according to Allender-Reinhardt-Zhou this problem was (in 1999) not known to be in NL. The latter paper also shows that BPM is in a nonuniform version of $\textsf{SPL}$, but $\textsf{SPL}$ is not known to be comparable to $\textsf{NL}$. (The Zoo also claims that BPM is in $\textsf{coRNC}$ by Karloff, although it is not obvious to me how this follows from the results in Karloff's paper.)

All of the usual approaches to finding a maximum bipartite matching are polynomial-time. Moreover, these algorithms have polynomial time bounds of quite low degree. In particular, an $O(n^{2.5})$ time upper bound follows from the reduction of Hopcroft-Karp and Karzanov to maximum flow. So it is not impossible that BPM (and perhaps even maximum matching) is in $\textsf{NL}$.

However, all the approaches I've looked at seem to be rather space-intensive. The usual algorithms essentially keep track of subsets of the vertices, and therefore seem to require something like $\Omega(n)$ bits (although proving such a bound rigorously seems likely to be difficult).

Looking at a more specific problem, if the input graphs are further restricted to be planar as well as bipartite, then Planar BPM is in $\textsf{UL}$ and hence $\textsf{NL}$, by Datta-Gopalan-Kulkarni-Tewari.

So BPM is $\textsf{NL}$-hard, but is in some complexity classes "not much larger than" $\textsf{NL}$, and is in $\textsf{NL}$ for restricted classes of bipartite graphs. It therefore seems reasonable to ask: if $\text{BPM} \not\in\textsf{NL}$, would anything interesting follow?

### StackOverflow

#### newff and train functions of python's neurolab is giving inconsistent results for same code and input

While the input is the same and the code is the same, I get two different results when run multiple time. There are only two unique outputs though. I do not know what part of the code is randomized and I'm having a hard time figuring out where the error is. Is this a known bug in neurolab by any chance?

I've attached the complete code below. Please run in it some 9-10 times to see the two different outputs. I also have attached the output from five runs of the same code and I see that the error output has two different values in the five runs. Please help.

Code: --------

import neurolab as nl
import numpy as np

# Create train samples

N = 200;

## DATA
x1 = [0]*(N+1);

for ii in range(-N/2,N/2+1,1):

x1[ii+N/2] = ii;

x1_arr = np.array(x1);

y1 = -2+ 3*x1_arr ;

y = [0]*len(y1);

for ii in range(len(y1)):

if(y1[ii] > 15):

y[ii] = 1;

l = len(y);

x0 = [1]*l;

x0_arr = np.array(x0);

x_arr = np.concatenate(([x0_arr], [x1_arr]), axis=0)

x = x1_arr;

y_arr = np.array(y);

size = l;

inp = x.reshape(size,1)

tar = y_arr.reshape(size,1)

# Create network with 2 layers and random initialized

net = nl.net.newff([[-N/2, N/2]],[1, 1])

net.trainf =  nl.train.train_gd;

# Train network
error = net.train(inp, tar, epochs=100, show=100, goal=0.02, lr = 0.001)

# Simulate network
out = net.sim(inp);


Ouput ---------

>>>
========= RESTART: D:/Python_scripts/ML/nn_neurolab/num_detection.py =========
Epoch: 100; Error: 2.49617137968;
The maximum number of train epochs is reached
>>>
========= RESTART: D:/Python_scripts/ML/nn_neurolab/num_detection.py =========
Epoch: 100; Error: 2.49617137968;
The maximum number of train epochs is reached
>>>
========= RESTART: D:/Python_scripts/ML/nn_neurolab/num_detection.py =========
Epoch: 100; Error: 2.66289633422;
The maximum number of train epochs is reached
>>>
========= RESTART: D:/Python_scripts/ML/nn_neurolab/num_detection.py =========
Epoch: 100; Error: 2.49617137968;
The maximum number of train epochs is reached
>>>
========= RESTART: D:/Python_scripts/ML/nn_neurolab/num_detection.py =========
Epoch: 100; Error: 2.66289633422;
The maximum number of train epochs is reached


Thanks and Cheers!

#### Why can not I get a Correct Answer in the Hackerrank? Python 3 [on hold]

I am creating a program to solve the problem proposed in the Test Case #1. But I am always getting a wrong answer in the Test Case #2, it is because I never know what is the Test Case #2!

Why am I getting wrong in the second case?

Here is the problem propsed and my answer:

Task Complete the code in the editor below. The variables i, d, and s are already declared and initialized for you. You must declare 3 variables: one of type int, one of type double, and one of type String. Then you must read 3 lines of input from stdin and initialize your 3 variables. Finally, you must use the + operator to perform the following operations:

Print the sum of i plus your int variable on a new line. Print the sum of d plus your double variable to a scale of one decimal place on a new line. Concatenate s with the string you read as input and print the result on a new line.

Test Case #1 - Output expected:

16

8.0

HackerRank is the best place to learn and practice coding!

+Test Case #2 - Output expected:

7

6.8

is my favorite plataform!

Here's my code:

def test_case_1():
ii = int()
dd = float()
ss = str()
i = 4
d = 4.0
s = 'HackerRank '
ii = 12
dd = 4.0
ss = "is the best place to learn and practice coding!"
total_int = i + ii
print (total_int)
total_double = d + dd
print(total_double)
print (s + ss)
memory = open('memory.txt', 'w')
memory.write("1")
memory.close()
def test_case_2():
ii = int()
dd = float()
ss = str()
i = 4
d = 4.0
s = 'HackerRank '
ii = 3
dd = 2.8
ss = "is my favorite plataform!"
total_int = i + ii
print (total_int)
total_double = d + dd
print(total_double)
print (s + ss)
memory = open('memory.txt', 'w')
memory.write("0")
memory.close()
def starting():
memory = open('memory.txt', 'r')
ted = []
for line in memory:
ted.append(line)

memory.close()
return ted

def test(ted, main):
if ted == ['0'] :
test_case_1()

elif ted == ['1'] :
test_case_2()

def main():
try:
memory = open('memory.txt', 'r')
memory.close()

except:
memory = open('memory.txt', 'w')
memory.write("0")
memory.close()

pronto = starting()
test(pronto, main)

main()

### Planet Theory

#### Some more bits from the Gathering for Gardner

I posted about the Gathering for Gardner conference and about some of the talks I saw here. Today I continue with a few more talks.

Playing Penney's game with Roulette by Robert Vallen. Penney;'s game is the following:  let k be fixed. Alice and Bob pick different elements of {H,T}^k.  They flip a coin until one of their sequences shows up, and that person wins. Which sequences have the best probability of winning?

New Polyhedral dice by Robert Fathauer, Henry Segerman, Robert Bosch. This is a good example of how my mentality (and possibly yours) differs from others. When I hear 60-sided dice'' I think p1,...,p60 where are all between 0 and 1 and add up to 1'' I also thought that only the platonic solids could be usedvto form fair dice (so only 4-sided, 6-sided, 8-sided, 12-sided, and 20-sided dice can be made). NOT so. These authors actually MAKE real dice and they do not have to be platonic solids. Here is their website.

Numerically balance dice by Robert Bosch (paper is here). Why do dice have the opposite sides sum to the same thing?  Read the paper to find out!

Secret messages in juggling and card shuffling by Erik Demaine. Erik Demaine was one of about 4 theoretical computer scientists I met at the conference, though Erik is so well rounded that calling him a theoretical computer scientist doesn't seem quite right. I had never met him before which surprised me. In this talk he showed us some new fonts- one using juggling. See here for an example of juggling fonts, co-authored with his father Martin.

Fibonacci Lemonade by Andrea Johanna Hawksley. Put in the leomon and sugar in fib number increments. Here is their website. In my first post I said the talks were on a variety of topics and then presented mostly math talks. This talk is an example of that variety. There were other talks involving the Fib numbers. I was surprised by this since they aren't that special (see here).

Penis Covers and Puzzles: Brain Injuries and Brain Health by Gini Wingard-Phillips. She recounted having various brain injuries and how working on mathematical puzzles, of the type Martin Gardner popularized as HELPING HER RECOVER! As for the title- people with brain injuries sometimes have a hard time finding the words for things so they use other words. In this case she wanted her husband to buy some condoms but couldn't think of the word so she said Penis Covers instead.

Loop- Pool on an Ellipse by Alex Bellos. Similar in my mind to the Polyhedral dice talk (you'll see why). We all know that if you built an elliptical pool table with a hole at one of the foci then if the ball is placed at the other foci and hit hard enough it WILL go into the other hole. But Alex Bellos actually MAKES these pool table (see here if you want buy one for $20,000). He told us the history- someone else tried to make one in 1962 but nobody bought them (I wonder if anyone are going to buy his), and Alex had problems with friction as you may recall that it only works on a frictionless surface. So his game does require some skill. The similarity to dice is that I (and you?) are used to thinking about dice and ellipses abstractly, not as objects people actually build. This post is getting long so I'll stop here and report more in a later post. Why so mny posts? Six minute talks that I an actually understand and are delighted to tell you about! ### QuantOverflow #### Calculation of Bond Carry from Synthetic future prices I have only government bond yields with different maturities. How can I obtain sythetic future prices on bonds? After obtained the future prices, I am supposed to compute the return and carry returns. #### On the reflection of a stochastic integral Let${(I_t)}_{t\geq 0}$be a stochastic integral defined by $$I_t=\int_{0}^{t}\theta_sdW_t,$$ where$W$is a standard Brownian motion defined on$(\Omega,\mathcal{F},{(\mathcal{F}_t)}_{t\geq 0},\mathbb{P})$and$\theta$a stochastic process adapted to$\mathcal{F}_t$satisfying the follows condition of integrability $$E\left(\int_{0}^{t}\theta_s^2 ds\right)<\infty\;\;\ \forall t> 0.$$ We define the first passage time at$a$for Brownian motion$W$by the following random variable $$\tau_a = \inf\{t\geq 0,W_t\geq a\},$$ where$a>0$. It is possible to show that$\tau_a$is a stopping time. Moreover, By virtue of the reflection principle, we know that the following process \begin{equation*} Z_t = \begin{cases} W_t \qquad & if \qquad 0 \leq t \leq \tau_a \\ 2a-W_t \qquad & if \qquad t > \tau_a \end{cases} \end{equation*} also follows a standard Brownian motion under$\mathbb{P}$. My question is as follows : Is it possible to rewrite the process$I$in relation to the process$Z$? I would like your opinion on this issue, thank you in advance. ### CompsciOverflow #### What are the Correct Conditions for Akra-Bazzi Master Theorem? (Cross-Post) [on hold] NOTE: this question is cross-posted The Akra-Bazzi method solves recurrences of the form: $$T(n) = g(n) + \sum\limits_{i=1}^k a_iT(b_in + h_i(n))$$ In the Wikipedia article about the topic, it says that the condition on$g(n)$is: $$g(n) \in O(x^c)$$ for some constant$c$, and the$O$is Big Oh notation. However, elsewhere it says that the condition on$g(.)$is "polynomial growth" (for example here): We say that$g(n)$satisfies the polynomial-growth condition if there exist positive constants$c_1,c_2$such that for all$n \ge 1$, for all$1 \le i \le k$, and for all$u \in [b_in, n],$$$c_1g(n) \le g(u) \le c_2g(n).$$ My question is this: Are these two conditions equivalent? Does one imply the other? Or is Wikipedia's statement of the theorem simply incorrect? EDIT: What confuses me about how the two conditions relate to each other, is that the Big Oh condition obviously compares the growth of the function to another function which is a polynomial, while the "polynomial growth condition" compares the growth of the function to itself and not to any polynomial. So why is the "polynomial growth condition" even called that? If it is equivalent to the the Big Oh condition, then why don't all references state it like that? Isn't the first condition much easier to understand? I checked the Wikipedia references and googled Akra-Bazzi, but all of the other papers I found gave the "polynomial growth condition", so I have no idea where the condition on Wikipedia came from. The Wikipedia condition worked correctly for my homework (using$g(n)=n^2 \log n$), so I feel like it is probably sufficient but not necessary for the "polynomial growth condition", but I'm not sure because I can't find any sources corroborating that and I keep screwing up the argument when I attempt it myself. #### Complexity of Hamiltonian path and clique problem I came across this question. If we want to check if a graph contains both Hamiltonian path and clique. Would this problem be NPC. I knew that clique contains a Hamiltonian path and both problems are NPC, but I am uncertain if something would be different if we check it in same time. ### Planet Theory #### “Largely just men doing sums”: My review of the excellent Ramanujan film [Warning: This movie review contains spoilers, as well as a continued fraction expansion.] These days, it takes an extraordinary occasion for me and Dana to arrange the complicated, rocket-launch-like babysitting logistics involved in going out for a night at the movies. One such an occasion was an opening-weekend screening of The Man Who Knew Infinitythe new movie about Srinivasa Ramanujan and his relationship with G. H. Hardy—followed by a Q&A with Matthew Brown (who wrote and directed the film), Robert Kanigel (who wrote the biography on which the film was based), and Fields Medalist Manjul Bhargava (who consulted on the film). I read Kanigel’s The Man Who Knew Infinity in the early nineties; it was a major influence on my life. There were equations in that book to stop a nerdy 13-year-old’s pulse, like $$1+9\left( \frac{1}{4}\right) ^{4}+17\left( \frac{1\cdot5}{4\cdot8}\right) ^{4}+25\left( \frac{1\cdot5\cdot9}{4\cdot8\cdot12}\right) ^{4}+\cdots =\frac{2^{3/2}}{\pi^{1/2}\Gamma\left( 3/4\right) ^{2}}$$ $$\frac{1}{1+\frac{e^{-2\pi}}{1+\frac{e^{-4\pi}}{1+\frac{e^{-6\pi}}{1+\cdots}}% }}=\left( \sqrt{\frac{5+\sqrt{5}}{2}}-\frac{\sqrt{5}+1}{2}\right) \sqrt[5]{e^{2\pi}}$$ A thousand pages of exposition about Ramanujan’s mysterious self-taught mathematical style, the effect his work had on Hardy and Littlewood, his impact on the later development of analysis, etc., could never replace the experience of just staring at these things! Popularizers are constantly trying to “explain” mathematical beauty by comparing it to art, music, or poetry, but I can best understand art, music, and poetry if I assume other people experience them like the above identities. Across all the years and cultures and continents, can’t you feel Ramanujan himself leaping off your screen, still trying to make you see this bizarre aspect of the architecture of reality that the goddess Namagiri showed him in a dream? Reading Kanigel’s book, I was also entranced by the culture of early-twentieth-century Cambridge mathematics: the Tripos, Wranglers, High Table. I asked, why was I here and not there? And even though I was (and remain) at most 1729-1729 of a Ramanujan, I could strongly identify with his story, because I knew that I, too, was about to embark on the journey from total scientific nobody to someone who the experts might at least take seriously enough to try to prove him wrong. Anyway, a couple years after reading Kanigel’s biography, I went to the wonderful Canada/USA MathCamp, and there met Richard K. Guy, who’d actually known Hardy. I couldn’t have been more impressed had Guy visited Platonic heaven and met π and e there. To put it mildly, no one in my high school had known G. H. Hardy. I often fantasized—this was the nineties—about writing the screenplay myself for a Ramanujan movie, so that millions of moviegoers could experience the story as I did. Incidentally, I also fantasized about writing screenplays for Alan Turing and John Nash movies. I do have a few mathematical biopic ideas that haven’t yet been taken, and for which any potential buyers should get in touch with me: • Radical: The Story of Évariste Galois • Give Me a Place to Stand: Archimedes’ Final Days • Mathématicienne: Sophie Germain In Her Prime • The Prime Power of Ludwig Sylow (OK, this last one would be more of a limited-market release) But enough digressions; how was the Ramanujan movie? Just as Ramanujan himself wasn’t an infallible oracle (many of his claims, e.g. his formula for the prime counting function, turned out to be wrong), so The Man Who Knew Infinity isn’t a perfect movie. Even so, there’s no question that this is one of the best and truest movies ever made about mathematics and mathematicians, if not the best and truest. If you’re the kind of person who reads this blog, go see it now. Don’t wait! As they stressed at the Q&A, the number of tickets sold in the first couple weeks is what determines whether or not the movie will see a wider release. More than A Beautiful Mind or Good Will Hunting or The Imitation Game, or the play Proof, or the TV series NUMB3RS, the Ramanujan movie seems to me to respect math as a thing-in-itself, rather than just a tool or symbol for something else that interests the director much more. The background to the opening credits—and what better choice could there be?—is just page after page from Ramanujan’s notebooks. Later in the film, there’s a correct explanation of what the partition function P(n) is, and of one of Ramanujan’s and Hardy’s central achievements, which was to give an asymptotic formula for P(n), namely $$P(n) \approx \frac{e^{π \sqrt{2n/3}}}{4\sqrt{3}n},$$ and to prove the formula’s correctness. The film also makes crystal-clear that pure mathematicians do what they do not because of applications to physics or anything else, but simply because they feel compelled to: for the devout Ramanujan, math was literally about writing down “the thoughts of God,” while for the atheist Hardy, math was a religion-substitute. Notably, the movie explores the tension between Ramanujan’s untrained intuition and Hardy’s demands for rigor in a way that does them both justice, resisting the Hollywood urge to make intuition 100% victorious and rigor just a stodgy punching bag to be defeated. For my taste, the movie could’ve gone even further in the direction of “letting the math speak”: for example, it could’ve explained just one of Ramanujan’s infinite series. Audiences might even have liked some more T&A (theorems and asymptotic bounds). During the Q&A that I attended, I was impressed to see moviegoers repeatedly pressing a somewhat-coy Manjul Bhargava to explain Ramanujan’s actual mathematics (e.g., what exactly were the discoveries in his first letter to Hardy? what was in Ramanujan’s Lost Notebook that turned out to be so important?). Then again, this was Cambridge, MA, so the possibility should at least be entertained that what I witnessed was unrepresentative of American ticket-buyers. From what I’ve read, the movie is also true to South Indian dress, music, religion, and culture. Yes, the Indian characters speak to each other in English rather than Tamil, but Brown explained that as a necessary compromise (not only for the audience’s sake, but also because Dev Patel and the other Indian actors didn’t speak Tamil). Some reviews have mentioned issues with casting and characterization. For example, Hardy is portrayed by Jeremy Irons, who’s superb but also decades older than Hardy was at the time he knew Ramanujan. Meanwhile Ramanujan’s wife, Janaki, is played by a fully-grown Devika Bhise; the real Janaki was nine (!) when she married Ramanujan, and fourteen when Ramanujan left for England. J. E. Littlewood is played as almost a comic-relief buffoon, so much so that it feels incongruous when, near the end of the film, Irons-as-Hardy utters the following real-life line: I still say to myself when I am depressed and find myself forced to listen to pompous and tiresome people, “Well, I have done one thing you could never have done, and that is to have collaborated with Littlewood and Ramanujan on something like equal terms.” Finally, a young, mustachioed Bertrand Russell is a recurring character. Russell and Hardy really were friends and fellow WWI pacifists, but Hardy seeking out Bertie’s advice about each Ramanujan-related development seems like almost certainly just an irresistible plot device. But none of that matters. What bothered me more were the dramatizations of the prejudice Ramanujan endured in England. Ramanujan is shown getting knocked to the ground, punched, and kicked by British soldiers barking anti-Indian slurs at him; he then shows up for his next meeting with Hardy covered in bruises, which Hardy (being aloof) neglects to ask about. Ramanujan is also depicted getting shoved, screamed at, and told never to return by a math professor who he humiliates during a lecture. I understand why Brown made these cinematic choices: there’s no question that Ramanujan experienced prejudice and snobbery in Cambridge, and that he often felt lonely and unwelcome there. And it’s surely easier to show Ramanujan literally getting beaten up by racist bigots, than to depict his alienation from Cambridge society as the subtler matter that it most likely was. To me, though, that’s precisely why the latter choice would’ve been even more impressive, had the film managed to pull it off. Similarly, during World War I, the film shows not only Trinity College converted into a military hospital, and many promising students marched off to their deaths (all true), but also a shell exploding on campus near Ramanujan, after which Ramanujan gazes in horror at the bleeding dead bodies. Like, isn’t the truth here dramatic enough? One other thing: the movie leaves you with the impression that Ramanujan died of tuberculosis. More recent analysis concluded that it was probably hepatic amoebiasis that he brought with him from India—something that could’ve been cured with the medicine of the time, had anyone correctly diagnosed it. (Incidentally, the film completely omits Ramanujan’s final year, back in India, when he suffered a relapse of his illness and slowly withered away, yet with Janaki by his side, continued to do world-class research and exchanged letters with Hardy until the very last days. Everyone I read commented that this was “the right dramatic choice,” but … I dunno, I would’ve shown it!) But enough! I fear that to harp on these defects is to hold the film to impossibly-high, Platonic standards, rather than standards that engage with the reality of Hollywood. An anecdote that Brown related at the end of the Q&A session brought this point home for me. Apparently, Brown struggled for an entire decade to attract funding for a film about a turn-of-the-century South Indian mathematician visiting Trinity College, Cambridge, whose work had no commercial or military value whatsoever. At one point, Brown was actually told that he could get the movie funded, if he’d agree to make Ramanujan fall in love with a white nurse, so that a British starlet who would sell tickets could be cast as his love interest. One can only imagine what a battle it must have been to get a correct explanation of the partition function onto the screen. In the end, though, nothing made me appreciate The Man Who Knew Infinity more than reading negative reviews of it, like this one by Olly Richards: Watching someone balancing algorithms or messing about with multivariate polynomials just isn’t conducive to urgently shovelling popcorn into your face. Difficult to dislike, given its unwavering affection for its subject, The Man Who Knew Infinity is nevertheless hamstrung by the dryness of its subject … Sturdy performances and lovely scenery abound, but it’s still largely just men doing sums; important sums as it turns out, but that isn’t conveyed to the audience until the coda [which mentions black holes] tells us of the major scientific advances they aided. On behalf of mathematics, on behalf of my childhood self, I’m grateful that Brown fought this fight, and that he won as much as he did. Whether you walk, run, board a steamship, or take taxi #1729, go see this film. Addendum: See also this review by Peter Woit, and this in Notices of the AMS by Ramanujan expert George Andrews. ### CompsciOverflow #### Implication of polynomial solution to subset sum problem Hi I am trying to get the concept right. If subset sum, a NPC problem, yields a polynomial solution. Does that mean P=NP? Thanks for clarity. ### Halfbakery #### Extra Smartphone Ports (0.5) ### CompsciOverflow #### How to write the turing machine processing operations? I have this Turing machine example given in my book: For the language$0^n1^n$. I understand how it works because it's very similar to a Finite State Machine. But what I want to know is the following: What is the$|-$symbol is for what are the states$q_0$,$q_1$doing in middle of everything. Unfortunately my textbook doesn't explain any of it. ### Planet Emacsen #### Pragmatic Emacs: Using the zenburn theme A few people have asked about the theme I use in Emacs, as seen in the sreenshots and code snippets I use. It is zenburn, and I find it very easy on the eye, but also has good contrast and a nice colour palette for syntax highlighting. Install and activate the theme by adding the following to your emacs config file (use-package zenburn-theme :ensure t :config (load-theme 'zenburn t))  ### CompsciOverflow #### Prove Undecidability Without Using Rice's Theorem Show that checking if a TM accepts some input string of length greater than some constant$k$is undecidable. Here the constant$k$is publicly known. I tried solving this problem by trying to reduce the "Acceptance Problem" i.e. the problem where we have to check whether a TM$M$accepts a string$w$or not. One idea I had was that we can modify the input$M$to$M'$such that$M'$simply increases the length of the string to become greater than$k$but that is wrong since by input we strictly mean what was provided and any thing that the TM does cannot be considered as part of the input. How can I solve this? ### Halfbakery #### Binary Sudoku (-1.5) ### QuantOverflow #### What does it mean for an option strategy to be leveraged Probably a newbie question, but what do traders mean when they say that an option strategy is leveraged ? And when can we say that it is the case ? #### reference for elementary mortgage math I have a student doing a project on default rate & prepayment rate for mortgages. She would like to include a section on how the quantities affect pricing, & so would like to reference a formula that gives the value of a mortgage in terms of them, say for constant values of them. I think it used to be possible to find them in one or both of Fabozzi's books, (fixed income handbook and mortgage handbook) but the editions she found in the library did not in fact have them. So the QUESTION is does someone know a readily available source that has a formula for the value of a mortgage in terms of these quantities or basically the same thing, psa, smm etc. ### CompsciOverflow #### What is wrong with this reasoning that finding the genus of a degree 3 bipartite graph is NP-complete? Finding genus of a biparite graph is$NP$-complete and finding genus of a degree$3$graph is$NP$-complete and so finding genus of a degree$3$bipartite graph is$NP$-complete. Though implication could be right is there any harm in reasoning this way on$NP$-completeness? ### StackOverflow #### Tensorflow conv2d_transpose size error "Number of rows of out_backprop doesn't match computed" I am creating a convolution autoencoder in tensorflow. I got this exact error: tensorflow.python.framework.errors.InvalidArgumentError: Conv2DBackpropInput: Number of rows of out_backprop doesn't match computed: actual = 8, computed = 12 [[Node: conv2d_transpose = Conv2DBackpropInput[T=DT_FLOAT, data_format="NHWC", padding="SAME", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/cpu:0"](conv2d_transpose/output_shape, Variable_1/read, MaxPool_1)]] Relevant code: l1d = tf.nn.relu(tf.nn.conv2d_transpose(l1da, w2, [10, 12, 12, 32], strides=[1, 1, 1, 1], padding='SAME'))  where w2 = tf.Variable(tf.random_normal([5, 5, 32, 64], stddev=0.01))  I checked the shape of the input to conv2d_transpose i.e. l1da and it is correct(10x8x8x64). The batch size is 10, input to this layer is in the form of 8x8x64, and the output is supposed to be 12x12x32. What am I missing? ### CompsciOverflow #### Problem with external keyboard and numluck [on hold] I’ve got some problems with my laptop numeric keys. In fact, I bought an external keyboard for my laptop because it didn’t have numluck keys separately. The numluck keys are inside the keyboard, for example j is 3. The problem is when i turn the numluck key on, the external keyboard works well but i can’t use my keyboard completely because some keys type number!!! And when i turn the numluck off ,the external keyboard doesn’t work!!! Thanks in advance. Sorry for bad english writing. ### QuantOverflow #### Differences between editions of Security Analysis by Graham and Dodd? Where can I find a comparison of the contents, a list of everything that changed or the differences among the different editions of the book Security Analysis by Benjamin Graham & David Dodd? There are six editions of the book: 1934, 1940, 1951, 1962, 1988, and 2008. Do I have to read all of them and compare myself or did someone already do that? Edit: I already read The Intelligent Investor and the sixth edition of Security Analysis. This question is about the differences between the different editions. The sixth edition if I understand it correctly, is based on the 1940-edition. ### UnixOverflow #### A Question of Two Avahi I have two boxen with avahi 0.6.31 installed: one runs Ubuntu 14.04, and I am in the process of setting up the other with FreeBSD 10.3. I also have a Mac running OS X 10.4. The Ubuntu and FreeBSD machines can resolve each other's addresses, as can the Ubuntu and Mac. The Mac and the FreeBSD box, however, cannot resolve each other. Can anyone tell me what I should look at to remedy this problem? ### CompsciOverflow #### Machine learning algorithm for predicting binary decisions on a large, underrepresented dataset I would like create a classifier which works on a relatively large (about 30k samples) dataset with circa 20 attributes and a binary decision, however such, which contains relatively small amount of samples with, say, "Yes" decision. That is, the data for building a classifier seems underepresented. My question is, are there any algorithms which work well with these kinds of datasets? So far I have tried C4.5 (J 48 actually), some basic SVM algorithms, Naive Bayes and MLP, hovewer each method failed to learn the dependencies in the data well (accuracy was at the level of about 90% and this underrepresented decision was... unrepresented by the classifier too). I'm using Weka if this makes any difference. ### Christian Neukirchen #### 01may2016 ### StackOverflow #### Scikit-Learn manually specifying .max_features in RFECV()-how many features get ranked? I have followed this Scikit-Learn example in Python to obtain .feature_importances_ from a forest estimator. In that example, ExtraTreesClassifier() was used with its default hyperparameter settings - this would mean max_features='auto'. The output of this example is a plot of importances for 10 features. Question 1: When I re-run this example, with max_features=2, the plot is still showing feature importances for all 10 features. Should is only show the importances for 2 features? Question 2: Now, I would like to use ExtraTreesClassifier(max_features=2) with RFECV(). From the RFECV() docs, it indicates RFECV() assigns the best features a rank of 1 - we can see this in the .ranking_ attribute of RFECV(). However, if I specify the estimator to be ExtraTreesClassifier(max_features=2), then does RFECV() use 2 features in its estimator and only return ranks for 2 features? Or does it ignore max_features and return ranks for all the features? ### CompsciOverflow #### What are some real world applications of the Battleship puzzle? Aside from data reconstruction, what would be some real world applications similar to that of solving the Battleship puzzle? ### QuantOverflow #### Scraping options data and returning ticker symbols of companies meeting certain criteria [on hold] This is my first post on this site and I am looking forward to becoming more involved as my skills in coding increase. My first questions involves scraping options (calls and puts) data from the web using yahoo or google finance. I am interested in running code, in R, which will look at the asking price for a call/put on a given stock. Specifically, I hope to run a code which examines all optionable stocks (csv file containing all symbols can be found at https://www.cboe.com/tradtool/symbols/symboldirectory.aspx) and returns a list of stock symbols corresponding to those stocks which have a asking price for a given call/put at or below a certain price with a strike price within a given percentage of the current trading price. For example, assume that I am looking at the hypothetical company ABC which is currently trading at 100 dollars a share. If there exist a put or call option on this stock for a 1 dollar premium or less with a strike price between 90 and 110 dollars (e.g. within 10% of the current price) I would be interested in examining options on this stock. Expanding this, I would be interested in generating an algorithm which would search all optionable stocks and return a list of stock symbols corresponding to those stocks meeting this criteria. I have extensively searched through the available resources on this site and have found some insight into methods of scraping options data, specifically using the script as presented below (gathering data for Apple) and described in https://mktstk.com/2014/12/29/start-trading-like-a-quant-download-option-chains-from-google-finance-in-r/.  library(RCurl) library(jsonlite) getOptionQuote <- function(symbol){ output = list() url = paste('http://www.google.com/finance/option_chain?q=', symbol, '&output=json', sep = "") x = getURL(url) fix = fixJSON(x) json = fromJSON(fix) numExp = dim(json$expirations)[1]
for(i in 1:numExp){
y = json$expirations[i,]$y
m = json$expirations[i,]$m
d = json$expirations[i,]$d
expName = paste(y, m, d, sep = "_")
if (i > 1){
url = paste('http://www.google.com/finance/option_chain?q=', symbol, '&output=json&expy=', y, '&expm=', m, '&expd=', d, sep = "")
json = fromJSON(fixJSON(getURL(url)))
}
output[[paste(expName, "calls", sep = "_")]] = json$calls output[[paste(expName, "puts", sep = "_")]] = json$puts
}
return(output)


}

  fixJSON <- function(json_str){
stuff = c('cid','cp','s','cs','vol','expiry','underlying_id','underlying_price',
'p','c','oi','e','b','strike','a','name','puts','calls','expirations',
'y','m','d')

for(i in 1:length(stuff)){
replacement1 = paste(',"', stuff[i], '":', sep = "")
replacement2 = paste('\\{"', stuff[i], '":', sep = "")
regex1 = paste(',', stuff[i], ':', sep = "")
regex2 = paste('\\{', stuff[i], ':', sep = "")
json_str = gsub(regex1, replacement1, json_str)
json_str = gsub(regex2, replacement2, json_str)
}
return(json_str)


}

aapl_opt = getOptionQuote("AAPL")


However, this code does not support examining multiple stocks at once and thus has not been successful for my application. Any help would be greatly appreciated!

Thanks

### StackOverflow

#### Tensorflow LSTM model testing

I'm new to LSTM and Tensorflow, and I'm trying to use an LSTM model to learn and then classify some huge data set that I have. (I'm not worried about the accuracy my intention is to learn). I tried to implement the model in a similar way as in the PTB word prediction tutorial that uses LSTM. The code in the tutorial (https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/rnn/ptb/ptb_word_lm.py) uses the below line to run the session using the model

 cost, state, _ = session.run([m.cost, m.final_state, eval_op],
{m.input_data: x,
m.targets: y,
m.initial_state: state})


I modified this for my example as below (to get the logits and work with it):

  cost, state, _,output,logits = session.run([m.cost, m.final_state, eval_op, m.output,m.logits],
{m.input_data: x,
m.targets: y,
m.initial_state: state})


So my questions if someone could help are as below:

• How can the model built while training be used for testing? What exactly is happening when 3 models are being used by the tutorial one for each test, train and validation?
• What about the targets while testing(if I don't know them, say in a classification problem). What changes in the run_epoch () can be done in a way to use the model built during training.
• Just another question: It's difficult to debug tensorflow graphs ( and I found it difficult to understand the tensorboard visualizer too) And I didn't find good resource for learning tensorflow (the website seems to be lacking structure/ documentation) What other resources/ debugging methods are there?

Thanks.

#### Is there any python library or API like deeplearning4j?

I need to do some deep learning work in python, mainly image processing based work. Do Python have any standard library or API which is works works like deeplearning4j?

### StackOverflow

#### Difference between Probabilistic kNN and Naive Bayes

I'm trying to modify an standard kNN algorithm to obtain the probability of belonging to a class instead of just the usual classification. I haven't found much information about Probabilistic kNN, but as far as I understand, it works similar to kNN, with the difference that it calculates the percentage of examples of every class inside the given radius.

So I wonder, what's the difference then between Naive Bayes and Probabilistic kNN? I just can spot that Naive Bayes takes into consideration the prior possibility, while PkNN does not. Am I getting it wrong?

### CompsciOverflow

#### Proving a binary heap has $\lceil n/2 \rceil$ leaves

I'm trying to prove that a binary heap with $n$ nodes has exactly $\left\lceil \frac{n}{2} \right\rceil$ leaves, given that the heap is built in the following way:

Each new node is inserted via percolate up. This means that each new node must be created at the next available child. What I mean by this is that children are filled level-down, and left to right. For example, the following heap:

    0
/ \
1   2


would have to have been built in this order: 0, 1, 2. (The numbers are just indexes, they give no indication of the actual data held in that node.)

This has two important implications:

1. There can exist no node on level $k+1$ without level $k$ being completely filled

2. Because children are built left to right, there can be no "empty spaces" between the nodes on level $k+1$, or situations like the following:

    0
/ \
1   2
/ \   \
3  4    6


(This would be an illegal heap by my definition.) Thus, a good way to think of this heap is an array implementation of a heap, where there can't be any "jumps" in indeces of the array.

So, I was thinking induction would probably be a good way to do this... Perhaps something having to deal with even an odd cases for n. For example, some induction using the fact that even heaps built in this fashion must have an internal node with one child for an even n, and no such nodes for an odd n. Ideas?

#### Analysis of Weighted Quick Union with Path Compression

I have searched the internet for an analysis of why WQUPC is amortized $O( m \alpha (n) )$ for m operations on n nodes ( $\alpha ( n)$ is the inverse Ackerman function).

I understand why it is $O ( m \log^{\ast} ( n ) )$ as it is proven on wikipedia.

So the question is, why is WQUPC $O( m \alpha (n) )$? I would prefer a rigorous analysis, but an intuitive explanation would also be useful.

#### What is the difference between centralised, decentralised, distributed, fully-distributed?

What is the difference between centralised, decentralised, distributed, fully-distributed, partially-centralised, partially-decentralised system? Which type or topology is the system attached in the picture below? For instance, A is a master node (coordinator) of A1, A2, A3, which makes this part a centralised system. A, B, C and D are all connected between each other, which makes this system a decentralised one. But then, what are fully-distributed, partially-centralised, partially-decentralised systems?

### StackOverflow

#### Binary Search Tree insert function in OCaml

Suppose we want to insert value smaller. It will go to Node (insert x left, k, right) I don't understand how we can have insert x left when function insert is declared as taking only one argument, the key. How can left also be passed to insert funtion?

type 'a bst_t =
| Leaf
| Node of 'a bst_t * 'a * 'a bst_t

let rec insert x = function
| Leaf -> Node (Leaf, x, Leaf)
| Node (left, k, right) ->
if x < k then Node (insert x left, k, right)
else Node (left, k, insert x right)


### TheoryOverflow

#### Quantum algorithms for QED computations related to the fine structure constants

My question is about quantum algorithms for QED (quantum electrodynamics) computations related to the fine structure constants. Such computations (as explained to me) amounts to computing Taylor-like series $$\sum c_k\alpha^k,$$ where $\alpha$ is the fine structure constant (around 1/137) and $c_k$ is the contribution of Feynman diagrams with $k$-loops.

This question was motivated by Peter Shor's comment (about QED and the fine structure constant) in a discussion regarding quantum computers on my blog. For some background here is a relevant Wikipedea article.

It is known that a) The first few terms of this computation gives very accurate estimations for relations between experimental outcomes which are with excellent agreement with experiments. b) The computations are very heavy and computing more terms is beyond our computational powers. c) At some points the computation will explode - in other words, the radius of convergence of this power series is zero.

My question is very simple: Can these computations be carried out efficiently on a quantum computer.

### 3) (Even weaker) Is it at least feasible to compute the estimates given by these QED computation as long as they are relevant. (Namely for those terms in the series that gives good approximation to the physics.)

A similar question applies to QCD computations for computing properties of the proton or neutron. (Aram Harrow made a related comment on my blog on QCD computations, and the comments by Alexander Vlasov are also relevant.) I would be happy to learn the situation for QCD computations as well.

Following Peter Shor's comment:

In other words

### efficiently approximate answer to the actual physical quantities.

Another way to ask it:

Can we compute using quantum computers more and more digits of the fine structure constant, just like we can compute with a digital computer more and more digits of e and $\pi$?

(Ohh, I wish I was a believer :) )

### more background

The hope that computations in quantum field theory can be carried our efficiently with quantum computers was (perhaps) one of Feynman’s motivation for QC. Important progress towards quantum algorithms for computations in quantum field theories was achieved in this paper: Stephen Jordan, Keith Lee, and John Preskill Quantum Algorithms for Quantum Field Theories. I don't know if the work by Jordan, Lee, and Preskill (or some subsequent work) implies an affirmative answer to my question (at least in its weaker forms).

### A related question on the physics side

I am curious also if there are estimations for how many terms in the expansion before we witness explosion. (To put it on more formal ground: Are there estimates for the minimum k for which $\alpha c_k/c_{k+1} > 1/5$ (say).) And what is the quality of the approximation we can expect when we use these terms. In other words, how much better results can we expect from this QED computations with an unlimited computation power.

Here are two related questions on the physics sister site. QED and QCD with unlimited computational power - how precise they are going to be?; The fine structure constant - can it genuinely be a random variable?

### infra-talk

#### Terraform Modules – My Sharing Wishlist

I’ve been writing a few Terraform modules recently with the aim of sharing them among a few different teams and there are a couple of things missing that I think would make reusable modules much more powerful.

The first and more generic issue is using the inability to use more complex data structures. After you’ve spent a while using Terraform with AWS resources you’ll develop the urge to just create a hash of tags and use it nearly everywhere. Hopefully with the ability to override a key / value or two when actually using the hash. If your teams are using tags, and you really should be, it’s very hard to write a reusable module if the tag names in use by each team are not identical. Because you can only (currently) pass strings around, and you’re unable to use a variable as a tag name, you’re stuck with requiring everyone to use exactly the same tag names or not providing any at all. There’s no middle ground available.

tags {
"${var.foo}" = "Baz" } # creates a Tag called literally '${var.foo}'


My second current pain point, and the one I’m more likely to have missed a solution to, is the ability to conditionally add or remove resource attributes. The most recent time this has bitten me is when trying to generalise a module that uses Elastic Load Balancers. Sometimes you’ll want an ELB with a cert and sometimes you won’t. Using the current module system there’s no way to handle this case.

If I was to do the same kind of thing in CloudFormation I’d use the AWS::NoValue pseudo parameter.

    "DBSnapshotIdentifier" : {
"Fn::If" : [
"UseDBSnapshot",
{"Ref" : "DBSnapshotName"},
{"Ref" : "AWS::NoValue"}
]
}


If DBSnapshotName has a value the DBSnapshotIdentifier property is present and set to that value. If it’s not defined then the property is not set on the resource.

As an aside, after chatting with @andrewoutloud, it’s probably worth noting that you can make entire resources optional using a count and setting it to 0 when you don’t want the resource to be included. While this is handy and worth having in your Terraform toolkit it doesn’t cover my use case.

variable "include_rds" {
default = 0
description = "Should we include a aws_db_instance? Set to 1 to include it"
}

resource "aws_db_instance" "default" {
count = "${var.include_rds}" # this serves as an if # ... snip ... }  I’m sure these annoyances will be ironed out in time but it’s worth considering them and how they’ll impact the reusability of any modules you’d like to write or third party code you’d want to import. At the moment it’s a hard choice between rewriting everything for my own use and getting all the things I need or vendoring everything in and maintaining a branch with things like my own tagging scheme and required properties. ### CompsciOverflow #### How do decompose this relation from 3NF to BCNF? Given a relation with attributes { A, X, Y, Z } and functional dependencies$AX\rightarrow Y, AX\rightarrow Y, Y\rightarrow A, Z\rightarrow B$, I would think that this would be in 3NF because there are no partial or transitive dependencies. What I don't understand is how to decompose the relation to BCNF by ensuring no relation has a determinant that is not a candidate key. #### Recurrence for number of ways to write n as the sum [migrated] I'm trying to find the recurrence for this problem: m (odd), and O(n,m) the number of ways to write n as the sum of odd positive integers at most m (distinctness not required). objective: come up with recurrence for O(nm).  Here's my attempt: ways(m,n) = 0 if m > n ways(m,n) = 1 if m = n ways(m,n) = ways(m+1,n)+m(m, n - m) otherwise  However, I don't think this is correct as it doesn't account for n having to be the sum of odd positive integers (at most m). It doesn't account for oddness. ### Halfbakery #### Squeaky Toy Edition (0.5) ### CompsciOverflow #### Building an undecidable T-Grammar I am asked, "Show that these T-Grammars constitute a set of languages that are undecidable. Do this by building a T-Grammar for a Turing machine description. For a starting point you might think about machine configurations." I am not sure how configurations is suppose to help, but the grammar that I built: $$abc \to def; d \to a; e \to b; f \to c$$ Is this undecidable? ### QuantOverflow #### Black Scholes Constant Implied Volatility I hope someone can clarify my ideas about the constant implied volatility in the classical Black Scholes framework. As well known, market practitioners quote the prices of vanilla call and put options in terms of implied volatilities. For inputs$K$,$S$,$r$,$T$and the price of the option$V$, one can determine the implied volatility$σ$such that$V=BS(K,S,r,T,σ)$(1) When the market quoted implied volatilities are plotted against different strike prices for a fixed maturity$T$, the graph would tipically exhibit a 'smile' shape and hence the name volatility smile. Theory says that this implies a deficiency in the Black Scholes model since it assumes a constant volatility parameter, not depending on$K$nor$T$. Hence the volatility smile would be flat. Here my ideas get confused. Assuming that$S$,$r$and$T$remain constant, for a fixed market price$V$of a vanilla option the implied volatility will vary depending on the value of strike$K$under the Black Scholes model (1). Hence, if the implied volatility is plotted against different strikes for a fixed$V$it will indeed show a smile behaviour, which is in contrast to what theory states. Furthermore, do the market quoted implied volatilities that form the volatility smile according to the theory correspond to a fixed vanilla option price$V$and with varying$K$? I think I am making a mistake in my reasoning but I do not understand where. I would be glad if someone can point me in the right thinking direction. Thanks in advance. ### CompsciOverflow #### Convert non-integer decimal to octal I follow a course in Computer Architecture and I'm making exercises on number conversions. Now one of the questions asks me to convert 251.5625 to octal and hexdecimal base. No further info is given. Would this mean that I have to convert it and just ignore the point? Or what's the convention on something like this? ### QuantOverflow #### Kolmogorov-Smirnov test for Generalized Pareto Distribution I've fitted my data to a generalized pareto distribution as to model the returns in the tails more accurately. The interior is fitted with kernel distributions. I would like to now test whether the original returns conform to the hypothesized distribution (i.e. generalized pareto distribution). Can I do this with the Kolmogorov-Smirnov test? I've already QQ-plots. However, I would like to conduct a statistical significance test on top. Can some one help? Kind of struggling with implementing it in Matlab. Best ### CompsciOverflow #### Turing Machine that always returns a blank tape Is it possible to construct a Turing Machine such that given any finite input on a tape$s$, it clears the tape in a finite amount of time? I have used such a TM as an intermediate step to show a reduction from the State Entry Problem to is$w \in L(M)$problem but I don't know if it is feasible to construct one. Even if we assume that the head of the TM always starts at the leftmost character on the tape and keep moving write, clearing each symbol we encounter, if the tape is infinite, how will we know when to stop moving right? ### StackOverflow #### AssertionError: Mismatch between dataset size and units in output layer I want to do a NN classification using scikit-neuralnetwork , I have 5 classes, so in the output layer , I have units=5 ibut I am getting this error: Mismatch between dataset size and units in output layer, I reshaped my y_train and applied "Sigmoid" function to the output layer according to the documentation: http://scikit-neuralnetwork.readthedocs.io/en/latest/guide_model.html#classification If you want to do multi-label classification, simply fit using a y array of integers that has multiple dimensions, e.g. shape (N, 3) for three different classes. Then, make sure the last layer is Sigmoid instead. y_train shape is : (2115, 5) X_train shape is : (2115, 343) This is the code: import sknn.mlp as mlp from sknn.mlp import Classifier ip_layer = mlp.Layer('Sigmoid', units=1) hidden_layer = mlp.Layer('Tanh', units=100) op_layer = mlp.Layer('Sigmoid', units=5) nn = Classifier( [ip_layer, hidden_layer, op_layer], n_iter=10000 ) nn.fit(X_train, y_train)  ### Fefe #### Pro-Tipp: Wenn du in Pornos mitspielst oder als Sex ... Pro-Tipp: Wenn du in Pornos mitspielst oder als Sex Worker arbeitest, und dir das gegenüber deinem Umfeld peinlich ist oder du es ihnen verschweigst, dann lade keine Fotos von dir auf Social Media-Sites hoch. Auf der anderen Seite kann man natürlich die Frage stellen, wieso es überhaupt ein öffentliches Gesichtserkennungs-API mit App geben muss. #### In Irans neuem Parlament sind mehr Frauen als Mullahs.However ... In Irans neuem Parlament sind mehr Frauen als Mullahs. However clerical numbers have steadily fallen since 1980 with 153 elected in the second parliament, 85 in the 3rd, 67 in the 4th and 52 in the 5th. The outgoing legislature had only 27 men of the cloth. Of the 16 who will enter parliament next month 13 have conservative political leanings and 3 are reformists. #### Hilfreiche Grafik zum Fortschritt der Digitalisierung ... #### Du weißt, dass es um das Gesundheitssystem schlecht ... Du weißt, dass es um das Gesundheitssystem schlecht bestellt ist, wenn das Klinikpersonal nicht für mehr Geld sondern für mehr Pflegepersonal streikt, weil die Stationen so krass überlastet ist. #### Old and busted: Briefkastenfirmen in Panama.New hotness: ... Old and busted: Briefkastenfirmen in Panama. New hotness: Briefkastenfirmen auf Mauritius. Mit dem Gütesiegel der Deutschen Bank! Die Dependance gibt es schon seit 1996, aktuell sind dort immerhin 200 Mitarbeiter beschäftigt. Oha? 200 Mitarbeiter auf Mauritius? Das wird sicher lustig, mal zu fragen, was die da offiziell tun! e Deutsche Bank braucht sie dort, um ihre Dienste großen Pensionskassen, Versicherern und Private-Equity-Gesellschaften anzubieten, die in Asien und Afrika investieren. Sie kümmern sich um wohl wenig aufregende Themen wie die Kontoführung und Depotbetreuung. Zudem würden interne Prozesse und Verwaltungsaufgaben für die Bank abgewickelt, heißt es in Frankfurt. Oh ach soooo, DAS macht ihr da!1!! Ja nee, klar. ### Planet Theory #### TR16-071 | Unprovability of circuit upper bounds in Cook&#39;s theory PV | Jan Krajicek, Igor Carboni Oliveira We establish unconditionally that for every integer$k \geq 1$there is a language$L \in P$such that it is consistent with Cook's theory PV that$L \notin SIZE(n^k)$. Our argument is non-constructive and does not provide an explicit description of this language. ### UnixOverflow #### Restart/reload IPFW remotely via ssh without losing connection Is it possible to restart IPFW or reload its script remotely via ssh connection without loosing current connection? ### Lobsters #### Neural Networks to Upscale & Stylize Pixel Art A small 550 line program is used to upscale minecraft textures. Comments ### StackOverflow #### Image classification using Convolutional neural network I'm trying to classify hotel image data using Convolutional neural network.. Below are some highlights: 1. Image preprocessing: • converting to gray-scale • resizing all images to same resolution • normalizing image data • finding pca components 2. Convolutional neural network: • Input- 32*32 • convolution- 16 filters, 3*3 filter size • pooling- 2*2 filter size • dropout- dropping with 0.5 probability • fully connected- 256 units • dropout- dropping with 0.5 probability • output- 8 classes 3. Libraries used: • Lasagne • nolearn But, I'm getting less accuracy on test data which is around 28% only. Any possible reason for such less accuracy? Any suggested improvement? Thanks in advance. ### CompsciOverflow #### Rigorous Books on Algorithms I thoroughly enjoyed my algorithms class but I felt that it lacked rigor. Most of the time I was able to intuitively understand why the algorithms presented worked and why they had the time complexity that was presented but I'd like to be able to prove such things. As such, I'd like a book that goes over lots of common algorithms and has a focus on proving the correctness and time complexity of the algorithms. Any good recommendations? ### Lobsters #### Emacs Autotetris Mode ### TheoryOverflow #### What do you do when you cannot make progress on the problem you have been working on? I am a 2nd year graduate student in theory. I have been working on a problem for the last year (in graph theory/algorithms). Until yesterday I thought I am doing well (I was extending a theorem from a paper). Today I realized that I have made a simple mistake. I realized that it will be much harder than I thought to do what I intended to do. I feel disappointed so much I am thinking about leaving grad school. Is this a common situation that a researchers notices that her idea is not going to work after considerable amount of work? What do you do when you realized that an approach you had in mind is not going to work and the problem seems too difficult to solve? What advice would you give to a student in my situation? ### CompsciOverflow #### modeling for asset value by Automata I want to model asset value and their relation ship.I model one asset's value like this: state A : when asset value decrease one unit state B: when asset value increase one unit my problem is to show relationship between two asset's value: imagine we have two asset that each one have one asset value model like image above when asset S have decrease in it's value another asset that have relation to it like Q have decrease in it's value simultaneously how can I show this simultaneous transition? ### TheoryOverflow #### How to generate Extended Finite State Machines Randomly with some properties? This is related to my academic project An extended finite state machine is a tuple$SM=(I,S,T)$(simplified): •$I$is the set of identifiers and it's divided into two sets Inputs and outputs, for simplification we will just consider boolean variables. Let$\sigma$be the evaluation function$\sigma :I \to \{0,1\}$which associate to very identifier its value, And let$\Sigma$be the set of all evaluations. •$S$is the set of states •$T\subseteq S\times S \times G\times A$the set of transitions, a transition$t=(s_1,s_2,g,a)$means : •$s_1$the source state •$s_2$the target state •$g$the guard,which is usual expressed in the guard language and here we can just consider it as its semantic$g:\Sigma \to \{0,1\}$where$g(\sigma)$is an element of$\{0,1\}$. •$a$is usually expressed in the action language, and we will here identify it with its semantic$a:\Sigma \to \Sigma $where$f(\sigma)$is another evaluation function. Now a configuration of the state machine$SM$is a tuple$(s_i,\sigma)$where$s_i$is the current state and the$\sigma$the current evaluation function. A run of the machine is : $$(\sigma_0,s_0)\to(\sigma_1,s_{1})\to (\sigma_2,s_2)\to \cdots (\sigma_k,s_k)\to(\sigma_{k+1},s_{k+1})\to \cdots$$ where for every$k$we have$(s_k,s_{k+1},g,a)$is a transition and$g(\sigma_k)=1$(meaning true) and$a(\sigma_k)=\sigma_{k+1}$Example of action : we denote an action by$Rep(a)=[e_1,\cdots,e_n] $where$e_i$is a boolean expression over the variables in$I=[o_1,\cdots,o_n,i_1,\cdots ,i_k]$where$i_j$are the inputs of the system and are determined by the environment. And from this representation of the action we can have$a(\sigma)(x)=e_j(\sigma) $if$x=o_j$and$a(\sigma)(x)=i_j$if$x=i_j$The guards are represented by expressions over the set of the variables$I$What I am able to do : I was able to generate some boolean expressions that are valid or satisfiable using either well known examples, or using the threshold density of boolean formulas. Hence I am able to generate the guards and the actions (which are random, and have some properties like satisfiable or valid) Question : how could I generate a random Extended State machine with the property that: Every state is reachable and every transition can be executed you can use that fact that I am able to generate a formula that evaluates to$1$or$0$and I can generate a satisfiable or unsatisfiable formula Any other ideas to do tests with Extended State Machines? Thanks Edit 29/04/2016 As my question is either not very clear or difficult; I have an alternative question: Question 2: what are some well known random Extended State Machines for which every state is reachable. The purpose of my project is to test the efficiency of an algorithm ### QuantOverflow #### Overpricing Bermudan swaption using Shifted LMM I am trying to model a callable range accrual note linked to the EUR CMS spread, 20Y-10Y, with cap and floor. The note is Bermudan, callable starting year 3, every 3 years till maturity at 30 year. We plan on using shifted LMM for the EUR rate. We plan to calibrate libor correlations to cms 20-10 spreadoptions 1Y maturity because those are the liquid ones, and vols to vanillas. A colleague told me that we will still overprice the trade. I don't understand why. I understand that Bermudans in will trade at a discount to europeans but I don't understand why the modeling will generally overprice it. Any help or links to papers or books will be greatly appreciated. Also any links to books and papers that explain how to remedy the issue will also be great. Thanks! ### CompsciOverflow #### Halting problem reducing to the blank tape halting problem I was going through my book of proof and I find very confusing its definition, so I would like someone to help me in understanding this. • The blank tape problem takes a machine and an empty tape and tells if this machine halts or not • We prove it is unsolvable by proving it reduces to the halting problem Whenever I read online, I read that we we write the input on the tape and we run this on the halting problem. • How do we construct the reduction? • Why do we write the input on the tape? • Isn't this all about having an empty tape? update: I think my misunderstanding is in the definitio of input and tape ### QuantOverflow #### short selling with collateral accounting I don't know how the accounting works for short selling with collateral: For example if a stock is \$10 a share and turn out to be $15 a share a week later. At time 0, you borrow and sell 10 shares and get total proceeds$100

If collateral requirement is 50%: you have to keep $50 in the bank, and any potential losses are deducted from there first. A week later, your position is worth 10 *$15 = $150. If you close, you have net loss \$100 + -\$150 = -$50

Basically wipe out your collateral account entirely, \$50-$50=0

So you still take home $50 which you got from the original sale but weren't required to put into the collateral account. Doesn't this show you still take home positive amount of money, where as if there was no collateral accounting: very clearly you sold for \$100 and bought for $150, so your loss is very clear. For shorting with collateral it appears you still have$50... very confused.

#### How can I compute zero coupon bond prices from dirty/clean prices of coupon bonds?

I am having problems with computing zero-coupon bond prices. The question is the following:

Today is $t$=14.4.2016 and I know dirty and clean prices of coupon bonds expiring at maturities: 4.7.2016, 4.7.2017, 4.7.2018,4.7.2019, 4.7.2020,4.7.2021. Coupons are paid annually on the date of maturity.

How can we determine the term structure of zero-coupon bond prices?

My idea is simply to apply the formula:

$P_{dirty}(t) = \sum_i^n c_i P(t,T_i)$

Starting from the bond A expiring on the 4th July 2016, we should have

$P_{dirty}^A (t)=c^A P(t,4.7.2016)$

from which we can compute the first discount factor. Then, considering the other bonds, recursively, we should be able to compute them all.

Finally, employing these results and the method of least squares (assuming a parametric form of the term structure) we should be able to estimate the term structure.

The issue is that the discount factors $P(t,T_i)$ turn out to be bigger than $1$, which impossible. Can you help me?

Is the formula above correct?

I can provide you with all the numerical values, if you need them.

Thank you very much for any help you can provide!!

#### The relation between exchange rate SDE and respective interest rates

The exchange rate between a domestic currency money market and a foreign currency money market can be expressed as $$dQ(t) = (r_d - r_f)Q(t)dt + \sigma Q(t)d\tilde{W}(t)$$ where $r_d$ is the interest rate for the domestic market, and $r_f$ for foreign.

In my head, I believe that the exchange rate should decrease when the domestic interest rate goes up, indicating the domestic currency is strengthening. For example if the Fed were to increase rates, then $EUR/USD$ should decrease, given that the ECB doesn't do much. So, if $EUR/USD$ was 1.14 yesterday, it should be below 1.14 today.

To my understanding, the SDE for $Q(t)$ doesn't seem to reflect this fact. It seems that $Q(t)$ would increase if $r_d$ were to go up. I would like to resolve this contradiction, so any help would be appreciated.

### Fred Wilson

#### Trends

I like to look at Google Trends from time to time to see what it can tell me about things. I realize that search keyword activity is only one data point in a complex system and that with the move to mobile, it is less important than it was in the web only era. And people search for things when they want them. Once they have them, the search volume goes down. But I still think Google Trends can reveal some interesting things.

Here are some queries I ran today:

Facebook and Google are battling it out for video supremacy, but this query really doesn’t tell us very much about where that battle is going and how it will end. It is interesting to note that YouTube has been a mature but stable business for a long time now.

Twitter and the smartphone seem to have risen with a similar curve and are now in decline, with Twitter falling a bit faster than smartphones.

We see a similar shaped curve with Facebook, but the order of magnitude is quite different which is why I did not combine it with the previous chart.

December 2013 sure seems like the high water mark for the mobile social sector.

But not all boats go out with the receding tide.

Here is Snapchat and Instagram, with Twitter thrown in for scale comparison

It will be interesting to see when Instagram and Snapchat start flattening off. My gut tells me Instagram may already be there but we just don’t see it in the data yet.

Moving on from the past to the future, here are some of the sectors that entrepreneurs and VCs are betting on as the next big thing:

If you take out the VR term and look at the other three, you see something that looks like the NCAA football rankings over the course of a season. Each team/term has had a moment at the top but it remains unclear who is going to prevail.

If we look at one of the most interesting coming battles in tech, the voice interface race, the data is less clear.

I think we haven’t really gotten going on this one. But it is an important one as Chris Dixon explained in a really good blog post last week.

My semi regular Google Trends session today confirms what I’ve known for a while and have written here before. We are largely moving on from mobile and social in terms of big megatrends, video is being played out now, and its not yet clear what is going to emerge as the next big thing. Google is betting on AI and I tend to agree with them on that. Voice interfaces may be a good proxy for that trend.

### StackOverflow

#### How many labels are acceptable before using regression over classification

I have a problem where I'm trying to use supervised learning in python. I have a series of x,y coordinates which i know belong to a label in one data set. In the other i have only the x,y coordinates. I am going to use one set to train the other, my approach is that of supervised learning and to use a classification algorithm (linear discriminant analysis) as the number of labels is discrete. Although they are discrete, they are large in number (n=~80,000). My question, at which number of labels should i consider regression over classification where regression is better suited to continuous labels. I'm using SciKit as my machine learning package and using astronml.orgs excellent tutorial as a guide.

### DragonFly BSD Digest

#### Lazy Reading for 2016/05/01

Cinco De Mayo is coming up.

Your unrelated link of the week: What was the weirdest 911 call ever received?  (via)

### StackOverflow

#### Python keras how to transform a dense layer into a convolutional layer

I have a problem finding the correct mapping of the weights in order to transform a dense layer into a convolutional layer.

This is an excerpt of a ConvNet that I'm working on:

model.add(Convolution2D(512, 3, 3, activation='relu'))


After the MaxPooling, the input is of shape (512,7,7). I would like to transform the dense layer into a convolutional layer to make it look like this:

model.add(Convolution2D(512, 3, 3, activation='relu'))
model.add(Convolution2D(4096, 7, 7, activation='relu'))


However, I don't know how I need to reshape the weights in order to correctly map the flattened weights to the (4096,512,7,7) structure that is needed for the convolutional layer? Right now, the weights of the dense layer are of dimension (25088,4096). I need to somehow map these 25088 elements to a dimension of (512,7,7) while preserving the correct mapping of the weights to the neurons. So far, I have tried multiple ways of reshaping and then transposing but I haven't been able to find the correct mapping.

An example of what I have been trying would be this:

weights[0] = np.transpose(np.reshape(weights[0],(512,7,7,4096)),(3,0,1,2))


but it doesn't map the weights correctly. I verified whether the mapping is correct by comparing the output for both models. If done correctly, I expect the output should be the same.

### QuantOverflow

#### Consequence of negative mean reversion of hull white one factor model

I tried to calibrate the data for hull-white one-factor model. Sometimes, I get negative estimate of mean reversion factor after the calibration process. When I plug the negative mean reversion factor into the hull-white one factor model, the interest rate tree cannot be generated.

I just wonder the theoretical consequence of hull-white one-factor model. Can anyone provide the meaning of negative mean reversion of hull-white one-factor model. If the mean reversion factor is negative, can the model be implemented properly?

Thanks.

### CompsciOverflow

#### What's the time complexity of Monte Carlo Tree Search?

I'm trying to find the time complexity of Monte Carlo Tree Search (MCTS). Googling doesn't help, so I'm trying to see how far I get calculating it myself.

It does four steps for n iterations, or before the time runs out. So we'll have

O(n*(selection+expansion+simulation+backpropagation))

Expansion just adds a child to the currently selected node. Assuming you're not using a singly linked list or something like that to store tree children, this can happen in constant time, so we can exclude it:

O(n*(selection+simulation+backpropagation))

Given the branching factor b, and d as the depth of our tree, I'm assuming the selection phase runs in O(b*d), because at each level, the selection phase goes to all the children of the previous node.

So our time complexity becomes

O(n*(b*d+simulation+backpropagation))

Backpropagation takes time proportional to the depth of the tree as well, so that becomes:

O(n*(b*d+simulation+d))

But now I'm not sure how to add the simulation phase to this. It'll be proportional to the depth of the tree, but for each iteration, the running time depends heavily on the implementation as far as I know.

# What's monte carlo tree search

Monte Carlo Tree Search (MCTS) is a tree search algorithm that tries to find the best path down a decision tree, mostly used for game playing. In games with a high branching factor, it can often go deeper than algorithms like Minimax, even with Alpha-Beta pruning, because it only looks into nodes that look promising. For a certain number of iterations (or a certain amount of time), the algorithm goes through these four phases:

Selection Starting at the root node, go down the tree until you find a node that is not fully expanded, or a leaf node. At each level, select the child that looks most promising. There are many ways of making this selection, but most implementations (including mine) use UCB1.

expansion Expand the selected child by looking at one of its unexplored children.

simulation Simulate a "rollout" from the newly discovered child. Again, there are many ways to do this, but many implementations just take a random child until it reaches an end node. Take note of whether the simulation resulted in a success or a failure (win or loss in the case of a game).

backpropagation update the information about this node, and all nodes along the path back to the root. Most implementations store two values at each node: the number of times it was part of a selection path that lead to a success, and the total number of times it was part of a selection path. These values are used during the selection phase.

After you've run this for as long as you can, the child node of the root that looks most promising is the action to take next.

#### Are there any implementations of the spineless tagless G-machine other than in GHC?

From Simon Peyton Jones (recent Royal Society member) we read the paper: Implementing lazy functional languages on stock hardware: the Spineless Tagless G-Machine.

Now this paper is part of how they made Haskell a lazy language when they were implementing it, and solved some problems they had at the time.

The only comparable paper seems to be: Compiling Lazy Functional Programs Based on the Spineless Tagless G-machine for the Java Virtual Machine but there doesn't appear to be an implementation available.

One tangentially related is: Compiling Haskell to Java. However in this approach they leave the implementation of the Spineless Tagless G-Machine in GHC and just read the output.

My question is: Are there any implementations of the spineless tagless G-machine other than in GHC?

### StackOverflow

#### Machine Learning encoding the name of people

I am going to predict the box office of a movie using sklearn. The question is how to encode the name of actors, screenwriters,directors to fit the model?

### Planet Emacsen

#### Grant Rettke: Emacs Keyboard Design 31 and 32: Give Emacs More Logical Modifiers

• Impossible to design and fabricate a custom for a reasonable price in time and money
• Use a XKE-128 instead

• Rubber dome switches and caps
• Disassembled a Dell keyboard
• Found it had rubber dome switches (obviously, spongy)
• Good to see and know
• Been using them for years, and they were fine
• Mechanical switch probably isn’t required by me
• N-Key rollover
• You could quickly hit
• Control, Meta, Super, Hyper, Shift, j
• If you designed the keyboard out to make it easy
• 6 NKRO is probably fine
• You must choose a keycap style
• DSA makes it easy to try different layouts so use that
• Cost is a big topic
• Maybe a grab bag is a good option?
• If you want a lot of rows and columns then you need a microcontroller with a lot of connections like the Teensy++
• You must choose available key sizes
• The design tools let you make a keycap any size which helps exploration
• At build time you needs fabricated keycaps in that size
• Easier to use pre-made caps
• 3d printing caps is another option, but I don’t want to do that
• Reality is that doing a custom build
• Will require 3x iterations
• Will cost 3x as much
• Reviewed the Ergodox EZ and it’s not for me
• Thoughtful ideas about OS-Hyper key
• Might be best to use the XKE-128 instead
• Zero fabrication costs
• Well-built body
• Rubber-dome is OK
• Cherry MX compatible stems
• No way I could built for less
• Hobby-ish
• Converting to XKE-128 follows
• Make power keys 2 wide because they are available from PI or SP
• Left align QAZ
• Make PgUp PgDn 1w
• Moved arrows to bottom right
• Added CapsLock back under right super
• Added Ultra* so had to move PgUp to each side of Enter row by shift
• Added a space down the middle to occupy 8×16
• Decided that it would be nice to have a space and return that went from C to M so expanded that
• Move Alt and Gui up to middle because
• They are important modifiers
• Alt-Tab is always two-handed, that is OK
• Their importance doesn’t overlap with Emacs modifiers so you use them in a cognitively different place
• PgUp PgDn go all the way left
• Didn’t add back ScrollLk and Break, can add later if needed
• Swap Super and Shift
• Muscle memory makes Shift happier as expected location
• Makes super-shift easy negating opportunity for Super*
• Every Emacs modifier with * appended includes shift
• Wherever it isn’t easy to do by hand, and free keys
• Add Hyper* to left of hyper making it one key
• This placement of hyper makes sense if you recall the feel of the layout of a typical laptop keyboard after you made CapsLock super. Using your thumb to go to C, M, super with your pinky, and H with your thumb again are natural
• C-s and M-s are natural
• H-s is even natural and H*-s is doable
• Ultra shift is easy now, so U* can go away
• Added Xtrm key for Emacs
• C-M-s
• Ultra below it
• You an go “all out” with Emacs modifiers if you like
• H* still makes sense

### QuantOverflow

#### How to fit model implied forward curve with market forward curve for Ornstein-Uhlebeck?

I have a spread option model of 2 correlated Ornstein-Uhlenbeck commodity prices that I estimate the parameters of with Maximum Likelihood. What is the formula for introducing the additional requirement that the (spot market) model implied forward curves are equal to the observed forward curves?
Thank you!

### StackOverflow

#### How is MSE defined for Image comparison?

I am building a convolution autoencoder that uses MSE as its error function. How is MSE defined for images? If the image is presented in simple matrix form, is MSE simply the square of the difference of individual determinants? Or is it the square of the determinant of the difference of the matrices?

#### Maximum Depth for a Random Tree

I'm trying to get the better classifier for a data set on Weka and I'm studying different types of maximum depth for the Random Tree algorithm. But I don't understand the results I get: with a maximumDepth between 3 and 10 I get a far better acuracy rate than with a maximumDepth>10. Anyone can help me to figure out why? Deeper trees shouldn't give better acuracy?

For the past two hours I have been reading about currying in Haskell and all the resources present how the functions with multiple parameters actually return other functions, but not how their definitions looks like, so this is what the question is about.

Let us define the function:

myFunc :: (Num a) => a -> a -> a
myFunc x y = x * 2 + x * y


:t (myFunc 2) prints Num a => a -> a, i.e. a function that takes a number and also outputs a number. However, what does the definition of the function returned by (myFunc 2) look like? Does the compiler substitute x in the definition and the new function becomes something like myFunc' y = 2 * 2 + 2 * y?

How does recursion handle currying? If I define the function

replicate' :: (Integral i, Ord i) => i -> a -> [a]
replicate' n x
| n <= 0    = []
| otherwise = x : replicate' (n - 1) x


, what is the function returned by (replicate' 3) in the context (replicate 3) 'a'?

### TheoryOverflow

#### Intermediate problems between L and NL

It is well-known that directed st-connectivity is $NL$-complete. Reingold's breakthrough result showed that undirected st-connectivity is in $L$. Planar directed st-connectivity is known to be in $UL \cap coUL$. Cho and Huynh defined a parametrized knapsack problem and exhibited a hierarchy of problems between $L$ and $NL$.

I am looking for more problems that are intermediate between $L$ and $NL$ i.e., problems that are :

• known to be in $NL$ but not known (or unlikely) to be $NL$-complete and
• known to be $L$-hard but not known to be in $L$.

### CompsciOverflow

#### What is the average-case running time of Fun-sort?

I read this paper: http://www.sciencedirect.com/science/article/pii/S0166218X04001131?np=y (you can check the PDF online for free), and I translated section 4's Fun-sort algorithm (correct me if I'm wrong):

binary-search(A,x):
A[0]=-∞
A[n+1]=∞    //|A|=n
l=0
h=n+1
m=⌊(l+h)/2⌋
while (h != l+1)
if (x<=A[m])
h=m
else
l=m
m=⌊(l+h)/2⌋
return h    //success if A[h]=x

fun-sort(A):
for i=1 to n:
success=false
while (!success):
success=true
h=binary-search(A,A[i])
if (A[h]!=A[i]):
success=false
if (i<(h-1)):
swap(i,h-1)
elif (i>h):
swap(i,h)


As you can see, this sorting algorithm uses a binary search on a not necessarily sorted array. I realized that to get the average-case running time for Fun-sort, I needed the average number of iterations for its while cycle which I defined to be a random variable X. X would be geometrically distributed, so E[X]=1/p, where p is the probability of success of the binary-search procedure. I am stuck trying to get this probability of success.

So... any help would be very much appreciated =)

EDIT: I've been considering that binary-search will definitely fail when i < m and A[i]<=A[m]. In fact, every time the elements being compared in binary-search are an inversion, it will fail (thanks to Jorge M.). In other words, binary-search will end up in success if (x,A[m]) is not an inversion in each of the log n iterations. Now: what is the probability of (x,A[m]) being an inversion, in general?

EDIT2: I posted this on stackoverflow before I realized cs.stackexchange was a better option.

#### Is this decidable language a subset of this undecidable language?

I think I understand the theoretical definition of decidable and undecidable languages but I am struggling with their examples.

A(DFA) = {(M, w): M is a deterministic finite automaton that accepts the string w}

A(TM) = {(M, w): M is a turing machine that accepts the string w}

I know that A(DFA) is decidable and A(TM) is not. But, is A(DFA) a subset of A(TM)?

### CompsciOverflow

#### Reduce set partition search to decision?

I'm a little lost and don't know how to approach this problem.

Show the partition search problem can be poly-time reduced to the partition decision problem, the partition decision problem takes an input set of numbers and returns true if there is a subset of the initial set that sums up to half the total sum of the initial set.

With problems like ham-path search, clique search and SAT search, the key was to build the solution one piece at a time using the results from the decision "oracle". But I need to know how to approach this problem.

Initially, I thought about removing elements from the set while verifying if there is a partition in the remaining set, which led me nowhere. Now I'm wondering if adding elements to the initial set would have any results. I noticed if the initial set has a partition, adding elements to the set would then only have a partition if the added element is even, but I don't see how this can generate a subset of the original set that satisfies partition search. Am I going off the wrong track? Any pointers would be appreciated.

#### What Measure of Disorder to use when Analysing Quicksort

I'm trying to understand why quicksort using Lomuto partition and a fixed pivot is performing erratically, but overall poorly, on randomly generated inputs. I'm thinking that even though the inputs are randomly generated, there may be a lot of order to the sequences, but I'm not sure how to measure the level of disorder in the sequences. I thought about using the number of inversions, but I saw from this other question I asked that that's not really a good measure in this case.

The reason I suspect that my random sequences have a lot of "order" to them is that randomizing the pivot fixes the performance problem. But theoretically there shouldn't be any performance problem on these supposedly "random" input sequences.

### UnixOverflow

#### RAID: ZFS or Btrfs?

I've mounted my own NAS with ArchLinux on an old HDD. I want to add 3x4To to have real storage capabilites and I would like to use a RAID5 system with these 3 disks.

I've read a lot about ZFS Raid-z and it's exactly what I want to do. But I've heard about Btrfs and it seems Btrfs is also able to handle software RAID-5 like ZFS. But I wonder if Btrfs RAID work as well as ZFS. I also couldn't find complete information regarding how to create and manage the raid. So my question is:

• Is Btrfs able to handle a software raid with same protection as ZFS (no «write hole error», self-healing, etc… ?
• Is Btrfs as reliable as ZFS Raid-z or is it still experimental features?
• If the answer to my 2 first questions are «yes», where can I find full information about how to setup, repair and clean a Btrfs raid?

Thanks for your help :)

### StackOverflow

#### How to predict more than one class with random forest in python?

How can I predict more than one class with random forest in python?

I am familiar with the get probability, but how can I know to which class each probability belongs?

### Fefe

#### Merkel und ihre Handlanger stellen erstmals die bedingungslose ...

Merkel und ihre Handlanger stellen erstmals die bedingungslose Freundschaft mit Israel in Frage. Also nicht mit Israel per se, mit der Netanjahu-Junta. Netanjahu scheint es ein bisschen zu weit getrieben zu haben, die Merkel fühlt sich instrumentalisiert.

### CompsciOverflow

#### Imagine a red-black tree. Is there always a sequence of insertions and deletions that creates it?

Let's assume the following definition of a red-black tree:

1. It is a binary search tree.
2. Each node is colored either red or black. The root is black.
3. Two nodes connected by an edge cannot be red at the same time.
4. Here should be a good definition of a NIL leaf, like on wiki. The NIL leaf is colored black.
5. A path from the root to any NIL leaf contains the same number of black nodes.

### Question

Suppose that you have implemented the insert and delete operations for the red-black tree. Now, if you are given a valid red-black tree, is there always a sequence of insert and delete operations that constructs it?

### Motivation

This question is motivated by this question and by the discussion from this question.

Personally, I do believe that if you imagine a valid red-black tree consisting only of black nodes (which implies that you are imagining a perfectly balanced tree), there is a sequence of insert and delete operations that constructs it. However,

1. I do not know how to accurately prove that
2. I am also interested in the more general case

#### Is this special case of a scheduling problem solvable in linear time?

Alice, a student, has a lot of homework over the next weeks. Each item of homework takes her exactly one day. Each item also has a deadline, and a negative impact on her grades (assume a real number, bonus points for only assuming comparability), if she misses the deadline.

Write a function that given a list of (deadline, grade impact) figures out a schedule for which homework to do on which day that minimizes the sum of bad impact on her grades.

All homework has to be done eventually, but if she misses a deadline for an item, it doesn't matter how late she turns it in.

In an alternative formulation:

ACME corp wants to supply water to customers. They all live along one uphill street. ACME has several wells distributed along the street. Each well bears enough water for one customer. Customers bid different amounts of money to be supplied. The water only flows downhill. Maximize the revenue by choosing which customers to supply.

We can sort the deadlines using bucket sort (or just assume we have already sorted by deadline).

We can solve the problem easily with a greedy algorithm, if we sort by descending grade impact first. That solution will be no better than O(n log n).

Inspired by the Median of Medians and randomized linear minimum spanning tree algorithms, I suspect that we can solve my simple scheduling / flow problem in (randomized?) linear time as well.

I am looking for:

• a (potentially randomized) linear time algorithm
• or alternatively an argument that linear time is not possible

As a stepping stone:

• I have already proven that just knowing which items can be done before their deadline, is enough to reconstruct the complete schedule in linear time. (That insight is underlying the second formulation where I am only asking about certificate.)
• A simple (integral!) linear program can model this problem.
• Using duality of this program, one can check a candidate proposed solution in linear time for optimality, if one is also given the solution to the dual program. (Both solutions can be represented in a linear number of bits.)

Ideally, I want to solve this problem in a model that only uses comparison between grade impacts, and does not assume numbers there.

I have two approaches to this problem---one based on treaps using deadline and impact, the other QuickSelect-like based on choosing random pivot elements and partitioning the items by impact. Both have worst cases that force O(n log n) or worse performance, but I haven't been able to construct a simple special case that degrades the performance of both.

### StackOverflow

#### Implementing Mean Squared Error for matrices in TensorFlow

I am making a convolution autoencoder for images, and want to use MSE as the loss. Does an inbuilt function implementation of MSE exist for matrices?

### CompsciOverflow

#### Deletion in B+ Tree

The B+ tree deletion algorithm given in CLRS is as follows:

1. If the key $k$ is in node $x$ and $x$ is a leaf, delete the key $k$ from $x$.
2. If the key $k$ is in node $x$ and $x$ is an internal node, do the following:

a. If the child $y$ that precedes $k$ in node $x$ has at least $t$ keys, then find the predecessor $k$’ of $k$ in the subtree rooted at $y$. Recursively delete $k$’, and replace $k$ by $k$’ in $x$. (We can find $k$’ and delete it in a single downward pass.)

b. If $y$ has fewer than $t$ keys, then, symmetrically, examine the child $z$ that follows $k$ in node $x$. If $z$ has at least $t$ keys, then find the successor $k$’ of $k$ in the subtree rooted at $z$. Recursively delete $k$’, and replace $k$ by $k$’ in $x$. (We can find $k$’ and delete it in a single downward pass.)

c. Otherwise, if both $y$ and $z$ have only $t – 1$ keys, merge $k$ and all of $z$ into $y$, so that $x$ loses both $k$ and the pointer to $z$, and y now contains $2t – 1$ keys. Then free $z$ and recursively delete $k$ from $y$.

3. If the key $k$ is not present in internal node $x$, determine the root $x.c_i$ of the appropriate subtree that must contain $k$, if $k$ is in the tree at all. If $x.c_i$ has only $t – 1$ keys, execute step 3a or 3b as necessary to guarantee that we descend to a node containing at least $t$ keys. Then finish by recursing on the appropriate child of $x$.

a. If $x.c_i$ has only $t – 1$ keys but has an immediate sibling with at least $t$ keys, give $x.c_i$ an extra key by moving a key from $x$ down into $x.c_i$, moving a key from $x.c_i$’s immediate left or right sibling up into $x$, and moving the appropriate child pointer from the sibling into $x.c_i$.

b. If $x.c_i$ and both of $x.c_i$’s immediate siblings have $t – 1$ keys, merge $x.c_i$ with one sibling, which involves moving a key from $x$ down into the new merged node to become the median key for that node.

I have a doubt in step 2.a (or 2.b). It says "recursively delete $k'$", where $k'$ is a predecessor of $k$. As per my understanding $k'$ should always be in the leaf node in which case we apply step 1 and simply delete it from the leaf. Whats does the author mean to imply with word "recursively", since there will always be only one call to delete predecessor (or successor if you consider step 2.b)

The example given in CLRS also shows predecessor in the leaf as can be seen below in deletion of $M$:

### QuantOverflow

#### trading strategy problem - initial capital x buys S over time [0,T] at the constant rate of x/T euros per unit of time

I am looking for clarification to the trading strategy problem where the number of stocks is depending on time.

In the Market with zero safe rate and stock dynamics defined as $$\frac{dS_t}{S_t}=\mu_t dt + \sigma_t dW_t \quad \quad (1)$$ investor with initial capital x buys stock for an interval of time [0,T] at the constant rate of x/T euros per unit of time.

I am calculating the number of shares at time t, first by defining the change rate

$$d \theta_t=\frac{x}{T} \frac{1}{S_t} dt \quad \quad (2)$$

and then getting the function for $\theta$ by integrating (2)

$$\theta_t=\frac{x}{T} \int_0^t \frac{1}{S_t} ds \quad \quad (3)$$

is this approach correct?

Further I want to show that the payoff $V_T$ at T equals $\frac{x}{T} \int_0^T R_{t,T} dt$ where $R_{t,T}$ is the simple return rate between t and T.

Solution manual says that this should be calculated as $$V_T= \int_0^T \theta_t dS_t \quad \quad (4)$$ and integration by parts yields $$V_T= \theta_T S_T - \int_0^T S_t d\theta_t = \int_0^T (S_T-S_t) d\theta_t = \int_0^T (\frac{S_T}{S_t}-1)S_t d\theta_t = \frac{x}{T} \int_0^T R_{t,T} dt \quad (5)$$

The way the payoffs are derived here is unclear to me. My understanding is that the number of shares is different at each time point but only two prices were used here $S_T$ and $S_t$ while integration is with respect to $\theta$. However the price changes over the time interval as well.

Can anybody explain the reasoning used for the payoffs?

### Fefe

#### Gute Nachrichten vom Überwachungsstaat: Das FBI hat ...

Gute Nachrichten vom Überwachungsstaat: Das FBI hat 2015 48642 National Security Letters rausgehauen. Und der geheime FISA Court hat keine einzige Regierungsanfrage nach Überwachung nicht durchgewunken.

### StackOverflow

#### Intent from a sentence

I need to build a system or use any online service which can help me out to find the intent from the sentence like

Where should you like to have coffee Here the intent is location as where is there

When should you like to have a coffee Here the intent is on time

What cofee would you like to have Here the intent is in the type of coffee

How can i build such a system. Can someone guide me in this. Is it RNN. How in RNN

### CompsciOverflow

#### EBNF Grammar and Phrase

Grammar G2 is defined by the following production in the EBNF
P = 1R| 0QR
Q = 1 | Q0
R = 0| 0Q | R1

(i) which of these is a sentence in G2 : 10 , 11, 00 , 011
(ii) which of these is a sentential form in G2 : 0Q1 , 1R1 , 110 , 0QQ0
(iii) which of these is a phrase of the non-terminal Q in G2 : R1, 1000 , OR1 , R00

### UnixOverflow

#### Installing Avahi on FreeBSD - Daemon Doesn't Start

I've just installed package avahi-app-0.6.31_5 on a fairly-clean FreeBSD 10.3, but the service isn't starting on its own. I've consulted the documentation and discovered that there isn't any.

Can anyone fill me in on what I've overlooked?

### StackOverflow

#### How to re order my list of list?

I have a list like:

[[a,b],[c,d]]


and i want to have:

[[a,c],[b,d]]


using only functional programming.

### StackOverflow

#### Using my own dataset for classification

I am building a ANN module to conduct classification in python. The demo I get imports ClassificationDataSet module

    from pybrain.datasets import ClassificationDataSet
alldata = ClassificationDataSet(2, 1, nb_classes=3)


and I am wondering how can I use my own data. My data is list type. Is there any processing I need to do?

### CompsciOverflow

#### Recursion Type in Grammar Productions

The grammar G0 is defined by the productions P= xP|y which type of recursion is it
left , central , right or indirect recursion

### CompsciOverflow

#### TSP problem with a benchmark data

I've got a test Travel Salesman Problem's data with known optimal solutions. It's in a form of set of 2D points. Particularly, this is a tsplib format; sources are here and here.

I'd started a simple test with the "Western Sahara - 29 Cities" (wi29) and found that my algorithm rapidly found a few better particular solutions than the proposed optimum.

I checked one of them manually and didn't find an error. So, I guess, here're the three options.

1. I did a mistake.
2. Wrong optimum.
3. Different problems were solved.

1 and 2. My solution tour is:

17>18>19>15>22>23>21>29>28>26>20>25>27> 24>16>14>13>9>7>3>4>8>12>10>11>6>2>1>5

(will list my checking calculations if requested)

Rounded length: 26040.76. Optimal reference value: 27603.

1. I can't find a particular task descriptions and especially rounding policy for the TSPLib examples optimums. This is important, because they're looking rounded or discretized in another manner, but simple result rounding isn't looks like it.

### StackOverflow

#### [PHP]How to login using FACEBOOK SDK?

I need your help, what's the method of using PHP to login FB? enter image description here

### CompsciOverflow

#### Why CPU needs to be cooled? [on hold]

CPUs are made of Silicon and conductivity of Silicon increase with temperature, so why there is a need of cooling CPUs.

#### Is there a formal CS definition of VCS and file versions?

I don't know whether it was a joke, but once I read what was referred to as a formal definition of a file in a versioning system such as git, hg or svn. It was something like a mathematical object like a homeomorphism. Was that a joke or is there really computer science theory about versioning systems and the mathematics of VCS?

#### How to improve the accuracy of automatic conflict resolution?

When a court reporter strokes 2 different words with the same keys, this creates a conflict. Normally, the reporter will fix the error later, but sometimes there is a way for the court reporting software to fix the error for you. This can be referred to as automatic conflict resolution.

My court reporting software's system for accomplishing this is by recording the parts-of-speech before and after the conflicting words.

So for example, if my two conflicting words are Tallahassee and shake and I type the following sentences it will look something like this:

I eat at Tallahassee / shake all the time. (Prep - Determiner)

I eat a Tallahassee / shake all the time. (Article - Determiner)

At first it will make me choose between the two, but after I chose it will then automatically add its defined part of speech in a database, so that if I type something like this...

I eat in Tallahassee every afternoon. (Prep - Determiner)

my computer should correctly pick "Tallahassee" since I already told it that "Tallahassee" occurs after a Prep and before a Determiner. The rule for this is simply pos word pos

I tested the practicality of this conflict-resolution system with 79 random conflicts using parts of ANC's pos-tagged corpus and Excel VBA.

• As the data shows, only 10 out of the 79 conflicts showed up with 0 collisions total. This means that in the entirety of the corpus, none of these conflicts had conflicting parts-of-speech which had caused an error.

• 36/79 conflicts showed up with 5 or less collisions.

• 32/79 had 10 or more collisions

• Each collision represents 1 guaranteed error of real-time translation given the pos word pos rule (per the parsed text from ANC, which was about 4.5 million words long)

These results aren't very good for the kind of real-time accuracy I hope to achieve. It would be much better if I could get at least 30/79 (as opposed to only 10) to have 0 "guaranteed errors."

How can I improve this system so that I will have fewer real-time translation errors?

My best thought is to change the rule from pos word pos to pos pos word pos in the case of a collision, but that's all I've got. I'm not very experienced on this subject, so I'm not necessarily opposed to the idea of starting over with something fresh.

### StackOverflow

#### Value train not a member of object NaiveBayes

I am new to spark and trying to use MLlib - NaiveBayes from the documentation example. I tried to import NaiveBayes but I get the below error mentioning it doesnt have train method in it. I am not sure how to proceed with this? If you have any inputs, it would be helpful.

import org.apache.spark.{SparkConf, SparkContext}
import org.apache.spark.mllib.linalg.Vectors
import org.apache.spark.mllib.regression.LabeledPoint
import org.apache.spark.mllib.classification.NaiveBayes

object NaiveBayes {

def main(args: Array[String]){

val conf = new SparkConf().setMaster("local[1]").setAppName("NaiveBayesExample")
val sc = new SparkContext(conf)

val data = sc.textFile("/Users/Desktop/Studies/sample_naive_bayes_data.txt")
val parsedData = data.map { line =>
val parts = line.split(',')
LabeledPoint(parts(0).toDouble, Vectors.dense(parts(1).split(' ').map(_.toDouble)))
}

// Split data into training (60%) and test (40%).
val splits = parsedData.randomSplit(Array(0.6, 0.4), seed = 11L)
val training = splits(0)
val test = splits(1)

val model = NaiveBayes.train(training, lambda = 1.0)

val predictionAndLabel = test.map(p => (model.predict(p.features), p.label))
val accuracy = 1.0 * predictionAndLabel.filter(x => x._1 == x._2).count() / test.count()

println("Accuracy = " + accuracy * 100 + "%")

}
}


ERROR:

 Error:(26, 28) value train is not a member of object NaiveBayes
val model = NaiveBayes.train(training, lambda = 1.0)
^
Error:(29, 59) value _1 is not a member of Nothing
val accuracy = 1.0 * predictionAndLabel.filter(x => x._1 == x._2).count() / test.count()
^


#### What is Train loss, Valid loss, and Train/Val mean in NNs

I'm currently learning about Convolutional Neural Networks by studying examples like the MNIST examples. During the training of a neural network, I often see output like:

 Epoch  |  Train loss  |  Valid loss  |  Train / Val
--------|--------------|--------------|---------------
50  |    0.004756  |    0.007043  |     0.675330
100  |    0.004440  |    0.005321  |     0.834432
250  |    0.003974  |    0.003928  |     1.011598
500  |    0.002574  |    0.002347  |     1.096366
1000  |    0.001861  |    0.001613  |     1.153796
1500  |    0.001558  |    0.001372  |     1.135849
2000  |    0.001409  |    0.001230  |     1.144821
2500  |    0.001295  |    0.001146  |     1.130188
3000  |    0.001195  |    0.001087  |     1.099271


Besides the epochs, can someone give me an explanation on what exactly each column represents and what the values mean? I see a lot of tutorials on basic cnn's, but I haven't run into one that explains this in detail.

#### Wrong Output for prediction using SVR

What could be the possible reason for svr to predict a wrong output for untrained dataset even though its predicting right answer for trained dataset.

I have implemented grid search for best C and Gamma

svr_rbf = GridSearchCV(SVR(kernel='rbf', gamma=0.1), cv=5,
param_grid={"C": [1e0, 1e1, 1e2, 1e3],
"gamma": np.logspace(-2, 2, 5)})
y_rbf = svr_rbf.fit(train_data, train_trans_time).predict(test_data)


, do i have to train more data set or there is any other issue ?

### CompsciOverflow

#### Should Microsoft have banned installing Windows on Macs back when Macs first used Intel processors?

It seems that Macs had very little adoption until they started using Intel processors and can therefore run Windows. So many consumers were initially scared of getting something with an incompatible OS, but now have the security of knowing they can always install their favorite OS. But after trying out OS X, they realized that it wasn't so bad and never bothered to install Windows, thereby permanently depriving Microsoft of a loyal customer. So what claiming "it runs Windows!" does is it lets the purchaser to try out OS X.

So my logic is: by banning Windows from Macs, although Microsoft would lose a bit of revenue, they would have prevented being phased out by Apple in the high end computer market.

Would the plan have worked? Or did most prospective Mac buyers not care about running Windows anyway?

(Of course, this is ignoring the fact that this would be a very underhanded and anti-competitive move lol)

### QuantOverflow

#### testing for the equality of proportions between groups and categories

I have a large sample of males and females grouped into 6 categories of different sizes. I would like to know whether the proportion of females is the same across all 6 categories. Could you please tell me what statistical test I should use?

Thank you!

### CompsciOverflow

#### What is an addressable cell size?

This question started with a quiz question from my university: Consider a big-endian computer system with an addressable cell size of one byte. The values in memory cells 372 to 375 are shown in the table below. What 16-bit two's complement value (expressed as a decimal number) is stored at address 374?

Address    Value
372        0xC5
372        0x5E
374        0x7F
375        0x23


One thing I'm not clear about is what exactly does "addressable cell size of one byte" mean?

### Lambda the Ultimate Forum

#### PL's hotness challenge

Blog post on HN. Only the intro is related to PL:

I’m trying to get into neural networks. There have been a couple big breakthroughs in the field in recent years and suddenly my side project of messing around with programming languages seemed short sighted. It almost seems like we’ll have real AI soon and I want to be working on that. While making my first couple steps into the field it’s hard to keep that enthusiasm. A lot of the field is still kinda handwavy where when you want to find out why something is used the way it’s used, the only answer you can get is “because it works like this and it doesn’t work if we change it.”

Putting my ear to the ground, competition from ML has become more and more common not just in PL, but also in systems, I know many researchers who are re-purposing their skills right now while it has become more difficult to find grad students/interns.

So, is this something we should be worried about? What could happen to make PL more competitive in terms of exciting results/mindshare?

### TheoryOverflow

#### How many subsets of even cardinality does an n-element set have? [on hold]

How many subsets of even cardinality does an n-element set have ?

### Planet Theory

#### The shape of the Kresge Auditorium

The image below is a study of the geometry of MIT's Kresge Auditorium.

I found an article by Ivars Petersen claiming that this building's floor plan is "close to the geometry of a Reuleaux triangle" and I wanted to determine whether that was true. Other sources such as a 50-year retrospective published by MIT state that the roof of the building has the shape of an eighth of a sphere (a spherical right equilateral triangle); see this link on making a 3d model of the shape for an amusingly-captioned visualization of its construction.

So, the floor plan is the projection of an eighth-sphere; what is this shape? The edges of the roof are great circle arcs in 3d, so they project to ellipses in 2d. By my calculation, the aspect ratio of these ellipses is √3:1. To see this, let the sphere be the unit sphere in 3d, with the three corners of the roof at (1,0,0), (0,1,0), and (0,0,1), and project it onto the plane x+y+z=0. Then the semimajor axis of the ellipse is the radius of the sphere, 1, while the semiminor axis is the distance from the origin of the projected midpoint of an arc. The midpoint is √2(1/2,1/2,0), its projection is √2(1/6,1/6,-1/3), and the distance is 1/√3. So I drew three ellipses with that aspect ratio, rotated by a third of a circle around their common center, and to complete the illusion of being three-dimensional (though really it's just a 2d drawing) I added another circle, with radius equal to the semimajor axis of the ellipses. Those are the grey and black parts of the figure. The shape of the auditorium floor plan is the central triangle outlined by black arcs.

The red circles in the drawing are centered at the corners of this triangle, and pass through the other two corners. Their intersection forms a Reuleaux triangle, overlaid on the other curved triangle formed by the projected roof. As you can see, the floor plan is not actually a Reuleaux triangle. It differs from Reuleaux in two significant ways: It has slightly less area, and it has elliptical arcs for sides (with variable curvature, bendier near the corners and flatter near the centers of each side) rather than circular arcs. On the other hand, as Petersen stated, it is very close.

So, to state the obvious, not all curvy triangles are alike! Another example of this same phenomenon is given by the rotor of the Wankel rotary engine: also a curved triangle with sharper angles than the Reuleaux, but with another kind of curve for its sides (the envelope of an epitrochoid). I'm pretty sure this envelope is not an ellipse, even though I don't know how to draw it. And the angles are definitely different. So the Wankel would be yet another kind of curved equilateral triangle that differs from the first two.

### CompsciOverflow

#### Greedy algorithm for submodular optimzation

In these notes, https://courses.engr.illinois.edu/cs598csc/sp2011/Lectures/lecture_3.pdf 4.2.1 exercise 1, the following argument works if $f$ takes values in the integers, but I don't know how to deal with it if $f$ can take values in the reals.

Problem: Given a monotone submodular function $f$ (whose value would be computed by an oracle) on N = {1, 2, . . . , m}, find the smallest set S ⊆ N such that f(S) = f(N).

A greedy algorithm for this problem is as follows:

1. $S \leftarrow\emptyset$
2. while $f(S) \neq f(N)$ {
3. ____find $i$ to maximize $f(S+i)-f(S)$
4. ____set $S \rightarrow S\cup \{i\}$
5. }
6. return $S$

Question: Show that this is a $1+\ln f(N)$ approximation algorithm.

An argument is as follows: If $O$ is an optimal set, we can show that for the $i$ chosen in line 3 of the algorithm,

$$f(S+i) - f(S) \geq \frac{f(N)-f(S)}{|O\setminus S|} \geq \frac{f(N)-f(S)}{|O|}$$

Now letting $S_k$ denote the set $S$ after the $k$'th iteration (so $S_0=\emptyset$), and $z_k = f(N)-f(S_k)$ i.e., $z_k$ the "amount left" after the k'th iteration (so $z_0=f(N)$), the above inequality implies

$$z_k \leq z_{k-1} - \frac{z_{k-1}}{|O|} = \left(1-\frac{1}{|O|}\right)z_{k-1}$$

And therefore $$z_k \leq \left(1-\frac{1}{|O|}\right)^{k}f(N) \leq f(N)\exp(-k/|O|).$$ Setting $k=|O|(1+\ln f(N))$, we have $z_k<1$, and because $f$ takes only integral values, it must be that $z_k=0$.

But the question does not stipulate $f$ take only integral values. How can you deal with an $f$ that takes non-negative reals?

### StackOverflow

#### How can I improve numpy's broadcast

I'm trying implementing k-NN with Mahalanobis's distance in python with numpy. However, the code below works very slowly when I use broadcasting. Please teach me how can I improve numpy speed or implement this better.

from __future__ import division
from sklearn.utils import shuffle
from sklearn.metrics import f1_score
from sklearn.datasets import fetch_mldata
from sklearn.cross_validation import train_test_split

import numpy as np
import matplotlib.pyplot as plt

mnist = fetch_mldata('MNIST original')
mnist_X, mnist_y = shuffle(mnist.data, mnist.target.astype('int32'))

mnist_X = mnist_X/255.0

train_X, test_X, train_y, test_y = train_test_split(mnist_X, mnist_y, test_size=0.2)

k = 2
def data_gen(n):
return train_X[train_y == n]
train_X_num = [data_gen(i) for i in range(10)]
inv_cov = [np.linalg.inv(np.cov(train_X_num[i], rowvar=0)+np.eye(784)*0.00001) for i in range(10)]  # Making Inverse covariance matrices
for i in range(10):
ivec = train_X_num[i]  # ivec size is (number of 'i' data, 784)
ivec = ivec - test_X[:, np.newaxis, :]  # This code is too much slowly, and using huge memory
iinv_cov = inv_cov[i]
d[i] = np.add.reduce(np.dot(ivec, iinv_cov)*ivec, axis=2).sort(1)[:, :k+1]  # Calculate x.T inverse(sigma) x, and extract k-minimal distance


### CompsciOverflow

#### Construct a grammar for $a^mb^n$ s.t $0<=n<=m<=3n$

So this is a homework question. It goes as follows

Construct a grammar over $\{a, b\}$ whose language $$L = \{a^mb^n | 0 ≤ n ≤ m ≤ 3n\}$$

The work I have done is:

$$S \rightarrow aaaSb|aaSb|aSb|\epsilon$$

The intuition behind the solution is that since the number of a's is greater than or equal to the number of b's but less than 3 times the number of b's the grammar can take the form $$aaaSb \text{ (3a's to 1b) or } aaSb \text{ (2a's to 1b) or }aSb \text{ (equal number of a's and b's)}$$

Is this the solution or Am I missing anything else?

### StackOverflow

#### Length of the longest subsequence in a list of integers

I want to get good at functional programming so I set myself some tasks.

I want to determine the length of the longest subsequence in a list of integers, where the next element is incremented by 1.

So the result should be

incsubseq [] ~?= 0,
incsubseq [5] ~?= 1,
incsubseq [1,2,3,5,6] ~?= 3,
incsubseq [5,6,1,2,3] ~?= 3,
incsubseq [5,6,1,4,3] ~?= 2,
incsubseq [6,5,4,3,2,1] ~?= 1]


incsubseq :: [Int] -> Int
incsubseq [] = 0
incsubseq [_] = 1
incsubseq (a:b)
| a == ((head b)-1) = 1 + (incsubseq b)
| a /= ((head b)-1) = (incsubseq b)


But of course this only works for lists that don't have a longer subsequence e.g. [1,2,3,42] = 3, but not for lists like [1,2,100,101,102] which should be 3 but is NOT (It's 2)!

I would really, really appreciate your help since this problem drives me crazy, coming from OO- Programming.

### CompsciOverflow

#### Proof with closure properties for regular languages [duplicate]

This question already has an answer here:

Hi how to prove using properties closure for regular languajes that: $L = \{w|w \in \{a,b\}^* \wedge |w|_b = 2|w|_a \}$ is not regular. Thanks

### QuantOverflow

#### Why do we assume quadratic utility in portfolio theory?

In my text (Investments by BKM), the investor's mean-variance utility (given as $U = E[R] - \frac12A\sigma^2$) is stated to be the objective function we wish to maximize. Upon further digging, it seems that this stems from the assumption of quadratic utility functions ($U = aW - bW^2$). This kind of bothers me since I see two unrealistic properties for quadratic utility functions. (1) They exhibit increasing absolute risk aversion, and (2) they achieve a satiation point, beyond which money/return begins to have negative value.

So why do we assume quadratic utility? Are there no other simple, more realistic functional forms for utility that would still lead to a reasonably clean portfolio optimization theory? Or are the issues I cited about the quadratic just negligible in practice?

### StackOverflow

#### Appropriate Deep Learning Structure for multi-class classification

I have the following data

         feat_1    feat_2 ... feat_n   label
gene_1   100.33     10.2  ... 90.23    great
gene_2   13.32      87.9  ... 77.18    soso
....
gene_m   213.32     63.2  ... 12.23    quitegood


The size of M is large ~30K rows, and N is much smaller ~10 columns. My question is what is the appropriate Deep Learning structure to learn and test the data like above.

At the end of the day, the user will give a vector of genes with expression.

gene_1   989.00
gene_2   77.10
...
gene_N   100.10


And the system will label which label does each gene apply e.g. great or soso, etc...

By structure I mean one of these:

• Convolutional Neural Network (CNN)
• Autoencoder
• Deep Belief Network (DBN)
• Restricted Boltzman Machine

## April 30, 2016

### QuantOverflow

#### Extrapolating SVI

In his paper Gatheral presents the following parametrization of the implied total variance $w(k,T) = \sigma_{BS}(k,T)^2T$

$$w(k) = a + b\{\rho (k-m) + \sqrt{(k-m)^2 + \sigma^2} \}.$$

Assuming that we only have a few market prices e.g. 6 or 7 which are close to at-the-money. I wanted to know if there are any common techniques to extrapolate the implied volatility for Strikes that are far out-of-the-money.

### CompsciOverflow

#### Rotating a set of points around two fixed points

I have a set of points that are used to draw a shape. I want to rotate this shape without moving its start and end points. I tried to illustrate what I want in the image below (original shape on left, interpolated on right). Are there any algorithms to do that? I researched Hermite Spline and Bezier Curves but I did not think they are applicable in my problem.

My goal is achieving this action with only 2D transformations:

#### p2k16 Hackathon Report: espie@ on proot

Our very first p2k16 hackathon report comes from none other than Marc Espie, who writes:

Lots of thanks to Gilles Chehade, Epitech Nantes, and Aymeric Fouchault for the organization. It was top-notch. The only complaint I might have is that the food was so good that I might have eaten too much.

### CompsciOverflow

#### Maximum set of equalities, subject to some inequalities

I have $n$ variables $x_1,\dots,x_n$. I'm given a set $E$ of equalities (each of the form $x_i=x_j$ for some $i,j$) and a set $I$ of inequalities (each of the form $x_i \ne x_j$ for some $i,j$). I want to find a maximum-size subset $E' \subseteq E$ such that $E'$ is compatible with $I$, i.e., such that there is an assignment to the $n$ variables that satisfies every inequality in $I$ and every equality in $E'$.

Is there an efficient algorithm for this?

I can see that the greedy algorithm (try adding equalities to $E'$ as long as doesn't imply the negation of some inequality) doesn't yield an optimal solution. I have no idea what other approaches to try.

Equivalently, the problem can be formulated in graph-theoretic terms. I'm given an undirected graph $G=(V,E)$ and a set $I \subseteq V \times V$. I want to find a maximum-cardinality subset $E' \subseteq E$ of edges, such that when I decompose the graph $G'=(V,E')$ into connected components, $v,w$ are in different connected components for all $(v,w) \in I$.

### TheoryOverflow

#### Subtyping rules for extension of System $F_\omega$ with subtyping and kind-level variance tracking

I need an extension of System $F_\omega$ with subtyping, and where the variance of type constructors is reflected in their kind. Unfortunately, System $F^\omega_{<:}$, as defined in chapter 31 of Pierce's Types and Programming Languages, doesn't address the latter requirement, so I decided to roll my own.

Here is the list of additions to $F_\omega$'s I've made so far:

## Polarities

• Ground forms: $+$, $-$.

• Inversion: $+^\dagger = -$ and $-^\dagger = +$.

## Variances

• Ground forms: sets of polarities.

• Inclusion: set inclusion.

• Inversion: memberwise.

## Kinds

• Ground forms: $\Omega/V$ and $K \rightarrow K'$.

• Inclusion:

• Given $V_1 \subseteq V_2$, we can derive $\Omega/V_1 \subseteq \Omega/V_2$.

• Given $K_2 \subseteq K_1$ and $K_1' \subseteq K_2'$, we can derive $K_1 \rightarrow K_1' \subseteq K_2 \rightarrow K_2'$.

## Types

• Kinding:

• $\Gamma \vdash \top, \bot : \Omega/V$.

• $\Gamma \vdash (\rightarrow) : \Omega/V^\dagger \rightarrow \Omega/V \rightarrow \Omega/V$.

• The remaining rules are as one would expect.

• Inclusion:

• Given $\Gamma \vdash T : \Omega/V$ and $\{+\} \subseteq V$, we can derive $\Gamma \vdash \bot \subseteq T \subseteq \top : \Omega/V$.

• Given $\Gamma \vdash T : \Omega/V$ and $\{-\} \subseteq V$, we can derive $\Gamma \vdash \top \subseteq T \subseteq \bot : \Omega/V$.

• Given $\Gamma \vdash T_1, T_2 : \Omega/\{+,-\}$, we can derive $\Gamma \vdash T_1 \subseteq T_2 : \Omega/\{+,-\}$.

• The remaining rules are as one would expect.

And here I ran out of imagination. Now I have the following questions:

• Is what I've sketched so far sound? What sanity checks can I use to make sure I'm not doing something wrong? Perhaps something akin to automated testing in computer programming?

• A type system with polymorphism and subtyping must have bounded quantification. How hard should it be to add?

• A very important desideratum in type systems with subtyping is that the inhabitants of each kind form a lattice. How hard should it be make sure that each kind is a lattice?

• What's the most convenient tool for mechanizing formal systems like the one I sketched? Preferably, I'd like a library or framework that already does the “boring stuff”, like implementing variable substitution and handling contexts.

### StackOverflow

#### Iterate over parameters without a loop

I need to iterate over two lists of numbers which form the inputs to a function. I'd like to do this in a functional way. Currently I'm doing:

results = []
for i in params_list1:
for j in params_list2:
results.append(myfunction(i,j))


where myfunction() returns a number. I'm pretty sure there is a way to multiply params_list1 and params_list2 (maybe using numpy broadcasting?) and map them to myfunction(), but I'm not able to figure it out. Any tips?

### Lobsters

#### Circuit Classics

I really appreciate seeing a schematic printed on a circuit board next to its circuit. It reminds me that before Open Hardware, hardware was open. Andrew (bunnie) Huang

I think this is a really important point about expecting a schematic for a circuit you are building or using.

### QuantOverflow

#### "Hedging" a put option, question on exercise

I have a question on the following exercise from S. Shreve: Stochastic Calculus for Finance, I:

Exercise 4.2. In Example 4.2.1, we computed the time-zero value of the American put with strike price $5$ to be $1.36$. Consider an agent who borrows $1.36$ at time zero and buys the put. Explain how this agent can generate sufficient funds to pay off his loan (which grows by $25 \%$ each period) by trading in the stock and money markets and optimally exercising the put.

The model from Example 4.2.1 he refers to is the following: \begin{align*} S_0 & = 4 \\ S_1(H) & = 8, S_1(T) = 2 \\ S_2(HH) & = 16, S_2(HT) = S_2(TH) = 4, S_2(TT) = 1 \end{align*} with $r = 1/4$ and risk-neutral probabilities $\hat p = \hat q = 1/2$.

So now my question, I am not sure how to solve this, I guess the agent wants in some way hedge that he can always pay the credit by utilising the option. First how should I think about it, that he should pay back his credit at time step $2$, or earlier if possible by exercising the option, or should he stay liquide until the end? And what means optimal, hedging with minimal invest?

Okay, I solved it by considering two scenarios, first using the option at time step $1$, and then at time step $2$. By using it at time step one I found that he has to invest additional $0.16$, buy $1/2$ from the share, and accordingly borrow $0.16 - 1/2 \cdot 4$ from the bank/monkey market. Then at the first time step, as the option was exercised and the value of the portfolio equals $(1 + r)1.36$ he could just invest everything riskless, i.e. readjusting his portfolio by not buying any shares of stock, and investing in the riskless asset $(1+r)1.36$, in this way at time step $2$ he could pay $(1+r)^2 1.36$.

In the second scenario, i.e. exercising the option at time step $2$, I found that he has to invest additional $1.36$ and buy no share at the initial step, and then readjust in the next step his portfolio as to buy $1/12$ of the share if it goes up and $1.06$ if it goes down, and by exersing the option, if it goes up after paying his debt $(1+r)^2 1.36$ his portfolio has the value $1.3$, meaning he still has money, or $-0.3$ if it goes down, meaning he still has some debt (this point I do not understand fully?...)

So can someone help me in understand and solving this exercise (if my approach is wrong...)?

### StackOverflow

#### How to make prediction on an image took from my phone

I have a small nn trained to predict handwritten numbers ,it uses 20*20 images as input.Input later size is 400.so if I took a photo of a number on my mobile and convert it to grayscale and resize it to 20*20 and convert it to a matrix and then reshaped it to 1*400 ...can I make a prediction on that.....

### QuantOverflow

#### Problems with a Black-Scholes modified equation

I haven't really studied much financial mathematics until about 2 months ago so I'm quite new to this stuff, so I'm sorry if this is a trivial question. At the moment I'm trying to work out what the terms of a modified Black-Scholes equation are equal to, so if someone could help me out I'd appreciate it. The equation is as follows:

$u(0,S_{0}) = \mathbb{E}^{\mathbb{Q}_{BS}}(\Phi) + \frac{\lambda}{2}(\mathbb{E}^{\mathbb{Q}_{BS}}((\Phi*)^2) - (\mathbb{E}^{\mathbb{Q}_{BS}}(\Phi*))^2)$

where $\Phi* = s\partial_s\Phi - \Phi$, $\Phi$ is the payoff, and $\mathbb{Q}_{BS}$ is the risk-neutral probability for the BS equation.

My question is, how do I find out what $\Phi*$ is equal to? In particular, what is $\mathbb{E}^{\mathbb{Q}_{BS}}(\Phi*)$ equal to? There is a lot of material on calculating $\mathbb{E}^{\mathbb{Q}_{BS}}(\Phi)$ so I know what that is equal to, but I have no idea how to find what the other term is equal to ($\lambda$ is just an arbitrary value so I don't need to worry about that).

### StackOverflow

#### Random Number generator fail

How do I get the program to loop back around from the beginning if the incorrect number is picked? I'm not sure what I'm doing wrong. I tried ifs, do whiles, whiles, and if elses:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace ArrayProblms
{
class Program
{
public static void Main(string[] args)
{
Console.WriteLine("Guess a number between 1 and 10: ");
RandomNumberGenerator();
}

public static void RandomNumberGenerator()
{
Random rand = new Random();
int userValue = int.Parse(Console.ReadLine());
int randValue = rand.Next(1, 11);
int attempts = 0;

if (userValue == randValue)
{
Console.WriteLine("You have guessed correctly!");
}
while (userValue != randValue)
{
Console.WriteLine("You have guessed incorrectly");
attempts++;
Console.WriteLine("You have made {0} incorrect guesses", attempts);
break;
}
}
}
}


### StackOverflow

#### Best way to get intersection of keys of two objects?

I have two object literals like so:

var firstObject =
{
x: 0,
y: 1,
z: 2,

a: 10,
b: 20,
e: 30
}

var secondObject =
{
x: 0,
y: 1,
z: 2,

a: 10,
c: 20,
d: 30
}


I want to get the intersection of the keys these two object literals have like so:

var intersectionKeys  = ['x', 'y', 'z', 'a']


I can obviously do a loop and see if a key with the same name exists in the other object, but I am wondering if this would be a good case for some functional programming and map / filter / reduce usage? I myself have not done that much functional programming, but I have a feeling, that there could exist a clean and clever solution for this problem.

#### Weka does not have NominalToNumeric Filter [on hold]

In my dataset, there are 3 nominal attributes that I want to convert them to be numeric for the purpose of k-mean clustering algorithm. In Weka, the only filter I found is NominalToBinary and when I use it creates new attributes corresponding to the number of nominal values there. Is that normal? Why there is no NominalToNumeric is Weka?

Thank you.

#### Sentiment Analysis classifier using Machine Learning

How can we make a working classifier for sentiment analysis since for that we need to train our classifier on huge data sets.

I have the huge data set to train, but the classifier object (here using Python), gives memory error when using 3000 words. And I need to train for more than 100K words.

What I thought was dividing the huge data set into smaller parts and make a classifier object for each and store it in a pickle file and use all of them. But it seems using all the classifier object for testing is not possible as it takes only one of the object during testing.

The solution which is coming in my mind is either to combine all the saved classifier objects stored in the pickle file (which is just not happening) or to keep appending the same object with new training set (but again, it is being overwritten and not appended).

I don't know why, but I could not find any solution for this problem even when it is the basic of machine learning. Every machine learning project needs to be trained in huge data set and the object size for training those data set will always give a memory error.

So, how to solve this problem? I am open to any solution, but would like to hear what is followed by people who do real time machine learning projects.

Code Snippet :

documents = [(list(movie_reviews.words(fileid)), category)
for category in movie_reviews.categories()
for fileid in movie_reviews.fileids(category)]

all_words = []
for w in movie_reviews.words():
all_words.append(w.lower())
all_words = nltk.FreqDist(all_words)
word_features = list(all_words.keys())[:3000]

def find_features(document):
words = set(document)
features = {}
for w in word_features:
features[w] = (w in words)
return features

featuresets = [(find_features(rev), category) for (rev, category) in documents]
numtrain = int(len(documents) * 90 / 100)
training_set = featuresets[:numtrain]
testing_set = featuresets[numtrain:]

classifier = nltk.NaiveBayesClassifier.train(training_set)


PS : I am using the NLTK toolkit using NaiveBayes. My training dataset is being opened and stored in the documents.

### CompsciOverflow

#### Smallest real root of a polynomial in a range

I'm trying to numerically find the smallest real root of a polynomial in a given range. My initial plan was to shift the polynomial so that the bottom of the range was 0, expand the resulting expressions to find the new coefficients, then use the Jenkins-Traub algorithm until it finds the first (smallest) root or increases out of the range. However, the Wiki page says that it only generally finds the roots in increasing order, so it sounds like that's not guaranteed. Is there a way to guarantee finding the first root? If not, what is the most efficient way to solve the problem?

Finding all the roots then sorting is possible, but is inefficient as the degree of the polynomial gets larger, and, I hope, unnecessary. The bisection method is another common algorithm, but consider the polynomial: $-5x^2+5x-1$ with a range of (0,2), which would cause the angle bisection to fail. The best answer is one that guarantees success with the fastest time for a reasonable degree polynomial (less than say 10).

### QuantOverflow

#### Interpolation of forward zeros-coupons bonds simulations for missing maturities (ESG data)

I have a set of economic scenarios simulated with Barrie and Hibbert ESG. The stochastic model for interest rates used is Libor Market Model Shifted. I am facing a problem with zeros-coupons prices.

Indeed, I have for each maturity: 2,000 forward prices(in 1 year to 30 years) trajectories of zeros-coupons. I have the following maturities (1; 2; 3; 5; 8; 10; 15; 20; 30; 40; 50; 60) for each forward price but I want the maturities of 1 to 30 with an annual pace.

I cannot regenerate the scenarios: I have to work with this simulations and I don't have swaptions prices used to calibrate the model. So I have to interpolate the missing maturities throughout 2000 trajectories. Considering that I have to project in risk neutral world, zero coupon prices are martingale seen in $t = 0$: $E[B(t,T)D_t|\mathcal{F}_0]=B(0,t;T)$ with :

• $B(t,T)$: the price seen in t of zero-coupon bond with maturity T.
• $D_t$: the deflator is the deflator to calculate the present value of a cash-flow in t.
• $\mathcal{F}_0$: the filtration(the information in $t=0$)

So I can not make cubic interpolation without considering the fact that the price interpolated must be martingale in all trajectories in expectation (the mean).

What can you propose to me in order to interpolate zeros-coupon prices for missing maturities so that the interpolated prices are still martingale.

### CompsciOverflow

#### If graph isomorphism yields a polynomial time algorihtm

Greeting I'm studying computing theory and are trying to grasp the concept of complexity classes.

If graph isomorphism (suspected NPI) turns out to have polynomial time solution. What possible implication it has or what can we possibly deduce?

Thanks for any explanation

#### Find all shortest paths in a graph where path has even number of edges and greater than 6

Let $G=(V,E)$, a directed with non-negative weights ($w:E\to\mathbb{R}^+$). Describe an algorithm, finds all shortest paths in the graph from a source vertex, $s\in V$, such that, each paths has an even number of edges and the number of edges is greater-equal to $6$.

So I know I need to use Dijkstra algorithm on a modified graph. Somehow I need to "count" the number of edges. I think I need to add some vertices for each vertex which will make the "count" but I can't figure it out completely.

I'd be glad for help.

Thanks.

#### Show that language generated by grammar is regular

We have grammar with nonterminals $X_1,...X_n$ terminals $V_t$ and rewriting rules of form:

$X_i \rightarrow a \in V_t$

$X_i \rightarrow X_jX_k, \ i \ge j , \ i > k$

How can I show that language generated by this grammar is regular?

I don't think this is duplicate question, because:

• I don't have concrete language, but set of nonterminals and set of terminals
• I have to show, that for every possible combination of terminals and nonterminals I get only language which is regular

If I had been given particular language, I could prove it by giving DFA, showing that rules are only of linear type,etc....

#### How to Modify SAT Solvers to Produce Resolution Refutations for Unsatisfiable Instances?

In recent SAT competitions, there is a Certified UNSAT track. The problem instances are all unsatisfiable and the solvers are asked to produce certificates for unsatisfiability. One way is to produce a resolution refutation for the set of clauses in a problem instance. How can a SAT solver, say MiniSAT, be modified to do this? Can this be done for every solver? I searched the Internet and found little information. A reference is good for me.

### Wes Felter

#### "I have news for you: this is not the middle of the PS4’s lifespan, and there is no PS5."

“I have news for you: this is not the middle of the PS4’s lifespan, and there is no PS5.”

- David Galindo

### CompsciOverflow

#### Origin of the concept of types

About the state of art that I'm running ahead of Type Theory I have the these question all related between them.

1. Where did the idea of type come from? (It seems that all start when Russell and Whitehead propuse a way to avoid the contradiction that we know today as Russell's Paradox, am I right?)
2. Before considering the type concept, was there something similar? (Maybe a refinement of a set, but I don't find a reference distinct of Russell).
3. Who was the first person to put it on formal terms? (Was Russell with this paper of 1908 or ?

#### Toads and frogs game algorithm

I am looking for an algorithm (or hint where to start), for Toads and Frogs Game. What I am interested in is not how to solve the problem (it's NP-hard), but how to plan one player's moves. I.e. how to design a computer player (AI), which could win against another player (another program or human player). I was looking for some clues but with no luck so far, there's not much about it on the Web.

Link to game description on Wikipedia.

And here you can play the game. Please bear in mind, that starting positions may not be that straightforward. They may be mixed up from the very start.

### QuantOverflow

#### Volatility of investment (/w currency hedging) [on hold]

I´ve been trying to compute a volatility of invesment with currency hedging and I have a question. Let's take this example. We have our money in a fond copying the S&P500 index, which has 16% volatility, we also know that the current volatility of a dollar toward our currency is 5%. We want to know the volatility of the whole invesment.

Can I compute with the following formula? If so, what is the reason for adding the two deviations instead of mulitplying them considering the volalitity of an index and a currency are mutualy independent (if any correlation exists, it is undefined, at least to my knowledge).

$$\sigma=\sqrt{(16^2)+(5^2)}$$

### Lobsters

#### Dirty Hands - Cheating in Professional Bridge

There’s some interesting examples here of opsec and adhoc ciphers.

### StackOverflow

#### Tensorflow conv2d_transpose output_shape parameter format

I have recently started with tensorlfow and am building a convolution autoencoder. I wanted to know about the format in which the output_shape parameter(1D Tensor) is required in the "tf.nn.conv2d_transpose" function? What are the indices of the output image's height, weight and number of channels?

Documentation for this function only states that the output_shape is a 1D tensor.

### Lobsters

#### How To Correct Someone

Not exactly programming, but there are some universal truths to training.

### CompsciOverflow

#### Which languages (apart from the list below) can do Howard Curry type checking at compile-time?

The Howard-Curry correspondence is enormously powerful. I'd like to use it, but I'd like a choice other than Haskell and Scala.

The languages that I can see that support the Howard-Curry correspondence (with type-checking at compile-time) are:

• Agda
• Coq
• Scala
• Shen
• Qi II

My question is: Which languages (apart from the list below) can do Howard Curry type checking at compile-time?

#### Is function application actually a memory manipulation algorithm?

I thought about how in lambda calculus (and many implementations of functional programming languages) function (lambda) application and lambda itself, as a construct, are "primitive things", usually somehow implemented by an interpeter. Then I thought, can you "boil-down" these two things to more primitve stuff. For instance, we have a following expression (apply is usually implicit by the syntax convetions, but whatever)

(apply (\x.\y.x) (a b))

The interpreter:
1. Constructs a new environment, where arguments are bound to lambda's terms, i.e. new_env = this_env.append({"x":a, "y":b, "body":"x"})
2. Performs a rewrite of the whole application term with lambda's body, i.e. new_env["body"]

Given only the environment manipualtion "primitives", like: "construct", "append", and "get", doesn't that make whole lambda calculus just a clever trick to hide memory ("tape") mutations? Now I know that turing machine and lambda calculus are equivalent, but is there something more to LC than just what I've described? What have I missed?

### StackOverflow

#### Weka API giving ArrayIndexOutOfBoundsException?

I am trying to do prediction using Weka API, i have saved a model for the Weka explorer, now i want to do predict a value for a single instance.

Here is the java code-

import weka.classifiers.Classifier;
import weka.classifiers.functions.LinearRegression;
import weka.classifiers.Evaluation;
import weka.classifiers.bayes.NaiveBayes;
import weka.core.Attribute;
import weka.core.FastVector;
import weka.core.Instance;
import weka.core.Instances;
import weka.classifiers.*;
import java.io.*;
import java.util.*;
import weka.classifiers.trees.RandomTree;
import weka.core.DenseInstance;
import weka.core.SerializationHelper;
import weka.core.Utils;

public class topcoder {

public static void main(String[] args) throws Exception {

Classifier classifier = (Classifier) weka.core.SerializationHelper.read("C:\\Users\\abc\\Desktop\\one.model");

FastVector atts = new FastVector();
FastVector atag;
Instance inst_co;

Attribute attr1 = new Attribute("customer_id");
Attribute attr2 = new Attribute("month");
Attribute attr3 = new Attribute("call_exp");
Attribute attr4 = new Attribute("data_exp");
Attribute attr5 = new Attribute("sms_exp");
Attribute attr6 = new Attribute("total_exp");
Attribute attr7 = new Attribute("real_exp");

ArrayList<Attribute> attributes = new ArrayList<Attribute>();

Instances testing = new Instances("test",attributes,0);
System.out.println((testing.numAttributes()));

inst_co = new DenseInstance(testing.numAttributes());
inst_co.setValue(attr1,10000);
inst_co.setValue(attr2, 12);
inst_co.setValue(attr3, 300);
inst_co.setValue(attr4, 50);
inst_co.setValue(attr5, 10);
inst_co.setValue(attr6, 360);
inst_co.setMissing(attr7);
System.out.println(inst_co.toString());

System.out.println(res+"  io");
}
}


one.model is the model i save from weka GUI, and attr7 that is real_exp is what i am trying to predict.

Here is stack trace -

java.lang.ArrayIndexOutOfBoundsException: 8
at weka.core.DenseInstance.value(DenseInstance.java:347)
at weka.filters.supervised.attribute.NominalToBinary.convertInstanceNumeric(NominalToBinary.java:680)
at weka.filters.supervised.attribute.NominalToBinary.convertInstance(NominalToBinary.java:462)
at weka.filters.supervised.attribute.NominalToBinary.input(NominalToBinary.java:225)
at weka.classifiers.functions.LinearRegression.classifyInstance(LinearRegression.java:352)
at topcoder.main(topcoder.java:91)


I am not sure what is it that i am doing wrong. Please help.

### Planet Emacsen

#### Grant Rettke: Emacs Keyboard Design 30: Smaller Modifiers

• After 3 years using single-key modifiers on a MBP and HP laptop I can do the same thing here.
• USB spec defines all F keys so use them and see what happens
• Get rid of right side of board in the process

• Add F13 to F24
• Make bottom 1 wide:
• Because
• They are flow interrupting actions
• It is OK to find home with 1 key
• Keys
• Alt
• Meta
• Ctrl
• Gui
• Ultra
• Shift
• Leave middle modifiers alone because they are pinky used; need to be large
• Make spacebar and enter 4 wide
• 3 is too small
• 4 is perfect
• Move arrows below right hyper
• Move pgup pgdn below left hyper
• Make them 1.5w
• Move home above left arrow and end above right arrow
• Move caps lock above escape making it 1w
• Move insert and delete above backspace left
• This leaves PrSc, ScrollLock, and Pause hanging
• Replace CapsLock with PrSc and del caps lock
• Delete Scroll Lock and Pause
• Make delete and insert wide
• Made F4 and F7 homing keys if it isn’t obvious

### Lambda the Ultimate

#### Simon Peyton Jones elected into the Royal Society Fellowship

Simon Peyton Jones has been elected as a Fellow of the Royal Society. The Royal Society biography reads:

Simon's main research interest is in functional programming languages, their implementation, and their application. He was a key contributor to the design of the now-standard functional language Haskell, and is the lead designer of the widely-used Glasgow Haskell Compiler (GHC). He has written two textbooks about the implementation of functional languages.

More generally, Simon is interested in language design, rich type systems, compiler technology, code generation, runtime systems, virtual machines, and garbage collection. He is particularly motivated by direct use of principled theory to practical language design and implementation -- that is one reason he loves functional programming so much.

Simon is also chair of Computing at School, the grass-roots organisation that was at the epicentre of the 2014 reform of the English computing curriculum.

Congratulations SPJ!

### StackOverflow

#### Neural Network for function approximation

I have created feedforward neural network in matlab, which is supposed to approximate sine - like functions. I have tested network for:

1. Various number of neurons in hidden layer (around 10 produces the best result)
2. Best transfer function in hidden layer (logsig gives best results)
3. Best train function - trainlm gives best results.

This network with settings as specified above produces really good approximations, but I don't know why:

1. logsig is better than tansig or hardlim
2. trainlm is better than other train function available in matlab.

In general, why sigmoidal transfer functions are better for approximation tasks and why Levenberg-Marquardt backpropagation training function suites the best given problem?

### CompsciOverflow

#### Total number of calls during insertion into binary tree

The problem:

Find a formula for the total number of calls occurring during the insertion of n elements into an initially empty set. Assume that the insertion process fills up the binary search tree level-by-level. Leave your answer in the form of a sum.

code for INSERT function:

procedure INSERT(x: elementtype; var A: SET);
begin
if A = nil then begin
A -> .element := x;
A ->.leftchild := nil;
A ->.rightchild := nil;
end;
else if x < A ->.element then
INSERT(x, A->.leftchild);
else if x > A ->.element then
INSERT(x, A ->.rightchild);
end;
end;


The main confusion for me here is with leaving my answer in the form of a sum. I'm not all that great at sums (haven't taken Calc 2 yet), so I don't really know how to set them up or extract information from them all that well. Any help here would be greatly appreciated.

For clarity: This is a review problem where the answer is:

Let $2^k \leq n \leq 2^{k+1}$. Then $k = \log n$ and the number of calls equals,

$$\sum_{i = 0}^{k - 1} (i + 1)2^i + (k + 1)(n - 2^k + 1)$$

I'd like to know the process behind getting this answer. Thank you.

#### complexity of incrementally creating a graph [duplicate]

This question already has an answer here:

What is the complexity of the two functions f() and g()? They both build a graph G(V, E) incrementally, I think that they are both O(|V|)?

f(n): 1. start with an initial graph K3 (triangle); 2. for (i = 1, n-3) {add a vertexe v to V; and connect it to two randomly chosen verticis from V; }.

g(n):

1. Start with an initial graph G = K3 (triangle);
2. while the number of nodes of the graph G is < n: {Replace an edge (e1, e2) randomly selected from G by a graph f(m); m<= n - |V|}.

/ * The replacement is performed as follows: chose a random edge (e'1, e'2) from G; chose two nodes {e'1, e'2} from a graph H(V', E'), created by f(m); V = V union V' minus {e1,e2}, E = E union E' minus {(e1,e2)}.

THANK YOU

### Planet Emacsen

#### Irreal: What to do When Emacs Hangs or Crashes

Jisang Yoo has a very nice post on recovering from Emacs hangs or crashes. He considers three topics

1. What do do when Emacs hangs
2. How to enable debugging
3. What to do when Emacs crashes

His advice on Emacs hanging seems mostly aimed at Windows users and doesn't mention my preferred method of

pkill -SIGUSR2 Emacs


That will usually unstick Emacs enough that you can save your files and quit. Sometimes you can even keep going but I've found it's generally better to save your files and restart Emacs. If you aren't on a Mac, you will want to use

pkill -SIGUSR2 emacs


The problem with a crash is that you can lose unsaved work. What I've always done in that case is to use recover-file to get the file from disk and fold in the information in the autosave file. Yoo suggests using recover-session instead. This has the advantage of recovering all files from the session that crashed. That's something I didn't know but that I'll use from now on.

Yoo's post is fairly short and one well worth reading. Knowing the things he talks about doesn't make crashes/hangs painless but it does take a lot of the sting out of them.

### CompsciOverflow

#### Numerical Stability of Halley's Recurrence for Integer $n^{\mathrm{th}}$-Root

tl;dr? See last paragraph.

If I use the initial value $2^{\left(\big\lfloor\lfloor\log_2 x \rfloor/n\big\rfloor + 1\right)}$ with Halley's recurrence in the compact form

$x_{k+1} = \frac{x_k\Big[A\left(n+1\right) + x_k^n\left(n-1\right) \Big] }{A\left(n-1\right) + x_k^n\left(n+1\right)}$

to evaluate $\lfloor x^{1/n}\rfloor$ with $x,n \in \mathbb{N}$, $x \gg 1$ and $n > 2$ it seems (empirical tests only!) to work. Slowly.

As is the case with all of these methods: the closer the initial value $x_0$ to the actual root, the smaller the amount of iterations needed. Many papers have been written about it, although not many for the integer versions, but a simple refinement can be implemented by observing that the root lies between $2^{\lfloor \lfloor\log_2 x \rfloor / n \rfloor + 1}$ and $2^{\lfloor \lfloor\log_2 x \rfloor / n \rfloor}$ so a simpel arithmetic average of these limits should give a significant decrease in the number of iterations needed and, lo! and behold, it does. Small problem: the algorithm is unstable with that seed. Visible in the following pretty picture with a highly abused y-axis (#iterations, bitsize of root, and absolute error) and the index along the x-axis. The radicand used was $5987^{797}$.

The range of indices where the error occurs is outside the range where I would use Halley's recurrence and change to bisection. The cut-off point I have choosen is the intersection between the bisection which is linear $ax^{-1}$ and the approximately linear part of the Halley iterations at the beginning $bx$ which puts the intersection at $\sqrt{a/b}$. Some runs with up to $3\,321\,960$ bits (ca. one million decimal digits) showed $\Big\lfloor\sqrt{A_b\left(\left\lfloor\log_2\left(\left\lfloor\log_2 \left(A_b\right) \right\rfloor\right) \right\rfloor + 1\right)}\Big\rfloor$ with $A_b = \left\lfloor\log_2 \left(A\right) \right\rfloor$ to be a good estimate for the big-integer library in use.

Hence my question: is Halley's recurrence, implemented as described above, numerically stable in the range $(3,\Big\lfloor\sqrt{A_b\left(\left\lfloor\log_2\left(\left\lfloor\log_2 \left(A_b\right) \right\rfloor\right) \right\rfloor + 1\right)}\Big\rfloor)$ with the initial value the arithmetic average of $2^{\lfloor \lfloor\log_2 x \rfloor / n \rfloor + 1}$ and $2^{\lfloor \lfloor\log_2 x \rfloor / n \rfloor}$ or not and, much more interesting, why?

#### Distributed systems according to Lamport?

I need some clarification over the following quotation of Leslie Lamport:

A distributed system can be described as a particular sequential state machine that is implemented with a network of processors. The ability to totally order the input requests leads immediately to an algorithm to implement an arbitrary state machine by a network of processors, and hence to implement any distributed system.

I see how a distributed system can be modeled as the product of several particular sequential state machines.

I understand that by totally ordering input requests, one can duplicate a given sequential state machine over several computers, and make sure that all input requests are processed in the same order at every computer, and thus ensuring that the individual local states are coherent.

What I don't understand is how this mechanism is sufficient to implement ANY distributed system. At least it's sufficient to duplicate a sequential state machine, but isn't that a very particular case of a distributed system?

### QuantOverflow

#### Tradable information from BS Implied volatility

These are two follow up questions to:

Implied volatility as price transform

1. I understand that the BS model is used as a 'Blackbox' that takes a market price and maps it in a 1to1 fashion to a 'BS implied volatility'. What I don't understand is how there is any actionable information in that IV number given that everybody knows that most assumptions of BS do not hold. Yes, I know it is well understood and a bad model is better than no model. But let's say in a different universe somebody might have come up with a different model $\phi$ that shares the characteristics that make BS so popular. Now $\phi$ gives different IVs and hence also different actionable information. So how come traders actually use that information in trading if itstems from an 'arbitrary' Blackbox?
2. The second question is based on the following slide of a talk given by P. Staneski, an MD quant at Credit Suisse:

In the world of Black-Scholes, implied volatility is the expected value of future realized volatility because they are both constants!

• Even if we allow for stochastic volatility, given the other assumptions implied volatility is the expected value of future realized volatility.

In the real world, none of the assumptions are true (some being more false than others, particularly the constancy of vol and GBM).

• The market gives us the price of an option, which “embeds” all the imperfections traders must deal with.

• There is only one degree of freedom in the B-S model, namely, the volatility input, which must be “forced” to match the model to the market.

Implied volatility is thus a lot more than expected volatility!

If as he claims IV is a lot more than expected volatility, how can a trader sensibly use that information in trading? Very often, I saw things like (in summary) "If the implied vol is 20 and you think volatility is gonna realise at 18, you just go short vol by selling a call/put and delta hedge it, thereby making money if you are right". Well, what if 5 of my 20 implied vol actually stems from the inaccuracy of the BS model and is not 'priced in expected realised variance'?

### Planet Emacsen

#### emacspeak: Emacspeak 44.0 (SteadyDog) Unleashed

For Immediate Release:

San Jose, Calif., (May 1, 2016)
Emacspeak: Redefining Accessibility In The Era Of (Real)Intelligent Computing
–Zero cost of Ownership makes priceless software Universally affordable!

Emacspeak Inc (NASDOG: ESPK) --http://emacspeak.sf.net– announces the
immediate world-wide availability of Emacspeak 44.0 (SteadyDog) –a
powerful audio desktop for leveraging today's evolving data, social
and service-oriented Internet cloud.

## 1 Investors Note:

With several prominent tweeters expanding coverage of
#emacspeak, NASDOG: ESPK has now been consistently trading over
the social net at levels close to that once attained by DogCom
high-fliers—and as of May 2016 is trading at levels close to
that achieved by once better known stocks in the tech sector.

## 2 What Is It?

Emacspeak is a fully functional audio desktop that provides complete
eyes-free access to all major 32 and 64 bit operating environments. By
seamlessly blending live access to all aspects of the Internet such as
Web-surfing, blogging, social computing and electronic messaging into
the audio desktop, Emacspeak enables speech access to local and remote
information with a consistent and well-integrated user interface. A
rich suite of task-oriented tools provides efficient speech-enabled
access to the evolving service-oriented social Internet cloud.

## 3 Major Enhancements:

• Enable playing multiple media streams using mplayer. 🔊
• Smart Ladspa effects in mplayer, including panning. 🕪
• Sound theme chimes has been spatialized to create theme pan-chimes. 🕭-
• Package elpy has been speech-enabled. 🐍
• Emacspeak now implements automatic soundscapes. 🏙
• Speech-enables package helm.𝍎
• Emacs EWW: Consume Web content efficiently. 🕷
• Updated Info manual 🕮
• emacspeak-url-templates: Smart Web access. ♅
• emacspeak-websearch.el Find things fast. ♁
• And a lot more than wil fit this margin. …

## 4 Establishing Liberty, Equality And Freedom:

Never a toy system, Emacspeak is voluntarily bundled with all
major Linux distributions. Though designed to be modular,
distributors have freely chosen to bundle the fully integrated
system without any undue pressure—a documented success for
the integrated innovation embodied by Emacspeak. As the system
evolves, both upgrades and downgrades continue to be available at
the same zero-cost to all users. The integrity of the Emacspeak
codebase is ensured by the reliable and secure Linux platform
used to develop and distribute the software.

Extensive studies have shown that thanks to these features, users
consider Emacspeak to be absolutely priceless. Thanks to this
wide-spread user demand, the present version remains priceless
as ever—it is being made available at the same zero-cost as
previous releases.

At the same time, Emacspeak continues to innovate in the area of
eyes-free social interaction and carries forward the
well-established Open Source tradition of introducing user
interface features that eventually show up in luser environments.

On this theme, when once challenged by a proponent of a crash-prone
but well-marketed mousetrap with the assertion "Emacs is a system from
the 70's", the creator of Emacspeak evinced surprise at the unusual
candor manifest in the assertion that it would take popular
idiot-proven interfaces until the year 2070 to catch up to where the
Emacspeak audio desktop is today. Industry experts welcomed this
refreshing breath of Courage Certainty and Clarity (CCC) at a time
when users are reeling from the Fear Uncertainty and Doubt (FUD)
unleashed by complex software systems backed by even more convoluted
press releases.

## 5 Independent Test Results:

Independent test results have proven that unlike some modern (and
not so modern) software, Emacspeak can be safely uninstalled without
adversely affecting the continued performance of the computer. These
same tests also revealed that once uninstalled, the user stopped
functioning altogether. Speaking with Aster Labrador, the creator of
Emacspeak once pointed out that these results re-emphasize the
user-centric design of Emacspeak; "It is the user –and not the
computer– that stops functioning when Emacspeak is uninstalled!".

### 5.1 Note from Aster,Bubbles and Tilden:

UnDoctored Videos Inc. is looking for volunteers to star in a
video demonstrating such complete user failure.

## 6 Obtaining Emacspeak:

Emacspeak can be downloaded from GitHub –see
https://github.com/tvraman/emacspeak you can visit Emacspeak on the
WWW at http://emacspeak.sf.net. You can subscribe to the emacspeak
mailing list — emacspeak@cs.vassar.edu — by sending mail to the
list request address emacspeak-request@cs.vassar.edu. The Emacspeak
Blog
is a good source for news about recent enhancements and how to
use them.

The latest development snapshot of Emacspeak is always available via
Git from GitHub at
Emacspeak GitHub .

## 7 History:

• Emacspeak 44.0 continues the steady pace of innovation on the
audio desktop.
• Emacspeak 43.0 brings even more end-user efficiency by leveraging the
ability to spatially place multiple audio streams to provide timely
auditory feedback.
• Emacspeak 42.0 while moving to GitHub from Google Code continues to
innovate in the areas of auditory user interfaces and efficient,
light-weight Internet access.
• Emacspeak 41.0 continues to improve
on the desire to provide not just equal, but superior access —
technology when correctly implemented can significantly enhance the
human ability.
• Emacspeak 40.0 goes back to Web basics by enabling
• Emacspeak 39.0 continues the Emacspeak tradition of increasing the breadth of
user tasks that are covered without introducing unnecessary
bloatware.
• Emacspeak 38.0 is the latest in a series of award-winning
releases from Emacspeak Inc.
• Emacspeak 37.0 continues the tradition of
delivering robust software as reflected by its code-name.
• Emacspeak 36.0 enhances the audio desktop with many new tools including full
EPub support — hence the name EPubDog.
• Emacspeak 35.0 is all about
teaching a new dog old tricks — and is aptly code-named HeadDog in
on of our new Press/Analyst contact. emacspeak-34.0 (AKA Bubbles)
established a new beach-head with respect to rapid task completion in
an eyes-free environment.
• Emacspeak-33.0 AKA StarDog brings
unparalleled cloud access to the audio desktop.
• Emacspeak 32.0 AKA
LuckyDog continues to innovate via open technologies for better
access.
• Emacspeak 31.0 AKA TweetDog — adds tweeting to the Emacspeak
desktop.
• Emacspeak 30.0 AKA SocialDog brings the Social Web to the
audio desktop—you cant but be social if you speak!
• Emacspeak 29.0—AKAAbleDog—is a testament to the resilliance and innovation
embodied by Open Source software—it would not exist without the
thriving Emacs community that continues to ensure that Emacs remains
one of the premier user environments despite perhaps also being one of
the oldest.
• Emacspeak 28.0—AKA PuppyDog—exemplifies the rapid pace of
development evinced by Open Source software.
• Emacspeak 27.0—AKA
FastDog—is the latest in a sequence of upgrades that make previous
releases obsolete and downgrades unnecessary.
• Emacspeak 26—AKA
LeadDog—continues the tradition of introducing innovative access
solutions that are unfettered by the constraints inherent in
• Emacspeak 25 —AKA ActiveDog
information.
• Emacspeak-Alive —AKA LiveDog —enlivens open, unfettered
information access with a series of live updates that once again
demonstrate the power and agility of open source software
development.
• Emacspeak 23.0 – AKA Retriever—went the extra mile in
fetching full access.
• Emacspeak 22.0 —AKA GuideDog —helps users
navigate the Web more effectively than ever before.
• Emacspeak 21.0
—AKA PlayDog —continued the
Emacspeak tradition of relying on enhanced
productivity to liberate users.
• Emacspeak-20.0 —AKA LeapDog —continues
the long established GNU/Emacs tradition of integrated innovation to
create a pleasurable computing environment for eyes-free
interaction.
• emacspeak-19.0 –AKA WorkDog– is designed to enhance
user productivity at work and leisure.
• Emacspeak-18.0 –code named
GoodDog– continued the Emacspeak tradition of enhancing user
productivity and thereby reducing total cost of
ownership.
• Emacspeak-17.0 –code named HappyDog– enhances user
productivity by exploiting today's evolving WWW
standards.
• Emacspeak-16.0 –code named CleverDog– the follow-up to
SmartDog– continued the tradition of working better, faster,
smarter.
• Emacspeak-15.0 –code named SmartDog–followed up on TopDog
as the next in a continuing series of award-winning audio desktop
releases from Emacspeak Inc.
• Emacspeak-14.0 –code named TopDog–was

the first release of this millennium.

• Emacspeak-13.0 –codenamed
YellowLab– was the closing release of the
20th. century.
• Emacspeak-12.0 –code named GoldenDog– began
leveraging the evolving semantic WWW to provide task-oriented speech
• Emacspeak-11.0 –code named Aster– went the
final step in making Linux a zero-cost Internet access solution for
blind and visually impaired users.
• Emacspeak-10.0 –(AKA
Emacspeak-2000) code named WonderDog– continued the tradition of
award-winning software releases designed to make eyes-free computing a
productive and pleasurable experience.
• Emacspeak-9.0 –(AKA
Emacspeak 99) code named BlackLab– continued to innovate in the areas
of speech interaction and interactive accessibility.
• Emacspeak-8.0 –(AKA Emacspeak-98++) code named BlackDog– was a major upgrade to
the speech output extension to Emacs.
• Emacspeak-95 (code named Illinois) was released as OpenSource on
the Internet in May 1995 as the first complete speech interface
to UNIX workstations. The subsequent release, Emacspeak-96 (code
named Egypt) made available in May 1996 provided significant
enhancements to the interface. Emacspeak-97 (Tennessee) went
further in providing a true audio desktop. Emacspeak-98
integrated Internetworking into all aspects of the audio desktop
to provide the first fully interactive speech-enabled WebTop.

## 8 About Emacspeak:

Originally based at Cornell (NY) —
http://www.cs.cornell.edu/home/raman —home to Auditory User
Interfaces (AUI) on the WWW, - Emacspeak is now maintained on GitHub
https://github.com/tvraman/emacspeak. The system is mirrored
world-wide by an international network of software archives and
bundled voluntarily with all major Linux distributions. On Monday,
April 12, 1999, Emacspeak became part of the Smithsonian's Permanent
Research Collection
on Information Technology at the Smithsonian's
National Museum of American History.

The Emacspeak mailing list is archived at Vassar –the home of the
Emacspeak mailing list– thanks to Greg Priest-Dorman, and provides a
valuable knowledge base for new users.

## 9 Press/Analyst Contact: Tilden Labrador

Going forward, Tilden acknowledges his exclusive monopoly on
setting the direction of the Emacspeak Audio Desktop, and
promises to exercise this freedom to innovate and her resulting
power responsibly (as before) in the interest of all dogs.

Windows-Free (WF) is a favorite battle-cry of The League Against
Forced Fenestration (LAFF). –see
http://www.usdoj.gov/atr/cases/f3800/msjudgex.htm for details on
the ill-effects of Forced Fenestration.

CopyWrite )C( Aster, Hubbell and Tilden Labrador. All Writes Reserved.
HeadDog (DM), LiveDog (DM), GoldenDog (DM), BlackDog (DM) etc., are Registered
Dogmarks of Aster, Hubbell and Tilden Labrador. All other dogs belong to
their respective owners.

### CompsciOverflow

#### A coding question

We are given $n, m$ with $n - m > 1$. Let $S$ be the set of all $n$-bit words. Form $2^{n-m}$ disjoint subsets of $S$ of size $2^m$, denote a typical one of them by $A$, and let $B = S \setminus A$. With $H(a,b)$ denoting the Hamming distance of elements $a \in A$ and $b \in B$, let $G(a) = \max_{b \in B} H(a,b)$ and $F(A) = \min_{a \in A} G(a)$. How could one construct the $A$s such that the $F(A)$ values are as small as possible?

#### number of balanced binary trees [duplicate]

This question already has an answer here:

How you can find the number of balanced binary tree knowing only the number of nodes? Is there a method better than generate all possible balanced trees and if not how can i generate those trees based only on the number of nodes?

### StackOverflow

#### Meaning of Error Term e

I was reading the book "Introduction to Statistical Learning". The book says that:

More generally, suppose that we observe a quantitative response Y and a set of predictor variables X1, X2, .... Xn.

We assume that there is some relationship between Y and X (X1, X2, ... Xn) which can be written in the very general form as:

Y = f(X) + e

Here, f is some fixed but unknown function of X and e is a random error term which is independent of X and has mean zero.

I want to know what does it mean to have zero mean ?

### TheoryOverflow

#### How to check if a the language represented by a DFA is finite [on hold]

I am studying regular languages and D FA. I have implemented D FA in Java. I have to write a function which tells if the language represented by a D FA is finite or not. I need a method or algorithm to do so. What I have already figured out is that if the D FA has loops in it then it can possibly recognize infinite words.

### QuantOverflow

#### Thompson Reuters TRBC and GICS

I can retrieve the components But I would like 2 retrieve the Index RIC by the sector scheme Code for Both GCIS and TRBC. I would really like 2 get my hands on the components for the Dow Jones sectors / ICB I have them for NYSE but would like the other exchanges. :) This Finds the Components

=RSearch("EQUITY",A2:B5,"NBROWS:2000",,C2)
RCSIssuerCountryGenealogy   'M:A9/G:6J'
PriceClose  >5.00
GicsName    'Financials'
IsPrimaryRIC    Yes
Row 1262


This Finds the Sector Scheme and code for groupings

=TR(OFFSET($C2,0,0,B6,1),"TR.GICSSector; TR.GICSSectorCode; TR.GICSIndustryGroup; TR.GICSIndustryGroupCode; TR.GICSIndustry; TR.GICSIndustryCode; TR.GICSSubIndustry; TR.GICSSubIndustryCode ","CH=Fd RH=IN",E1)  The problem is how do U retrieve the index ric for GICSIndustryGroupCode or GICSIndustryGroup I found a list of the GCIS symbols broken down by Sectors, Industry Groups, Industries and Sub-Industries But they are EOD symbols :( ### Lobsters #### .note.GNU-stack (2010) ### Undeadly #### proot: dpb meets chroot With the p2k16 hackathon just coming to a close, Marc Espie has revealed one of the new things he worked on. I've been using dpb(1) chroot'd for a long time, using my own methods. This is a first try at making things "simple." Basically, proot -B /build should more or less do something sane, and then you can build ports in that chroot. Read more... ### CompsciOverflow #### What are the concepts required to be conceptual perfect with computer science what are the concepts required to be conceptual perfect with computer science and to works with higher end of the project. #### How can we avoid mistake in LL(1) parse tree? I'm learning about LL(1) parse tree. We need to find first and follow in order to construct a LL(1) parse tree. Each and every time when I find first and follow I'm missing one or the other terminal as we to back track each production. When makes me end up in wrong LL(1) parse tree. Is there any standardized way to check the first and follow table ? ### Fefe #### Die fetten Jahre sind anscheinend vorbei: Exxon weist ... ### CompsciOverflow #### Raspberry-pi2 connection problem [on hold] When I try to connect my raspberry pi to my laptop via Ethernet, it is not assigning any dynamic IP to my laptop(windows 10) though my laptop is set to obtain dynamic IP. I didn't even make the IP static for pi when it was working. What to do? ### StackOverflow #### Gradient Temporal Difference Lambda without Function Approximation In every formalism of GTD(λ) seems to define it in terms of function approximation, using θ and some weight vector w. I understand that the need for gradient methods widely came from their convergence properties for linear function approximators, but I would like to make use of GTD for the importance sampling. Is it possible to take advantage of GTD without function approximation? If so, how are the update equations formalized? ### CompsciOverflow #### About randomness and minmax algorithm with alpha beta pruning Will choosing the child of a node randomly in the alpha beta algorithm have a better chance to get a cut off than choosing them in order? Here's the pseudocode with my addition marked with ***. function alphabeta(node, depth, α, β, maximizingPlayer) if depth = 0 or node is a terminal node return the heuristic value of node arrange childs of node randomly *** if maximizingPlayer v := -∞ for each child of node v := max(v, alphabeta(child, depth - 1, α, β, FALSE)) α := max(α, v) if β ≤ α break (* β cut-off*) return v else v := ∞ for each child of node v := min(v, alphabeta(child, depth - 1, α, β, TRUE)) β := min(β, v) if β ≤ α break (* α cut-off*) return v  I ran a small sample with this on a connect four game and it does seem to run a little faster, but when I actually count the cutoffs with and without randomness, there are more cutoffs without randomness. That's a little odd. Is it possible to prove that it's faster (or slower)? ### Lobsters #### Makefile Assignments are Turing-Complete #### An Elegant Fizzbuzz #### The Magic of Folds #### Practical testing in Haskell #### An Optimization Exercise ### CompsciOverflow #### If the strings of a language can be enumerated in lexicographic order, is it recursive? If the strings of a language L can be effectively enumerated in lexicographic order then is the statement "L is recursive but not necessarily context free" is true? ### Lobsters #### How to pronounce SQL #### The current state of HTML forms ### Fefe #### FPÖ-Chef Strache zahlt 9000 Euro an die Flüchtlingshilfe. ... FPÖ-Chef Strache zahlt 9000 Euro an die Flüchtlingshilfe. Nanu? Allerdings geschah dies offensichtlich nicht freiwillig. Die Zahlung sei im Zuge einer außergerichtlichen Einigung mit einem Fotografen erfolgt, teilte ein FPÖ-Sprecher mit. MWAHAHAHAHA, sehr schön! #### Habt ihr schon mal von Alfredo Stroessner gehört? ... Habt ihr schon mal von Alfredo Stroessner gehört? Das war ein deutschstämmiger Diktator, der sich 1954 in Paraguy an die Macht geputscht hat. Gaby Weber hat dazu beim Deutschlandfunk ein Feature gemacht, 45 Minuten Audio. Lohnt! Der Stroessner hat damals erstmal Menschenjagd auf Indianer gemacht, die der Rinderzucht im Wege waren. Und zwar haben die Rinderzüchter damals Regenwald abgeholzt, um auf dem Boden Rinder züchten zu können, und damit die Jagdgründe der dort lebenden Indianer dezimiert. Die haben daraufhin gelegentlich zu wenig Essen vorgefunden und dann halt ein Rind von der Weide gejagd. Daraufhin haben die Rinderzüchter Jagd auf die Indianer gemacht und in Reservationen gefahren, wo die an Unterernährung und Krankheiten wie Grippe gestorben sind. Das Auswärtige Amt hat sich damals dazu geäußert, dass von einer Verfolgung von Indianern nicht die Rede sein könne. #### Wisst ihr, welches Ministerium in Deutschland für ... Wisst ihr, welches Ministerium in Deutschland für die digitale Infrastruktur zuständig ist? Das ist ja schon eine Pointe an sich, aber wenn die dann auch noch Heartbleed haben, im April 2016, da fehlen mir dann schon ein bisschen die Worte. Die BESTEN der BESTEN der BESTEN, Sir! #### Eines der Hauptprobleme unserer Zeit ist, dass wir ... Eines der Hauptprobleme unserer Zeit ist, dass wir keine effizienten Energiespeicher haben. Wenn Strom gebraucht wird, müssen wir ihn generieren. Wir haben kleinere Puffertechnologien, aber keine großen Speicher, in die man sagen wir mal eine Woche Windstrom einzahlen kann und dann hebt man sie ab, wenn Flaute herrscht. Hier ist ein interessantes Pufferkonzept, bei dem man einen schweren Zug mit Elektromotor den Berg hoch fährt. Dann, wenn man Strom braucht, fährt man ihn wieder runter und betreibt die Motoren als Generatoren. Klingt jetzt nicht, als könne man so richtig viel Strom speichern, aber der Prototype, den sie da gerade bauen wollen, hat 50 MW Leistung und 12.5 MWh Kapazität. Und sie wollen auf 1 GW hoch. Update: Zum Vergleich: Die maximale Speicherkapazität aller österreichischen (Pump-)Speicherkraftwerke beträgt derzeit ca. 3 TWh; für Pumpspeicherkraftwerke allein liegen keine Daten vor. Auch diese Bahngeschichte ist also eher nur ein Puffer. Und die Frage, was man macht, wenn man keinen Berg in der Nähe hat, ist auch noch offen. Da gibt es so Ansätze mit riesigen Kreiseln. Update: Hier ist noch ein cooles Energiespeicherkonzept mit Betonkugeln im Bodensee, und auch über Salz als Wärmespeicher denken Leute nach. Laut Technology Review hat das Betonkugelding einen Wirkungsgrad von 85%, das wären 10 Prozentpunkte mehr als Pumpspeicher. ### StackOverflow #### combining matching and classification I am working on a matching problem between an exposed and control group. I had an idea about using binary classification to solve it. I would assign all observations in the exposed to one class and create the other class artificially. For example, Suppose I have an exposed group of 1,000 people and a control group of 4,000. There are 25 binary profiles Gender, Age (18 to 34), Age (34 to 55), Age (over 55), income (less than 50,000), income (50,00, 100,000), income (greater than 100,000) outdoor enthusiast, video game player, fast food diner, cost conscious shopper, …, pet enthusiast Suppose all of the exposed are Male, age 18-24, video-game players, and fast food diners, and vary in the remaining 21 categories. Let a1, a2,…, a1,000 be my exposed group, so I would put all of my exposed group a1, a2,…,a1,000 in one class and for the remaining class I would take the opposite binary choice for each expose observation. So if a1 has the following profile Gender - Male Age (18 to 34) - Yes Age (34 to 55) - No Age (over 55) - No income (less than 50,000) - Yes income (50,00, 100,000) – No income (greater than 100,000) - No outdoor enthusiast - No video game player - Yes fast food diner - Yes cost conscious shopper - Yes . . . per enthusiast – No  I would then create a new observation for the other class by taking the opposite choice for each category with random selection among the reaming choices for age and income. Gender - Female Age (18 to 34) - No Age (34 to 55) - Yes Age (over 55) - No income (less than 50,000) - No income (50,00, 100,000) – Yes income (greater than 100,000) - No outdoor enthusiast - Yes video game player - No fast food diner - No cost conscious shopper - No . . . pet enthusiast – Yes  I do this for all 1,000 exposed observations. Then I train a binary classifier on a portion of the this data. I applied this to my data ( much larger than 1000, close to 1,000,00 exposed) and it resulted in to 100% classification rate on the my constructed data set and when I used the predict method, it classified all members in the control to the class of exposed observations (highly unlikely in real life). My questions are 1- Does this approach make sense. 2- If it makes sense, why did all the control members get assigned to the exposed class? What is a good method to construct the other class of observations in the training set. 3- If it doesn’t make sense what is a good way to match categorical features? I don’t like the idea of converting them to numeric and using a clustering method. ### Lobsters #### darcs 2.12.0 release ### CompsciOverflow #### How to Convert a Directed Graph to an Undirected Graph (Adjacency Matrix) [on hold] Given an adjacency matrix, what is an algorithm/pseudo-code to convert a directed graph to an undirected graph without adding additional vertices (does not have to be reversable)? similar question here ### DragonFly BSD Digest #### In Other BSDs for 2016/04/30 I think I manage to link at least one story for every BSD type this week, or close to it. ### StackOverflow #### Election Algorithms - A ring algorithm I been reading about Election algorithms in Distributed Systems. I read about the Bully Algorithm and understood it. I came across A Ring Algorithm, read about it an understood how it conducts the election but I could not understand how does it handle a situation when two processes 2 and 5 simultaneously discover that the coordinator 7is not functioning. Each of these builds an ELECTION message and start circulating it.Eventually, both messages will go all the way around,and both 2 and 5 will convert them to COORDINATOR message, with exactly the same number and in the same order. Who is going to be the coordinator (2or 5) and why according to this algorithm? ### CompsciOverflow #### TSP Edge Removal Are there any papers/algorithms for finding edges in a graph that can be removed with affecting the graph's optimal TSP tour length? For instance, in a Euclidean TSP instance, many edges could be ruled out for being too long (i.e., jumping between very far apart nodes) given a decent upper bound on the optimal tour length. Is there some efficient heuristic algorithm for eliminating edges like this? #### Proving a certain superset the halting language is not recursive Let$\Sigma =\{ 0, 1\}$. Let$val:\Sigma^* \rightarrow \mathbb{N}$be a function that given a string returns its decimal value, and$L_{halt} = \{\langle M\rangle \langle w\rangle \mid M $halts on$w \}$. I define the following language:$L=\{ x \mid x\neq x^R \wedge (x\in L_1 \vee x\in L_2 \vee x\in L_3)\}$where: •$L_1=\{ x \mid x\in L_{halt} \wedge x^R \in L_{halt} \wedge val(x)<val(x^R) \}$•$L_2=\{ x \mid x\notin L_{halt} \wedge x^R \notin L_{halt} \wedge val(x)<val(x^R) \}$•$L_3=\{ x \mid x\in L_{halt} \wedge x^R \notin L_{halt} \}$(in this language there is no requirement on$val$function) That is, a string in$L$has 3 options how to look like. Intuitively I can feel that$L\notin R$, but I do not know how to approach a formal proof for this. I tried using a reduction$L_{halt} \leq L$, and then since$L_{halt}\notin R$I can deduce that$L\notin R$, but I could not find a reduction that will work. The closest thing I could come up with is using an intermediate reduction, first defining$L_{halt}^{'} =\{ 1 x 0 \mid x \in L_{halt} \}$and then easily$L_{halt} \leq L_{halt}^{'}$and all I have to do is$L_{halt}^{'} \leq L$, so my idea for reduction was given$x=1y0$then$f(x)=y$(the reduction returns$y$), but then this reduction has a problem if$y\notin L_{halt}$and$y^R\notin L_{halt}$then it might be possible that$val(y) < val(y^R)$and then$f(x)=y\in L$but$y\notin L_{halt}^{'}$... I'm kind of stuck at how my reduction should work? I'm also assuming that any string from$\Sigma^*$is a possible encoding of a$<M><w>$, that is I cannot assume something about the encoding itself and also every word is a possible encoding of$\langle M\rangle \langle w\rangle$... #### Winning strategy of Nim game when picking from multiple piles is allowed I am studying with Game theory right now. In Nim game, in any turn, a player can move any number of stones from any one pile. I am wondering what might be the winning strategy of first player if in any move, a player can pick any number of stones from one or more piles? In this case, how the xor = 0 rule will work? Both are playing optimally. #### How to determine agreement between two sentences? A common Natural Language Processing (NLP) task is to determine semantic similarity between two sentences. Has the question of agreement/disagreement between two sentences been covered in NLP or other literature? I tried searching on Google Scholar but didn't get any relevant results. #### CFG for the language "number of a's = number of b's + 2" How can I construct a context-free grammar for the following language? $$L = \{ w \in \{a,b\}^* : \#_a(w) = \#_b(w) + 2 \}.$$ Please help me out in this. I am not sure how to approach this question. I would also appreciate general tips on constructing context-free grammars. I find it highly difficult. ### QuantOverflow #### Why is the value of an adaptive stochastic process known at time t? I am having a hard time to understand the concept of an adapted stochastic process. Using an analogy to finance, I have been told we can think of adaptiveness of a stock price process as having an access to a Bloomberg terminal and be able to check up the price of the stock at time$ t$, i.e. at each point in time the price of the stock is known. I have also learned that a stochastic process is nothing but a collection of random variables and can thus be interpreted as function-valued random variable. Stochastic processes in general need not be adaptive, but as e.g. Shreve (Stochastic Calculus for Finance vol.2 page 53, 2004) notes it is often safe to assume for finance related stochastic processes to be adapted. Now let us assume that we are dealing with an adapted stochastic process X and fix$ t$. To me it seems that by doing this we will (at this arbitrary point in time) obtain a random variable$ X(\omega; \text{t fixed})$by the definition of a stochastic process. But wait a minute, the value of a random variable should not be known, right? On the contrary, it should be random! How is this seeming puzzle reconciled? To me it is not clear how the definition of an adapted process implies that the value of$ X(\omega; \text{t fixed})$is known at time$ t$. Rather, it just states that at the fixed$ t X(\omega; \text{t fixed})$is$ \mathcal{F}_{t} $-measurable, which is not enough. Just imagine a case of a single random variable (just one point in time) Y on$ (\Omega, \mathcal{F})$(i.e. Y is a$ \mathcal{F} $-measurable function). Obviously the value of Y is not known but random. I have found some earlier related questions (e.g. this) but these have not clarified the matter to me. Thank you in advance for the help! ### CompsciOverflow #### Why IS is not in NL, only in NP? All we need to do is guess$k$vertices. We look at vertex$v_1$, and make sure$v_1$is not connected to$v_2...v_k$. Then, we "throw"$v_1$, and look at$v_2$. We do this to all vertices. Meaning that we only need to guess$k$vertices, and in our working memory (which determinates NL or NP) we only need to keep 2 vertices. Where is my mistake? ### StackOverflow #### Error in row.names<-.data.frame(*tmp*, value = c(NA_real_, NA_real_ I am trying to build a model using the tweets and polarity. But in the middle I get this weird error: At this line: analytics <- create_analytics(container, MAXENT_CLASSIFY)  I get this Error in row.names<-.data.frame(*tmp*, value = c(NA_real_, NA_real_, : duplicate 'row.names' are not allowed In addition: Warning messages: 1: In cbind(labels, BEST_LABEL = as.numeric(best_labels), BEST_PROB = best_probs, : NAs introduced by coercion 2: In create_documentSummary(container, score_summary) : NAs introduced by coercion 3: In cbind(MANUAL_CODE = testing_codes, CONSENSUS_CODE = scores$BEST_LABEL,  :
NAs introduced by coercion
4: In create_topicSummary(container, score_summary) :
NAs introduced by coercion
5: In cbind(TOPIC_CODE = as.numeric(as.vector(topic_codes)), NUM_MANUALLY_CODED = manually_coded,  :
NAs introduced by coercion
6: In cbind(labels, BEST_LABEL = as.numeric(best_labels), BEST_PROB = best_probs,  :
NAs introduced by coercion
7: non-unique values when setting 'row.names':


My CSV file looks like:

text, polarity
Hello I forget the password of my credit card need to know how I can make my statement, neutral
can provide the swift code thanks, neutral
thanks just one more doubt has this card commissions with these characteristics, neutral
Thanks, neutral
are arriving mail scam, negative
can you help me I need to pay an online purchase and ask me for a terminal my debit which is, neutral
if I do not win anything this time I change banks, negative
you can be the next winner of the million that circumvents account award date January, neutral
account and see my accounts so I can have the, negative
thanks i just send the greetings consultation, neutral
may someday enable office not sick people, negative
hello is running payments through the online banking no, negative
thanks hope they do, neutral
should pay attention to many happened to us that your system flushed insufficient balance or had no money in the accounts, negative
yesterday someone had the dignity to answer the telephone banking and verify that the system is crap, negative
and tried but apparently the problem is just to pay movistar services, neutral
good morning was trying to pay for services through the website but get error retry in minutes, negative
if no system agent is non clients or customers also, positive


The code I am using is:

library(RTextTools)

pgT <- as.factor(pg$text) pgP <- as.factor(pg$polarity)

doc_matrix <- create_matrix(pgT, language="spanish", removeNumbers=TRUE, stemWords=TRUE, removeSparseTerms=.998)

dim(doc_matrix)

container <- create_container(doc_matrix, pgP, trainSize=1:275, testSize=276:375, virgin=FALSE)

MAXENT <- train_model(container,"MAXENT")

MAXENT_CLASSIFY <- classify_model(container, MAXENT)

analytics <- create_analytics(container, MAXENT_CLASSIFY)

summary(analytics)


#### Classify data-set (stringToWord) filter by weka

i'm new in weka.

i've a data-set (twitter data) about specific company .. the filter i used : string to word .. and i change the option wordstokeep =100 , to improve the accuracy . then i applied classifiers : Kstar 55% , RandomForest 57% , SMO 58% these not that most good result ..

is there any idea , that help me to improve it very well >>

### CompsciOverflow

#### Using consensus for atomic commits

I read that consensus algorithms allow you to ensure the atomic commit. That is, in order to save a large amount of data on the disk, as if it is a single object, the article advises a consensus protocol.

I could not understand what consensus has to do with your writing on disk, where you seem to have only a single node writing a contigous piece of information (I asked this Q at dba but they blocked my account saying that it is a violation) but, anyway, let's take a distributed DB. I read that consensus allows me to agree upon a single value (without resorting to any locks). But, atomicity means that you should agree upon many values at the same time. How do you exploit the consensus algorithms for updating multiple values atomically?

I see in computerphile about Paxos that you can use consensus to take locks. Do you use consensus just to take them and commit a transaction under the locks?

### Lobsters

#### Supreme Court moves to expand FBI’s hacking authority

“Simply, it will allow an FBI agent sitting in Virginia to hack into a computer or network in Nevada — or anywhere in the world.”

### StackOverflow

#### How to update element inside List with ImmutableJS?

Here is what official docs said

updateIn(keyPath: Array<any>, updater: (value: any) => any): List<T>
updateIn(keyPath: Array<any>, notSetValue: any, updater: (value: any) => any): List<T>
updateIn(keyPath: Iterable<any, any>, updater: (value: any) => any): List<T>
updateIn(keyPath: Iterable<any, any>, notSetValue: any, updater: (value: any) => any): List<T>


There is no way normal web developer (not functional programmer) would understand that!

I have pretty simple (for non-functional approach) case.

var arr = [];
arr.push({id: 1, name: "first", count: 2});
arr.push({id: 2, name: "second", count: 1});
arr.push({id: 3, name: "third", count: 2});
arr.push({id: 4, name: "fourth", count: 1});
var list = Immutable.List.of(arr);


How can I update list where element with name third have its count set to 4?

### CompsciOverflow

#### How to calculate virtual address space from page size, virtual address size and page table entry size?

I try to solve an exercise, unfortunately without any success yet.

From the following given information, the virtual address space should be calculated:

• Page size is 16 KB
• Logical address size is 47 bit
• 3 levels of page tables; all have the same size
• Page table entry size is 8 byte

My idea is to find out how many pages can be addressed. The page count should lead then to the size of the virtual address space, right?

Virtual address space = page size * page count

As far as I understand is the page count defined by the logical address size. The logical address is split up in 3 levels of page tables plus the offset. Due the page table entry size is 8 byte (2^6 = 64 bit), 6 bits of the logical address are used for each stage to address it. The offset will have the size of 30 bits.

Each page stage can address 64 bit plus the 30 bits offset. So does this result in the page count?

Page count = 64 * 3 + 30 = 222

With the page size and the page count I get the virtual address space of 3552 KB.

I think this is wrong. It should be much larger. What is not correct? Any help is appreciated!

### StackOverflow

#### Python lambda function to sort list of tuples

Initially, a list of tuples is sorted by their second value and their third value.

tlist = [('a', 1, 14), ('a', 1, 16), ('b', 1, 22),
('a', 2, 1), ('c', 2, 9), ('d', 2, 11), ('d', 2, 12)]


Trying to sort this list of tuples by their second value, reverse by their third value; that is, want the output of the list to be sorted like so:

tlist= [('b', 1, 22), ('a', 1, 16), ('a', 1, 14),
('d', 2, 12), ('d', 2, 11), ('c', 2, 9), ('a', 2, 1)]


This is what I have tried so far as seen in this answer:

tlist = sorted(tlist, key=lambda t: return (t[0], t[1], -t[2]))


but it does not work; gives a return outside of function error.

Any ides of how to get the desired output?

EDIT

This question provides very good resources and perhaps all someone would need to go ahead and tackle my specific question. However, given my low level of experience and my specific instance of sorting with different sorting criteria (as described on the description above), I think this is no duplicate.

#### How to decide the size of layers in Keras' Dense method?

Below is the simple example of multi-class classification task with IRIS data.

import seaborn as sns
import numpy as np
from sklearn.cross_validation import train_test_split
from keras.models import Sequential
from keras.layers.core import Dense, Activation, Dropout
from keras.regularizers import l2
from keras.utils import np_utils

#np.random.seed(1335)

# Prepare data
X = iris.values[:, 0:4]
y = iris.values[:, 4]

# Make test and train set
train_X, test_X, train_y, test_y = train_test_split(X, y, train_size=0.5, random_state=0)

################################
# Evaluate Keras Neural Network
################################

# Make ONE-HOT
def one_hot_encode_object_array(arr):
'''One hot encode a numpy array of objects (e.g. strings)'''
uniques, ids = np.unique(arr, return_inverse=True)
return np_utils.to_categorical(ids, len(uniques))

train_y_ohe = one_hot_encode_object_array(train_y)
test_y_ohe = one_hot_encode_object_array(test_y)

model = Sequential()
activation="tanh",
W_regularizer=l2(0.001)))

# Actual modelling
# If you increase the epoch the accuracy will increase until it drop at
# certain point. Epoch 50 accuracy 0.99, and after that drop to 0.977, with
# epoch 70
hist = model.fit(train_X, train_y_ohe, verbose=0,   nb_epoch=100,  batch_size=1)

score, accuracy = model.evaluate(test_X, test_y_ohe, batch_size=16, verbose=0)
print("Test fraction correct (NN-Score) = {:.2f}".format(score))
print("Test fraction correct (NN-Accuracy) = {:.2f}".format(accuracy))


My question is how do people usually decide the size of layers? For example based on code above we have:

model.add(Dense(16, input_shape=(4,),
activation="tanh",
W_regularizer=l2(0.001)))


Where first parameter of Dense is 16 and second is 3.

• Why two layers uses two different values for Dense?
• How do we choose what's the best value for Dense?

### Fred Wilson

#### Video Of The Week: The Nitty Gritty Podcast

Bond Street, a startup company that makes small business loans, has started a podcast to tell stories about small business entrepreneurs and the companies they create and run. They call it the Nitty Gritty Podcast.

The first episode features an entrepreneur who is also a friend of ours, Gabe Stulman.

Gabe is a restaurant operator in the west village of Manhattan, where we live. We started our relationship with Gabe as regulars at his first restaurant and we have gone on to be investors in all of his current restaurants, as well as good friends with him.

Here is Gabe’s story. It’s a good one.

### UnixOverflow

#### Why can't I enable Transmission on FreeNAS 9.2? I keep getting an error

I have installed the plugin correctly, but every time I try and start it either using service transmission start, or trying to flip the on switch from the plugins page, I get this error:

I also have tried creating a new standard jail and using portsnap to get transmission, but same error.

### QuantOverflow

#### How to do performance attribution for a few characteristics?

Let's say the characteristics that I am interested in are

1. FX
2. Country
3. Security selection

I have the benchmark weights and returns, the FX returns, and the portfolio weights and returns. Can someone give a few pointers on how I would be able to get the performance attributions for 1,2,3?

### StackOverflow

#### Which languages (apart from the list below) can do Howard Curry type checking at compile-time?

The Howard-Curry correspondence is enormously powerful. I'd like to use it, but I'd like a choice other than Haskell and Scala.

The languages that I can see that support the Howard-Curry correspondence (with type-checking at compile-time) are: * Agda * Coq * Haskell * Scala * Shen * Qi II

My question is: Which languages (apart from the list below) can do Howard Curry type checking at compile-time?

#### Scala recursive API calls to get all the results

I'm pretty new to Scala, and what I'm trying to achieve is to make enough calls to an API with different offset until I get all the results.

Here's a simplified version of what I have, and I was wondering if there is a more idiomatic Scala way to do it. (The code sample might not be 100% accurate, it's just something I put up as an example)

def getProducts(
limit: Int = 50,
offset: Int = 0,
otherProducts: Seq[Product] = Seq()): Future[Seq[Product]] = {

val eventualResponse: Future[ApiResponse] = apiService.getProducts(limit, offset)

val results: Future[Seq[Product]] = eventualResponse.flatMap { response =>

if (response.isComplete) {
logger.info("Got every product!")
Future.successful(response.products ++ otherProducts)
} else {
logger.info("Need to fetch more data...")

val newOffset = offset + limit
getProducts(limit, newOffset, otherProducts ++ response.products)
}
}

results
}


Passing the otherProducts parameter just doesn't feel right :P

Thanks in advance for any suggestions :)

#### MATLAB: The determinant of a covariance matrix is either 0 or inf

I have a 1500x1500 covariance matrix of which I am trying to calculate the determinant for EM-ML method. The covariance matrix is obtained by finding the SIGMA matrix and then passing it into the nearestSPD library (Link) to make the matrix positive definite . In this case the matrix is always singular. Another method I tried was of manually generating a positive definite matrix using A'*A technique. (A was taken as a 1600x1500 matrix). This always gives me the determinant as infinite. Any idea on how I can get a positive definite matrix with a finite determinant?

### CompsciOverflow

#### Training accuracy in Tensorflow suddenly drops

I was training a ConvNet on CIFAR10 based on the code presented in "Deep MNIST for Experts" tutorial by TensorFlow. After around a few thousand epochs, the training accuracy suddenly drops from 90% to 4%. Further inspection shows that, together with the drop in accuracy, cross entropy and predicted scores also turns into NaN. Any ideas why this happen?

### QuantOverflow

#### Calibrating and simulating returns from a t-distribution

A slight twist (I hope) on the familiar problem of simulating log returns from a t distribution. My two questions concern calibration to sample data. First, one can infer the degrees of freedom in the t distribution, v, by equating the kurtosis of a sample of log returns with the kurtosis of the t distribution, which is 3(v-2)/(v-4). Alternatively, one could do the same thing with the variance, which is given by v/(v-2). In general, the two procedures will not yield the same value for v. Which is better? Or should one take a GMM approach?

My second concern involves scaling. It is often suggested that when simulating stock prices using a t-distribution, one should scale the sample volatility by the square root of (v-2)/v. I have found that this scaling can produce a density which (athough fatter tailed) is more peaked than the normally distributed returns for the same sample. This seems wrong.

### StackOverflow

#### From text to K-Means Vectors input

I've just started diving into Machine Learning, specifically into Clustering. (I'm using Python but this is irrelevant) My goal is, starting from a collection of tweets (100K) about fashion world, to perform KMeans over their text.

Till now I've filtered texts, truncating stopwords, useless terms, punctuation; done lemmatization (exploiting Part Of Speech tagging for better results).

I show the user the most frequent terms, hashtags, bigrams, trigrams,..9grams so that he can refine preprocessing adding words to useless terms.

My initial idea was to use the top n(1K) terms as features, creating foreach tweet a vector of fixed size n(1K) having a cell set to a value if the top term (of this cell) appear in this tweet (maybe calculating the cell's value with TFIDF).

Am I missing something(the 0 values will be considered)? Can I exploit n-grams in some way?

This scikit article is pretty general and I'm not understanding the whole thing.

(Is LSA dimensionality reduction useful or is it better reducing the number of features (so vectors dimension) manually? )

### CompsciOverflow

#### 2-thread consensus impl. with single FIFO queue and atomic read/write registers

I need to implement a 2-thread consensus with a single, initially empty FIFO queue and atomic read/write registers.

If deq() is called when the queue is empty, then a special EMPTY token is returned.

Any ideas anyone?

Thanks

**The queue can't be initialized with elements inside it, the algorithm starts when the queue is empty - cannot apply the standard algorithm for 2-thread consensus using a queue here.

#### Hough transform: difference in cartesian to polar equation

According to Wikipedia and most other resources on the internet, the relation between cartesian coordinate and polar coordinate parameters are described by the equation $x\cos{\theta}+y\sin{\theta}=d$.

However, the computer vision course in Udacity used $x\cos{\theta}-y\sin{\theta}=d$ instead, as shown in this short video https://youtu.be/2oGYGXJfjzw?t=4s.

What is the difference?

### StackOverflow

#### How to write java program that simulates a "pseudo" MACHINE LEARNING program? [on hold]

I have to do this project where I have to solve a problem using my own "Simple pseudo" machine learning algorithm. I will not be using any complex algorithms. I only have to use the technique in which the program learns a task from prior experience.

I really have no idea how to start this since we have never even learned about Machine Learning and I don't get how I can write a code that will learn by prior experiences by itself

My directions state:

You will be training a self-driving Uber car (the machine) how to drive to a specific location. Using the attached Uber street map with various numbers that represent streets and houses which represent the target location, the Uber must begin at the start and learn how to drive to the target house. Build a user interface that asks the user to type in a location (house) for which they want to be driven to. I created the map with the houses/locations being represented by EVEN numbers. So the user must type in an EVEN number. Then let the Uber randomly select a ROUTE with a series of TURNs. The ROUTE would be similar to the GAME class that I gave you in the Machine Learning PowerPoint example and the TURN would be similar to the MOVE class that I gave you in the Machine Learning PowerPoint example. After the Uber arrives at a destination randomly the first time around, if it is NOT the correct destination, then the Uber has to go back to START and try it again. During the next ROUTE attempt and in the successive ROUTE attempts, the Uber will be referring back to its’ previous ROUTE TURN attempts and see what it learned as to whether the TURN was a good one or not (if it wasn’t a good turn, then the Uber is not going to be stupid enough to do it again). Ideally, after a few attempts, the Uber will ultimately learn how to get to the correct destination. The ROUTE attempts are done when the Uber finally arrives at the correct destination. MAKE SURE TO ADD COMMENTS IN YOUR CODE!!

### QuantOverflow

#### Industry factors without GICS

I'm working through the Quantitative Equity Portfolio Management book by Chincarini and Kim.

I'd like to build a basic industry-based fundamental factor model. As this is a pet project for pedagogical purposes, I don't have the money to spend on Barra's GICS classifications. I also understand that other industry classifications (SIC and NAICS) are fairly useless for factor models.

Is there a reasonable open-source or homemade alternative (using, say, k-means clustering or non-negative matrix factorization) to create my own industry factors for US equities?

### Planet Theory

#### They are perfect for each other

Now that Ted Cruz has chosen his running mate, everybody is wondering: who is going to be Trump’s pick for vice-president?

It would make sense if, to mitigate his negatives, Trump chose a person of color and someone who has a history of speaking out against income inequality.

He or she would have to be someone who is media-savvy and with some experience running a campaign, but definitely not a career politician. And of course he or she should be someone who endorsed Trump early on, like, say, in January.

I can think of only one person: Jimmy McMillan!

### TheoryOverflow

#### What is the job of the Network Layer under the OSI reference model? [on hold]

What is the job of the Network Layer under the OSI reference model?

How does a network topology affect your decision in setting up a network?

### CompsciOverflow

#### How does Theory Of Computation actually been used in our Computer Systems and how does it interact with it directly or indirectly? [on hold]

I am doing engineering in INFORMATION SCIENCE and i am parallel y studying THEORY OF COMPUTATION and MACHINE INSTRUCTION.

Thank you

#### How to efficiently create balanced KD-Trees from a static set of points

From Wikipedia, KD-Trees:

Alternative algorithms for building a balanced k-d tree presort the data prior to building the tree. They then maintain the order of the presort during tree construction and hence eliminate the costly step of finding the median at each level of subdivision. [..] This algorithm presorts n points in each of k dimensions

I fail to understand how that is actually done. Consider this example:

Lets say I have 5 points in an array.

0 = (4, 7)
1 = (2, 9)
2 = (5, 4)
3 = (3, 6)
4 = (2, 1)


I suppose we now want to create 2 sorted arrays, one by X and one by Y ?

Sort them by X     Sort them by Y
1 = (2, 9)         4 = (2, 1)
4 = (2, 1)         2 = (5, 4)
3 = (3, 6)         3 = (3, 6)
0 = (4, 7)         0 = (4, 7)
2 = (5, 4)         1 = (2, 9)


We start creating the KD-Tree. Lets split by X axis, at the first step. We select the median of the points, 3 = (3, 6), so the Left and right trees will like this:

Left  (first half)
1 = (2, 9)
4 = (2, 1)

Right  (second half)
0 = (4, 7)
2 = (5, 4)


Now we want to sort the Left and the Right trees by Y axis. How are we supposed to make use of the points that we sorted previously by Y axis ?

The only solution in my eyes is to re-sort the Left and Right trees respectively, by Y axis. What is Wikipedia talking about ?

#### Where Can I Find DFA Practice?

Where can I find a web site or book where I can practice drawing DFAs from a language like " {w| w has at least three a’s and at least two b’s} ". It will be important to have access to the answer so as to check myself. I need the practice.

### StackOverflow

#### best datamine/classification techniques [on hold]

Are there exist some usual powerful techniques for data analysis which are common and suitable for various data in various type of situations?

For example I need to classify new data for part of which I already has the classification. I need to try (my thought examples are below):

1. try to apply PCA, then RandomForestDecisions;
2. find to most significant columns via method X(using lib A in python), then apply Kohonen network to all data using such way;
3. try SVM with Markov chaines (see this example in R, and this in Mathematica, and improve result with K-nearest method on the result;
4. use this toolkit to find data anomalies, and try usual backpropogation NN (like here) or reccurent neural networks like here;
5. combine genetic algorithms (like this) on linear classificators ( I mean this).

I feel like astronaut/cosmonaut diving the Infinity in current amount of data-minning tools and algorithms.

### CompsciOverflow

#### Vertex Disjoint Path Covers of Hypercube-Like Graphs

This is a followup question relating to an older question I posted, namely: Decomposing the n-cube into vertex-disjoint paths.

Given a graph $G = (V, E)$ and sets of distinct vertices $S = \{s_1, \ldots s_k \}$ and $T = \{t_1, \ldots t_k\}$, I am interested in finding a vertex disjoint path cover $P = \{P_1, \ldots P_k\}$ of $G$ such that each $P_i$ begins with $s_i$ and ends with $t_i$. Moreover, for $P$ to be a path cover of $G$, every vertex $v \in V$ must be part of a unique path $P_i \in P$.

In my previous question, I was interested in the case when $G = \mathcal{Q}_n$ where $\mathcal{Q}_n$ denotes the $n$ dimensional hypercube graph. It was shown by Gregor and Dvorak that such a cover exists when $P$ is balanced (in the sense that they contain the same amount of vertices from both bi-partitions of the $n$-cube), then such paths exists whenever $2k-e < n$, where $e$ is the number of pairs $(s_i, t_i)$ that form edges in $\mathcal{Q}_n$.

Now I am interested in the same problem for a graph $G = \mathcal{Q}_n - \mathcal{Q}_d$ (i.e a single copy of $\mathcal{Q}_d$ is deleted from $\mathcal{Q}_n$), for $1 \leq d \leq n$. Results were shown for the existence of Hamiltonian cycles in graphs for graphs $\mathcal{Q}_n - G$ when $G$ is an isometric tree or cycle (http://davpe.net/download/mff/bc_revised.pdf), but nothing for the problem of path covers in the desired graph class $G = \mathcal{Q}_n - \mathcal{Q}_d$. Is anyone familiar with such results?

Edit: I should add I am primarily interested in results for $1 \leq k \leq 2$, though I would appreciate results for any $k$.

### StackOverflow

#### Feature weightage from Azure Machine Learning Deployed Web Service

I am trying to predict from my past data which has around 20 attribute columns and a label. Out of those 20, only 4 are significant for prediction. But i also want to know that if a row falls into one of the classified categories, what other important correlated columns apart from those 4 and what are their weight. I want to get that result from my deployed web service on Azure.

### CompsciOverflow

#### get time slots available in any of workshop

I have a list of Workshops, each having open_days (Monday, Tuesday etc.) and open_time and close_time (which will be same for each day). Now, Based on the current time, I need to find out next 7 days available slots. A slot is of 2 hour and is available if any of the workshop is open in that 2 hour time. The first slot of each day will start at the open_time (nearest to hour time for eg. if open_time is 09:23:54 then first slot start time will be 10:00:00) of workshop opens first on that day.

How to find all the available slots?

Input:

workshops = [{"id":1,"open_days":[1,2,3,4,5,6],"open_time":"09:30:00","close_time":"19:30:00"},{"id":2,"open_days":[2,3,4,5,6,7],"open_time":"08:00:00","close_time":"16:30:00"}]

current_time = "2016-04-29 14:00:00"


Note: Here open_days will be list of days where Monday is 1 and sunday is 7.

Output:

slots = {"2016-04-29":[{"start_time":"14:00:00","end_time":"16:00:00"},{"start_time":"16:00:00","end_time":"18:00:00"}],"2016-04-30":[{"start_time":"08:00:00","end_time":"10:00:00"},{"start_time":"10:00:00","end_time":"12:00:00"},{"start_time":"12:00:00","end_time":"14:00:00"},
{"start_time":"14:00:00","end_time":"16:00:00"},
{"start_time":"16:00:00","end_time":"18:00:00"}],...}


### CompsciOverflow

#### Can any finite problem be in NP-Complete?

My lecturer made the statement

Any finite problem cannot be NP-Complete

He was talking about Sudoku's at the time saying something along the lines that for a 8x8 Sudoku there is a finite set of solutions but I can't remember exactly what he said. I wrote down the note that I've quoted but still don't really understand.

Sudoku's are NP complete if I'm not mistaken. The clique problem is also NP-Complete and if I had a 4-Clique problem is this not a finite problem that is NP-Complete?

### TheoryOverflow

#### Where Can I Find DFA Practice? [migrated]

Where can I find a web site or book where I can practice drawing DFAs from a language like " {w| w has at least three a’s and at least two b’s} ". It will be important to have access to the answer so as to check myself. I need the practice.

#### Computing the approximate population of a bloom filter

Given a bloom filter of size N-bits and K hash functions, of which M-bits (where M <= N) of the filter are set.

Is it possible to approximate the number of elements inserted into the bloom filter?

Simple Example

I've been mulling over the following example, assuming a BF of 100-bits and 5 hash functions where 10-bits are set...

Best case scenario: Assuming the hash functions are really perfect and uniquely map a bit for some X number of values, then given 10-bits have been set we can say that there has only been 2 elements inserted into the BF

Worst case scenario: Assuming the hash functions are bad and consistently map to the same bit (yet unique amongst each other), then we can say 10 elements have been inserted into the BF

The range seems to be [2,10] where abouts in this range is probably determined by the false-positive probability of filter - I'm stuck at this point.

### CompsciOverflow

#### Algorithm to convert rendered number back into symbolic form

If you have a number such as $3.14626437$ and you need to know what symbols create it, as far as I know, there are two tools:

1- ISC

and the answer is $\sqrt2+\sqrt3$

I am wondering what algorithm these websites are using and how much is their complexity?

### hubertf's NetBSD blog

#### OpenHUB's NetBSD Project Statistics

This flew by on Twitter (thanks ajcc @6LR61!), and I think it's neat so I point to it here: BlackDuck's OpenHUB has a number of NetBSD project statistics, generated automatically. Statis include activity and vulnerability reports, languages, lines-of-code statistics (with comment and blank lines), 30 day and 12 month activity reports with commit and contributor numbers, number of contributers per month since 1993 and more. In a nutshell, NetBSD consists of 5902 years of effort. Have a look!

### CompsciOverflow

#### Using Headpose Vector and 2D Points to Compute Distances

I have a frame taken from a video. The frame contains a face and I have the (x, y) locations of the features (corners of lips, edge of eyebrows, etc.) and the headpose vector (pitch, yaw, roll), which shows the direction that the face is looking in degrees ((0, 0, 0) would be at the camera).

I need to calculate distances between specific points in real (3D) space. How can I map the feature locations to 3D space?

#### How does a recurrent connection in a neural network work?

I am reading a very interesting paper on genetic algorithms which define neural networks. I am familiar with how a feedforward neural network operates, but then I came across this:

Where node #4 goes back to connect to #5. I was wondering how this is handled? Does the state of node 4 get kept from the last timestep and applied to node 5 when it is time to calculate its activation?

### Lobsters

#### Lively Linear Lisp -- 'Look Ma, No Garbage!' (1992)

tldr; Lisp sans the garbage collector

### QuantOverflow

#### What does a negative stock amount mean in a single-period, binomial market model?

Consider a single-period, binomial market model with a $r > 0$ interest rate (in USD per period) and a portfolio $(x, y)$ consisting of two assets: a savings/lendings account and a stock, both measured in USD.

Now, both $x$ and $y$ may be positive and negative. If $x$ is positive, the savings account holds $x$ USD; if $x$ is negative - the account holder owes the bank $x$ USD. If $y$ is positive, the account holder has $y$ stocks at his/her possession.

What does it mean for $y$ to be negative?

The only idea I have is that a negative $y$ corresponds to short selling $y$ stocks. However, in the real world, short selling a stock is accompanied by setting up an interest-accruing margin account with the broker and possibly depositing an additional collateral, and this is not reflected in the model.

I understand that the model is a simplification of the real world, but I don't think ignoring an interest accruing debt is an acceptable simplification, and, in support of this I bring the following quote from Investopedia:

Most of the time, you can hold a short for as long as you want, although interest is charged on margin accounts, so keeping a short sale open for a long time will cost more.

It also simply doesn't make any economical sense, of the sort that exists even in the most simplified models of economic interactions, that one can borrow something of value without having to pay for it.

So what does it mean for $y$ to be negative? How can I wrap my mind around it?

EDIT: Here's my proposal for an answer, let me know what you think.

The difficulty arises from the fact that the word "stock" is a misnomer: the security referred to as a "stock" does not model a real-life stock, not even in simplified form. Rather, it models some other financial instrument that has no counterpart in real life, which, together with additional financial instruments, can be used to synthesize a simplified model of a real life stock.

A much better conceptualization of what the model "stock" means is it is a variation on a savings/lending account: whereas the value of a regular savings/lending account increments deterministically with time, the value of the so-called "stock" increments randomly.

From this follows the following conclusion: instead of referring to the model "stock" as such, it would be better to call it "a random savings/lending account", as opposed to "a deterministic savings/lending account", and save the term "stock" to a different financial instrument that actually models a real-life stock, and that can be synthesized from a combination of a deterministic and a random savings/lending accounts and possibly some other financial instruments.

### TheoryOverflow

#### Observational Equivalence of open terms in PCF

The notion of observational equivalence is rather intuitive, but formally I'm having some doubts in the particular case of open terms.

Lets consider the simple case where the terms M and N are free variables, M = x and N = y. Choosing the program context C[-] = (λxy.[-]) v w we have,
C[M] = (λxy.x) v w →* v
C[N] = (λxy.y) v w →* w
So by definition M and N wouldn't be observationally equivalent.

However, consider any adequate denotational semantics of PCF. Then by definition of adequacy, ⟦M⟧ = ⟦N⟧ implies M and N observationally equivalent. The denotations of both (open) terms are,
⟦M⟧ = ⟦x ⊢ x⟧ = id
⟦N⟧ = ⟦y ⊢ y⟧ = id
From what we conclude M and N to be observationally equivalent.

So, surely I must be going wrong somewhere, and I suspect it to be some trivial mistake. But where?

### StackOverflow

#### Sort a generator using generators?

I have a lot of data stored in generators, and i would like to sort them without using lists, to not go out of memory in the process. It's possible to sort the generators by this way?. I have some hours thinking this and i can't find a way to do it without saving the seen values somewhere (or there's a way saving them "partially"). I have read in google about lazy sorting, is that a nice approach? Thanks for the answers!!

EDIT: My final objective is to write all the sorted data to a file.

PS: sorry about my bad english ><

### CompsciOverflow

#### Global relabeling heuristic: Push-relabel maxflow

I have a correct, working implementation of the preflow-push-relabel maxflow algorithm [2]. I am trying to implement the global relabeling update heuristic [3], but have run into some issues.

I have a specific instance of the problem here to illustrate my questions 1:

a) For this problem instance, the "current preflow" is a state that is reached by my implemented max flow algorithm [[ even without global relabeling heuristic we reach this state ]]. Without applying any heuristics the algorithm proceeds to completion from this state giving the correct result.

b) The distance labels (in green) at this state represent a valid labeling as for every (u, v) \belong to E_{residual} d_u <= d_v + 1

Q1: The distance label (or height) is supposed to be a lower bound on the distance from the sink. However for several nodes in the residual graph (eg: 6, 7) this is not true. Eg. Node 7 has a distance label of 14...but clearly has a distance of 1 from the sink in the residual graph.

Q2: On running the global relabel at this stage, we get a labeling as seen in the extreme right. From this point on (depending on how frequently you do the global update), the algorithm can get stuck [[ deadlock ]] -- eg. nodes 6, 3 keep circulating flow between them as they each get relabeled to a higher height. If you run the global update before the reach the final height, you will reset the heights and the process repeats.

I am reasonably sure I am making a very trivial error but am unable to put my finger on it. Can someone help me with this issue? I am happy to provide a code snippet if that would help.

A couple more points that may be relevant:
-> I am implementing a single phase version of the push-relabel algorithm (ie. do not stop when you have a min. cut, but continue till you obtain a valid flow).
-> I am processing active vertices in FIFO order in the current problem instance. But I have a highest label first [3] implementation as well. I do not think this affects the two questions I have.

Thank You!

[[ This question has also been posted on programmers stackexchange -- I was unsure which forum is better for this question ]]

[2] A new approach to the maximum flow problem; A.Goldberg, R.Tarjan; JACM, Vol 35. Iss 4, 1988

[3] On implementing the push-relabel method for the maximum flow problem; B.V.Cherkassky, A.Goldberg

### StackOverflow

#### true negative is 0% whereas true positive is 100% correctly classified

I used Naive Bayes from Spark's MlLib to train a model and test it on the data (in the form of an RDD). The results were confusing.

the data and results are as follows:

The problem is a binary classification one. The outcome should be either a label with '0' or '1'.

total number of labels with '0' in the testing dataset - 11774

total number of labels with '1' in the testing dataset - 246

Code for reference:

from pyspark.mllib.classification import LogisticRegressionWithLBFGS, LogisticRegressionModel
from pyspark.mllib.regression import LabeledPoint
from pyspark.mllib.util import MLUtils
from pyspark.mllib.evaluation import MulticlassMetrics

def parsePoint(line):
values = [float(x) for x in line]
return LabeledPoint(values[-1], values[0:-1])

data = myRDD.map(parsePoint)

# Split data aproximately into training (60%) and test (40%)
training, test = data.randomSplit([0.6, 0.4], seed=0)

# Train a naive Bayes model.
model = LogisticRegressionWithLBFGS.train(training, 1.0)

#labelsAndPreds = test.map(lambda p: (p.label, model.predict(p.features)))
predictionAndLabels = test.map(lambda lp: (float(model.predict(lp.features)), lp.label))
accuracy =1.0 * predictionAndLabels.filter(lambda (v, p): v == p).count() / test.count()
accuracy


after applying the model and obtaining the predictions :

True Positives - 11774

False Positives - 0

False Negatives - 246

True Negatives - 0

All my '0' labels are correctly classified and whereas all the '1' labels are incorrectly classified!

Now, this is a part of my project and I'm not sure if the results are fine to be submitted.

The code I wrote using Spark's Python API does this: it gets the data from a file and builds the RDD. I just fed this RDD into the Spark MlLib's Naive Bayes documentation provided on the website and the result is as above.

Can someone please tell me if this result is normal?

## April 29, 2016

### Lobsters

#### Go bindings to Rust's regex library

This tests out the new C API I added to Rust’s regex library: https://github.com/rust-lang-nursery/regex/tree/master/regex-capi

### DragonFly BSD Digest

#### garbage[24]: It’s all fun and games until someone’s domain gets hurt

The garbage podcast for this week is up, with discussion of OpenBSD and TRIM, and, well, a very wide range of topics, going by the summary.

### UnixOverflow

#### FreeBSD ports collection under PC-BSD?

I'm toying with the idea of installing FreeBSD or PC-BSD... as it looks now, I'll probably go for PC-BSD.

However, one thing I really would like, is the FreeBSD ports collection. So I was wondering if it can be installed on PC-BSD? If it can be used with PC-BSD - without too much conflict with PC-BSD's own package-installation and package-repository (ie. AppCafe)? (E.g. building and installing some packages from the port-collection, then removing it with PC-BSD's GUI package-manager... Or building and installing a package from the port-collection, and then adding a package that depend on the first one with PC-BSD's GUI package-manager...)

And finally how I can install it? Can I install the port-collection directly by installing some package (which?) from the PC-BSD repository? Or must I download it separately from FreeBSD (where?) and install it "manually" (how?) in PC-BSD?

### QuantOverflow

#### Regression model when samples are small and not correlated

I received this question during an onsite interview for a quant job and I'm still scratching my head on how to solve this problem. Any help would be appreciated.

Mr Quant thinks that there is a linear relationship between past and future intraday returns. So he would like to test this idea. For convenience, he decided to parameterize return in his data set using a regular time grid dt where $d=0, …, D-1$ labels date and $t=0, …, T-1$ intraday time period. For example, if we split day into 10 minute intervals then $T = 1440 / 10$. His model written on this time grid has the following form:

$y_{d,t}$ $=$ $\beta_t$ * $x_{d,t}$ + $\epsilon_{d,t}$

where $y_{d,t}$ is a return over the time interval $(t,t+1)$ and $x_{d,t}$ is a return over the previous time interval, $(t–1,t)$ at a given day $d$. In other words, he thinks that previous 10-minute return predicts future 10-minute return, but the coefficient between them might change intraday.

Of course, to fit $\beta_t$ he can use $T$ ordinary least square regressions, one for each “$t$”, but:

(a) his data set is fairly small $D$=300, $T$=100;

(b) he thinks that signal is very small, at best it has correlation with the target of 5%.

He hopes that some machine learning method that can combine regressions from nearby intraday times can help.

How would you solve this problem? Data provided is an $x$ matrix of predictors of size $300\times100$ and a $y$ matrix of targets of size $300\times100$.

### StackOverflow

#### Does IBM Watson API learn from my data?

I'm testing couple of IBM Watson APIs like the following:

Does Watson get smarter and learn more about my data the more I use it?

I read that Watson is getting smarter with more data it learns and processes. I'm not sure if this is only done behind the scenes by IBM Watson team, or if these API's as well allow an instance of Watson for example to be smarter with my specific application I'm developing.

### StackOverflow

#### Lists of Functions and their execution Erlang

Is it possible to create and send a list of functions as an argument to another function, and then have some functions within this list call other functions in this list?
For example, I want a function that works on a list passed as an argument, and then performs the first function in the list of functions on this list of numbers, and if that function makes calls to other functions within that list, they can be retrieved and used.

e.g.: deviation(Numbers, Functions) -> %Functions = [fun master/1, fun avg/1, fun meanroots/1]
Master calls avg, then passes that result into meanroots, etc. but at the end of the call chain the master is the one that will return a value.

I'd like to know if this is functionally possible, whether within OTP alone or using NIFs, and if there are samples of implementation to look at.

### CompsciOverflow

#### Minimal Steiner Tree in unweighted directed graph

I have an unweighted directed graph $(V, E)$ and a subset $T \subseteq V$ of these vertices. I want to find the minimum tree $(V',E')$ that contains all these $T$ vertices (minimize in number of nodes $|V'|$).

In my problem all vertices $t \in T$ have no leaving edges: $$\left\{(t,w) \in E \mid t \in T\right\}=\emptyset$$

This is a specific case of the Directed Steiner Tree problem.

What's the best algorithm and it's complexity to find the exact solution to the Directed Steiner Tree problem? (Or, if there is, a better solutions to this specific case of the Directed Steiner Tree)

What are the most used approximations for this problem?

### StackOverflow

#### Implementing custom kernel in Matlab

I would like to use the PUK kernel for SVM in Matlab. The PUK kernel looks as follows:

In Matlab, I have to create a function which implements the kernel.

function G = myPUKKernel(U,V)

end


How would this function look like?

Second, what parameters ranges is reasonable to search for omega and sigma?

### Jeff Atwood

#### They Have To Be Monsters

Since I started working on Discourse, I spend a lot of time thinking about how software can encourage and nudge people to be more empathetic online. That's why it's troubling to read articles like this one:

My brother’s 32nd birthday is today. It’s an especially emotional day for his family because he’s not alive for it.

He died of a heroin overdose last February. This year is even harder than the last. I started weeping at midnight and eventually cried myself to sleep. Today’s symptoms include explosions of sporadic sobbing and an insurmountable feeling of emptiness. My mom posted a gut-wrenching comment on my brother’s Facebook page about the unfairness of it all. Her baby should be here, not gone. “Where is the God that is making us all so sad?” she asked.

In response, someone — a stranger/(I assume) another human being — commented with one word: “Junkie.”

The interaction may seem a bit strange and out of context until you realize that this is the Facebook page of a person who was somewhat famous, who produced the excellent show Parks and Recreation. Not that this forgives the behavior in any way, of course, but it does explain why strangers would wander by and make observations.

There is deep truth in the old idea that people are able to say these things because they are looking at a screen full of words, not directly at the face of the person they're about to say a terrible thing to. That one level of abstraction the Internet allows, typing, which is so immensely powerful in so many other contexts …

… has some crippling emotional consequences.

As an exercise in empathy, try to imagine saying some of the terrible things people typed to each other online to a real person sitting directly in front of you. Or don't imagine, and just watch this video.

I challenge you to watch the entirety of that video. I couldn't do it. This is the second time I've tried, and I had to turn it off not even 2 minutes in because I couldn't take it any more.

It's no coincidence that these are comments directed at women. Over the last few years I have come to understand how, as a straight white man, I have the privilege of being immune from most of this kind of treatment. But others are not so fortunate. The Guardian analyzed 70 million comments and found that online abuse is heaped disproportionately on women, people of color, and people of different sexual orientation.

And avalanches happen easily online. Anonymity disinhibits people, making some of them more likely to be abusive. Mobs can form quickly: once one abusive comment is posted, others will often pile in, competing to see who can be the most cruel. This abuse can move across platforms at great speed – from Twitter, to Facebook, to blogposts – and it can be viewed on multiple devices – the desktop at work, the mobile phone at home. To the person targeted, it can feel like the perpetrator is everywhere: at home, in the office, on the bus, in the street.

I've only had a little taste of this treatment, once. The sense of being "under siege" – a constant barrage of vitriol and judgment pouring your way every day, every hour – was palpable. It was not pleasant. It absolutely affected my state of mind. Someone remarked in the comments that ultimately it did not matter, because as a white man I could walk away from the whole situation any time. And they were right. I began to appreciate what it would feel like when you can't walk away, when this harassment follows you around everywhere you go online, and you never really know when the next incident will occur, or exactly what shape it will take.

Imagine the feeling of being constantly on edge like that, every day. What happens to your state of mind when walking away isn't an option? It gave me great pause.

I admired the way Stephanie Wittels Wachs actually engaged with the person who left that awful comment. This is a man who has two children of his own, and should be no stranger to the kind of pain involved in a child's death. And yet he felt the need to post the word "Junkie" in reply to a mother's anguish over losing her child to drug addiction.

Isn’t this what empathy is? Putting myself in someone else’s shoes with the knowledge and awareness that I, too, am human and, therefore, susceptible to this tragedy or any number of tragedies along the way?

Most would simply delete the comment, block the user, and walk away. Totally defensible. But she didn't. She takes the time and effort to attempt to understand this person who is abusing her mother, to reach them, to connect, to demonstrate the very empathy this man appears incapable of.

Consider the related story of Lenny Pozner, who lost a child at Sandy Hook, and became the target of groups who believe the event was a hoax, and similarly selflessly devotes much of his time to refuting and countering these bizarre claims.

Tracy’s alleged harassment was hardly the first, Pozner said. There’s a whole network of people who believe the media reported a mass shooting that never happened, he said, that the tragedy was an elaborate hoax designed to increase support for gun control. Pozner said he gets ugly comments often on social media, such as, “Eventually you’ll be tried for your crimes of treason against the people,” “… I won’t be satisfied until the caksets are opened…” and “How much money did you get for faking all of this?”

It's easy to practice empathy when you limit it to people that are easy to empathize with – the downtrodden, the undeserving victims. But it is another matter entirely to empathize with those that hate, harangue, and intentionally make other people's lives miserable. If you can do this, you are a far better person than me. I struggle with it. But my hat is off to you. There's no better way to teach empathy than to practice it, in the most difficult situations.

In individual cases, reaching out and really trying to empathize with people you disagree with or dislike can work, even people who happen to be lifelong members of hate organizations, as in the remarkable story of Megan Phelps-Roper:

As a member of the Westboro Baptist Church, in Topeka, Kansas, Phelps-Roper believed that AIDS was a curse sent by God. She believed that all manner of other tragedies—war, natural disaster, mass shootings—were warnings from God to a doomed nation, and that it was her duty to spread the news of His righteous judgments. To protest the increasing acceptance of homosexuality in America, the Westboro Baptist Church picketed the funerals of gay men who died of AIDS and of soldiers killed in Iraq and Afghanistan. Members held signs with slogans like “GOD HATES FAGS” and “THANK GOD FOR DEAD SOLDIERS,” and the outrage that their efforts attracted had turned the small church, which had fewer than a hundred members, into a global symbol of hatred.

Perhaps one of the greatest failings of the Internet is the breakdown in cost of emotional labor.

First we’ll reframe the problem: the real issue is not Problem Child’s opinions – he can have whatever opinions he wants. The issue is that he’s doing zero emotional labor – he’s not thinking about his audience or his effect on people at all. (Possibly, he’s just really bad at modeling other people’s responses – the outcome is the same whether he lacks the will or lacks the skill.) But to be a good community member, he needs to consider his audience.

True empathy means reaching out and engaging in a loving way with everyone, even those that are hurtful, hateful, or spiteful. But on the Internet, can you do it every day, multiple times a day, across hundreds of people? Is this a reasonable thing to ask of someone? Is it even possible, short of sainthood?

The question remains: why would people post such hateful things in the first place? Why reply "Junkie" to a mother's anguish? Why ask the father of a murdered child to publicly prove his child's death was not a hoax? Why tweet "Thank God for AIDS!"

Unfortunately, I think I know the answer to this question, and you're not going to like it.

I don't like it. I don't want it. But I know.

I have laid some heavy stuff on you in this post, and for that, I apologize. I think the weight of what I'm trying to communicate here requires it. I have to warn you that the next article I'm about to link is far heavier than anything I have posted above, maybe the heaviest thing I've ever posted. It's about the legal quandary presented in the tragic cases of children who died because their parents accidentally left them strapped into carseats, and it won a much deserved pulitzer. It is also one of the most harrowing things I have ever read.

Ed Hickling believes he knows why. Hickling is a clinical psychologist from Albany, N.Y., who has studied the effects of fatal auto accidents on the drivers who survive them. He says these people are often judged with disproportionate harshness by the public, even when it was clearly an accident, and even when it was indisputably not their fault.

Humans, Hickling said, have a fundamental need to create and maintain a narrative for their lives in which the universe is not implacable and heartless, that terrible things do not happen at random, and that catastrophe can be avoided if you are vigilant and responsible.

In hyperthermia cases, he believes, the parents are demonized for much the same reasons. “We are vulnerable, but we don’t want to be reminded of that. We want to believe that the world is understandable and controllable and unthreatening, that if we follow the rules, we’ll be okay. So, when this kind of thing happens to other people, we need to put them in a different category from us. We don’t want to resemble them, and the fact that we might is too terrifying to deal with. So, they have to be monsters.

This man left the junkie comment because he is afraid. He is afraid his own children could become drug addicts. He is afraid his children, through no fault of his, through no fault of anyone at all, could die at 30. When presented with real, tangible evidence of the pain and grief a mother feels at the drug related death of her own child, and the reality that it could happen to anyone, it became so overwhelming that it was too much for him to bear.

Those "Sandy Hook Truthers" harass the father of a victim because they are afraid. They are afraid their own children could be viciously gunned down in cold blood any day of the week, bullets tearing their way through the bodies of the teachers standing in front of them, desperately trying to protect them from being murdered. They can't do anything to protect their children from this, and in fact there's nothing any of us can do to protect our children from being murdered at random, at school any day of the week, at the whim of any mentally unstable individual with access to an assault rifle. That's the harsh reality.

When faced with the abyss of pain and grief that parents feel over the loss of their children, due to utter random chance in a world they can't control, they could never control, maybe none of us can ever control, the overwhelming sense of existential dread is simply too much to bear. So they have to be monsters. They must be.

And we will fight these monsters, tooth and nail, raging in our hatred, so we can forget our pain, at least for a while.

After Lyn Balfour’s acquittal, this comment appeared on the Charlottesville News Web site:

“If she had too many things on her mind then she should have kept her legs closed and not had any kids. They should lock her in a car during a hot day and see what happens.”

I imagine the suffering that these parents are already going through, reading these words that another human being typed to them, just typed, and something breaks inside me. I can't process it. But rather than pitting ourselves against each other out of fear, recognize that the monster who posted this terrible thing is me. It's you. It's all of us.

The weight of seeing through the fear and beyond the monster to simply discover yourself is often too terrible for many people to bear. In a world of hard things, it's the hardest there is.

 [advertisement] At Stack Overflow, we help developers learn, share, and grow. Whether you’re looking for your next dream job or looking to build out your team, we've got your back.

### StackOverflow

#### How to plot sigmoid probability curve in Scikitlearn?

I'm trying to recreate this image using Python given 2 classes and their associated predicted probability from a classifier.

I want to see something like this:

It's not working though, as I get a mostly linear line. **NOTE: I know this data shown is currently suspect and/or bad. I need to tune the input & model, but wanted to look at the plot

Basically, I thought I'd "correct" the predict_proba() output so they are all with respect to the "0" class (meaning if it predicted "1" class, the probability that it's a "0" class is 1-(1classProbability) such that 95% prediction it's class "1" becomes a 5% change it's class "0". Then sort in order of my corrected predicition value and end up with something sigmoid-ish.

Unfortunately, I end up with this:

Here's a chunk of my python where I'm trying (unsuccessfully) to plot the probability sigmoid:

import matplotlib.pyplot as plt
testPredProbas = clf.predict_proba(X_test)
testPredProbas = testPredProbas[:, 1]
#Sort in order to produce sigmoid curve
# from operator import itemgetter
# cosorted = [list(x) for x in zip(*sorted(zip(testPredProbas, y_test), key=itemgetter(0)))]
#cosorted = zip(testPredProbas, y_test)
#cosorted.sort()
from numpy import array, rec
c = rec.fromarrays([array(testPredProbas), array(y_test)])
c.sort()
cosorted=[[],[]]
cosorted[0]= c.f0   # fromarrays adds field names beginning with f0 automatically
cosorted[1]= c.f1   # fromarrays adds field names beginning with f0 automatically

sigmoid = cosorted[0]
classValue = cosorted[1]
xAxis = range(0,len(cosorted[0]))
numInstances = len(cosorted[0])

#Fix probabilities so they are all with respect to the 0 class
sigmoidPoints=[]
for idx, val in enumerate(testPredProbas):
if testPred[idx] == 0:
sigmoidPoints.append(1.0-val)
elif testPred[idx] == 1:
sigmoidPoints.append(val)
else:
sigmoidPoints.append(-1.0)

plt.plot(xAxis, sigmoid, lw=1, label='sigmoid plot')
plt.scatter(xAxis, classValue, marker='o', color='r', label='actual outcome')
plt.xlim([-0.05, numInstances + 0.05])
plt.ylim([-0.05, 1.05])
plt.xlabel('x')
plt.ylabel('Prediction Probability')
plt.title('Sigmoid Curve')
plt.legend(bbox_to_anchor=(0., 1.02, 1., .102), loc=3,
plt.show()


For reference, below is the plot in Matlab that I'm trying to replicate in my Python model.

%Build the model
mdl = fitglm(X, Y, 'distr', 'binomial', 'link', 'logit')
%Build the sigmoid model
B = mdl.Coefficients{:, 1};
Z = mdl.Fitted.LinearPredictor
yhat = glmval(B, X, 'logit');
figure, scatter(Z, yhat), hold on,
gscatter(Z, zeros(length(X),1)-0.1, Y) % plot original classes
hold off, xlabel('\bf Z'),  grid on,  ylim([-0.2 1.05])
title('\bf Predicted Probability of each record')


#### Parameter Optimization and k-fold Cross-Validation

I was wondering about the correct data splitting for parameter optimization and later cross-validation.

In general, the data are split into a training and test set. Additionally, I split the training set into a smaller training and validation set to tune the SVM parameters (also CV) and are merged again to the final training set.

Everything is fine until now, I am running into theoretical problems when I want to carry out 10-fold cross validation in the end. The test set was left out to stay unbiased to the tuned parameters. In my case just 1 out of 9 is unbiased as the other test sets have been a part of the training set during the tuning.

Are my assumptions correct and can you maybe help me out with a better splitting strategy?

#### Development of the real-time recommendation system using Spark

I've just started learning spark and machine learning and I'm going to develop real-time recommendation system.

Assume there is an HBase with the following tables: User(id, age, gender, occupation, ZIP code), Movie(id, title, release data, IMDB_link, category), data(user_id, movie_id, rating (1-5 scale), timestamp). The system should make a recommendation to user basing on user's previous grades to films.

Obviously, one should train a model and then show to user recommendation. But I do not understand, how to organize retraining of the model after user watched one or more films? Should I keep my trained model in memory(like static member)? Should the model be retrained after every new grade to the film?

UPD

Here I will try to describe how I understand how should all things work:

1) There is a history of grades which user assigned to the different films.

2) When user clicks "See recommendation", a request is sent to the server.

3) Server trains model basing on the previous user's grades to the films.

4) User watches new movie and assign a grade to the movie.

5) After step 4 our model should be retrained to be more accurate.

Step 5 could take a lot of time, so user should wait all these time to get new recommendation. And the questions are 1) Do I understand correctly, how all things should work? 2) Is there any way not to make user wait after each new grading before obtaining new recommendation?

#### Loading other images into tensorflow, apart from MNIST

Iam interested in applying a convolutional neural networks using tensorflow. However, the only tutorial i have seen is loading the MNIST dataset. I have tried replicating the procedure done there and reading chunks of tutorials around the interwebs but it's not working. Here is my code so far

import tensorflow as tf
import os
import numpy as np

filename = os.getcwd() + '/sample_images/*.png'
filename_queue = tf.train.string_input_producer(tf.train.match_filenames_once(filename))

image = tf.image.decode_png(image_file, 3)
image = tf.image.rgb_to_grayscale(image, name=None)
image = tf.image.resize_images(image, 28, 28, method=0, align_corners=False)

data = []
with tf.Session() as sess:
tf.initialize_all_variables().run()

coord = tf.train.Coordinator()

image_tensor = sess.run([image])
data.append(image_tensor)
#print(image_tensor)
coord.request_stop()

xx = np.asarray(data)
print xx[0].shape


Basically, i want to do the following:- - Load images in from the folder with their names

• resize each image to 28*28

• change it to grey scale

• turn it into a tensor and add it to a training set

• create it's target(from it's label and add it to a numpy array)

• repeat for all images in the folder

when i am done, pass the dataset and target to a tensorflow RNN

All help will be highly appreciated

### QuantOverflow

#### Best way to store hourly/daily options data for research purposes

There are quite a few discussions here about storage, but I can't find quite what I'm looking for.

I'm in need to design a database to store (mostly) option data (strikes, premiums bid / ask, etc.). The problem I see with RDBMS is that given big number of strikes tables will be enormously long and, hence, result in slow processing. While I'm reluctant to use MongoDB or similar NoSQL solution, for now it seems a very good alternative (quick, flexible, scalable).

1. There is no need in tick data, it will be hourly and daily closing prices & whatever other parameters I'd want to add. So, no need for it to be updated frequently and writing speed is not that important.

2. The main performance requirement is in using it for data mining, stats and research, so it should be as quick as possible (and preferably easy) to pull and aggregate data from it. I.e., think of 10-year backtest which performs ~100 transactions weekly over various types of options or calculating volatility swap over some extended period of time. So the quicker is better.

3. There is lots of existent historical data which will be transferred into the database, and it will be updated on a daily basis. I'm not sure how much memory exactly it will take, but AFAIK memory should not be a constraint at all.

4. Support by popular programming languages & packages (C++, Java, Python, R) is very preferable, but would not be a deal breaker.

Any suggestions?

### StackOverflow

#### Generic conversational dataset? [on hold]

I'm looking for a strong conversational data-set. I've considered the Public Streaming Twitter API, but that didn't work. (retweets, slang, poor grammar, etc..)

I'm looking for something generic, more along the lines of:

Text: Hello

Response: Hi

Text: How are you?

Response: Doing well (etc..)

### CompsciOverflow

#### Merge k sorted arrays, each one's length is double than its' previous

I've seen many answers to merge identical-sized arrays, but haven't seen the answer to this question yet.

Given $A_1, A_2, ..., A_k$ sorted arrays where $|A_i| = 2^i$, what is the best comparison sort to merge them all?

### StackOverflow

#### Predict function of neuralnet gives odd results

I am relatively new to both machine learning techniques and programming in R, and at the moment I am trying to fit a neural network to some data that I have. However, the resulting predictions of the neural network do not make sense to me. I have looked through StackOverflow but could not find a solution to this problem.

My data (this is a part of the test set, the training set is of the same format)

    target monday tuesday wednesday thursday friday saturday  indepedent
428    277      1       0         0        0      0        0        3317
429    204      0       1         0        0      0        0        1942
430    309      0       0         1        0      0        0        2346
431    487      0       0         0        1      0        0        2394
432    289      0       0         0        0      1        0        2023
433    411      0       0         0        0      0        1        1886
434    182      0       0         0        0      0        0        1750
435    296      1       0         0        0      0        0        1749
436    212      0       1         0        0      0        0        1810
437    308      0       0         1        0      0        0        2021
438    378      0       0         0        1      0        0        2494
439    329      0       0         0        0      1        0        2110
440    349      0       0         0        0      0        1        1933


My code

resultsnn <- neuralnet(target~monday+tuesday+wednesday+thursday+friday+saturday+independent,data=training,hidden=3,threshold=0.01,linear.output = TRUE)
compute(resultsnn,test[,2:8])$net.result  My results (the predicted value is the same for ALL test set cases)  [,1] 428 508.4962231 429 508.4962231 430 508.4962231 431 508.4962231 432 508.4962231 433 508.4962231 434 508.4962231 435 508.4962231 436 508.4962231 437 508.4962231 438 508.4962231 439 508.4962231 440 508.4962231  What else have I tried? I have tried versions without the dummies (only including the independent variable, this does not change the type of results) I have created some synthetic data and used this as an input, for the same code, this does work properly: #building training set input_train <- as.data.frame(c(1:100)) output_train <- as.data.frame(c(sqrt((1:100)+1))) train <- cbind.data.frame(output_train,input_train) colnames(train) <- c("output","input") #building test set input_test <- as.data.frame(c(101:150)) output_test <- as.data.frame(c(sqrt((101:150)+1))) test <- cbind.data.frame(output_test,input_test) colnames(test) <- c("output","input") #NEURALNET PACKAGE #neural network 3 neurons res.train <- neuralnet(output~input,data=train,hidden=3,threshold=0.01) #train nn compute(res.train,test[,2])$net.result #predict using nn on test set


I have also tried other packages (e.g., nnet and RSNNS), but these packages already fail to provide correct predictions when using the synthetic data.

Some additional information on the data types:

str(test)
'data.frame':  82 obs. of  8 variables:
$target : int 277 204 309 487 289 411 182 296 212 308 ...$ monday     : int  1 0 0 0 0 0 0 1 0 0 ...
$tuesday : int 0 1 0 0 0 0 0 0 1 0 ...$ wednesday  : int  0 0 1 0 0 0 0 0 0 1 ...
$thursday : int 0 0 0 1 0 0 0 0 0 0 ...$ friday     : int  0 0 0 0 1 0 0 0 0 0 ...
$saturday : int 0 0 0 0 0 1 0 0 0 0 ...$ independent: int  3317 1942 2346 2394 2023 1886 1750 1749 1810 2021 ...

str(training)
'data.frame':  397 obs. of  8 variables:
$target : int 1079 1164 1069 1038 629 412 873 790 904 898 ...$ monday     : int  0 0 0 0 0 0 1 0 0 0 ...
$tuesday : int 1 0 0 0 0 0 0 1 0 0 ...$ wednesday  : int  0 1 0 0 0 0 0 0 1 0 ...
$thursday : int 0 0 1 0 0 0 0 0 0 1 ...$ friday     : int  0 0 0 1 0 0 0 0 0 0 ...
$saturday : int 0 0 0 0 1 0 0 0 0 0 ...$ independent: int  2249 2381 4185 2899 2387 2145 2933 2617 2378 3569 ...


Please let me know if you need any additional information! Thanks for the help guys (:

### CompsciOverflow

#### How to implement the regret matching algorithm?

My question is the following: How to calculate the regret in practice?

I am trying to implement the regret matching algorithm but I do not understand how to do it.

• First, I have $n$ players with the joint action space $\mathcal{A}=\{a_0, a_1,\cdots,a_m\}^n.$
• Then, I fix some period $T$. The action set $A^t\in\mathcal{A}$ is the action set chosen by players at time $t$. After the period $T$ (every player has chosen an action). So I get $u_i(A^t)$.
• Now the regret of player $i$ of not playing action $a_i$ in the past is: (here $A^t\oplus a_i$ denotes the strategy set obtained if player $i$ changed its strategy from $a'_i$ to $a_i$) $$\max\limits_{a_i\in A_i}\left\{\dfrac{1}{T}\sum_{t\leqslant T}\left(u_i(A^t\oplus a_i )-u_i(A^t)\right)\right\}.$$ I do not understand how to calculate this summation. Why there is a max over the action $a_i\in A_i$? Should I calculate the regret of all actions in $A_i$ and calculate the maximum? Also, In Hart's paper, the maximum is $\max\{R, 0\}$. Why is there such a difference?

I mean if the regret was: $\dfrac{1}{T}\sum_{t\leqslant T}\left(u_i(A^t\oplus a_i )-u_i(A^t)\right),$ the calculation would be easy for me.

The regret is defined in the following two papers [1] (see page 4, equation (2.1c)) and [2] (see page 3, section I, subsection B).

1. A simple adaptive procedure leading to correlated equilibrium by S. Hart et al (2000)
2. Distributed algorithms for approximating wireless network capacity by Michael Dinitz (2010)

I would like to get some helps from you. Any suggestions step by step how to implement such an algorithm please?

### StackOverflow

#### Recur not at tail position

How can I use something similiar to recurnot at tail position?

Take a look at my code:

(defn -main [& args]

(println "Hi! Type a file name...")

(let [rdr (reader fileName)]
(if-not (.exists rdr)
((println "Sorry, this file doesn't exists. Type a valid file name...")
(recur)))
(defn list '())
(doseq [line (line-seq rdr)]
(if-not (= "" line)
(concat list '(line)))
(list))))

...
...)


I know I can't use recur here... But I neither know how can I make it in clojure.

I'm a newbie in Clojure and I'm coming from a OOP context. So...

Is there a way to use recursion in this case? What would be an alternative?

I'm training a model in spark with mllib and saving it:

val model = SVMWithSGD.train(training, numIterations)

model.save(sc, "~/model")


but I'm having trouble loading it from a java app without spark to make real time predictions.

SparkConf sconf = new SparkConf().setAppName("Application").setMaster("local");
SparkContext sc = new SparkContext(sconf);
SVMModel model = SVMModel.load(sc, "/model");


I'm getting:

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/SparkConf

### StackOverflow

#### Writing a neural network using python + numpy

I am trying to program e neural network without a ml library for a class. The graph of my cost function is decreasing smoothly when training the network with a small sample(~50). However when I am using a larger sample (100 +), my graph instead of decreasing, is more looking like a cardiogram.. Can someone point to me what I am doing wrong? I have tried different learning rates, but nothing is working. Here is my python code:

import numpy as np
import matplotlib.pyplot as plt

n_samples = 500
n_features = 10
n_outputs = 4
n_hidden0 = 10
n_hidden1 = 11

#Genereting training data
x = np.random.normal(size=(n_samples, n_features))
y = np.random.multinomial(1, [1.0/n_outputs]*n_outputs, size=n_samples).astype(np.float64)

#Generating the weights
w0 = np.random.uniform(-0.1, +0.1, size=(n_hidden0, x.shape[1]))
b0 = np.zeros((1, n_hidden0))

w1 = np.random.uniform(-0.1, +0.1, size=(n_hidden1, n_hidden0))
b1 = np.zeros((1, n_hidden1))

w2 = np.random.uniform(-0.1, +0.1, size=(y.shape[1], n_hidden1))
b2 = np.zeros((1, y.shape[1]))

def softmax(z):
m = z.max(1)[:, None]  # improves numerical stability of softmax
e = np.exp(z - m)
return e / e.sum(1)[:, None]

def softmax_loss(w0, b0, w1, b1, w2, b2, x, y):
_, _, _, out = forward_pass(w0, b0, w1, b1,w2, b2, x)
err = -np.sum(y * np.log(out + 1e-16)) #/ x.shape[0]
return err

def forward_pass(w0, b0, w1, b1, w2, b2, x):
z = np.dot(x, w0.T) + b0
h0 = np.tanh(z)

z = np.dot(h0, w1.T) + b1
h1 = np.tanh(z)

z = np.dot(h1, w2.T) + b2
o = softmax(z)
return [x, h0, h1, o]

def backward_pass(activations, w0, w1, w2, y):
x, h0, h1, o = activations
delta = o - y
dw2 = np.dot(delta.T, h1)
db2 = delta.sum(0)

delta = np.dot(delta, w2)
delta *= (1-h1*h1)
dw1 = np.dot(delta.T, h0)
db1 = delta.sum(0)

delta = np.dot(delta, w1)
delta *= (1-h0*h0)
dw0 = np.dot(delta.T, x)
db0 = delta.sum(0)
return [dw0, db0, dw1, db1, dw2, db2]

#updating the weights
eta = 0.03
epochs = 200
softmax_cost = []
for i in range(epochs):
activations = forward_pass(w0, b0, w1, b1, w2, b2, x)
dw0, db0, dw1, db1, dw2, db2 = backward_pass(activations, w0, w1, w2, y)

w0 -= eta * dw0
b0 -= eta * db0

w1 -= eta * dw1
b1 -= eta * db1

w2 -= eta * dw2
b2 -= eta * db2

cost = softmax_loss(w0, b0, w1, b1, w2, b2, x, y)
softmax_cost.append(cost)

plt.plot(softmax_cost)