# Planet Primates

## September 05, 2015

### TheoryOverflow

#### Space requirements for solving True Quantified Boolean Formulas problem

I came across this section on the wikipedia page for the TQBF solving problem, and just can't wrap my head about the fact that the space requirement is linear. Moreover, it does not provide any reference.

For a given QBF $$Q_1x_1Q_2x_2...Q_nx_n\phi(x_1,x_2,...,x_n)$$

for each quantifier, the algorithm presented evaluates

$$A = Q_2x_2...Q_nx_n\phi(0,x_2,...,x_n)$$ $$B = Q_2x_2...Q_nx_n\phi(1,x_2,...,x_n)$$

and returns $A\wedge B$ for $Q_1 = \forall$ and $A\lor B$ for $Q_1 = \exists$. The way I work it out the space should be $O(2^n)$ instead of $O(n + \log(n)) = O(n)$. When I solve a QBF on paper, I realize it is not exponential, so something must be terribly wrong with my analysis.

At each step, the new $A$ and $B$ have one quantifier less, but there are twice as many expressions. I assumed that would build up as $$1: n$$ $$2 : (n-1) + (n-1)$$ $$3 : (n-2) + (n-2) + (n-2) + (n-2)$$ $$\ldots$$ $$k : 2^k (n-k)$$

which goes all the way up to $2^{n-1}$ before decreasing to zero. Can anyone provide a pointer on what I am doing wrong, or even better a good reference?

Thanks

### CompsciOverflow

#### In a DFA, does every state have a transition on every symbol of the alphabet?

If not, then what does it mean when for some state $q$ and some symbol $a$, $\delta(q, a)$ does not exist?

### StackOverflow

#### How to avoid the use of mutable datastructure and use more functional approach?

I'm building in scheme a database using a wiredtiger key/value store.

To query a given table one needs to have a cursor over the table. The library recommends to re-use the cursor. The general behavior can be described by the following pseudo-code:

with db.cursor() as cursor:

cursor.get(key)

...

do_something(db)
...


During the extent of the with statment cursor can only be used in the current context. If do_something(db) needs a cursor, it must create/retrieve another cursor even if it's to query the same table. Otherwise the cursor loose its position and the continuation of do_something(db) doesn't expect.

You can work around it by always reseting the cursor, that's a waste. Instead it's preferable to keep a set of cursors ready to be used and when one can request via db.cursor() this will remove a cursor from the available cursors and return it. Once the "context" operation is finished, put it back.

The way I solve this in Python is by using a list. db.cursor() looks like:

def cursor(self):
cursor = self.cursors.pop()
yield cursor
self.cursors.append(cursor)


Which means, retrieve a cursor, send it to the current context, once the context is finished, put it back to the list of available cursors.

How can I avoid the mutation and use more functional approach?

### TheoryOverflow

#### Is algebraic dependency decidable?

A set of numbers $S=\{x_1,...,x_n\}$ is said to be algebraically dependent if there exists a (multivariate) polynomial $p$ with coefficients in $\mathbb Q$ whose roots contain $x_1,...,x_n$ (or a subset thereof).

Given numbers $S=\{x_1,...,x_n\}$, is it decidable whether they are algebraically dependent?

Now, there is obviously an encoding issue here, since the numbers need to be transcendental, so describing them is problematic. Any solution would do, but specifically, I am interested in the case where all the numbers are of the form $e^{a_i}$, where $a_i$ is rational (so we can describe the numbers by $a_i$).

The motivation for this question is understanding whether a (positive) solution to Schanuel's conjecture would automatically lead to a constructive solution.

### CompsciOverflow

#### Can an algorithm that yields $O(n^2)$ answers run in $O(n)$ time?

My question may actually be more broadly described as: can I use the fact that an algorithm is expected to return $(O(f(n))$ answers to show that it may never run better than $O(f(n))$? I would intuitively assume yes, but I am not sure.

Examples of what I'm talking about is algorithms to calculate all the paths that pass through a set of points. I can easily calculate a higher bound $O(f(n))$ on how many those paths are. Will that tell me the algorithm must be no better than $\Omega(f(n))$?

Thanks

### Fred Wilson

#### Video Of The Week: A Discussion Of Accelerators

The economist Ian Hathaway did a public video hangout with Brad Feld this week and posted it on YouTube. They cover a lot of interesting territory in and around the startup accelerator phenomenon.

### TheoryOverflow

#### What is the status of intermediate problems if P is not NP in worst way imaginable?

Assume $P\neq BPP\neq NP$ with caveat that there is a deterministic algorithm for every $NP$ complete problem with input size $n$ bits in $2^{(\log n)^{1+f(n)}}$ arithmetic operations on $\log n$ sized words where $f(n)$ grows slower than any definable function.

Where does complexity of intermediate problems stand in this case and utility of randomness stand in this case?

What is scenario if $P= BPP\neq NP$ or $P\neq BPP=NP$ holds?

### DragonFly BSD Digest

#### In Other BSDs for 2015/09/05

Will I need to add a NextBSD tag?  Time will tell.

### QuantOverflow

#### Exercise 2.2 from the book "The concept and practice of Mathematical Finance"

I am a newbie. Please help me understand how to resolve the exercise 2.2 from the book "The concept and practice of Mathematical Finance". The solution from the book says that our super-replicating portfolio will be $\alpha$ shares and $\beta$ bonds. It must dominate at zero. This implies that $\beta$ >= 0. First of all, what does it mean "it must dominate at zero". Secondly, why if it dominates at zero, then $\beta$ >= 0? Thanks so much for your help!

## Solution

### UnixOverflow

#### How to configure which user fcgiwrap runs as on FreeBSD?

I have Redmine/Git/nginx/fcgiwrap running in a jail on FreeBSD 9.3 for (potentially) authenticated Git commits over HTTP/S. Everything works until I restart the jail.

In order for a commit to work I need to manually change /var/run/fcgiwrap/fcgiwrap.sock from srwxr-xr-x root:wheel to srwxrwxr-x root:www.

It seems like there should be a better way to do this so that its persistent over a reboot. My feeling is that there should be some way of telling fcgiwrap who to run as but I can't work out where this is specified on FreeBSD.

The man page says:

Most probably you will want to launch fcgiwrap by spawn-fcgi using a configuration like this:

FCGI_SOCKET=/var/run/fcgiwrap.sock
FCGI_PROGRAM=/usr/sbin/fcgiwrap
FCGI_USER=nginx
FCGI_GROUP=www
FCGI_EXTRA_OPTIONS="-M 0700"
ALLOWED_ENV="PATH"


Based on this question I have looked in /usr/local/etc/rc.d for spawn-fcgi but its not there which I assume means its not installed. It also seems overkill to install spawn-fcgi just to manage who fcgiwrap runs as.

I've found in /usr/local/etc/rc.d/fcgiwrap it says:

# fcgiwrap rc.d script supports multiple profiles (a-la rc.d/nginx)
# When profiles are specified, the non-profile specific parameters become defaults.
# You need to make sure that no two profiles have the same socket parameter.


What is a profile and how would I go about creating one for this rc.d script? Or am I going about this all the wrong way?

### QuantOverflow

#### Why QuantLib computes the fixed-leg swap rate by this formula?

I'm trying to understand how QuantLib creates (bootstraps) a yield curve from a vanilla swap at the source level. I have the following test code:

void testYieldFromIRSwap()
{
Settings::instance().evaluationDate() = Date(1, Jan, 2015);

auto dc = Actual360();

auto h1 = boost::shared_ptr<RateHelper>(new DepositRateHelper
(0.03, Period(1, Years), 0.0, UnitedStates(), Following, false, Actual360()));
auto h2 = boost::shared_ptr<RateHelper>(new DepositRateHelper
(0.04, Period(2, Years), 0.0, UnitedStates(), Following, false, Actual360()));

auto index = boost::shared_ptr<IborIndex>(new EURLibor1Y());
auto h3 = boost::shared_ptr<RateHelper>(
new SwapRateHelper(0.05, Period(3, Years), UnitedStates(), Annual,
Following, Actual360(), index));

std::vector<boost::shared_ptr<RateHelper>> helpers;
helpers.push_back(h1);
helpers.push_back(h2);
helpers.push_back(h3);

boost::shared_ptr<YieldTermStructure> yield(
new PiecewiseYieldCurve<Discount,LogLinear>(
Settings::instance().evaluationDate(), helpers,
Actual360()));

const auto t1 = dc.yearFraction(Date(1, Jan, 2015), Date(1, Jan, 2016)); // 1.014
const auto t2 = dc.yearFraction(Date(1, Jan, 2015), Date(1, Jan, 2017)); // 2.031
const auto t3 = dc.yearFraction(Date(1, Jan, 2015), Date(1, Jan, 2018)); // 3.044

std::cout << yield->discount(0)  << std::endl; // Must be 1
std::cout << yield->discount(t1) << std::endl; // 1/((1+0.03) ^ 1.014) = 0.9704721
std::cout << yield->discount(t2) << std::endl; // 1/(1+0.04) ^ 2.031   = 0.9234328
std::cout << yield->discount(t3) << std::endl;
}


This is obviously a toy example. I have two deposits that gives the discount rates for the first two years. My swap is an annual 3y swap, I expect the swap gives me the discount rate for the the third year. I do get the value but I don't understand how it's computed.

The code for bootstrapping a discount factor from a swap is:

Real SwapRateHelper::impliedQuote() const {
QL_REQUIRE(termStructure_ != 0, "term structure not set");
// we didn't register as observers - force calculation
swap_->recalculate();
// weak implementation... to be improved
static const Spread basisPoint = 1.0e-4;
Real floatingLegNPV = swap_->floatingLegNPV();
Spread spread = spread_.empty() ? 0.0 : spread_->value();
Real spreadNPV = swap_->floatingLegBPS()/basisPoint*spread;
Real totNPV = - (floatingLegNPV+spreadNPV);
Real result = totNPV/(swap_->fixedLegBPS()/basisPoint);
return result;
}


Everytime when QuantLib guesses a new discount factor (i.e. iteratively bootstrapping), this function computes a new NPV for the floating leg. I'd expect this NPV be set to equal to the NPV for the fixed leg, from which a new swap rate can be computed.

What I don't understand is this particular line:

Real result = totNPV/(swap_->fixedLegBPS()/basisPoint);


I think BPS is the sum of accurate interest and is computed by:

bps += cp->nominal() * cp->accrualPeriod() * df;


Question:

Why do we need to divide the new fixed-leg NPV by BPS and basisPoint? Why basisPoint is set to 1.0e-4 (1 basis point)? What's the point of doing it?

### CompsciOverflow

#### Expressing iterative version of fold in terms of recursive version

1st Definition. Recursive definition of fold

fold$_{recur}$ (c,h) nil = c

fold$_{recur}$ (c,h) (cons(a, x)) = h(a, fold$_{recur}$ (c,h) x)

2nd definition of fold. Iterative definition

fold$_{itr}$ (c, h) nil = c

fold$_{itr}$ (c, h) (cons (a, x) ) = fold$_{itr}$ (h (c, a), h) x

How do I express fold$_{itr}$ in terms of fold$_{recur}$.

### QuantOverflow

#### convert three months interbank rate into monthly rate

I have a time series of the three month interbank rate each month and I suppose that the rate is has yearly frequency.

I need these interbank rates to be on a monthly basis because I want to us these rate as a proxy for the risk- free rate, so I can subtract the rates from my monthly return to get the excess rate.

Let's consider an example:

by assuming the monthly return of stock x in April is 6%. The interbank rate at the end of April is 3% annually. Can I simply subtract o.25%(3/12) from 6% resulting in 5.75% of excess return.

Any help would be much appreciated.

### /r/emacs

#### xah-fly-keys, an efficient keyset for emacs

xah-fly-keys is now on MELPA. http://melpa.org/#/xah-fly-keys

xah-fly-keys is an efficient keyset for emacs. It is a modal mode, like vi, but key choices are based on statistics of command call frequency, and key position easy-to-press score.

It does not bind any Control key, nor Meta keys (except 3, but you can turn off).

To learn xah-fly-keys, is like learning vi for the first time. All is new. You'll need one month to adopt. (but you can use emacs as is, because no Control or Meta are used. Just leave it in insertion mode.)

xah-fly-keys is currently optimized for Dvorak layout only. If you touch-type QWERTY or other, you will need to rebind keys.

Give it a shot. Bugs can be posted to github. Questions can be commented on the home page, or here. Thanks. Edit: corrected a word typo.

submitted by xah
[link] [3 comments]

### /r/scala

#### null safety

I've never tried Kotlin and I don't think I will because it doesn't have a type system interesting enough for me, but its idea of null safety might be worth exploring.

The only problem I can see is that the compiler must be smart enough to know when a pointer is not null and can be dereferenced. Safety requires that the compiler be conservative: "When in doubt, give a compilation error!".

Option[T] is an imperfect solution because it doesn't enforce safety. If one forgets to wrap a reference in Option or mistakenly assume that a function can't return null and therefore doesn't use Option, a NPE may be thrown at runtime.

Do you think it'd be useful to have something similar to Kotlin's null safety feature in Scala or not?

submitted by Kiuhnm
[link] [6 comments]

### Planet Theory

#### TR15-145 | Arithmetic circuit classes over Zm | Eric Allender, Asa Goodwillie

We continue the study of the complexity classes VP(Zm) and Lambda�P(Zm) which was initiated in [AGM15]. We distinguish between “strict” and “lax” versions of these classes and prove some new equalities and inclusions between these arithmetic circuit classes and various subclasses of ACC^1.

### Fefe

#### Old and busted: John McAfees Aufenthaltsort leaked ...

Old and busted: John McAfees Aufenthaltsort leaked über Metadaten in Fotos von Vice.

#### Habt ihr das auch gehört? Wir nehme soooo viele Flüchtlinge ...

Habt ihr das auch gehört? Wir nehme soooo viele Flüchtlinge auf? Das überlastet unsere Infrastruktur und Ressourcen total krass? Ja?

Nun, dann Guckt mal nach Jordanien.

Anderthalb Millionen Flüchtlinge in einem Land mit rund sechs Millionen Einwohnern – für Deutschland würde das im Verhältnis die Aufnahme von zwanzig Millionen Flüchtlingen bedeuten. Und Jordanien ist ein ressourcenarmes Entwicklungsland; das Pro-Kopf-Einkommen beträgt nur ein Viertel des deutschen.

Dass es hier nicht zu Ausschreitungen oder wenigstens Protesten kommt, grenzt an ein Wunder.

### StackOverflow

#### Functional programming and dependency inversion: how to abstract storage?

I'm trying to create a solution that has a lower-level library that will know that it needs to save and load data when certain commands are called, but the implementation of the save and load functions will be provided in a platform-specific project which references the lower-level library.

I have some models, such as:

type User = { UserID: UserID
Situations: SituationID list }

type Situation = { SituationID: SituationID }


And what I want to do is be able to define and call functions such as:

do saveUser ()
let user = loadUser (UserID 57)


Is there any way to define this cleanly in the functional idiom, preferably while avoiding mutable state (which shouldn't be necessary anyway)?

One way to do it might look something like this:

type IStorage = {
saveUser: User->unit;
loadUser: UserID->User }

module Storage =
// initialize save/load functions to "not yet implemented"
let mutable storage = {
saveUser = failwith "nyi";
loadUser = failwith "nyi" }

// ....elsewhere:
do Storage.storage = { a real implementation of IStorage }
do Storage.storage.saveUser ()
let user = Storage.storage.loadUser (UserID 57)


And there are variations on this, but all the ones I can think of involve some kind of uninitialized state. (In Xamarin, there's also DependencyService, but that is itself a dependency I would like to avoid.)

Is there any way to write code that calls a storage function, which hasn't been implemented yet, and then implement it, WITHOUT using mutable state?

(Note: this question is not about storage itself -- that's just the example I'm using. It's about how to inject functions without using unnecessary mutable state.)

### UnixOverflow

#### FreeBSD - No internet connection inside jail

I am currently trying to set up a server on FreeBSD. I want to create jails with ezjails according to the handbook. I am following the example to install BIND inside a jail but I am stuck at the installation step (make -C /usr/ports/dns/bind99 install clean).

At first I thought I had a DNS problem (bad /etc/resolv.conf) but it seems that I have simply no internet inside the jail.

On the host: (8.8.178.110 is www.freebsd.org)

root@varda:~ # nc -z -w 2 8.8.178.110 80
Connection to 8.8.178.110 80 port [tcp/http] succeeded!


Inside the jail:

root@dns:~ # nc -z -w 2 8.8.178.110 80; echo $? 1  Any idea what is going on? root@varda:~ # ezjail-admin list STA JID IP Hostname Root Directory --- ---- --------------- ------------------------------ ------------------------ DR 2 192.168.4.1 www /home/jails/www 2 lo1|127.0.1.1 DR 3 192.168.3.1 dns /home/jails/dns 3 lo2|127.0.2.1  (www is another jail in which I have the same problem) root@varda:~ # ifconfig em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=4219b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,WOL_MAGIC,VLAN_HWTSO> ether 00:22:4d:ad:be:2a inet ???.???.???.??? netmask 0xffffff00 broadcast ???.???.???.??? inet6 fe80::222:4dff:fead:be2a%em0 prefixlen 64 scopeid 0x1 inet6 2001:41d0:a:f231::1 prefixlen 128 inet 192.168.4.1 netmask 0xffffffff broadcast 192.168.4.1 inet 192.168.3.1 netmask 0xffffffff broadcast 192.168.3.1 nd6 options=23<PERFORMNUD,ACCEPT_RTADV,AUTO_LINKLOCAL> media: Ethernet autoselect (100baseTX <full-duplex>) status: active lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384 options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6> inet6 ::1 prefixlen 128 inet6 fe80::1%lo0 prefixlen 64 scopeid 0x2 inet 127.0.0.1 netmask 0xff000000 nd6 options=23<PERFORMNUD,ACCEPT_RTADV,AUTO_LINKLOCAL> lo1: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384 options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6> inet 127.0.1.1 netmask 0xffffffff nd6 options=23<PERFORMNUD,ACCEPT_RTADV,AUTO_LINKLOCAL> lo2: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384 options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6> inet 127.0.2.1 netmask 0xffffffff nd6 options=23<PERFORMNUD,ACCEPT_RTADV,AUTO_LINKLOCAL>  root@dns:~ # ifconfig em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500 options=4219b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,WOL_MAGIC,VLAN_HWTSO> ether 00:22:4d:ad:be:2a inet 192.168.3.1 netmask 0xffffffff broadcast 192.168.3.1 media: Ethernet autoselect (100baseTX <full-duplex>) status: active lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384 options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6> lo1: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384 options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6> lo2: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384 options=600003<RXCSUM,TXCSUM,RXCSUM_IPV6,TXCSUM_IPV6> inet 127.0.2.1 netmask 0xffffffff  ### StackOverflow #### What is difference between function oriented and procedure oriented? [duplicate] This question already has an answer here: I am currently using functional programming(F#). I also know c, and it is procedure oriented language. I am getting confused with functional programming and procedure oriented programming. It seems like same execution except function and procdure. F# treats every statement as a function. Please clear functional and procedure oriented language. #### What are the most efficient representations of numbers without native types? On Haskell, most numeric types are natives - Int, Float, Word32 etc. There is also a popular representation of unary natural numbers with ADTs only - that is the Peano encoding: data Nat = Succ Nat | Zero  That datatype, while elegant, is not very efficient. Multiplication, exponentiation, division with unary numbers are unpractical. My question is: if we didn't have native types to count on, what would be the most efficient representation of numbers - nats, ints, fracs, complex, etc. - in a pure functional language such as Haskell? What would the datatypes and the respective algorithms look like? #### How to compose two different State Monad? When I learn State Monad, I'm not sure how to compose two functions with different State return types. State Monad definition: case class State[S, A](runState: S => (S, A)) { def flatMap[B](f: A => State[S, B]): State[S, B] = { State(s => { val (s1, a) = runState(s) val (s2, b) = f(a).runState(s1) (s2, b) }) } def map[B](f: A => B): State[S, B] = { flatMap(a => { State(s => (s, f(a))) }) } }  Two different State types: type AppendBang[A] = State[Int, A] type AddOne[A] = State[String, A]  Two methods with differnt State return types: def addOne(n: Int): AddOne[Int] = State(s => (s + ".", n + 1)) def appendBang(str: String): AppendBang[String] = State(s => (s + 1, str + " !!!"))  Define a function to use the two functions above: def myAction(n: Int) = for { a <- addOne(n) b <- appendBang(a.toString) } yield (a, b)  And I hope to use it like this: println(myAction(1))  The problem is myAction is not compilable, it reports some error like this: Error:(14, 7) type mismatch; found : state_monad.State[Int,(Int, String)] required: state_monad.State[String,?] b <- appendBang(a.toString) ^  How can I fix it? Do I have to define some Monad transformers? Update: The question may be not clear, let me give an example Say I want to define another function, which uses addOne and appendBang internally. Since they all need existing states, I have to pass some to it: def myAction(n: Int)(addOneState: String, appendBangState: Int): ((String, Int), String) = { val (addOneState2, n2) = addOne(n).runState(addOneState) val (appendBangState2, n3) = appendBang(n2.toString).runState(appendBangState) ((addOneState2, appendBangState2), n3) }  I have to run addOne and appendBang one by one, passing and getting the states and result manually. Although I found it can return another State, the code is not improved much: def myAction(n: Int): State[(String, Int), String] = State { case (addOneState: String, appendBangState: Int) => val (addOneState2, n2) = addOne(n).runState(addOneState) val (appendBangState2, n3) = appendBang(n2.toString).runState( appendBangState) ((addOneState2, appendBangState2), n3) }  Since I'm not quite familiar with them, just wondering is there any way to improve it. The best hope is that I can use for comprehension, but not sure if that's possible ### CompsciOverflow #### How are alphabetic characters programmed into a computer? I'm no cs student, I'm a programmer. I have a couple of questions and a few assumptions that I will make here (correct me if I'm wrong please). From my understanding is that all the sequences of 1 and 0's that computers execute are just representations of actual data or instructions that tell other hardware on a system what to do like telling a graphic card to change a pixels color on the monitor. 01100001 being the representation of the letter "a" and that this sequence of bits has been chosen by the fathers of computing/ascii or what have you to represent that letter, it could just as well have been some other sequence right? All computer systems have agreed that those bits mean the letter "a" for all of the computers around the globe to be interoperable, if there were no standards in place for this the internet would be a mess. What I want to know is: where is the information stored on a computer that tells it that these sequences of bits here mean/represent the character "a"? Is it in the OS or directly on the motherboard or am I just completely wrong? #### Why are PTAS-Reductions used to establish APX-Hardness? What's the intuition for PTAS reductions instead of approximation preserving reductions or something of that ilk? I'd expect APX-Hardness to mean something like "finding a constant factor approximation for this problem is as hard as finding a constant factor approximation to any problem in APX" but having just a PTAS reduction from A to B isn't strong enough to imply a constant factor approx to B can be converted into a constant factor approx of A. ### Lobsters #### Containers steal the show at VMworld ### CompsciOverflow #### What are the clear differences in mips,ppc and arm architectures? [on hold] I would like to know very clear differences in mips,ppc and arm architecture processors. Also would like to know what is main focus in each architecture,about instruction set,power consumption,speed in processing,processor design,which sector of industry its used. #### What is the difference (if any) between transition systems and finite automata? is there any difference between transition systems and finite automata? Is it that transition systems consist of both NFA (nondeterministic finite automata) and DFA (deterministic finite automata)? ### /r/netsec #### A Toure of Bootloading: How to interact with the software loading process ### CompsciOverflow #### Is this language LL(1) parseable? I tried to find a simple example for a language that is not parseable with an LL(1) parser. I finally found this language. $$L=\{a^nb^m|n,m\in\mathbb N\land n\ge m\}$$ Is my hypothesis true or is this language parseable with an LL(1) parser? One can use this simple grammar to describe$L$(of course it is isn't LL(1) parseable): S -> ε S -> A A -> aA A -> aAb A -> a A -> ab  ### /r/compsci #### Need help thinking of modern uses for older memory management systems. I have a HW problem I'm having trouble with. I need to think of present-day uses for Single User, Fixed, Dynamic and Relocatable Dynamic allocation schemes. But for whatever reason I can't think of any. Like I understand Single User only loads one job at a time. Fixed loads multiple jobs into preset sections of memory and is good if the jobs are all similar size, and it helps if you know all the job sizes in advanced. Dynamic loads jobs into memory wherever they fit but can get fragmented after the initial jobs are loaded. etc. But for the life of me I can't think of uses for them. Does anyone have some reading I could read to get some ideas? submitted by Leering [link] [comment] ### CompsciOverflow #### Why are computable numbers (in Turing's sense) enumerable? Why are computable numbers (in Turing's sense) enumerable? It must be very obvious, but I'm currently just not seeing it. #### FPT algorithm for point line cover In the "Covering Things with Things" paper from Langerman and Morin, they mention the BST-Dim-Set-Cover, which is a FPT algorithm for point-line-cover, at page 666. The algorithm chooses each point p from a set S and then subdivides it into k-lines. My question is around what is written in the paper at page 666 regarding where is p taken from: At some point during the text it says "Otherwise, the algorithm chooses an element p ∈ S'". However, further down in the algorithm it says "choose p ∈ S" (without the prime symbol). Is p taken from S or S' (S prime)? Thanks! ### QuantOverflow #### Source of market or security attribute information? There are many securities and exchanges on platforms like Bloomberg and Quandl, but many securities are described with the relevant pit close times and pit open times, exchanges, related futures, and settlement procedures. Is there a reliable source where people go to find these things out? Kind of like a one stop shop. #### Performance of Open Source Time Series Database for Financial Market Data We would like to store financial tick data in a database (potentially billions of rows) and then create aggregated (open-high-low-close) bar data from it (e.g. 1min or 5min bars). It was mentioned to us that a NoSql or time series database might be a good choice for this. Can anybody give any advice on which open source product might fit this requirement best. Note: query performance is very important for us. In our research we came across the following products (maybe there are more): • InfluxDB • OpenTSDB • KairosDB • MonnetDB We did run a test with InfluxDB with around 10 mio ticks. Unfortunately the creation of 1min bars was 3-5 slower than with a relation database (i.e. MySql). We are aware that KDB now offers a free 32-bit version, but unfortunately 32-bit will not be enough for our use case. Any advice is appreciated. ### StackOverflow #### Read a single character or byte from stdIn without waiting for newline in SML I encountered a problem while working with the TextIO structure, because every input waites for a newline chacter and for the buffer to be full... How can i work with BinIO and stdIn to solve that problem? Any helpfull input is appreciated. BTW: I am using MLTton so there is nothing more than the usual standard libs. ### CompsciOverflow #### I am having trouble understanding (and implementing) logistic regression for classifying into three classes (For reference, i am using Kevin P Murphy's Book "Machine Learning: A Probabilistic Perspective" and implementing with MATLAN - without any toolboxes) I have a dataset with 392 samples (rows), each sample has 8 features (columns), one of which defines the class (i.e. column 1 of features is divided into three equal bins which define the three classes - low, medium, and high). I am having a really hard time understanding how to create a logistic regression model to classify a sample into one of these three classes. I just finished learning and making a linear regression model where I learned about both the Ordinary Least Squares (Closed form) solution for the weight vector, and also Gradient Descent (Open Form) solution. But i never implemented gradient descent because my data was fitted perfectly fine with the OLS solution for weight vector. I am extremely confused how to create a weight vector for Logistic regression, I understand that it requires use of Gradient Descent because there is no closed form solution. I also read about the Newton method for calculating the weights but I don't understand it at all. And after you use these methods to calculate weights how do you apply the weights to the sample data? In Linear regression it was simply because you simply multiplied the weights by the features (and higher order features for higher order linear regression), but is it the same in logistic regression? Moreover my understanding so far is that this model only works for binary classification, so how would i do it for three classes? Basically my question boils down to this: How exactly do you find the weight vector for logistic regression (using either gradient descent or newtons method, whichever is easier) and how do you apply the weight vector to the sample to get a classification out of it for 3 classes (not just two). ### /r/compsci #### Computer science: Enchantress of abstraction : Nature : Nature Publishing Group ### /r/netsec #### Enpublic Apps: Security Threats Using iOS Enterprise and Developer Certificates by Min Zheng, Hui Xue, Yulong Zhang, Tao Wei, and John C.S. Lui [PDF] ### TheoryOverflow #### LPs with "sparse solutions" Consider a graph optimization on a graph with n vertices and m edges that can be written as an LP (like say bipartite matching). By the duality with vertex cover, we know that there's a sparse dual solution consisting of the elements of the cover. It's sparse in the sense that its size is linear in n, rather than in m. Consider the problem of finding the minimum enclosing ball of a set of$n$points in$d$dimensions. This is an LP-type problem with (again) a sparse dual solution: a set of$d+1$points that define the optimal ball. Here, "sparse" means that the size depends on the (smaller) parameter$d$rather than$n$In general, suppose I have a problem that can be expressed via an LP with$n$variables and$m$constraints, where$m \gg n$. Are there classes of such problems for which any optimal solution to the problem can be expressed using the intersection of only$O(n)$constraints (or dually with only$O(n)$nonzero dual variables ? ### CompsciOverflow #### Global minimum when noisy estimates of function available I am interested in knowing about algorithms for finding global minimum when noisy estimates of the function are available. ### TheoryOverflow #### Construction of a graph which has regular subgraphs at each iteration of a recursive process I am studying Graph Isomorphism and also trying to figure out the complexity of a certain class of graph. The graph I am studying at the moment is described below Description :$G$is a$r$regular graph,$k$connected( not a complete , cycle graph). A vertex of$G$is$x_1$. All vertices which are not adjacent to$x_1$create a sub-graph$C_1$. All vertices adjacent to$x_1$create a sub-graph$, D_1 $. A vertex of$D_1$is$x_2$. Using same method, based on adjacency of$x_2$,$D_1$can be divided. All vertices which are not adjacent to$x_2$create a sub-graph$C_2$. All vertices adjacent to$x_2$create a sub-graph$, D_2 $. In general ,$ D_{y-1} $is a graph and can be divided/ partitioned in to 2 sub graphs$C_y, D_y $. There are 2 restrictions, they are- 1.$C_y, D_y $are$s_y , t_y > 0 ; s_y \neq t_y; $regular graphs respectively for all iteration$y$2.$C_y, D_y $cannot be complete bipartite graph (utility graph), complete graph or disjoint union of complete graphs.$G$can be divided/ partitioned maximum$\log_2(|G|)$times , using this dividing process recursively if$G$is a 3 clique free graph , this condition implies$t_y \leq |D_y|/2$. Questions : Does there exist such graph in current literature as described above(follows restriction 1 or 2 or both)? Motivation : An algorithm runs faster than existing results for this kind of graphs. Related papers: 1. V.A.Taskinov. Regular subgraphs of regular graphs. Sov.Math.Dokl.26(1982), 37-38 . In this paper he proved the following : Let$0<k<r$be positive odd integers. Then every$r$-regular graph contains a$k$-regular subgraph (Here, the graph need not be simple). 2. N. ALON,S. FRIEDLAND,G. KALAI. Regular Subgraphs of Almost Regular Graphs. JOURNAL OF COMBINATORIAL THEORY, Series B 37, 79-91 (1984) Any advice will help. ### Halfbakery #### Better Searching of Mispellings (0.5) ### Lobsters #### Freer monads, more extensible effects ### StackOverflow #### Covariance Not Working Properly in R I'm trying to run the following code in R  X6=matrix(c(1,18.95,2,19,3,17.95,3,15.54,4,14,5,12.95,6,8.94,8,7.49,9,6,11,3.99), 10, 2, byrow=TRUE) (9/10)*cov(X6, method="pearson")  This gives me the matrix  [,1] [,2] [1,] 9.5600 -15.93920 [2,] -15.9392 27.76893  But it should be giving me something close to  [,1] [,2] [1,] 3.09 -15.94 [2,] -15.94 5.27  Whats the deal here? ### /r/compsci #### 16. Uk. Want to make a company when im older, need advice. • How do i build a multi million pound business? Heres my plan 1. Go to oxbridge or any other uni 2. Within 1 to 2 years use knowledge from degree e.g. natural language processing to make an app/program/startup 3. Someeehow get funding 4. ??? Profit Reddit, is this realistic? If not, how can i reach my dream of being, i dont know the next mark zuckerberg or whatever. Is it possible, if not could you guys give me a sense of direction/advice. P.s. please dont kill my dream, thanks! submitted by Brand248 [link] [1 comment] ### CompsciOverflow #### A bot to cleanup a map We're given a robot that's able to sense its immediate surroundings on the north, south, east and west, has space to store its state 0-100, and a room for the robot to cleanup, with obstacles in it. The robot has a fixed set of instructions that tell it which way to go and which state to go to based on the current state and surroundings. I'd like to find a way to cleanup any room that is one connected space completely, no matter where the robot is in the room to start. It's ok if the algorithm isn't efficient. As soon as the bot walks on another cell, the cell is cleaned. ### Lobsters #### The mystery of the fifteen-millisecond breakpoint instruction ### /r/compsci #### Need help designing a data structure I have to design a data structure That's essentially a half table, not unlike Unity's collision matrix. I have a set of strings, two strings would get passed to the structure, as the indexes of an object; the problem I'm having is that something like a 2D array leaves me with a lot of duplicates (e.g. ["something","nothing"] and ["nothing","something"] would be 2 different objects). I don't want the order of the index's to matter. Ignoring order is easy with something like integers because I can just multiply them and use a 1D since the order wouldn't matter. I know it's obvious with integer indexes because It's so much easier to OR two integers and get the bit flags (just as the unity collision matrix is really just setting bit flags). I've spent a good couple days nailing my fingers to the keyboard trying to figures this out and research it, but I quite frankly can't even come up with what the name of such a table would even be! So a side question is: Does anyone here know what such a table (see the unity collision matrix) would be called, where the rows and columns use the same indexing, but the column is in the opposite order of the rows so there's not the duplicate crosses. I know I'm probably making very little sense here, I've been up way too many hours (workaholic, love my job). Edit: Thank you /u/program_the_world and /u/asthasr for pointing out where I went wrong in approaching this, I can't believe I didn't think about hashing. submitted by Destects [link] [16 comments] ### Lobsters #### Happy Birthday Monkey Island ### Fefe #### Mozillas Bugtracker ist gehackt worden, anscheinend ... Mozillas Bugtracker ist gehackt worden, anscheinend mehrere Monate lang. Naja, denkt ihr euch jetzt vielleicht, na und, das ist doch eh alles öffentlich. Nein, ist es nicht. Security-Bugs sind nicht öffentlich. Und genau die sind da jetzt rausgetragen worden. Wie das halt so ist, wenn man Security-Kram nicht sofort fixt, sondern da glaubt, noch ein paar Wochen bis Monate drauf herumsitzen zu können. Merkel-Politik ist halt immer Scheiße, in der Politik wie bei Softwareprojekten. Aussitzen funktioniert nicht. ### QuantOverflow #### What is the correlation of stock options? I want to calculate the VaR of two correlated option positions, and I know the correlation between stock price returns. I want to separately calculate$Var_1$,$Var_2$for option 1 and 2, and then use $$\sqrt{(w_1var1)^2+(w_2var2)^2+2*corr*w_1*w_2*var_1*var_2}$$ to calculate the portfolio var, but I don't know if the option correlation is the same with its underlying stock returns correlation? #### Sources of index data (MSCI, FTSE, S&P etc.)? Who are the major suppliers of index data that cover multiple index providers, e.g. MSCI, FTSE, S&P etc? There are a huge number of people sourcing e.g. equity data, but index data is much harder to come by. I am only looking for very high level information and at low frequency - the EOD number for each index would almost be enough. Basic information on the constituents and their weights would also be useful, but in depth analysis and such like of the indexes and their constituents isn't required. Sites like Yahoo Finance just seem to cover a small number of major indexes, whereas I'm interested in sector indexes and such like that aren't commonly covered, e.g. MSCI World Telecommunication Services. I am looking for a provider with historical data (going back e.g. 10 years) who will also provide data day-by-day going forward. Currently I work with a data aggregator who, before they supply particular data, requires us to negotiate agreements directly with the individual index providers, e.g. MSCI or FTSE, rather than handling this on our behalf. They also provide the data in a highly non-standardized form, e.g. data from MSCI will come in a different form to that from FTSE. And they only sell the data in very non-flexible bundles, e.g. rather than just buying basic EOD numbers one has to buy a hugely expensive package of in-depth analysis of which these numbers are a tiny part. It's not that I don't expect to pay for this data - but it would be nicer if possible to pay for just the basic data needed and to have the negotiation with the actual index providers handled behind the scenes by the data provider themselves. Apologies if you feel this, as a rather commercial question rather than something interesting on e.g. option pricing, isn't directly appropriate to this stack exchange. ### TheoryOverflow #### Why is least fixed point (lfp) important in program analysis I am trying to get a big picture on the importance of least fixed point (lfp) in program analysis. For instance abstract interpretation seems to use the existence of lfp. Many research papers on program analysis also focus heavily on finding least fixed point. More specifically, this article in wikipedia : Knaster-Tarski Theorem mentions that lfp are used to define program semantics. Why is it important? Any simple example helps me. (I am trying to get a big picture). ### HN Daily #### Daily Hacker News for 2015-09-04 ## September 04, 2015 ### QuantOverflow #### Gamma Imbalance Explanation Can someone please give me an explanation as to what put-call gamma imbalance specifically refers to (imbalance of what?), and why they may exacerbate volatility from a market perspective, and why the risk increases on option expiry day? My guess is it that these imbalances force option sellers/dealers to more aggressively delta-hedge their positions, but I would like to have a better grasp on what is really going on. Some confustion is coming from the fact that I thought put and call gamma must be equal else put/call parity would be violated based on put delta + call delta = 1 for a given option strike. ### CompsciOverflow #### Why is${n \choose 2}$a$\Theta(n^2)$computation? In Cormen's Algorithms textbook, the author casually mentions that${n \choose 2}$is$\Theta(n^2)$. Why is this so? In the calculation $${n \choose 2} = \frac{n!}{(n-2)!2!}$$ there are$n$multiplication operations in the numerator and$n$multiplications in the denominator. That means there are at most$2n+1$computations to here make (one for the division step), no? ### QuantOverflow #### Can a large OpenInt of calls cause a stock to go down? I read forum post from another site. Which stated... XYZ has 60k+ CALL contracts are in the money for this Friday vs 5k PUTS. That will be extra 5,500,000 supply this Friday. Big drop! Chances of XYZ being green tomorrow = 0% 4 mln shares supply (20% daily volume) from OPEX imbalance will accelerate profit taking run.  My question is... is this a technically valid statement? if so why?... What happens to the underlying when someone closes or exercises an in the money call that causes an addition to the daily volume of a stock? Thanks! ### CompsciOverflow #### Is there a formal CS definition of VCS and file versions? I don't know whether it was a joke, but once I read what was referred to as a formal definition of a file in a versioning system such as git, hg or svn. It was something like a mathematical object like a homeomorphism. Was that a joke or is there really computer science theory about versioning systems and the mathematics of VCS? #### Quantum computing roadmap I have to create a roadmap for the quantum computing technology. Looking around I found the timeline on wikipedia that is pretty wide but does not highlight the key events in quantum computing research neither sets the possible future for research. Could somebody help me to define which are key events in the quantum computing fields? When (and why) we started to explore this technology, for which milestones we passed through and (maybe) what we can set as future milestones? ### QuantOverflow #### Utility Theory - Certainty equivalent approximation formula derivation I have a question on an exercise from chapter 9 of D. Luenberger, Investment Science, International Edition, where I suspect there may be a typo. Exercise 8 (Certainty approximation) There is a useful approximation to the certainty equivalent that is easy to derive. A second-order expansion near$\bar x=E(x)$gives $$U(x)\approx U(\bar x)+U^{'}(\bar x)(x-\bar x)+\frac12U^{''}(\bar x)(x-\bar x)^2$$ Hence, $$E[U(x)]\approx U(\bar x)+\frac 12 U^{''}(\bar x)var(x)$$ On the other hand, if we let c denote the certainty equivalent and assume it is close to$\bar x$, we can use the first-order expansion $$U(c)\approx U(\bar x)+U^{'}(\bar x)(c-\bar x)$$ Using these approximations, show that $$c \approx \bar x+{U^{''}(\bar x) \over U^{'}(\bar x)}var(x)$$ Now, I used general methods of algebra along with the fact that$E[U(x)] = U(c)$to show directly that $$c \approx \bar x+\frac 12 ({U^{''}(\bar x) \over U^{'}(\bar x)})var(x)$$ as follows: Take the third equation and transform it into $$c\approx \bar x + {U(c)-U(\bar x) \over U^{'}(\bar x)}$$ Now all I have to do is show that the numerator in the fraction part is$\frac 12 U^{''}(\bar x)var(x)$which is done by putting$E[U(x)] = U(c)$into the second formula and you can see the result is immediately there. On top of this work, I wrote out an example of an investment with the log utility function and showed that my approximation for c worked whereas the book's formula without the "2" didn't. However, I would like to post all this here just to verify that this is a typo from the book and not some misunderstanding on my part. Thanks in advance for any feedback. ### /r/netsec #### HTTP Evader - Automate Firewall Evasion Tests ### Fefe #### Wer in China angefahren wird, sollte damit rechnen, ... ### /r/compsci #### What is the opposite of screen-scraping? I'm familiar with the idea of Screen-scraping - getting data from websites where there's no nice feed to use. Is there a name for the opposite i.e. when your program simulates a human inputting data to a site? submitted by tpedar50 [link] [11 comments] ### CompsciOverflow #### Segment Tree, with queries and range update [on hold] How to query and range updates for LCM in an array provided they can be large so return answer %10^9+7 for Example the queries are of type 0 L R -- find LCM of elements from L to R (Both inclusive), return ans%10^9+7 1 P V -- Update the element at index P with value V (1-based indexing).. Say Like INPUT 6(no. of elements) 1 3 2 5 15 3 6(no. of queries) 0 1 6 1 3 5 0 2 3 1 5 7 0 3 5 0 1 6 OUTPUT 30 15 35 105 ### TheoryOverflow #### Program reasoning about own source code The inspiration for this question is the following (vague) question: What are the programming language/logical foundations for having an AI which could reason about its own source code and modify it? This isn't at all rigorous, so here is my attempt to extract a concrete question out of it. There are two things I'm interested in: (A) A programming language P that can represent and manipulate its own programs as a datatype Program (e.g., as an AST). (If desired, a object of type Program can be converted into a String, which is the text of a valid program in that language. This would be the opposite of what a compiler does.) (B) A method to reason about what a program in language P does. Here are two levels I'm thinking about it: 1. Another language Q (with theorem proving capabilities) which models what a P-program does. It should be able to express and prove statements like "the outcome of running Program p is foo." 2. A way to reason about what a program p:Program does in the language P itself. (So we are taking P=Q above.) To what degree has something like this been implemented, or what is progress in this direction? What are the practical obstacles? In light of the original intention of the question, what is the best way to formalize the problem? * As the answers show (thanks!), both (A) and (B1) can be done separately, though it seems doing them together is more of a research question. Here were some of my first thoughts on the question (warning: rather vague). See also my comments on Martin Berger's answer. I'm interested in the programming language modeling the same programming language, rather than a simpler one (so P=Q above). This would be a "proof of concept" of a program being able to "reason about its own source code." Dependently typed programming languages can give guarantees about the outputs of its functions, but this doesn't count as "reasoning about its own source code" any more than a "Hello world!" counts as a quine in a language that will automatically prints out a naked string---there needs to be some kind of quoting/self-reference. The analogue here is having a datatype representing Program. It seems like a rather large project - the simpler the language, the harder it is to express everything within it; the more complicated the language, the more work has to be done to model the language. In the spirit of the Recursion Theorem, a program can then "get" its own source code and modify it (i.e., output a modified version of itself). (B2) then tells us that the program should be able to express a guarantee about the modified program (this should be able to recurse, i.e., it should be able to express something about all future modifications-?). ### /r/emacs #### Compile with F5 I have been trying to switch from vim to emacs + evil for the past week, so I'm very new with all this stuff. I'm trying to bind F5 to compile a cpp file with a specific shell command: (defun c-save-compile () (interactive) (save-buffer) (defvar user-foo) (setq user-foo (concat "g++ " (buffer-name) " -g -pg -std=c++0x -o a.out" )) (shell-command user-foo) ) (add-hook 'c++-mode-hook (lambda () (define-key c++-mode-map (kbd "<f5>") 'c-save-compile)))  this is not working, when I press f5 on a cpp buffer I only get <f5> is undefined. Can anyone explain to me what's going on? Thanks! submitted by Ursao [link] [13 comments] ### UnixOverflow #### Logout from jail I'm trying to use FreeNAS on my office along with Syncthing, and some things are giving me headache. FreeNAS installed Syncthing as a jail, and I did the jexec command to login into the jail, but now I can't figure out a way to login back into my server. I'm as root@syncthing instead of being in root@freenas. How do I undo the jexec on the shell? ### CompsciOverflow #### Using an older version of data structures and algorithm in java [on hold] Has data structures and algorithms changed during the pass 13 years in java? Would i be able to learn the concepts from a textbook published in 2002 called Data Structures and algorithms in java by Robert Lafore? #### How to determine number of instances for resources in a resource allocation graph? [duplicate] I am struggling to determine the number of instances a resource will have. I have performed extensive research on this, but unfortunately have come up short. I am tasked with drawing a resource allocation graph and I have been provided with the number of processes which is 3 and the of resources which is 5. I have concluded that when drawing the resource allocation graph, I will need to know the number of instances of a resource. Please could someone just help in how I would go about working out or determining the number of instances a resource will have? #### What is the best average-case scheduling policy for a multiprogrammed batch system? Out of the various scheduling policies, which would be the best recommended scheduling policy for optimizing a multiprogrammed batch system? I'm looking for a general-case policy, if we are not given information on the jobs or needs. For instance, if there was a general purpose library implementing different policies, what would it make the most sense to default to? ### QuantOverflow #### Historical volatility from close prices (Haug pg 166) I have implemented a function for calculating historical volatility using close the close method as described by Haug on page 166. When I implemented the formula given by Haug, it resulted in some negative values for the variance. The data I am using is not suspect, so the fault must lie in either: • my implementation or • the formula itself. Here is a rough sketch of my implementation function (in Python) # close_prices is a time sorted list of pricedata objects # returns a list of calculated volatilities def calc_volatility(close_prices, period): startpos, stoppos, raw_vols = (0, period, []) while True: subset = close_prices[startpos:stoppos+1] period_returns = [ subset[i].close_price/subset[i-1].close_price for i in range(1,len(subset)) ] log_returns = [ math.log(x) for x in period_returns ] log_squared_returns = [ math.log(pow(x,2)) for x in period_returns ] sum_squares_1 = sum( log_squared_returns ) / (period-1) sum_squares_2 = pow( sum( log_returns ), 2) / (period * (period -1)) variance = sum_squares_1 - sum_squares_2 print "ss1: {0} - ss2: {1}. Variance: {2}".format(sum_squares_1, sum_squares_2, variance) volatility = math.sqrt(variance) raw_vols.append (volatility) startpos += period stoppos += period if stoppos >= len(close_prices): break return raw_vols  Is there something wrong in my implementation, or is the formula I am using, incorrect? ### CompsciOverflow #### For every 'evil' regex, does there exist a non-evil alternative, or is the devil in the grammar? Apparently, ReDos attacks exploit characteristics of some (otherwise useful) regular expressions ... essentially causing an explosion of possible paths through the graph defined by the NFA. Is it possible to avoid such problems by writing an equivalent 'non-evil' regex? If not (thus, the grammar can't be handled in practical space/time by an NFA), what parsing approaches would be better? Why? ### /r/netsec #### Secrets and Source Control: A Maturity Model ### Lobsters #### Beginning with Regular Expressions ### AWS #### AWS Webinars – September, 2015 I have always advised my family, friends, colleagues, and audiences to plan to spend some time learning something new every day. If you don’t take responsibility for your own continued education, you can soon find that you have fallen far behind your peers! If you are interested in staying current with AWS, I have a couple of recommendations. You can read this blog, follow me (and @awscloud) on Twitter, check in with the AWS What’s New page from time to time, and make use of our self-paced labs. When you are ready for a more in-depth look at a certain topic or service, I would strongly advise you to attend one or more of our webinars! Every month my colleagues assemble a lineup of sessions that are hand-selected to provide you with a detailed look at a subject that they believe will be of value to you. The webinars are presented by senior members of the team including evangelists, product managers, and solution architects. With that said, I would like to introduce you to our September 2015 webinars (all times are Pacific). • September 15: • September 16: • September 17: There’s no charge and no obligation, of course! Please feel free to sign up, and to share this post wiht your friends and colleagues. Jeff; PS – Please feel free to suggest new topics by leaving a comment on this post. ### DragonFly BSD Digest #### CDBUG: Patrick Muldoon on DNS, 2015/09/08 CDBUG is having a presentation on DNS, given by Patrick Muldoon, on Sept. 8th. That’s next Tuesday. If you are anywhere near Albany, go visit. ### CompsciOverflow #### Is it possible to prove thread safety? Given a program consisting of variables and instructions which modify these variables, and a synchronization primitive (a monitor, mutex, java's synchronized or C#'s lock), is it possible to prove that such a program is thread safe? Is there even a formal model for describing things like thread safety or racing conditions? #### Minimising sum of consecutive points distances Manhattan metric I have set of points (two dimensions,$X, Y$). Coordinates are floating point numbers. Objective is to sort them in such way that sum of differences in distances of consecutive sorted points is minimal. I tried to solve this problem by localisation (dividing plane by smaller grids of sorted points) and then connecting clusters - but this is greedy and approximate approach - the problem here is in setting grid size, but this is not guarantee to give any bounds on how optimal solution is, and it is rather time consuming (without well defined objective function). Another attempt was to sort them by X coordinate, than find minimum. But this approach failes. Normal sorting by two coordinates fails and is highly unstable. Problem ressembles TSP problem, but "path" is not closed and metric used is L1 (Manhattan, TaxiCab). Solution I am looking for probably will be approximate. Finding minimal and second minimal distance from every point and connecting them gives good but for sure not minimal sum. The problem I am having is with giving objective function. ### TheoryOverflow #### Quantum algorithms based on transforms other than Fourier transforms In Quantum Computation and Quantum Information by Nielsen and Chuang they say that many of the algorithms based on quantum Fourier transforms rely on the Coset Invariance property of Fourier transforms and suggests that invariance properties of other transforms might yield new algorithms. Has there been any fruitful research on other transforms? ### Lobsters #### Paradigms of Computer Programming – Fundamentals ### /r/freebsd #### xterm-256 only supports 144 colors? Is there a way to get it to support all 256? Or another terminal to use? I'm working on building a happy dev box with freebsd, but can't get my vim color schemes to load. submitted by desnudopenguino [link] [3 comments] ### Daniel Lemire #### Revisiting Vernor Vinge’s “predictions” for 2025 Vernor Vinge is a retired mathematics professor who became famous through his science-fiction novels. He is also famous as being one of the first to contemplate the idea of a “technological singularity“. There is debate as to what the technological singularity, but the general idea goes as follows. At some point in the near future (maybe between 2025 and 2050), we shall get to a point technologically where most human beings cannot make sense of the world anymore. We get a form of technological “hyperinflation“: even if you can make sense of the technology we had last year, you are hopelessly lost when looking at this year’s technology. Historically, we have almost always experienced accelerating technological progress… there are more new inventions and innovations in 2015 than any other year in our history… At some point, the rate of change might get too fast for most of us. Vinge chose the term “singularity” to describe such an event, not because we become infinitely wealthy, or machines become infinitely intelligent, but because, as in a black hole, it is an event that you cannot “see” even as you get closer… Your models of reality break down at that point. In his 2006 novel, Rainbows End, paints a world going through such singularity. It is in our near future too (2025). In this world, information technology has made such progress that anyone who hasn’t kept up-to-date for the last ten years, is hopelessly obsolete… as if, say, you tried to live in 2015 through the lens of someone from 1970. Any intellectual asleep from 1970 to today would need to go back to basic training… learn how to use a computer, maybe starting from a PC. This may require high-school-level retraining. Now imagine that the next ten years (hypothetically) bring about as much change as we underwent in the last 35 years. Most people under 20 today would adapt, though an increasing numbers would fall to the side. However, most people over 50 today might end up being pushed aside, as the rate of change becomes faster than they can cope with. Writing near-future science-fiction is very difficult as it becomes almost instantly obsolete. And you will find very little near-future science-fiction for this very reason. But Vinge was no doubt well aware of this challenge and he is a smart fellow, so maybe his “predictions” for year 2025 are going to hold out better than, say, mine would. Let me review some of his predictions: • In his novel, many people earn a small income through informal part-time work with affiliate networks, doing random work. Today you can earn a small income through Amazon’s Mechanical Turk and there are many Uber-like services whereas individuals can earn small sums by doing various services. So this prediction is almost certainly coming true. • In his novel, we have ubiquitous high-bandwidth networking and zero privacy in 2025. That is, everyone is connected all the time, and hackers can track individuals easily. Given that your smart phone is tracking your every move already, this does not sound improbable. • In Rainbows end, we have ubiquitous augmented reality. You can “almost” appear to be at a meeting by projecting your image at a remote location so that people who are equipped with the proper gear can see you and get the impression that you are present. You can virtually repaint your house every day, and people with the right gear can see it as such. This is shown to have applications in gaming, education and medicine. Sadly, we have nothing close to that right now, and it is probably the most disrupting technology in the novel. We had Google Glass but the whole field of virtual/augmented reality has not been making much progress. Can we expect breakthrough in the next ten years? Maybe. My understanding is that augmented reality is much harder to achieve that it appears. On a related note, hardly anyone has a PC in 2025. I have been (wrongly) predicting the death of the PC for many years now… with no luck… but if we get ubiquitous augmented reality, then a PC would really look old-school. In any case, why would ubiquitous augmented reality be so disrupting? Because it draws a sharp line between those who are in it and those who are not. If businesses were to begin relying on augmented reality, employees would have to pick it up quickly or be left behind. • In Vinge’s 2025, people can send silent messages without talking or (openly) typing. It does feel like the kind of application that could take off. • Ving predicts self-driving cars that can take you somewhere, drop you off and go on to serve other needs. I think Google has shown that it is possible, soon. • There is no newspaper, not even the online equivalent. People get news somehow though. • It looks like neurodegenerative diseases have mostly been cured. The main character suffered from Alzheimer’s and came back. It is, of course, impossible to tell whether such a medical feat could happen in the next ten years… there are certainly many smart scientists trying to make it happen. However, obesity is still a problem and you can still die from a stroke. Many things are related in this respect. Better technology can help us better understand our body and our brain, but a better understanding of our brain could also help us build better technology (better AI). But we have to remain humble: we have known about Alzheimer’s disease for a century and we still have nothing that even looks like a cure from a distance. Naturally, these problems tend to be binary, one day you have no effective treatment, the next you do. There are more surprises in the novel, but I do not want to include spoilers. It is maybe interesting to note that while 2006 is not that far in the past, it is also before the first iPhone ever made it to the market. The first iPad came in 2010. Android did not exist in 2006. I think that Vinge had to make a real effort to imagine 2025 and it looks like he did well. Vinge has predicted in public the singularity for 2030. The problem, of course, is that with such an ill-defined concept, we could say, if we wanted to, that we are going through the singularity right now. I am far more interested to know when augmented reality will be ubiquitous. ### /r/compsci #### What is the meaning of "coadd pixel"? I keep coming across "co-added" or "coadd" pixels, e.g. "A set of coadded and stored pixels obtained at a specific time is referred to as a cadence, and the total amount of time over which the data in a cadence is coadded is the cadence period. The two cadence periods in use are Long Cadence (LC) and Short Cadence (SC). Each cadence consists of a series of frames that each include a 6.02 s exposure time and a 0.52-s readout time. For Long Cadence, 270 frames are coadded, for a total of 1766 s = 0.49 h. For Short Cadence, 9 frames are coadded, for a total of 58.85 s. " https://archive.stsci.edu/kepler/manuals/Data_Characteristics.pdf What does this mean? Is this something only used in astronomy? submitted by Zeekawla99ii [link] [17 comments] ### AWS #### New AWS Quick Start – Magento for E-Commerce Magento is a very popular open-source content management system for e-commerce sites. Sellers and developers appreciate its open architecture, flexibility, extensibility (hundreds of extensions), and back-end workflows that can be tailored to fit the unique needs of each customer. Magento Community Edition (Magento CE) Magento Enterprise Edition (Magento EE) are popular among our customers. Some of these deployments have been launched on AWS by way of partners such as Anchor Hosting, Elastera, Tenzing, Razorfish, and Optaros. Others have been launched via the AWS Marketplace (a search for Magento returns more than 20 listings). And still others have been launched in do-it-yourself form. Today we are publishing a new Magento Quick Start Reference Deployment. This 29-page document will show you how to build an AWS cluster that runs version 1.9.2 of Magento Community Edition. It walks you through best practices, provides cost estimates, and outlines the recommended set of AWS components. Using the AWS CloudFormation template referenced in the Quick Start, you can launch Magento into a new or existing Virtual Private Cloud (Amazon VPC). The template will create (if requested) the VPC, along with the necessary EC2 instances (auto scaled instances for the web server and a NAT instance for SSH connectivity), an Amazon Relational Database Service (RDS) instance running MySQL, and Elastic Load Balancing. It will also create the requisite IAM roles and security groups and configure the Auto Scaling to add more EC2 instances when traffic rises and remove them when it subsides. Here’s what the completed system looks like: The Quick Start also includes a pointer to some sample data that you can download from the Magento site! Jeff; ### CompsciOverflow #### Binary code with constraint Suppose I have an alphabet of n symbols. I can efficiently encode them with$\lceil \log_2n\rceil$-bits strings. For instance if n=8: A: 0 0 0 B: 0 0 1 C: 0 1 0 D: 0 1 1 E: 1 0 0 F: 1 0 1 G: 1 1 0 H: 1 1 1 Now I have the additional constraint that that each column must contain at most p bits set to 1. For instance for p=2 (and n=8), a possible solution is: A: 0 0 0 0 0 B: 0 0 0 0 1 C: 0 0 1 0 0 D: 0 0 1 1 0 E: 0 1 0 0 0 F: 0 1 0 1 0 G: 1 0 0 0 0 H: 1 0 0 0 1 Given n and p, does an algorithm exist to find an optimal encoding (shortest length) ? (and can it be proved that it computes an optimal solution?) EDIT Two approaches have been proposed so far to estimate a lower bound on the number of bits$m$. The goal of this section is to assess which of the approaches provides the best lower bound. Yuval's approach is based on entropy and provides a very nice lower bound:$\frac{logn}{h(p/n)}$where$h(x) = xlogx + (1-x)log(x)$. Alex's approach is based on combinatorics. If we develop his reasonning a bit more, it is also possible to compute a very good lower bound: Given$m$the number of bits$\geq\lceil log_2(n)\rceil$, there exists a unique$k$such that $$1+\binom{m}{1} + ... +\binom{m}{k} \lt n \leq 1+\binom{m}{1} + ... + \binom{m}{k}+\binom{m}{k+1}$$ One can convince himself that an optimal solution will use the codeword with all bits low, then the codewords with 1 bit high, 2 bits high, ..., k bits high. For the$n-1-\binom{m}{1}-...-\binom{m}{k}$remaining symbols to encode, it is not clear at all which codewords it is optimal to use but, for sure the weights$w_i$of each column will be bigger than they would be if we could use only codewords with$k+1$bits high and have$|w_i - w_j| \leq 1$for all$i, j$. Therefore one can lower bound$p=max(w_i)$with $$p_m = 0 + 1 + \binom{m-1}{2} +... + \binom{m-1}{k-1} + \lceil \frac{(n-1-\binom{m}{1}-...-\binom{m}{k}) (k+1)}{m} \rceil$$ Now, given$n$and$p$, try to estimate$m$. We know that$p_m \leq p$so if$p \lt p_{m'}$, then$m' \lt m$. This gives the lower bound for$m$. First compute the$p_m$then find the biggest$m'$such that$p \lt p_{m'}$This is what we obtain if we plot, for$n=1000$, the two lower bounds together, the lower bound based on entropy in green, the one based on the combinatorics reasonning above in blue, we get: Both look very similar. However if we plot the difference between the two lower bounds, it is clear that the lower bound based on combinatorics reasonning is better overall, especially for small values of$p$. I believe that the problem comes from the fact that the inequality$H(X) \leq \sum H(X_i)$is weaker when$p$gets smaller, because the individual coordinates become correlated with small$p$. However this is still a very good lower bound when$p=\Omega(n)$. Here is the script (python3) that was used to compute the lower bounds: from scipy.misc import comb from math import log, ceil, floor from matplotlib.pyplot import plot, show, legend, xlabel, ylabel # compute p_m def lowerp(n, m): acc = 1 k = 0 while acc + comb(m, k+1) < n: acc+=comb(m, k+1) k+=1 pm = 0 for i in range(k): pm += comb(m-1, i) return pm + ceil((n-acc)*(k+1)/m) if __name__ == '__main__': n = 100 # compute lower bound based on combinatorics pm = [lowerp(n, m) for m in range(ceil(log(n)/log(2)), n)] mp = [] p = 1 i = len(pm) - 1 while i>= 0: while i>=0 and pm[i] <= p: i-=1 mp.append(i+ceil(log(n)/log(2))) p+=1 plot(range(1, p), mp) # compute lower bound based on entropy lb = [ceil(log(n)/(p/n*log(n/p)+(n-p)/n*log(n/(n-p)))) for p in range(1,p)] plot(range(1, p), lb) xlabel('p') ylabel('m') show() # plot diff plot(range(1, p), [a-b for a, b in zip(mp, lb)]) xlabel('p') ylabel('m') show()  ### Lobsters #### Thoughts on Entitlement and Pricing ### /r/netsec #### An attacker was able to use a compromised account to access security-sensitive information about Firefox in Bugzilla ### Lobsters #### Call me Maybe: MariaDB Galera Cluster ### /r/compsci #### Computer Architecture help? Currently taking Computer Architecture II at my university now. I took CompArch I in community college ~three years ago. Most of the knowledge obtained from that class has been forgotten. We did a review the first day. Went over a few gates (e.g. AND, NAND, OR, etc...) and a truth tables. No problem. Started to go over Adders, Multiplexer, ALU, D Latch, D Flip Flop Hrm.. confused. We just had a quiz and I completely bombed it. I've been looking through the book but it's just not catching on. My professor reads directly off slides, which come from the book, so it's just personally difficult learning off that. Now we are on processors and I don't want to fall back too far behind! Any books, youtube videos, help would be appreciated. Thanks. submitted by odu_mark [link] [8 comments] #### Neural Networks, Types, and Functional Programming ### /r/netsec #### Researchers make advances in database security "arms race" ### CompsciOverflow #### Zelda Chaos not working properly [on hold] How come when ever I try to create http://www.jaytheham.com/zcw/Talk:Majora's_Mask_Miscellaneous_Glitches_-_Day_Reset, it won't submit when I click "Save page"? I went to that page later to see if the change got saved without me seeing it but it didn't. I can't contact Zelda Chaos directly about that problem because I don't see a way to contact that site. ### Lobsters #### Evolving the Google Identity #### W^X policy violation affecting all Windows drivers compiled in Visual Studio 2013 and previous ### StackOverflow #### Using assertion library outside of testing for list manipulations In my current project I do a lot of list manipulations on front end, and I realised that using something like chai or other assertion library could save me a lot of time and make code quite readable. Though while Chaijs is cool and all, it's api is not exactly functional and currying friendly. My questions are: Is it actually a good idea to look for assertion library to use outside of testing environment for array manipulations/filtering. Or am I smoking crack? And if yes, any one have success story with any library? Some of simplified examples of what I'm trying to do: const people = [ {name: "David Smith", age: 25, livedIn: ["London", "Paris"]}, {name: "John Smith", age: 30, livedIn: ["New York"]}, {name: "John Gates", age 25, livedIn: ["London", "New York"]} ]  Currently we are using plain closures, and when checks are easy enough, it's all good and shiny: people // Let's find 25 year olds .filter((person) => person.age === 25); // And see who is among them john .filter((person) => person.name.indexOf("John") === 0);  Now that's all plain and easy, but then we need to start testing deep properties: person.names.firstName, or person.cities[<index>].postCode and this is what assertion library does great on paper, chai or jasmine have methods to do just that, example from chaijs documentation: expect(tea).to.have.property('flavors').with.length(3);  For example with assertion library I can find people who lived in London: people .filter((person) => { return expect(person.livedIn).to.includes("London"); });  Or not lived in London: people .filter((person) => { return expect(person.livedIn).not.to.includes("London"); });  And of course ranges: expect({price: 50}).to.be.within(40, 60);  Now while both chaijs and jasmine do functinally what I need, their API is not functional, I would love to have api close to Ramda or i.e. Composable functions, with rules first, and iteratee last. Something along the lines of: people .filter(within('age', 20, 40)) // Filter age property, to be 20 < age < 40 .filter(contains('livedIn', 'London', 'Boston')) // Further filter people, by place they lived in.  Of course I don't want or need exactly that API. It can differ in any way as long as I can curry it. #### functional programming - call function with all parameters I'm actually experimenting the functional programming in php. I would like to have some precisions about some function calls. as an example: function addition($num1){
return function ($num2){ return$num1*$num2; } }$add_2_to = addition(2);
echo $add_2_to(3); echo$add_2_to(4);


Is there a way to call the addition function with all the parameters? I tried in this way with no chances:

echo addition(2)(3);


### CompsciOverflow

#### Find maximum element in sorted arrays in logarithmic time

Been stuck on this for a while, would really appreciate some help:

Suppose you are given an array A[1...n] of sorted integers that has been circularly shifted k positions to the right. For example, [35,42,5,15,27,29] is a sorted array that has been circularly shifted k = 2 positions, while [27,29,35,42,5,15] has been shifted k = 4 positions. Give an algorithm for finding the maximum element in A that runs in O(log n) time.

The elements in A are distinct.

I understand that to achieve O(log n) time I'll probably have to search through the list by starting at the middle, and then going left or right, then splitting the list in half over and over, but I'm not sure how to attack it beyond that.

### StackOverflow

#### Stateful computation with different types of short-circuit (Maybe, Either)

I am trying to find the most elegant way of converting the following stateful imperative piece of code to pure functional representation (preferably in Haskell to use abstraction that its Monad implementation offers). However I am not yet good at combining different monads using transformers and the like. It seems to me, that analyzing other's takes on such tasks helps the best when learning how to do it myself. The imperative code:

while (true) {
while (x = get()) { // Think of this as returning Maybe something
put1(x) // may exit and present some failure representation
}
put2() // may exit and present some success representation
}


When get returns Nothing we need the execution to continue with put2, when get returns Just x we want the x to get passed to put1 and short-circuit only if put1 fails or loop otherwise. Basically put1 and put2 may terminate the whole thing or move to the following statement changing the underlying state somehow. get can either succeed and invoke put1 and loop or fail and continue to put2.

My idea was something along:

### /r/scala

#### Scala XML attribute replacement results in appending modified node as a child

All information here

Ive narrowed it down to the replace method and tried changing the current.map to current.child.map current.distinct.map etc. with none of my desired results showing up as anticipated :/

edit: worst case scenario I may just migrate this project to scales xml

submitted by Brompton_Cocktail
[link] [comment]

### Planet Theory

#### Open Problems That Might Be Easy

A speculation on the length of proofs of open problems

 Broad Institute source

Nick Patterson is one of the smartest people I have ever known.

Today I would like to talk about something he once said to me and how it relates to solving open problems.

Nick now works at the Broad Institute on genomic problems—especially large data sets. For decades, he worked as a cryptographer for the British, and then the U.S. He also spent many years with Renaissance Technologies, Jim Simons’s investment hedge fund.

Years ago I consulted regularly at IDA—a think tank for the NSA that was based in Princeton. I cannot tell you what we did, nor what we did not. But I can say we worked on interesting problems in the area of communication. One of the top scientists there was Nick. He was a quite strong chess player, and at tea would often play the other top player at speed chess. I once asked Nick how he would fare against the then world champion Anatoly Karpov. He answered that it would be:

Sack, sack, mate.

I never really believed this. I always thought that he was strong enough to hold out a bit, but perhaps Nick was right. Perhaps it would be over quickly. That Karpov would find a simple, short, demonstration that would wipe Nick out.

But given how smart Nick was I often wondered if he was right when we translate his statement to mathematics:

Can open problems have simple solutions?

Could some open problems fall to: “We know […] and by induction […] we are done”?

## Some Short Solutions

Leonhard Euler thought long and hard about possible generalizations of Fermat’s Last Theorem. He was the first to fashion a nearly-correct proof that no cube could be a nontrivial sum of fewer than three cubes. He conjectured that no 4th-power could be a nontrivial sum of fewer than four 4th-powers, no 5th-power a sum of fewer than five 5th-powers, and so on. These cases were open for nearly two centuries, but in 1966, Leon Lander and Thomas Parker used a computer to find

$\displaystyle 27^5 + 84^5 + 110^5 + 133^5 = 144^5.$

It still took 20 more years for the other shoe to drop on the ${n=4}$ case:

$\displaystyle 2682440^4 + 15365639^4 + 18796760^4 = 20615673^4.$

This was the smallest member of an infinite family of solutions found by Noam Elkies—himself a master chess player—though not the smallest overall, which is

$\displaystyle {95800^4 + 217519^4 + 414560^4 = 422481^4}$.

The shortest solution to a problem in complexity theory that was important and open for several years was:

$\displaystyle c = a b a^{-1} b^{-1}.$

Well this needs a little context. If either ${a}$ or ${b}$ is the identity, then this group commutator ${c}$ is the identity. Else in a large enough permutation group like ${S_5}$, you can rig it so that ${c}$ is never the identity. Thus if you a piece of code ${P}$ such that ${P(x) = a}$ if some property ${A(x)}$ holds and ${P(x) = 1}$ otherwise, and ${Q}$ likewise gives ${Q(x) = b}$ iff ${B(x)}$ holds, then the code ${PQP^{-1}Q^{-1}}$ computes the AND gate ${C(x) = A(x) \wedge B(x)}$.

Since it is easy to code ${\neg A(x)}$ by the code ${a^{-1} P}$, and since AND and NOT is a complete basis, the upshot is that all ${O(\log n)}$-depth Boolean formulas can be computed by codes of size ${n^{O(1)}}$. Permutations of 5 elements allow codes in the form of width-5 branching programs, so what tumbles out is the celebrated theorem of David Barrington that the complexity class ${\mathsf{NC^1}}$ has constant-width branching programs. GLL’s old post on it played up the simplicity, but like many mate-in-n chess problems it was not simple to see in advance.

More recently, a seemingly complicated conjecture by Richard Stanley and Herbert Wilf about permutations was proved in an 8-page paper by Adam Marcus and Gábor Tardos. For any sequence of ${k}$ distinct integers define its pattern to be the unique permutation of ${1,\dots,k}$ that has the same relative ordering. For instance, the pattern of ${(110,27,133,144,84)}$ is ${(3,1,4,5,2)}$. Given any length-${k}$ pattern ${p}$ and ${n > k}$ define ${R_n}$ to be the set of permutations of ${1,\dots,n}$ for which no length-${k}$ subsequence (not necessarily consecutive) has pattern ${p}$. Whereas ${|S_n|}$ grows as ${2^{\Theta(n\log n)}}$, Stanley and Wilf conjectured that the maximum growth for ${|R_n|}$ is ${2^{O(n)}}$, for any fixed ${p}$. Marcus and Tardos polished off a stronger conjecture by Zoltán Füredi and Péter Hajnal about matrices in basically 1.5 pages whose short lemmas read like sack, sack, mate.

We found the last example highlighted in a StackExchange thread on open problems solved with short proofs. Another we have seen mentioned elsewhere and covered in a post is Roger Apéry’s proof that ${\zeta(3)}$ is irrational, which centered on deriving and then analyzing the new identity

$\displaystyle \zeta(3) = \frac{5}{2}\sum_{n=1}^{\infty} \frac{(-1)^{n-1}}{n^3 {2n \choose n}}.$

The Euler examples could be ascribed mostly to “computer-guided good fortune” but the other three clearly involve a new idea attached to a crisp new object: an equation or group or matrix construction. We wonder whether and when the possibility of such objects can be “sniffed out.”

## Problems Ripe For Short Solutions?

I sometimes wonder if we have missed a simple proof—no, short is a better term—a short proof that would resolve one of our favorite open problems. Here a couple that Ken and I suggest may fall into this category.

${\bullet}$ Péter Frankl’s conjecture. Surely it just needs an equation or formula for the right kind of “potential function”? We noted connections to other forms in a followup post.

${\bullet}$ Freeman Dyson’s conjecture. It stands out even among number-theory statements that are “overwhelmingly probable” in random models, and there ought to be some principle for boxing away potential counterexamples along lines of how certain numbers with fast approximations can be proved to be transcendental. (Then again ${\zeta(3)}$ hasn’t yet been proved transcendental.)

${\bullet}$ Problems involving deterministic finite automata (DFAs) are often easy to solve or show to be decidable. But here is one from Jeff Shallit that is not: Given a DFA, does it accept a string that is the binary representation of a prime number—is this decidable?

${\bullet}$ Even simpler is the question, given binary strings ${x}$ and ${y}$ of length ${n}$, of finding the smallest DFA ${M}$ that accepts ${x}$ but not ${y}$ (or vice-versa). Can we put tight bounds on the size of ${M}$ in terms of ${n}$?

${\bullet}$ Can a Boolean circuit ${C}$ of size ${s}$ computing a function ${f(x_1,\dots,x_n)}$ be converted into a circuit ${C'}$ of size ${O(s)}$ with ${n}$ gates ${g_i}$ each computing the XOR of ${f(x_1,\dots,x_i,\dots,x_n)}$ with ${f(x_1,\dots,\neg x_i,\dots,x_n)}$? This would be a Boolean analogue of the famous Derivative Lemma of Walter Baur and Volker Strassen, which we have discussed here and here.

## Harder Ones?

By contrast, the 3n+1 conjecture of Lothar Collatz does not strike us as a candidate. John Conway proved that related forms are undecidable, and we suspect that it will need deep new knowledge of how intuitively multiplicative properties of integers duck-and-weave under the additive structure.

Regarding the Skolem problem we are on the fence: in a sense it involves DFAs which should be easy, but even for small ${n}$ the connections to deep approximation problems in number theory become clear.

Here are four others—what do you think?:

${\bullet}$ Can polynomials modulo ${6}$ compute the majority function? Of course we mean polynomials of low degree and we allow some flex on the meaning of “compute” provided it relates to polynomial-sized constant-depth circuits.

${\bullet}$ Can we at least prove that SAT cannot be computed in linear time on a Turing Machine? Note that this does not follow from the proof of ${\mathsf{NLIN \neq DLIN}}$ as we discussed here. For all those of you who are sure that not only does ${\mathsf{P \neq NP}}$, but also that ETH is true, how can we not prove this “trivial lower bound?”

${\bullet}$ Can whether an ${n}$-node graph has a triangle be decided in ${O(n^2)}$ time? Or at least better than matrix-multiplication time? Ken knew a visitor to Buffalo who admitted spending a year on this with little to show for it.

${\bullet}$ Can we solve the lonely runner problem? Or at least improve the number of runners that it is known to be true for?

## Open Problems

How can we possibly forecast how open an open problem is?

### QuantOverflow

#### residential mortgage prepayment modelling

I'm trying to develop a model for predicting prepayments, after reading several arcticles about it over the net. the model should use market data and be behavioral model (i.e. regression/survival analysis).

for now i was thinking to follow: "Modelling prepayment risk: Multinomial logit approach..." (NIBC payper)

my goal should be used in ALM. my understaing was that every month i'll run the model on each loan, and the model will calculate A FULL SMM(t) CURVE, using interest rate curve and other variables.

for me able to build SMM(t) curve, which will predict the SMM for each month until the loan ends, I should build a model that predict the probability that a loan will prepay.

so i should run a logit model.

my question is, when running the logit model, how does my data should look like, supposing i have 10 years of data. meaning, should i insert each loan many times (each month)? (Panel data model) does autocorrelation woun't "kill" it? should i do something else?

thank you for your help!

#### Ledoit-Wolf Shrinkage estimator not giving positive definite covariance matrix

I used ten year daily data for 407 stocks and computed the daily and monthly covariance matrices. Since I have more variables than observations for the monthly matrix, I wasn't surprised to find the matrix to be not invertible (and hence useless for portfolio optimization). I was surprised to see the daily covariance matrix not invertible. I then tried to shrink the matrix with the Ledoit-Wolf shrinkage estimator using the package tawny. It didn't help. It makes the covariance matrix really, really small, but no invertible.

Does anyone have any suggestions what could be the problem? How could I improve the covariance matrix?

### /r/emacs

#### Help using shell-quote-argument correctly

I'm writing some code where I want to prompt the user for a directory and then run a shell command on that directory and capture the output. My understanding is that it is wise to use shell-quote-argument to sanitise the user input before running the shell command (e.g. to properly escape spaces and other special characters).

However, I can't seem to use it correctly. For example, this simple code works

(setq dir "~/") (shell-command-to-string (concat "ls " dir)) 

and returns the contents of my home directory, but this does not:

(setq dir "~/") (shell-command-to-string (concat "ls " (shell-quote-argument dir))) 

since shell-quote-argument turns dir into "\\~/" which the shell doesn't recognise as the home directory.

Could anyone tell me what I am doing wrong?

submitted by benmaughan
[link] [2 comments]

### UnixOverflow

#### How to dual boot PC-BSD 10.3 (with zfs file system) and debian 7 (crunchbang) using grub2 boot loader in MBR?

I want to dual boot PC-BSD 10.3 with ZFS as the root file system (the only file system) with Debian 7 (crunchbang linux) with ext4 using grub2 boot loader installed in MBR and grub is managed by the debian.

All the documents are dealing with dual booting PC-BSD/FreeBSD using UFS2 file system and Debian(How do I add PC BSD / FreeBSD to Grub 2 boot loader?). So I'm asking the question here.

From my debian's grub configuration

cat /etc/grub.d/40_custom
#!/bin/sh
exec tail -n +3 $0 # This file provides an easy way to add custom menu entries. Simply type the # menu entries you want to add after this comment. Be careful not to change # the 'exec tail' line above. menuentry "PC-BSD" { insmod zfs set root=(hd0,2) chainloader +1 }  This entry is being detected by the grub and showing in the boot screen. But when I select PC-BSD, it is showing the error "UFS not found". I think this is because PC-BSD 10.3 is using ZFS instead of UFS2. Please provide me the guide to boot PC-BSD with ZFS and debian using grub2 managed by debian. ### AWS #### Welcome to our New Colleagues at Elemental Earlier today we announced that we had reached an agreement to acquire Elemental Technologies of Portland, Oregon. Elemental has pioneered a number of software-based solutions for multiscreen content delivery and powers many of the world’s most innovative app-delivered video offerings and new services like 4K TV. Elemental customers use its software to process and deliver video streams that are customized for a wide variety of different formats, displays, devices, data rates, and environments. Here’s a recent photo of the Elemental team: We have been working with Elemental to address shared customers in the Media & Entertainment (M&E) business for the past four years. During this time we have been impressed by their penchant for moving fast and their long-term vision for software-defined video. We quickly realized that we could work together to create solutions that spanned the entire video pipeline. Today’s announcement will allow us to work even more closely together to provide media companies with a family of on-premises, hybrid, and cloud-based solutions for Internet-based video delivery. Of course, I’ll do my best to learn about and then share information about any and all new offerings as they become available. Perhaps I’ll even make a trip to Portland to interview the Elemental folks for an AWS Podcast episode. On behalf of the entire AWS team I would like to extend a warm welcome to our new colleagues! Jeff; ### Planet Theory #### Whiplashed I recently watched the movie Whiplash, about a college jazz band director, Fletcher played by J.K. Simmons, who torments his musicians to force them to be their best. The movie focuses on a drummer, Andrew, which makes for a great audio/video feast but in its essentials Whiplash is a story of a professor and his student. I can imagine playing the role, “Do you think your proof is correct? Yes or No? Are you applying Toda’s theorem correctly or are you using the same crazy logic your dad used when he left your mom?” OK, maybe not. Nevertheless Fletcher has a point. Too often I’m seeing graduate student doing just enough to get a paper into a conference instead of pushing themselves, trying to do great work and still not being satisfied. Fletcher says the two most dangerous words in the English language are “good job”. While that might be a little cruel, we do need to push our students and ourselves to take risks in research and be okay in failing. To roughly quote John Shedd and Grace Murray Hopper, "the safest place for a ship is in the harbor, but that’s not what ships are for." Whiplash had a different kind of scene that definitely hit home. Andrew could not impress his family with the fact that he was lead drummer in the top college jazz band in the nation. I’ve been there, trying to get my mother excited by the fact that I had a STOC paper early in my career. "That's nice dear". ### Fefe #### DE-CIX wird jetzt aktiv gegen die Netzneutralität ... ### DataTau #### Diagnosing diabetic retinopathy with deep learning ### Lobsters #### What Comes next for the Web Platform - Alex Russell ### /r/netsec #### IIS Level SQL Injection Prevention ### Lobsters #### Neural Networks, Types, and Functional Programming ### CompsciOverflow #### give potential function - binary heap - extract-min in amortized const time and insert in log amortized time Consider an ordinary binary min-heap data structure with n elements supporting the instructions INSERT and EXTRACT-MIN in O($\lg n)$worst-case time. Give a potential function$\Phi$such that the amortized cost of INSERT is$O(\lg n)$and the amortized cost of EXTRACT-MIN is$O(1)$. I try to solve it, but I got in stuck: Attempt Let$c_i$denotes real cost of$i-th$operation, and$a_i$denotes amortized cost. Let$\Phi(D_i) = \text{number of elements after i operations}$INSERT:$$a_i=c_i+ \Phi(D_i) - \Phi(D_{i-1}) \le \log(i) + (i)\log(i) - (i-1)\log(i-1) \le 3\log(i+1) = O(\log(i))$$ EXCTRACT-MIN:$$a_i=c_i + \Phi(D_i) - \Phi(D_{i-1}) \le \log(i) + i\log(i) - (i+1)\log(i+1) =\log(i) + i\log(i) - i\log(i+1) - \log(i+1) < 0$$ As you can see, my problem is that I get number lower than$0$. Can you help me ? ### /r/compsci #### A New Design for Cryptography’s Black Box ### Lobsters #### Smart Machines Will Take Us With Them ### /r/compsci #### Designing a mechanical push-down automaton I want to create a physical, mechanical push down automaton accepting anεbn. I'd prefer it if the design were made of simple construction toys (LEGO, K'nex, Lincoln Logs, that kind of thing). I ask because I want something I can demonstrate to high school students that easily demonstrates that style of machine. submitted by thephotoman [link] [comment] ### UnixOverflow #### zfs dataset constantly unmounting itself after some time, others still available I have a linux box, that i putty into, which is my backup system. It has a zfs linear span array with 3 2TB drives. Every night i use freefilesync (great program) to sync files to this mount, which is mapped as a network drive. $ zfs list
NAME                                USED  AVAIL  REFER  MOUNTPOINT
san                                3.31T  2.04T  2.87M  /san
san/vault                          3.31T  2.04T   136K  /san/vault
san/vault/falcon                    171G  2.04T   100K  /san/vault/falcon
san/vault/falcon/snapshots          171G  2.04T   171G  /san/vault/falcon/snapshots
san/vault/falcon/version            160K  2.04T    96K  /san/vault/falcon/version
san/vault/gyrfalcon                 564K  2.04T   132K  /san/vault/gyrfalcon
san/vault/gyrfalcon/snapshots       184K  2.04T   120K  /san/vault/gyrfalcon/snapshots
san/vault/gyrfalcon/version         184K  2.04T   120K  /san/vault/gyrfalcon/version
san/vault/osprey                    170G  2.04T   170G  /san/vault/osprey
san/vault/osprey/snapshots         24.2M  2.04T  24.2M  /san/vault/osprey/snapshots
san/vault/osprey/version            120K  2.04T   120K  /san/vault/osprey/version
san/vault/redtail                  2.98T  2.04T  17.2M  /san/vault/redtail
san/vault/redtail/c                 777M  2.04T  72.9M  /san/vault/redtail/c
san/vault/redtail/c/AMD            4.44M  2.04T  4.24M  /san/vault/redtail/c/AMD
san/vault/redtail/c/Users           699M  2.04T   694M  /san/vault/redtail/c/Users
san/vault/redtail/d                1.59T  2.04T   124K  /san/vault/redtail/d
san/vault/redtail/d/UserFiles      1.59T  2.04T  1.59T  /san/vault/redtail/d/UserFiles
san/vault/redtail/d/archive         283M  2.04T   283M  /san/vault/redtail/d/archive
san/vault/redtail/e                1.34T  2.04T   124K  /san/vault/redtail/e
san/vault/redtail/e/PublicArchive  1.34T  2.04T  1.34T  /san/vault/redtail/e/PublicArchive
san/vault/redtail/e/archive         283M  2.04T   283M  /san/vault/redtail/e/archive
san/vault/redtail/snapshots         184K  2.04T   120K  /san/vault/redtail/snapshots
san/vault/redtail/version          44.3G  2.04T  43.9G  /san /vault/redtail/version


When looking at the linux, via putty, the mounts are there one minute, then a few minutes later they are gone(zfs list always shows them, you have to traverse into them to see if they have been unmounted, they will be empty, or not in the parent dir at all). These are datasets that are losing their mounts.

san/vault/redtail is empty almost every time when i come back the next morning, or a couple hours later, just after freefilesync starts its synch, but before i can start moving files.

I have tried to export and import , same problem happens still. My data is still in tact...

this command will fix all of them momentarily again(for an unknown duration).. zfs mount -a

This all started (coulpe weeks ago) after i made quite a few child datasets inside the parnt san, where before it was only the san dataset parent(i NEVER had any problems), no child datasets existed when all the data was layed in there. I have since moved out the data, made the datases, and moved back int he data.

My machine that keeps network traversing it to synchronize, the mount gets lost in the duration and my backups have yet to finish in over a month now almost. Before it can tally the files (say 15 minutes to an hour) the fricken mount is gone again, and it hangs. Back ups are in limbo.

I may have fubar'd something when i crated the datasets. I had to delete a few after because i wasnt happy with them. The data wasnt either visible or what, but once i had all said and done, things looked perfect after several reboots and double checks!!

After this, things seemed good, but I looked into nto one of its child data sets san/vault/redtail/c and that fricker is empty too, i think it came unmounted.

before i destroy shit. i need to know whats going on. This data is a duplicate backup of a current healthy system, but this backup is the only, therefore is at the mercy of the health of the source drives.. So i cant afford it to be offline at all..

One more note on SNAPSHOT's. I made a snapshot, for the first time, just prior to the seeming brakage. could that have caused it? Can i possibly use this snapshot to fix something? Is this coincidence, or symptomatic?

see this post, Bulk remove a large directory on a ZFS without traversing it recursively if you want information as to why i created a bunch of datasets, with children...

Edit: The question has just been rewritten, due to lack of attention on this post, please read it again..

### StackOverflow

#### Javascript reduce on array of objects

Say I want to sum a.x for each element in arr.

arr = [{x:1},{x:2},{x:4}]
arr.reduce(function(a,b){return a.x + b.x})
>> NaN


I have cause to believe that a.x is undefined at some point.

The following works fine

arr = [1,2,4]
arr.reduce(function(a,b){return a + b})
>> 7


What am I doing wrong in the first example?

### QuantOverflow

#### "Hedging" a put option, question on exercise

I have a question on the following exercise from S. Shreve: Stochastic Calculus for Finance, I:

Exercise 4.2. In Example 4.2.1, we computed the time-zero value of the American put with strike price $5$ to be $1.36$. Consider an agent who borrows $1.36$ at time zero and buys the put. Explain how this agent can generate sufficient funds to pay off his loan (which grows by $25 \%$ each period) by trading in the stock and money markets and optimally exercising the put.

The model from Example 4.2.1 he refers to is the following: \begin{align*} S_0 & = 4 \\ S_1(H) & = 8, S_1(T) = 2 \\ S_2(HH) & = 16, S_2(HT) = S_2(TH) = 4, S_2(TT) = 1 \end{align*} with $r = 1/4$ and risk-neutral probabilities $\hat p = \hat q = 1/2$.

So now my question, I am not sure how to solve this, I guess the agent wants in some way hedge that he can always pay the credit by utilising the option. First how should I think about it, that he should pay back his credit at time step $2$, or earlier if possible by exercising the option, or should he stay liquide until the end? And what means optimal, hedging with minimal invest?

Okay, I solved it by considering two scenarios, first using the option at time step $1$, and then at time step $2$. By using it at time step one I found that he has to invest additional $0.16$, buy $1/2$ from the share, and accordingly borrow $0.16 - 1/2 \cdot 4$ from the bank/monkey market. Then at the first time step, as the option was exercised and the value of the portfolio equals $(1 + r)1.36$ he could just invest everything riskless, i.e. readjusting his portfolio by not buying any shares of stock, and investing in the riskless asset $(1+r)1.36$, in this way at time step $2$ he could pay $(1+r)^2 1.36$.

In the second scenario, i.e. exercising the option at time step $2$, I found that he has to invest additional $1.36$ and buy no share at the initial step, and then readjust in the next step his portfolio as to buy $1/12$ of the share if it goes up and $1.06$ if it goes down, and by exersing the option, if it goes up after paying his debt $(1+r)^2 1.36$ his portfolio has the value $1.3$, meaning he still has money, or $-0.3$ if it goes down, meaning he still has some debt (this point I do not understand fully?...)

So can someone help me in understand and solving this exercise (if my approach is wrong...)?

### CompsciOverflow

#### Why we need to read memory on a write-miss?

I noticed that in write-back cache memories, when cpu want to write on a block, it should fetch it from memory then update that block. so if block is going to be overwritten and changed by processor, so why we should read that block from memory?

### /r/compsci

#### What is the best algorithm to use to determine likelihood of a certain party taking action on something?

I am attempting to write an algorithm that will determine the likelihood that a party will bid on an auction given past behavior, available capital, how much they have/ need, etc. I will be writing a shell that runs the algorithm and returns the result.

What algorithm would be best for solving this? Also, what scripting language would accomplish this in the best way?

I am very new to creating/ using algorithms so any and all feedback is greatly appreciated.

submitted by EmmonsVentures
[link] [4 comments]

### Planet Theory

#### One Lecture Down....

CS125, the "new", "honors-ish" Algorithms and Complexity course, got off to a good start today.  The room was full with not enough seats for people, the students asked good questions and responded well to questions asked, and we got through the amount of material I expected.  It's year 2 of the class, which is easier than year 1 in some ways (lots of material prepared), and possibly harder in others (some thinking about what needs to be tweaked or fixed from year 1).  We'll see how things shake out next week, but I'm expecting we'll be in the 30-40 student range, like last year.  I can never tell if I managed to scare students off or make them want to take it.  (The challenge is that I want to do both;  scare off people without sufficient background, but interest students who do but might not realize it and might not even be Computer Science majors.)  Pleasantly, I felt very excited during and after the lecture, and will try to hold on to that positive energy.

In other, much much stranger news, Harvard's CS50 appears to have a "backlash" movement, as described in today's Crimson article.  Apparently, according to some, there's intense "social pressure" to take CS50, and students need to be told that it's OK not to take the class.  I find this quite odd and, from my vantage point, misguided.  (Of course, I'm not a college freshman.)  I can't recall any such organized movement against Economics 10 at Harvard, which has been for decades now the most popular class at Harvard, although even when I was a student there was something one could potentially call cult-like about it.  (Cult of Wall Street....)  But that didn't mean people complained;  if you didn't want to take the class, you didn't, not really a thing.  Sure, the CS department here has been actively trying to attract students for decades -- CS50 was a good-sized class even before David Malan took over -- and David has just been very successful at it, with a combination of organization, interesting material, vision, and, yes, good marketing.  Naturally, here in CS, we believe that in our idealized world nearly all undergraduates would have a CS course as part of their liberal arts education, and we provide several other entry courses besides CS50.  I was initially thinking the movement described in the article was just a joke, and maybe I'm being April Fooled, but I'm not sure where those responsible are coming from.

And speaking of bringing in students to CS, Ben Golub and Yaron Singer are doing a new Econ-CS class at Harvard (counts for either;  also good for Applied Math majors) simply entitled Networks.  I'm a bit jealous -- this is a class I've thought about teaching also, but was busy and happy teaching algorithms -- but hopefully now that they've started it up it means I'll get a chance to teach it some year(s) down the line.

More insight into whether our enrollment numbers are still rising (is that still possible?) next week...

### Lambda the Ultimate Forum

#### Implementing "Elements of Programming" in Actor Script

In an effort to understand how ActorScript is to program with, and because I think Stepanov's "Elements of Programming" represents the best 'core' for any generic programming library, I want to try and translate the Elements into ActorScript. As there is no environment for actually testing code I thought I would post it here for comments from those who can tell if its correct or not. The idea is each top-level post should be a block of Elements code in ActorScript for comments. I will revise the top-level code according to the corrections and suggestions in the comments. I don't plan on using Unicode, but I will try and get the ASCII representations correct too.

### QuantOverflow

#### Does the low CAD positively or negatively impact Canadian Investors? [on hold]

I'm interested in getting into investing, but I have a limited amount of business experience. I plan on putting a small amount of money on the market just to see how it goes.

I don't quite understand how the low CAD affects my investments in American stock and international currency though. Suppose for a long term investment, will the eventual rise of the CAD benefit my investments or not?

### /r/netsec

#### August HardenedBSD Status Report

submitted by [deleted]
[link] [comment]

### CompsciOverflow

#### Practical application of Finite State Machines

I am looking for practical applications of Finite State Machines like DFA, NFA, Moore, Mealy machines...

It would be helpful if someone point to examples from Linux Kernel. I know that DFA is used in string matching like KMP algorithm.

What is the significance of NFA, Moore and Mealy machines?

### Dave Winer

#### Baseball humor

I love going to baseball games because, in certain cases, it's okay to be incredibly rude to people, and it's funny.

For example, we were waiting on the train home from CitiField last night, reveling in a solid Mets win that included an inside-the-park home run, when a group of Phillies fans comes on.

I said something like "Sorry we humiliated you guys again." There is some honor in being a Phillies fan after all.

Then I noticed one of them was wearing a Yankees shirt. "What are you doing here?" A few seconds pass: "I guess they'll let anyone in these days."

PS: This started as a Facebook thread.

### High Scalability

#### How Agari Uses Airbnb's Airflow as a Smarter Cron

This is a guest repost by Siddharth Anand, Data Architect at Agari, on Airbnb's open source project Airflow, a workflow scheduler for data pipelines. Some think Airflow has a superior approach.

Workflow schedulers are systems that are responsbile for the periodic execution of workflows in a reliable and scalable manner. Workflow schedulers are pervasive - for instance, any company that has a data warehouse, a specialized database typically used for reporting, uses a workflow scheduler to coordinate nightly data loads into the data warehouse. Of more interest to companies like Agari is the use of workflow schedulers to reliably execute complex and business-critical "big" data science workloads! Agari, an email security company that tackles the problem of phishing, is increasingly leveraging data science, machine learning, and big data practices typically seen in data-driven companies like LinkedIn, Google, and Facebook in order to meet the demands of burgeoning data and dynamicism around modeling.

In a previous post, I described how we leverage AWS to build a scalable data pipeline at Agari. In this post, I discuss our need for a workflow scheduler in order to improve the reliablity of our data pipelines, providing the previous post's pipeline as a working example.

Scheduling Workflows @ Agari - A Smarter Cron

### StackOverflow

#### EclipseFP doesn't working

I am trying to install EclipseFP on Eclipse version: Mars Release (4.5.0) on Mac OS Yosemite 10.10.5.

While Haskell perspective is appeared, nothing special for Haskell language is not working (Syntax highlight, hoogle search & etc...)

Here is my Eclipse Preferences:

And it seems that problem is that not all necessary packages could be compiled (But sure that this is the root cause)

For example, when I am pressing button "Install from Hackage" in the Haskell helper executables it is trying to compile it but fails because of ghc-pkg-lib-0.3:

Resolving dependencies...
Notice: installing into a sandbox located at
/Applications/Eclipse.app/Contents/MacOS/.eclipsefp/sandbox
Configuring ghc-pkg-lib-0.3...
Building ghc-pkg-lib-0.3...
Failed to install ghc-pkg-lib-0.3
Build log ( /Applications/Eclipse.app/Contents/MacOS/.eclipsefp/sandbox/logs/ghc-pkg-lib-0.3.log ):
Configuring ghc-pkg-lib-0.3...
Building ghc-pkg-lib-0.3...
Preprocessing library ghc-pkg-lib-0.3...
[1 of 1] Compiling Language.Haskell.Packages ( src/Language/Haskell/Packages.hs, dist/dist-sandbox-b2e886dd/build/Language/Haskell/Packages.o )

src/Language/Haskell/Packages.hs:170:13:
Couldn't match type ‘[Char]’
with ‘Distribution.ModuleName.ModuleName’
Expected type: InstalledPackageInfo_
Distribution.ModuleName.ModuleName
Actual type: InstalledPackageInfoString
In the expression: pkgconf
In the expression:
pkgconf {exposedModules = convert e, hiddenModules = convert h}

src/Language/Haskell/Packages.hs:170:47:
Couldn't match type ‘ExposedModule’ with ‘[Char]’
Expected type: [String]
Actual type: [ExposedModule]
In the first argument of ‘convert’, namely ‘e’
In the ‘exposedModules’ field of a record

src/Language/Haskell/Packages.hs:171:39:
Couldn't match type ‘ExposedModule’
with ‘Distribution.ModuleName.ModuleName’
Expected type: [Distribution.ModuleName.ModuleName]
Actual type: [ExposedModule]
In the ‘hiddenModules’ field of a record
In the expression:
pkgconf {exposedModules = convert e, hiddenModules = convert h}
cabal.real: Error: some packages failed to install:
buildwrapper-0.9.1 depends on ghc-pkg-lib-0.3 which failed to install.
ghc-pkg-lib-0.3 failed during the building phase. The exception was:
ExitFailure 1


Could somebody please help me?

### StackOverflow

#### Show progress of Haskell program

I have list in Haskell with some objects. And I need to find out whether someone of these objects satisfied certain condition. So, I wrote the following:

any (\x -> check x) xs


But the problem is that check operation is very expensive, and the list is quite big. I want to see current progress in runtime, for example 50% (1000/2000 checked).
How I can do this?

### QuantOverflow

#### Numerical computation of Heston model Integral: Simpsone Rule or Gauss-Legendre Method

I want to price a call option using the Heston model for a given set of parameters.

The integral equation (18) needs to be approximated.

I'd like to either use the Simpson Rule or Gauss-Legendre method for it.

Does anybody have the matlab or scilab codes for it ?

### /r/compsci

#### Comp Sci +(Computational Linguistics/Neuroscience?)

Hi guys,

I'm back again with a second career advice post. As you may know from my first post, I'm starting college in a few days and will be majoring in computer science. However, I am also interested in two other subjects, which I think will intertwine with computer science; computational linguistics and computational neuroscience. What do you guys think is the best option for me right now, as a freshman, and what career path should I take in the (near) future?

Thanks in Advance, Shaheer

submitted by shaheer0702
[link] [comment]

### QuantOverflow

#### Are low oil prices and low shipping costs really a leading indicator for a shrinking economy

Recent article in Bloomberg saying that lowered shipping costs n the form of the Baltic Dry Index and lowered oil prices are in someway a concern for a growing global economy:

http://www.bloombergview.com/articles/2015-08-31/maybe-this-global-slowdown-is-different

This is not the only recent financial article I have read showing concerns about the price of oil and implications that low oil prices are in anyway a bad thing for the rest of us.

Now, I'm not going to disagree with part of the article's idea that lowered commodity prices are at the very least, not a good sign for growth. But the rest of the concerns ultimately having to do with lower shipping costs would in my mind have the opposite effect on the global economy than the article implies. Decreased shipping costs should increase trade, and therefore wealth.

Furthermore it seems to me that it is supply, not demand, that is causing oil prices to fall. Increased engine efficiency also should be decreasing demand. None of this has anything to do with global economic growth

What valid concerns does this article (et al) have in terms of the implication that lowered fuel costs are a leading indicator of decreased future global economic growth?

Sincerely, Clickbait Victim

### /r/emacs

#### Modal editors, any tips for navigation and inserting I'm quick succession?

I tried switching to god-mode and evil-mode a couple times each and I've been frustrated by my inability to effectively switch between navigating and editing rapidly.

Maybe it's just the way I approach editing text but I find that I fall into the following pattern often: I'll type a word or two, see a typo at the beginning of the first word, navigate to it, fix it with some nav and edits, and then nav forward again. Doing this with modifiers just feels more natural to me. With modal editing i find the need to switch modes twice to move the cursor back or forward one char then edit to be really slow and annoying.

Am I missing some shortcuts or patterns (probably!)? Does it just get better with time? I'd love to hear some anecdotes too. I've been getting pretty bad RSI in my right hand especially, haven't been able to curb it just by switching keyboards.

Edit: typo in title. On phone can't seem to edit it. My bad.

submitted by ganglygorilla
[link] [12 comments]

### /r/compsci

#### Why the disparity of pi?

A colleague of mine recently pointed out to me that commonly in programming, pi is stored as a double with the following value:

3.1415926535897931 

However, when storing pi as a float, it is stored with the following value:

3.14159274 

It's understandable that the approximation would be slightly off, a simple rounding may not do the trick, but why is it rounded so oddly? And furthermore, these both seem to be used interchangeably within codebases and projects, with no apparent reasoning as to why.

Any answers would be appreciated. We're interested in the why of the value as well as why you'd interchange them so much. Anyone have any experience development-side with this?

Edit: Thanks for all the replies, they were incredibly useful and helped us understand the various factors that come into play when dealing with this.

submitted by ZedsTed
[link] [34 comments]

### QuantOverflow

#### Research topics - neural networks and market liquidity

I am a masters student looking for some direction on using neural network on market depth data to help predict market liquidity and bid-ask spreads. Can some of the more experienced people give me some research guidance and guide me to some papers that I should be reading? I have tried Google scholar but can't find something meaningful to kick off my research. Thanks

### QuantOverflow

#### How do I show that there is no tangency portfolio?

Question: Suppose that the risk-free return is equal to the expected return of the global minimum variance portfolio. Show that there is no tangency portfolio.

A hint for the question states: Show there is no $\delta$ and $\lambda$ satisfying

$$\delta\Sigma^{-1}(\mu-R_f\iota)= \lambda\pi_\mu + (1-\lambda)\pi\iota$$

but I'm not sure what to make of it. Any help is appreciated.

### UnixOverflow

#### Compiling FreeBSD on OSX

I currently have the source for the FreeBSD kernel. I want to compile it and create an iso image, on my OSX 10.9 computer. Can this be done? Do I need any other tools?

### CompsciOverflow

#### Finite State Machine [on hold]

The number of states required by a Finite State Machine to simulate the behaviour of a computer with a memory capable of storing m words,with each word being n bits long is

### Dave Winer

#### Update on Slack-like code

A brief note: The new version of nodeStorage, v0.78, is out.

This is the first version with Slack-like webhooks.

I have been able to use my existing Slack webhook code with this, both of which run unmodified with this server. So to that extent this is a clone, but I'm not claiming that at this time. They work like Slack, a subset of Slack's capabilities, and are not complete.

You can see the results of my testing in this liveblog post.

### QuantOverflow

#### calculate annualised tracking errors

I have 36 months of relative returns. I need to calculate the annualised tracking error. So using 36 months of returns is it simply like below,

st dev(36 months of returns) * sqrt(12)


Why the sqrt(12)?

### Planet Emacsen

#### Irreal: File Local Variables in an Org File

A nice tip from Artur Malabarba: It's often convenient to use file local variables in your Emacs files. It provides a way of setting certain Emacs variables" class="wp-smiley" style="height: 1em; max-height: 1em;" />mode, fill-column, comment-column, etc." class="wp-smiley" style="height: 1em; max-height: 1em;" />on a per file basis.

There's a problem, though, when using them in an Org file. Org considers anything after a heading to be part of that heading's subtree. Thus, if you move the node that the file local variables are in, they get dragged along with the node. The answer, Malabarba says, is to put the file local variables in their own COMMENT subtree. Those subtrees don't get exported so it's a perfect solution when using Org to write text intended for export to HTML, LaTeX, Word, and so forth.

Malabarba also gives us a bit of Elisp to make navigating such a buffer easier. If you often use file local variables in Org files, you will probably want to add his code to your Emacs configuration.

### QuantOverflow

#### Interpretation of the CAPM model under Stochastic Portfolio Theory framework

The CAPM under the Modern Portfolio Theory approach is given as:

$$R_i = \beta_i R_\pi$$ Where $R_\pi$ the portfolios expected excess returns

Under the stochastic portfolio theory approach: $$r_i = \beta_i r_\pi + \frac{\sigma_\pi^2}{2}(\beta_i -1)$$ Where $r_i$ is the continous excess return ($r_i = \log( 1+R_i)$), $\sigma^2_\pi$ is the variance of returns.

I am not sure how to interpret this from a finance perspective (the mathematical derivation is straight forward). The SPT CAPM, in contrast to MPT CAPM is said to price a stocks non systematic risk to the extent it differs from its systematic risk, however I thought the MPT CAPM does this too? Any economic intuition would be greatly appreciated.

### TheoryOverflow

#### Generate Mesh of Shadow Volume

Given a mesh, I want to generate another mesh that is effectively a shadow volume of this mesh.

I am looking for an algorithm/solution/implementation/reference that can do this under the following constraints:

1. I have a single, point light source. Thus:
• Not a polygon;
• I want hard edges/faces (not soft shadows)
2. The light source is at infinity, in fact can be assumed to by directly above the mesh. Thus:
• The projection is actually orthographic and depends only on a single direction.
3. I want an exact solution.
• The result mesh is not used for display so cannot be based on resolution dependent rasterization as many algorithms are.

A naive algorithm is to extrude all mesh triangles in the direction away from the light, e.g. downward, and do a CSG union of all these extruded "prisms".
The drawback here is that:

1. It does this for all triangles and not just the visible ones (how to do exact visibility?);
2. it requires a massive amount of CSG operations which are expensive.

I'm looking for a more efficient algorithm.

### DragonFly BSD Digest

#### BSDNow 105: Virginia BSD Assembly

BSDNow 105 is up, and has all the recent news, plus an interview with Scott Courtney about the in-about-a-month vBSDCon 2015.

### StackOverflow

#### How to denormalize a nested Map in Scala, with the denormalized string containing only values?

I am new to Scala and FP in general and wanted to know how a nested collection can be denormalized in Scala. For example, if we had data of a question paper for an exam as:

Map("subject" -> "science", "questionCount" -> 10, "questions" ->
Map("1" ->
Map("ques" -> "What is sun?", "answers" ->
Map("1" -> "Star", "2" -> "Giant bulb", "3" -> "planet")
), "2" ->
Map("ques" -> "What is moon?", "answers" ->
Map("1" -> "Giant rock", "2" -> "White lamp", "3" -> "Planet")
)
)
)


When denormalized into strings, using only the values, it can be written as:

science,2,1,What is sun?,Star
science,2,1,What is sun?,Giant bulb
science,2,1,What is sun?,planet
science,2,2,What is moon?,Giant rock
science,2,2,What is moon?,White lamp
science,2,2,What is moon?,Plane


I understand that I can use map to process each item in a collection, it returns exactly the same number of items. Also, while flatMap can be used to process each item into multiple items and return more items than in the collection, I am unable to understand how I can do the denormalization using it.

Also, is it possible to do a partial denormalization like this?

science,2,1,What is sun?,[Star,Giant bulb,planet]
science,2,2,What is moon?,[Giant rock,White lamp,Planet]


### TheoryOverflow

#### Set partition and exact cover

On my research i found algorithm which calculates all solutions of exact cover is Knuth X algorithm

Example:

Universe = { 1, 2 ,3 ,4 ,5 ,6 ,7}

Sets=[

{1,2,3,4}, Cost 10

{5,6,7}, Cost 20

{5}, Cost 5

{6,7}, Cost 10 ]

On this example we have to possible solutions {1,2,3,4} and {5,6,7} with cost 30 and the second solution is {1,2,3,4}, {5} and {6,7} with cost 25. My aim is to get the minimum cost with the associated subset.

Any suggestion on how to attack this problem.

### QuantOverflow

#### How to pull and/or calculate implied volatility of an ATM Call as its quoted everyday in Python? [on hold]

Set of tasks:

1. Pull quote for option with terms T = 1, S = K
2. Calculate Implied Volatility of this option
3. Plot datapoints

Thank you folks!

### Fefe

#### Wenn ihr euch fragt, wo dieser Gestank heute herkommt: ...

Wenn ihr euch fragt, wo dieser Gestank heute herkommt: Die FAZ hat den Orban seine Ausländerhetze in ihrem Politikteil auf die Straße auskübeln lassen. Ist vielleicht besser so, dass der Schirrmacher das nicht mehr erleben musste. Ekelhaft.

### UnixOverflow

#### Can I install FreeBSD, OpenBSD, NetBSD and DragonFly on one disc?

Is it possible to install all of FreeBSD, OpenBSD, NetBSD and DragonFly BSD on a single disc?

What would be the proper procedure for success, and what are the caveats?

### Fred Wilson

#### Forgive and Forget

My partner Albert has long been arguing on his blog (which you should be reading) that technology is bringing massive changes to society and we are going to have to change the way we think about things in reaction to these changes. One of those areas is privacy and the “post privacy world” that we are entering.

Last week he wrote this about the Ashley Madison hack and argued that society needs to be accepting of and forgiving of transgressions like using a website to arrange extramarital affairs. When I read that post, I thought “oh my, that’s pretty out there.” Mind you, I don’t disagree with Albert’s point at all, I just thought most people aren’t going to see it that way.

So I was pleasantly surprised this morning to see Farhad Manjoo make essentially the same argument in the New York Times.

Farhad quotes a security expert at the end of his post:

True online security is not just defending against compromise, it’s operating under the assumption that compromise will happen.

And of course that is true for the technologists who work in the teams that strive to keep our systems secure. But if you, like Albert and Farhad did, take that point to its logical conclusion, then all of us will have are in for having our deepest darkest secrets outed at some point. So let’s hope society becomes more forgiving over time. It’s going to have to.

### TheoryOverflow

#### exact cover set problem

i am searching an heuristic algorithm for a weighted exact cover problem shown here: http://en.wikipedia.org/wiki/Exact_cover

On my research i only found algorithm which calculates all solutions without any cost function, like http://en.wikipedia.org/wiki/Knuth%27s_Algorithm_X

Do you know some algorithm for that?

My problem has a Universe of size 200 and about 500 000 subsets, so it is not possible to calculate all solutions.

--------------EDIT--------------

Example:

Universe = { 1, 2 ,3 ,4 ,5 ,6 ,7}

Sets=[

{1,2,3,4}, Cost 10

{5,6,7}, Cost 20

{5}, Cost 5

{6,7}, Cost 10 ]

On this example i have to possible solutions {1,2,3,4} and {5,6,7} with cost 30 and the second solution is {1,2,3,4}, {5} and {6,7} with cost 25.

### QuantOverflow

#### Bloomberg implied volatility smile for equities

I was wondering if someone knows how Bloomberg does their computations for the implied volatility smile for equities.

As far as I understand, they use a lognormal mixture to model the stock prices. But I could not find any more documentation about this topic. Thanks in advance.

### CompsciOverflow

#### What does atomicity mean in distributed systems?

In a distributed system, how are atomic operations are done when network connections inherently have the Two Generals problem?

Does atomicity have a different meaning in this context?

### QuantOverflow

#### Calculating VaR with Monte Carlo simulation

I would like some help here :) I have a problem calculating VaR with the Monte Carlo Simulation. I have followed then next steps, is this a right way to calculate VaR or I need something more?

1.Generate random Numbers

2.Define Correlation Matrix

3.Define Volatlities, Drift and weights

4.Cholesky Decompose the Correlation Matrix

5.Multiply Random Numbers by Cholesky Matrix

6.Multiple Result of Step 5 by Volatility and Drift

7.Take the exponent of Results from Step 6

8.Take Log returns of Step 7 Results

9.Create the weighted portfolio returns

10.Calculate the VaR (use Percentile function at right confidence interval)

11.Calculate the Volatilites of your random numbers

12.Cross Check with Analytical VaR

Thank you, in advanced!

### CompsciOverflow

#### Proving relationship between set complement, intersection and union [migrated]

Show that $S_1 \cup S_2 = \overline{\overline{S_1}\cap \overline{S_2}}$.

So we want to show that the union of $S_1$ and $S_2$ is equal to the compliment of the intersection of the compliments of $S_1$ and $S_2$. I hope that makes sense!

I created some finite sets with some random values. Here's what I have so far but I'm stuck!

\begin{align*} S_1 &= \{1, 2, 3, 4\}\\ S_2 &= \{2, 3, 4, 5, 6\}\\ S_1 \cup S_2 &= \{1, 2, 3, 4, 5, 6\}\\ \overline{S_1} &= \{5, 6\} &\text{(everything in S_2 but not in S_1)}\\ \overline{S_2} &= \{1\} &\text{(everything in S_1 but not in S_2)} \end{align*} I don't see a way to intersect $\overline{S_1}$ and $\overline{S_2}$, can someone point me in the right direction please? Am I on the right track, or way off?

### UnixOverflow

#### Emacs and gvim display no content on FreeBSD 10.2

I just installed FreeBSD 10.2 with the Mate desktop environment and most things seem to work. I have a problem with emacs and gvim though. Since the problem has the same symptoms with both editors, I suppose the cause is somewhere else.

In both editors, the content of the edit buffer is initially black, even if it contains text. If I click with the mouse on the window top bar and move it slightly, the content is repainted and looks good: fonts, colours, syntax highlighting are as expected. But as soon as I type something in the editor window, the content becomes black again.

In gvim, I also tried the following:

1. I open a file -> content is black
2. I move the editor window -> content is displayed
3. I jump to the end of the buffer with G -> content is black again

In emacs I opened slime and it had similar problems:

1. The buffer is initially black.
2. I move the editor window -> buffer is repainted, fonts and colours are good
3. I type something in the slime buffer -> buffer is black again

I tried to produce screenshots using import but, interestingly, the resulting images show the correct content of the editor windows, whereas the window on screen still has black buffers. It seems that the window contents are rendered correctly but the result is not displayed on the screen.

I have also tried out gedit and it works properly.

Both emacs and vim work properly in a text console.

The only explanation I can think of is that it might be an Xorg problem (missing or wrong repaint event or something like that) but I have no clue as to where I should look now. Any ideas?

EDIT

I just found out that the problem disappears if in

System -> Preferences -> Windows -> General


I check the option "Enable software compositing window manager"

Note that I have this problem on FreeBSD only. On Debian Wheezy with Mate and "software compositing window manager" disabled, emacs and gvim work normally.

### StackOverflow

#### Does flatMap functional signature (input -> output) proposes its doing any flattening?

flatMap signature:

 /* applies a transformation of the monad "content" by composing
* this monad with an operation resulting in another monad instance
* of the same type
*/
def flatMap(f: A => M[B]): M[B]


is there anyway to understand by its signature (input to output) (except for the name flat) that its flattening the structure? or must I read its implementation to understand that? isn't good coding practice means I can understand by the signature of the function (input to output even without function name) exactly what it does? if so how does flatMap follows this basic programming practice? or does it violates it?

### QuantOverflow

#### Do futures have predictive value?

Futures closely mirror their underlying, as can be seen in the charts below. Eventually, at expiration, they reach the value of the underlying. However, they seem to show no extra information about the underlying's future; i.e. they don't trend any closer to the underlying's value at expiration than the underlying itself does.

Below is an underlying, SPY.

And below is the ES futures Dec 2012 contract.

However, it is possible to calculate how often a given future has been directionally correct. For instance, SPY closed at 142.79 on Dec 21, 2012. To determine the directional predictive power of the ES future, one would run the following pseudocode in a backtesting system:

$SPY_close = 142.79$future_predicted_direction = 0
$days = 0 FOR$day (2012-01-01 .. 2012-12-21) {
$days++ IF$EOD_quotes["SPY"][$day] >$SPY_close
THEN $future_predicted_direction +=$EOD_quotes["ES"][$day] <=$EOD_quotes["SPY"][$day] ELSE$future_predicted_direction += $EOD_quotes["ES"][$day] >= $EOD_quotes["SPY"][$day]
}

print "ES futures correctly predicted direction $future_predicted_direction out of$days days, that is, "
+ int($future_predicted_direction /$days * 100) + "% of the time."


Are there any studies on this? Do futures have any predictive power?

### Planet Emacsen

#### Jorgen Schäfer: Elpy 1.9.0 released

I just released version 1.9.0 of Elpy, the Emacs Python Development Environment. This is a feature release.

Elpy is an Emacs package to bring powerful Python editing to Emacs. It combines a number of other packages, both written in Emacs Lisp as well as Python.

## Quick Installation

Evaluate this:

(require 'package)(add-to-list 'package-archives             '("elpy" .               "https://jorgenschaefer.github.io/packages/"))

Then run M-x package-install RET elpy RET.

Finally, run the following (and add them to your .emacs):

(package-initialize)(elpy-enable)

## Changes in 1.9.0

• Elpy now supports the autopep8 library for automatically formatting Python code. All refactoring-related code is now grouped under C-c C-r. Use C-c C-r i to fix up imports using importmagic, C-c C-r p to fix up Python code with autopep8, and C-c C-r r to bring up the old Rope refactoring menu.
• C-c C-b will now select a region containing surrounding lines of the current indentation or more.
• C-c C-z in a Python shell will now switch back to the last Python buffer, allowing to use the key to cycle back and forth between the Python buffer and shell.
• The pattern used for C-c C-s is now customizeable in elpy-rgrep-file-pattern.
• <C-return> now can be used to send the current statement to the Python shell. Be careful, this can break with nested statements.
• The Elpy minor mode now also works in modes derived from python-mode, not just in the mode itself.

Thanks to ChillarAnand, raylu and Chedi for their contributions!

### CompsciOverflow

#### Where can I copy-out the windows 7 PATH variables on an external drive? [on hold]

I had to change my hard drive because I switched to SSD soon to find out that my old hard drive actually had bad sectors which is the reason I cannot mirror my whole drive. Now the most annoying part are the PATH variables I entered and I am not sure I will get them all back together. I found out that they are saved in the following registry path

System variables:

HKEY_LOCAL_MACHINE\ SYSTEM\ CurrentControlSet\ Control\ Session Manager\ Environment

User variables:

HKEY_CURRENT_USER\ Volatile Environment

The registers are saved at %winpath%/System32/config, but the question is not how to find but how to open or read them and copy out only the PATH variables. I am thankful for any help!

### TheoryOverflow

#### Proof for multiplicative dominance of universal probability distribution

I'm looking for the proof of the Leonid Levin theorem that states that the universal prior distribution function multiplicatively dominates all other functions of its type. The original article is Levin 1974 but it is in Russian Thanks

### CompsciOverflow

#### How Can we make $O(N)$ order statistic queries with Fibonacci heap in $O(N)$?

We have a Fibonacci heap with $N$ unique elements we want to make $O(N)$ order statistic queries (e.g., what is the element number 7 in this collection if it was sorted). Moreover, we know that the order statistic queries would be made in order (e.g., for $O(N)$ queries: what is elements number $1,3,5,\ldots,N$) .

How can we do it in $O(N)$ time ? We can't change the data structure in any way.

My thoughts: 1)The naive way is just to do delete min $N$ times but it would take $O(N \log N)$ so its not good enough.

2) if we could somehow use increase-key (if its min heap or decrease-key if its max heap) we would be able do find min in constant time increase its key so its larger than anything else in the heap we could avoid delete-min and its amortized bound of $O(log(n))$ entirely, but we are not allowed to modify the data-structure .

EDIT: sorry i didn't make it clear the heap don't have trees that waiting to be melded , otherwise its not possible since the heap could have N degree 0 trees and we cant sort N items in $O(N)$ time.

### CompsciOverflow

#### Any Natural Problems shown Easy by Reduction to Horn SAT?

To show that a problem is polynomial-time solvable, an often-successful technique is to reduce it to 2SAT (that is the problem of deciding satisfiability of CNF formulas with every clause containing at most 2 literals).

On the other hand, I never seem to have success with reducing problems to Horn SAT (that is the problem for CNFs in which every clause contains at most 1 positive literal). This might not be too surprising given that the algorithm solving Horn SAT is (in some sense) more basic than the one for 2SAT, and because satisfiable Horn formulas have minimal models while most other problems don't.

Nevertheless, are there some examples of problems that are (most naturally) shown to be easy by reduction to Horn SAT?

### TheoryOverflow

#### Intro CS Theory Question - Finite Sets [on hold]

I am starting an intro to CS theory course and have to answer the following question:

Show that if S1 and S2 are finite sets with |S1| = n and |S2| = m, then |S1 U S2| <= n + m.

I can show this by creating some random finite sets, taking the union and comparing the size but I am a bit confused as to how to express this answer in proper cs theory form, whatever that may be. The question doesn't state whether to use a certain proof (proof by contradicition and induction are the only thing we've been introduced to so far), so I'm not sure if what I've done is sufficient?

S1 = {1, 2, 3}
S2 = {4, 5, 6}
|S1| = 3
|S2| = 3
S1 U S2 = {1, 2, 3, 4, 5, 6}
|S1 U S2| = 6
3 + 3 <= 6


### Lobsters

#### Garbage collection thoughts

This has already been submitted and is originally from 2012, but I think it is interesting to re-discuss after Rust has finally been releases. How much does it still match the language and problems described in this post?

Comments

### Planet Theory

#### FOCS 2015 and KARPFest80

[Forwarding an announcement by Prasad Raghavendra –Boaz]
FOCS 2015 will be held at Berkeley, California on October 18–20, 2015. Registrations are open at:
The deadline for early registration is Sept 25th.
KARPfest80

On Saturday October 17, the day immediately before FOCS 2015, the Simons Institute for the Theory of Computing will host a celebration of the work of Dick Karp on the occasion of his 80th birthday. This event, which features a dozen speakers from across the spectrum of theoretical Computer Science, will also be held at the DoubleTree Marina Hotel, the same venue as FOCS itself. All participants are asked to register to attend; registration is free, and is done in conjunction with FOCS registration. Visit KARPfest80 for more information.

### Planet Theory

#### Automated drawing and optimization of syntax diagrams

If you've ever written a computer program you probably learned a little bit about context-free grammars, a formal method of describing the syntax of most programming languages that can be turned automatically into code for parsing those languages through the magic of compiler-compilers such as yacc. They're used for most other kinds of formal language, too: see for instance the JSON home page, which includes a grammar describing the syntax of data-description language JSON.

Go look again at the JSON page more carefully, and you'll notice that there are actually two different descriptions of the same context-free language. In a sidebar on the right of the page is a context-free grammar, a textual description of the JSON language. But more prominently, to the left of the grammar, are a bunch of diagrams that look like railroad tracks, called syntax diagrams. You can generate any string in the language by starting at the left terminal of the object track, following any smooth path in the track until you get to the right terminal, copying down the train stations you passed through on your way, and then expanding out the rectangular ones in the same way recursively. This idea of following smooth paths in a railway-like diagram has been floating around in graph drawing under the name of "confluent drawings" for a few years, but it's been used in syntax diagrams for much longer. It's really a generative model rather than a discriminative model, in that it tells you how to form all possible strings in the formal language but it doesn't tell you how to parse a string; nevertheless, it carries the same basic information as the grammar.

Or does it? Look a third time at the JSON page, and you might notice that the syntax diagram does not reflect the exact structure of the grammar. For instance, the grammar says that there should be only two smooth paths through the "object" track: one that passes through the two train stations "{" and "}", and a second one that passes through the "members" station between these two. But the diagram has a looped section of track that allows you to form infinitely many smoothed paths! They both describe the same language, but not in the same way. And this is a clue that the diagram must have been drawn by hand, or possibly drawn from a different grammar, because all the software I can find on the net for drawing syntax diagrams only performs a direct translation of a grammar to a diagram.

So this is where my new preprint, Confluent Orthogonal Drawings of Syntax Diagrams (arXiv:1509.00818, with Michael Bannister and David Brown, to appear at Graph Drawing) comes in. We take the point of view that writing grammars to be parsed automatically (essentially, a discriminative view of formal languages) and drawing syntax diagrams to visualize the strings that belong to the formal language (a generative view) are two different things, and that one can perform optimizations while going from one to the other (as the JSON people did) to make the syntax diagram easier to understand. Our paper envisions a third, intermediate description of a formal language (like the train station between the two brackets) that is abstract enough to allow these sorts of optimizations while also being flexible enough to describe more train tracks than the series-parallel ones that one gets from direct translations of grammars.

In this third description, a formal language is described as a set of nondeterministic finite automata, one for each of the tracks of a diagram or for each of the nonterminals of a grammar. However, unlike the usual form of NFAs that can only recognize regular languages, the ones we use have alphabets that include both terminals and nonterminals. You can use these to generate strings in the formal language in the same way that you would with syntax diagrams: generate a string in the language by following a path from start to accept state of the initial NFA but then recursively do the same thing to expand each nonterminal in the string.

An NFA is just a special kind of graph, and if we apply standard graph drawing techniques to this graph we get something that resembles a syntax diagram. But not quite: what look like vertices in the syntax diagram (its train stations) are actually not vertices, but edges, in the NFA. And the parts of the syntax diagram where the parts of the track come together, that look like confluent junctions of multiple edges in a confluent graph drawing, are actually not junctions, but vertices, in the NFA. So with a little care in the layout and a little transformation of how we draw each kind of object, we can turn our graph into a diagram. For instance this NFA (optimized from a grammar for LISP 1.5 and drawn with some edges overlapping each other so that it looks confluent but isn't really):

turns into this syntax diagram:

With this representation in hand, we can go about optimizing the diagram to improve the visualization. There are a lot of things we could have tried, but in this paper the main ones are local optimizations to individual diagrams (such as pulling together tracks that go to the same place), replacement of tail-recursive calls of a diagram within itself to create loops, and replacement of nonterminals in one diagram by copies of the corresponding diagram, to reduce the number of different components in the visualization at the expense of making individual components more complicated.

The paper also contains some experimental testing to show that these optimizations are effective, but I don't think that's really the point. The real point of the paper is to introduce the idea of algorithmically improving the appearance of syntax diagrams, rather than translating grammars directly to diagrams, and to provide both a software pipeline and an intermediate data structure that make that improvement possible.

Unfortunately, although (unlike many of my papers) we implemented what we wrote about, I don't think our actual code is in good shape for public release. And my co-author who wrote all the code (David Brown, an undergraduate at the time) has graduated and gone on to better things. But now that we've done this research and understand better what we should have been doing in the first place, I think a second system built along the same principles would work much more smoothly and could be a very helpful visualization utility.

### QuantOverflow

#### George Soros models

Mr. Soros in his books talked about principles which are not used by today's financial mathematics — namely reflexivity of all actions on the market. Simply it can be given by following: expectations of traders are based on the news and historical prices. They trade based on their expectations and hence influence prices which then influence expectations on the next "turn". I have never met an implementation of these ideas in the scientific framework. Have you? If yes, please give me a reference.

#### How to create a basket of currency pairs with the lowest correlation in R?

My strategy is designed to buy and sell all assets of a universe and rebalance periodically. It goes either long or short. To limit risk exposure to a single currency I would like the assets in the universe to have low relashionship and low price correlation between them (positive or even negative as it can go short).

For example, if the universe has 3 currency pairs A, B and C, my objective is to have price correlation c1 (between A and B), c2 (between B and C) and c3 (between A and C) as low as possible. Let's say c0 is the average of c1, c2, and c3, it will give information on the "global" correlation between pairs A, B and C.

Now imagine we have 26 currency pairs (from A to Z) to consider for creating a small universe of 5 of them (3 was for the example).

The method I apply is to create every possible combination of groups of 5 currency pairs and then it calculates c0 correl$corr of every single group. I also calculate standard deviation of c1, c2 and c3 correl$stddev as it will filter out groups with low c0 and high c1, c2 and c3. Finally I sum the global correlation and the stddev in order to rank groups with a single value correl$"corr+stddev. In this example 16 pairs are candidates. My dataset is an xts object with historical prices of the 16 currency pairs. I could extend it to ~50 to improve the selection as I guess bigger is the pool and better it must be. This is how I proceed to achieve this : # Put all symbols name in a list pairs <- names(mydata) # Create groups of 5 currency pairs with no duplication group <- combn( pairs , 5 ) # Calculate number of groups nb_group <- ncol(group) # Create empty object to store my result result <- NULL # For every groups for (i in 1:nb_group) { # Calculate the mean correlation for the group correl <- round(mean(cor(mydata[, group[,i]])),3) # Transform as data frame and give a name correl <- as.data.frame(correl) colnames(correl) <- "cor" rownames(correl) <- toString(group[,i]) # Calculate stddev and the sum of correlation and stddev correl$stddev <- round(sd(cor(mydata[, group[,i]])),3)
correl$"cor+sddev" <- correl$cor + correl$stddev # export data result <- rbind(correl, result) } # Basket of currency pairs with the lowest correlation and stddev head(result[order(result[,3]),])  This return something like : > head(result[order(result[,3]),]) cor stddev cor+sddev GBPUSD, USDCAD, USDRUB, USDTRY, NZDUSD 0.032 0.583 0.615 GBPUSD, USDCHF, USDCAD, USDRUB, USDTRY 0.048 0.569 0.617 GBPUSD, USDJPY, USDRUB, USDTRY, NZDUSD 0.052 0.576 0.628 GBPUSD, USDCAD, EURCHF, USDRUB, USDTRY 0.048 0.582 0.630 GBPUSD, USDCAD, USDRUB, USDMXN, NZDUSD 0.065 0.566 0.631 GBPUSD, USDCHF, USDCAD, USDRUB, USDMXN 0.097 0.536 0.633  Result is same when I average correlation and stddev (instead of the sum) Do you think R could help to achieve this and is there a more efficient approach to create such basket of currency pairs ? I've checked portfolio optimization packages like tawny, PortfolioAnalytics and fPortfolio but unfortunatly I'm not familiar with financial formulas and I got lost. Thank you, Florent ### CompsciOverflow #### Is it possible to boost the error probability of a Consensus protocol over dynamic network? [migrated] Consider the binary consensus problem in a synchronous setting over dynamic network (thus, there are$n$nodes, and some of them are connected by edges that may change round to round). Given a randomized$\delta$-error protocol for Consensus in this setting, is it possible to use this protocol as a black box to obtain another Consensus protocol with smaller error? Note that for problems with deterministic outcome (such as Sum), this can be done by repeatedly executing the protocol and taking the majority as the answer. However I am unable to figure out how for Consensus, so I wonder whether this is actually impossible to do. --- Details --- Dynamic Network. There are$n$nodes on a dynamic network over synchronous setting (thus the protocol will proceed in rounds). Each node has a unique ID of size$\log(n)$. There are edges connecting pairs of nodes. There are edges connecting pairs of nodes, and this set of edges may change from round to round (but remains fixed throughout any particular round). There may be additional restrictions to the dynamic network that allows the problem to be solved by some protocol (such as having a fixed diameter), but it is irrelevant to our discussion. The only information known to each node is their unique ID. They do not know$n$or the topology of the network. In particular, they do not know who their neighbors are (nodes that are adjacent to them with an edge). How nodes communicate. The only method of communication between nodes is by broadcasting messages under the$CONGEST$model (D. Peleg, "Distributed Computing, A Locality-Sensitive Approach"). More specifically, in each round, every node may broadcast an$O(\log n)$sized message. Then, every node will receive ALL messages that were broadcasted by all of its neighbors in that particular round in an arbitrary order. Thus, in particular, nodes may attach their IDs onto their messages, since their IDs are of size$O(\log n)$. Binary Consensus. The binary consensus problem over this network is as follows. Each node has a binary input ($0$or$1$), and each must decide either$0$or$1$under the the validity, termination, and agreement for consensus with$\delta$error: • validity means that for$z \in \{0, 1\}$, if all nodes have$z$as its initial input, then the only possible decision is$z$. • termination means that each node must eventually decide. • agreement means that all nodes must have the same decision. • The error rate means that under worst case input and worst case network, the probability that the protocol does not satisfy one of the three conditions is at most$\delta$over average coin flips of the protocol's coin. Question. For a particular set of dynamic networks, suppose we have a lower bound on the average number of rounds incurred by any randomized$\delta$-error protocol$P$for solving binary consensus under this settings such that$\delta < \frac{1}{3}$, under worst case dynamic network, worst case set of nodes' inputs, and on average coin flips of the protocol. Does this bound automatically hold for any$\delta'$-error protocol for$\delta' < \delta$?. #### What is the best-case and worst-case Time complexity in terms of n,in the form "O(...)" of the following code? [duplicate] This question already has an answer here: I've Asked this question ..... What is the best-case and worst-case Time complexity in terms of n,in the form "O(...)" of the following code  int i=n,x=0; whihle(i>0) { i--; x+=A[i][n-i]; } Can anybody answer this in details ? ### /r/netsec #### Bypass WAF Cookbook ### /r/emacs #### Gnus - Combined inboxes? I have just started really using Gnus for email (smtp + msmtp) for three accounts. Currently I am subscribed to the three Inboxes and they appear separately in my group buffer. My question is - is there a way to combine incoming mail from all three accounts to appear in one group? It would make it easier to keep track of all three. As well, if anyone can point out some tricks or tips or neat things I can do with gnus I wouldn't mind! submitted by Mister_Bubbles [link] [2 comments] ### TheoryOverflow #### Is the bitonic sort algorithm stable? I was wondering, is the bitonic sort algorithm stable? I searched the original paper, wikipedia and some tutorials, could not find it. It seems to me that it should be, as it is composed of merge / sort steps, however was unable to find answer anywhere. The reason why I'm asking - I was comparing this particular implementation of bitonic sort to the sort implemented in the C++ standard library, for array length 9 it requires 28 comparison / swap operations, while the standard library sort (which is unstable) requires 25. The three extra cswaps do not seem enough to make the sort stable. ### arXiv Networking and Internet Architecture #### Energy Harvesting Transmitters that Heat Up: Throughput Maximization under Temperature Constraints. (arXiv:1509.00836v1 [cs.IT]) Motivated by damage due to heating in sensor operation, we consider the throughput optimal offline data scheduling problem in an energy harvesting transmitter such that the resulting temperature increase remains below a critical level. We model the temperature dynamics of the transmitter as a linear system and determine the optimal transmit power policy under such temperature constraints as well as energy harvesting constraints over an AWGN channel. We first derive the structural properties of the solution for the general case with multiple energy arrivals. We show that the optimal power policy is piecewise monotone decreasing with possible jumps at the energy harvesting instants. We derive analytical expressions for the optimal solution in the single energy arrival case. We show that, in the single energy arrival case, the optimal power is monotone decreasing, the resulting temperature is monotone increasing, and both remain constant after the temperature hits the critical level. We then generalize the solution for the multiple energy arrival case. #### Better open-world website fingerprinting. (arXiv:1509.00789v1 [cs.CR]) Website fingerprinting enables an attacker to infer the source of a web page when a client is browsing through encrypted or anonymized network connections. We present a new website fingerprinting attack based on fingerprints extracted from random decision forests. Within the context of this attack we provide an analysis of the utility of previously proposed traffic features. Our attack, k-fingerprinting, performs better than current state-of-the-art attacks even against website fingerprinting defenses. We further show that error rates vary widely between web resources, and thus some patterns of use will be predictably more vulnerable to attack than others. #### Interference-Assisted Wireless Energy Harvesting in Cognitive Relay Network with Multiple Primary Transceivers. (arXiv:1509.00779v1 [cs.IT]) We consider a spectrum sharing scenario, where a secondary network coexists with a primary network of multiple transceivers. The secondary network consists of an energy-constrained decode-and-forward secondary relay which assists the communication between a secondary transmitter and a destination in the presence of the interference from multiple primary transmitters. The secondary relay harvests energy from the received radio-frequency signals, which include the information signal from the secondary transmitter and the primary interference. The harvested energy is then used to decode the secondary information and forward it to the secondary destination. At the relay, we adopt a time switching policy due to its simplicity that switches between the energy harvesting and information decoding over time. Specifically, we derive a closed-form expression for the secondary outage probability under the primary outage constraint and the peak power constraint at both secondary transmitter and relay. In addition, we investigate the effect of the number of primary transceivers on the optimal energy harvesting duration that minimizes the secondary outage probability. By utilizing the primary interference as a useful energy source in the energy harvesting phase, the secondary network achieves a better outage performance. #### A Big Data Analyzer for Large Trace Logs. (arXiv:1509.00773v1 [cs.DC]) Current generation of Internet-based services are typically hosted on large data centers that take the form of warehouse-size structures housing tens of thousands of servers. Continued availability of a modern data center is the result of a complex orchestration among many internal and external actors including computing hardware, multiple layers of intricate software, networking and storage devices, electrical power and cooling plants. During the course of their operation, many of these components produce large amounts of data in the form of event and error logs that are essential not only for identifying and resolving problems but also for improving data center efficiency and management. Most of these activities would benefit significantly from data analytics techniques to exploit hidden statistical patterns and correlations that may be present in the data. The sheer volume of data to be analyzed makes uncovering these correlations and patterns a challenging task. This paper presents BiDAl, a prototype Java tool for log-data analysis that incorporates several Big Data technologies in order to simplify the task of extracting information from data traces produced by large clusters and server farms. BiDAl provides the user with several analysis languages (SQL, R and Hadoop MapReduce) and storage backends (HDFS and SQLite) that can be freely mixed and matched so that a custom tool for a specific task can be easily constructed. BiDAl has a modular architecture so that it can be extended with other backends and analysis languages in the future. In this paper we present the design of BiDAl and describe our experience using it to analyze publicly-available traces from Google data clusters, with the goal of building a realistic model of a complex data center. #### A Game-theoretic Formulation of the Homogeneous Self-Reconfiguration Problem. (arXiv:1509.00737v1 [cs.MA]) In this paper we formulate the homogeneous two- and three-dimensional self-reconfiguration problem over discrete grids as a constrained potential game. We develop a game-theoretic learning algorithm based on the Metropolis-Hastings algorithm that solves the self-reconfiguration problem in a globally optimal fashion. Both a centralized and a fully distributed algorithm are presented and we show that the only stochastically stable state is the potential function maximizer, i.e. the desired target configuration. These algorithms compute transition probabilities in such a way that even though each agent acts in a self-interested way, the overall collective goal of self-reconfiguration is achieved. Simulation results confirm the feasibility of our approach and show convergence to desired target configurations. #### A Multilayer Model of Computer Networks. (arXiv:1509.00721v1 [cs.NI]) The fundamental concept of applying the system methodology to network analysis declares that network architecture should take into account services and applications which this network provides and supports. This work introduces a formal model of computer networks on the basis of the hierarchical multilayer networks. In turn, individual layers are represented as multiplex networks. The concept of layered networks provides conditions of top-down consistency of the model. Next, we determined the necessary set of layers for network architecture representation with regard to applying the system methodology to network analysis. #### A Fuzzy Clustering Based Approach for Mining Usage Profiles from Web Log Data. (arXiv:1509.00693v1 [cs.DB]) The World Wide Web continues to grow at an amazing rate in both the size and complexity of Web sites and is well on its way to being the main reservoir of information and data. Due to this increase in growth and complexity of WWW, web site publishers are facing increasing difficulty in attracting and retaining users. To design popular and attractive websites publishers must understand their users needs. Therefore analyzing users behaviour is an important part of web page design. Web Usage Mining (WUM) is the application of data mining techniques to web usage log repositories in order to discover the usage patterns that can be used to analyze the users navigational behavior. WUM contains three main steps: preprocessing, knowledge extraction and results analysis. The goal of the preprocessing stage in Web usage mining is to transform the raw web log data into a set of user profiles. Each such profile captures a sequence or a set of URLs representing a user session. #### Discovery of Web Usage Profiles Using Various Clustering Techniques. (arXiv:1509.00692v1 [cs.DB]) The explosive growth of World Wide Web (WWW) has necessitated the development of Web personalization systems in order to understand the user preferences to dynamically serve customized content to individual users. To reveal information about user preferences from Web usage data, Web Usage Mining (WUM) techniques are extensively being applied to the Web log data. Clustering techniques are widely used in WUM to capture similar interests and trends among users accessing a Web site. Clustering aims to divide a data set into groups or clusters where inter-cluster similarities are minimized while the intra cluster similarities are maximized. This paper reviews four of the popularly used clustering techniques: k-Means, k-Medoids, Leader and DBSCAN. These techniques are implemented and tested against the Web user navigational data. Performance and validity results of each technique are presented and compared. #### A Fuzzy Approach for Feature Evaluation and Dimensionality Reduction to Improve the Quality of Web Usage Mining Results. (arXiv:1509.00690v1 [cs.DB]) Web Usage Mining is the application of data mining techniques to web usage log repositories in order to discover the usage patterns that can be used to analyze the users navigational behavior. During the preprocessing stage, raw web log data is transformed into a set of user profiles. Each user profile captures a set of URLs representing a user session. Clustering can be applied to this sessionized data in order to capture similar interests and trends among users navigational patterns. Since the sessionized data may contain thousands of user sessions and each user session may consist of hundreds of URL accesses, dimensionality reduction is achieved by eliminating the low support URLs. Very small sessions are also removed in order to filter out the noise from the data. But direct elimination of low support URLs and small sized sessions may results in loss of a significant amount of information especially when the count of low support URLs and small sessions is large. We propose a fuzzy solution to deal with this problem by assigning weights to URLs and user sessions based on a fuzzy membership function. After assigning the weights we apply a Fuzzy c-Mean Clustering algorithm to discover the clusters of user profiles. In this paper, we describe our fuzzy set theoretic approach to perform feature selection (or dimensionality reduction) and session weight assignment. Finally we compare our soft computing based approach of dimensionality reduction with the traditional approach of direct elimination of small sessions and low support count URLs. Our results show that fuzzy feature evaluation and dimensionality reduction results in better performance and validity indices for the discovered clusters. #### On Normalized Compression Distance and Large Malware. (arXiv:1509.00689v1 [cs.CR]) Normalized Compression Distance (NCD) is a popular tool that uses compression algorithms to cluster and classify data in a wide range of applications. Existing discussions of NCD's theoretical merit rely on certain theoretical properties of compression algorithms. However, we demonstrate that many popular compression algorithms don't seem to satisfy these theoretical properties. We explore the relationship between some of these properties and file size, demonstrating that this theoretical problem is actually a practical problem for classifying malware with large file sizes, and we then introduce some variants of NCD that mitigate this problem. #### A note on strictly positive logics and word rewriting systems. (arXiv:1509.00666v1 [math.LO]) We establish a natural translation from word rewriting systems to strictly positive polymodal logics. Thereby, the latter can be considered as a generalization of the former. As a corollary we obtain examples of undecidable strictly positive normal modal logics. The translation has its counterpart on the level of proofs: we formulate a natural deep inference proof system for strictly positive logics generalizing derivations in word rewriting systems. We also formulate some open questions related to the theory of modal companions of superintuitionistic logics that was initiated by L.L. Maximova and V.V. Rybakov. #### Termination of rewrite relations on$\lambda$-terms based on Girard's notion of reducibility. (arXiv:1509.00649v1 [cs.LO]) In this paper, we show how to extend the notion of reducibility introduced by Girard for proving the termination of$\beta$-reduction in the polymorphic$\lambda$-calculus, to prove the termination of various kinds of rewrite relations on$\lambda$-terms, including rewriting modulo some equational theory and rewriting with matching modulo$\beta\eta, by using the notion of computability closure. This provides a powerful termination criterion for various higher-order rewriting frameworks, including Klop's Combinatory Reductions Systems with simple types and Nipkow's Higher-order Rewrite Systems. #### How to Generate Security Cameras: Towards Defence Generation for Socio-Technical Systems. (arXiv:1509.00643v1 [cs.CR]) Recently security researchers have started to look into automated generation of attack trees from socio-technical system models. The obvious next step in this trend of automated risk analysis is automating the selection of security controls to treat the detected threats. However, the existing socio-technical models are too abstract to represent all security controls recommended by practitioners and standards. In this paper we propose an attack-defence model, consisting of a set of attack-defence bundles, to be generated and maintained with the socio-technical model. The attack-defence bundles can be used to synthesise attack-defence trees directly from the model to offer basic attack-defence analysis, but also they can be used to select and maintain the security controls that cannot be handled by the model itself. #### Correlated Poisson processes and self-decomposable laws. (arXiv:1509.00629v1 [math.PR]) We discuss in detail a procedure to produce two Poisson processes M(t), N(t) associated to positively correlated, self-decomposable, exponential renewals. The main result of this paper is a closed, elementary form for the joint distribution p_{m,n}(s,t) of the pair (M(s), N(t)): this turns out to be instrumental to produce explicit algorithms with applications to option pricing, as well as to credit and insurance risk modeling, that will be discussed in a separate paper #### Model Checking Epistemic Halpern-Shoham Logic Extended with Regular Expressions. (arXiv:1509.00608v1 [cs.LO]) The Epistemic Halpern-Shoham logic (EHS) is a temporal-epistemic logic that combines the interval operators of the Halpern-Shoham logic with epistemic modalities. The semantics of EHS is based on interpreted systems whose labelling function is defined on the endpoints of intervals. We show that this definition can be generalised by allowing the labelling function to be based on the whole interval by means of regular expressions. We prove that all the positive results known for EHS, notably the attractive complexity of its model checking problem for some of its fragments, still hold for its generalisation. We also propose the new logic EHSre which operates on standard Kripke structures and has expressive power equivalent to that of EHS with regular expressions. We compare the expressive power of EHSre with standard temporal logics. #### Backhaul-Aware Caching Placement for Wireless Networks. (arXiv:1509.00558v1 [cs.IT]) As the capacity demand of mobile applications keeps increasing, the backhaul network is becoming a bottleneck to support high quality of experience (QoE) in next-generation wireless networks. Content caching at base stations (BSs) is a promising approach to alleviate the backhaul burden and reduce user-perceived latency. In this paper, we consider a wireless caching network where all the BSs are connected to a central controller via backhaul links. In such a network, users can obtain the required data from candidate BSs if the data are pre-cached. Otherwise, the user data need to be first retrieved from the central controller to local BSs, which introduces extra delay over the backhaul. In order to reduce the download delay, the caching placement strategy needs to be optimized. We formulate such a design problem as the minimization of the average download delay over user requests, subject to the caching capacity constraint of each BS. Different from existing works, our model takes BS cooperation in the radio access into consideration and is fully aware of the propagation delay on the backhaul links. The design problem is a mixed integer programming problem and is highly complicated, and thus we relax the problem and propose a low-complexity algorithm. Simulation results will show that the proposed algorithm can effectively determine the near-optimal caching placement and provide significant performance gains over conventional caching placement strategies. #### Achieving Energy Efficiency for Altruistic DISH: Three Properties. (arXiv:1509.00554v1 [cs.NI]) In an altruistic DISH protocol, additional nodes called "altruists" are deployed in a multi-channel ad hoc network to achieve energy efficiency while still maintaining the original throughput-delay performance. The responsibility of altruists is to constantly monitor the control channel and awaken other (normal) nodes when necessary (to perform data transmissions). Altruists never sleep while other nodes sleep as far as possible. This technical report proves three properties related to this cooperative protocol. The first is the conditions for forming an unsafe pair (UP) in an undirected graph. The second is the necessary and sufficient conditions for full cooperation coverage to achieve the void of multi-channel coordination (MCC) problems. The last is the NP-hardness of determining the minimum number and locations of altruistic nodes to achieve full cooperation coverage. #### Disaster-Resilient Control Plane Design and Mapping in Software-Defined Networks. (arXiv:1509.00509v1 [cs.NI]) Communication networks, such as core optical networks, heavily depend on their physical infrastructure, and hence they are vulnerable to man-made disasters, such as Electromagnetic Pulse (EMP) or Weapons of Mass Destruction (WMD) attacks, as well as to natural disasters. Large-scale disasters may cause huge data loss and connectivity disruption in these networks. As our dependence on network services increases, the need for novel survivability methods to mitigate the effects of disasters on communication networks becomes a major concern. Software-Defined Networking (SDN), by centralizing control logic and separating it from physical equipment, facilitates network programmability and opens up new ways to design disaster-resilient networks. On the other hand, to fully exploit the potential of SDN, along with data-plane survivability, we also need to design the control plane to be resilient enough to survive network failures caused by disasters. Several distributed SDN controller architectures have been proposed to mitigate the risks of overload and failure, but they are optimized for limited faults without addressing the extent of large-scale disaster failures. For disaster resiliency of the control plane, we propose to design it as a virtual network, which can be solved using Virtual Network Mapping techniques. We select appropriate mapping of the controllers over the physical network such that the connectivity among the controllers (controller-to-controller) and between the switches to the controllers (switch-to-controllers) is not compromised by physical infrastructure failures caused by disasters. We formally model this disaster-aware control-plane design and mapping problem, and demonstrate a significant reduction in the disruption of controller-to-controller and switch-to-controller communication channels using our approach. #### Cooperation under Incomplete Information on the Discount Factors. (arXiv:1012.3117v2 [cs.GT] UPDATED) In the repeated Prisoner's Dilemma, when every player has a different discount factor, the grim-trigger strategy is an equilibrium if and only if the discount factor of each player is higher than some threshold. What happens if the players have incomplete information regarding the discount factors? In this work we look at repeated games in which each player has incomplete information regarding the other player's discount factor, and ask when a pair of grim-trigger strategies is an equilibrium. We provide necessary and sufficient conditions for such strategies to be an equilibrium. We characterize the states of the world in which the strategies are not triggered, i.e., the players cooperate, in such equilibria (or \epsilon-equilibria), and ask whether these "cooperation events" are close to those in the complete information case, when the information is "almost" complete, in several senses. ### /r/emacs #### Is it possible to edit or search an inactive git branch in emacs? To what degree is it possible to interact with a git branch that isn't the current one from within Emacs? I have a large software project that takes a long time to build and would like to be able to view, work on, or inspect other branches while the build process is happening. I also sometimes forget what branches I have put things in. submitted by small-wolf [link] [1 comment] ### Halfbakery #### Face to Facetime (0.5) ### DataTau #### Celebrating the real-time processing and analytics revival ### /r/compsci #### ANTLR Help (String Literals) I don't know hardly anything about ANTLR or the grammars involved, however I did find a grammar already for the language I am interested in (PeopleCode, propriety language used by Oracle for PeopleTools). This grammar is reasonably well done but I've noticed an issue with the StringLiteral rule for the lexer: StringLiteral : '"' ( ~'"' )* '"' ; Seems to say "starts with a quote, then 0 to many of anything that isn't a quote, and finally ends with a quote". In most languages this would be accurate however PeopleCode is bizzare. To "escape" a quote within a string, you use double quotes. Example: local string &test = "Hello ""World""" Would produce a string (Hello "World"), without the parenthesis of course. I'm not sure how to adjust this rule to allow for double quotes within the string and was hoping someone better at lexers/antlr would be willing and able to help :) Thanks! submitted by tslater2006 [link] [comment] ### QuantOverflow #### labeling high frequency signal data Was curious if anyone has methodologies they can recommend for systematically labeling (discrete) signals generated from intraday tick data for use in classification or detection models ? ### Planet Theory #### Recognizing Weighted Disk Contact Graphs Authors: Boris Klemz, Martin Nöllenburg, Roman Prutkin Download: PDF Abstract: Disk contact representations realize graphs by mapping vertices bijectively to interior-disjoint disks in the plane such that two disks touch each other if and only if the corresponding vertices are adjacent in the graph. Deciding whether a vertex-weighted planar graph can be realized such that the disks' radii coincide with the vertex weights is known to be NP-hard. In this work, we reduce the gap between hardness and tractability by analyzing the problem for special graph classes. We show that it remains NP-hard for outerplanar graphs with unit weights and for stars with arbitrary weights, strengthening the previous hardness results. On the positive side, we present constructive linear-time recognition algorithms for caterpillars with unit weights and for embedded stars with arbitrary weights. #### Combinatorial Properties of Triangle-Free Rectangle Arrangements and the Squarability Problem Authors: Jonathan Klawitter, Martin Nöllenburg, Torsten Ueckerdt Download: PDF Abstract: We consider arrangements of axis-aligned rectangles in the plane. A geometric arrangement specifies the coordinates of all rectangles, while a combinatorial arrangement specifies only the respective intersection type in which each pair of rectangles intersects. First, we investigate combinatorial contact arrangements, i.e., arrangements of interior-disjoint rectangles, with a triangle-free intersection graph. We show that such rectangle arrangements are in bijection with the 4-orientations of an underlying planar multigraph and prove that there is a corresponding geometric rectangle contact arrangement. Moreover, we prove that every triangle-free planar graph is the contact graph of such an arrangement. Secondly, we introduce the question whether a given rectangle arrangement has a combinatorially equivalent square arrangement. In addition to some necessary conditions and counterexamples, we show that rectangle arrangements pierced by a horizontal line are squarable under certain sufficient conditions. #### Gaussian random projections for Euclidean membership problems Authors: Ky Vu, Pierre-Louis Poirion, Leo Liberti Download: PDF Abstract: We discuss the application of random projections to the fundamental problem of deciding whether a given point in a Euclidean space belongs to a given set. We show that, under a number of different assumptions, the feasibility and infeasibility of this problem are preserved with high probability when the problem data is projected to a lower dimensional space. Our results are applicable to any algorithmic setting which needs to solve Euclidean membership problems in a high-dimensional space. #### Extremal Distances for Subtree Transfer Operations in Binary Trees Authors: Ross Atkins, Colin McDiarmid Download: PDF Abstract: Three standard subtree transfer operations for binary trees, used in particular for phylogenetic trees, are: tree bisection and reconnection (TBR), subtree prune and regraft (SPR) and rooted subtree prune and regraft (rSPR). For a pair of leaf-labelled binary trees with n leaves, the maximum number of such moves required to transform one into the other is n-\Theta(\sqrt{n}), extending a result of Ding, Grunewald and Humphries. We show that if the pair is chosen uniformly at random, then the expected number of moves required to transfer one into the other is n-\Theta(n^{2/3}). These results may be phrased in terms of agreement forests: we also give extensions for more than two binary trees. #### Variants of Plane Diameter Completion Authors: Petr A. Golovach, Clément Requilé, Dimitrios M. Thilikos Download: PDF Abstract: The {\sc Plane Diameter Completion} problem asks, given a plane graph G and a positive integer d, if it is a spanning subgraph of a plane graph H that has diameter at most d. We examine two variants of this problem where the input comes with another parameter k. In the first variant, called BPDC, k upper bounds the total number of edges to be added and in the second, called BFPDC, k upper bounds the number of additional edges per face. We prove that both problems are {\sf NP}-complete, the first even for 3-connected graphs of face-degree at most 4 and the second even when k=1 on 3-connected graphs of face-degree at most 5. In this paper we give parameterized algorithms for both problems that run in O(n^{3})+2^{2^{O((kd)^2\log d)}}\cdot n steps. #### A note on Probably Certifiably Correct algorithms Authors: Afonso S. Bandeira Download: PDF Abstract: Many optimization problems of interest are known to be intractable, and while there are often heuristics that are known to work on typical instances, it is usually not easy to determine a posteriori whether the optimal solution was found. In this short note, we discuss algorithms that not only solve the problem on typical instances, but also provide a posteriori certificates of optimality, probably certifiably correct (PCC) algorithms. As an illustrative example, we present a fast PCC algorithm for minimum bisection under the stochastic block model and briefly discuss other examples. #### Finding Near-Optimal Independent Sets at Scale Authors: Sebastian Lamm, Peter Sanders, Christian Schulz, Darren Strash, Renato F. Werneck Download: PDF Abstract: The independent set problem is NP-hard and particularly difficult to solve in large sparse graphs. In this work, we develop an advanced evolutionary algorithm, which incorporates kernelization techniques to compute large independent sets in huge sparse networks. A recent exact algorithm has shown that large networks can be solved exactly by employing a branch-and-reduce technique that recursively kernelizes the graph and performs branching. However, one major drawback of their algorithm is that, for huge graphs, branching still can take exponential time. To avoid this problem, we recursively choose vertices that are likely to be in a large independent set (using an evolutionary approach), then further kernelize the graph. We show that identifying and removing vertices likely to be in large independent sets opens up the reduction space---which not only speeds up the computation of large independent sets drastically, but also enables us to compute high-quality independent sets on much larger instances than previously reported in the literature. #### Energy randomness Authors: Joseph S. Miller, Jason Rute Download: PDF Abstract: Energy randomness is a notion of partial randomness introduced by Diamondstone and Kjos-Hanssen to characterize the sequences that can be elements of a Martin-L\"of random closed set (in the sense of Barmpalias, Brodhead, Cenzer, Dashti, and Weber). It has also been applied by Allen, Bienvenu, and Slaman to the characterization of the possible zero times of a Martin-L\"of random Brownian motion. In this paper, we show that X \in 2^\omega is s-energy random if and only if \sum_{n\in\omega} 2^{sn - KM(X\upharpoonright n)} < \infty, providing a characterization of energy randomness via a priori complexity KM. This is related to a question of Allen, Bienvenu, and Slaman. ### QuantOverflow #### Quantitative Finance Programming Language Since couple of weeks, I started to do my research on quant finance. During this time, I could discover a lot of stuff and with that stuff, a lot of questions came to my mind. A lot of news or economic journals/magazines write about HFT/Algorithmic trading. Most of them say that the companies or software developers prefer to use C++. In some articles, the writers talk about Java, C#, C or even ASM. I tried to find the reason for C++, but I weren't successful. This topic doesn't provide answers that I need (Why is C++ still a very popular language in quantitative finance?) Here are my questions: 1. Why C++? That some companies might use ASM (and I can only imagine ASM in HFT where milliseconds play a role), that's fine. But in medium frequency trading or in algorithms? Is it because of speed? I looked for quant finance libraries for C++, but I couldn't find a lot. The only thing is QuantLib, MatLib and TA-Lib. But no chart APIs/Libs or tutorials. Seems like no one doing tutorials. 2. Why do some people choose Java? I know, Java is very popular language and has a lot of APIs/Libs and the community is growing. But if the speed might play a role, then Java can't be the fastest (because of virtual environment). Or am I wrong? 3. Why no one is using Python for medium frequency trading or algo trading? Python has a lot of Apis/Libs like MatLib, TA-Lib, Pyqtgraph. Ok, I have to say, Python is not the fastest. 4. In this discussion Why is C++ still a very popular language in quantitative finance?, some people claim that C# might be much better for quant finance developing. Is it really true? How about Libs, APIs, Tutorials etc? And my final question, what is the important property for choosing a language for quant finance? I don't talk about ASM because it's the fastest language and it's used for very complex calculations that have to be made fast. But what's about C++, C#, Python and Java? For me, it's important that there should be Libs and Tutorials/Examples. And the beginning, I started with python, but after all I have read, I'm not sure about Python anymore. ### /r/compsci #### Bloom filter as a service: validate username availability without asking the server. ### CompsciOverflow #### Understanding the one-writer, many-readers problem? I need some help understanding the implementation of the one-writer many-readers problem shown below as I am new to this concept: I have a rough idea that there will be starvation/deadlock among the readers as there is one writer that multiple readers are trying to access at once. Can anyone please provide some further insight regarding the problems with the implementation of this situation? I am not looking for any coding, that was just the sample code (pseudocode) I am working with. I am looking for the problem of the implementation. Upon further research I have assumed the following (based on the line: if readcount == 1 then semWait(wrt); : • There is one writer and many readers • The readers have priority over the writers • The possible problem: The writers cannot write until the reader has finished reading therefore starvation occurs However upon re-evaluation of the code I have also assumed the following based on the lines: semWait(mutex); readcount := readcount - 1; if readcount == 0 then up semSignal(mutex);  Could I not also say that only one reader may read at a time therefore the other readers will be starved? Therefore would either of these be the correct way of interpreting the coding to the problem is of the implementation or would I be wrong? ### Halfbakery #### Cash for Dads, Cash for Moms, Cash for Teachers (-0.5) ### HN Daily #### Daily Hacker News for 2015-09-02 The 10 highest-rated articles on Hacker News on September 02, 2015 which have not appeared on any previous Hacker News Daily are: ### Planet Theory #### L-Drawings of Directed Graphs Authors: Patrizio Angelini, Giordano Da Lozzo, Marco Di Bartolomeo, Valentino Di Donato, Maurizio Patrignani, Vincenzo Roselli, Ioannis G. Tollis Download: PDF Abstract: We introduce L-drawings, a novel paradigm for representing directed graphs aiming at combining the readability features of orthogonal drawings with the expressive power of matrix representations. In an L-drawing, vertices have exclusive x- and y-coordinates and edges consist of two segments, one exiting the source vertically and one entering the destination horizontally. We study the problem of computing L-drawings using minimum ink. We prove its NP-completeness and provide a heuristics based on a polynomial-time algorithm that adds a vertex to a drawing using the minimum additional ink. We performed an experimental analysis of the heuristics which confirms its effectiveness. ## September 02, 2015 ### CompsciOverflow #### Converting this DFA to RE using the Brzozowski Algebraic Method [on hold] Hello people, I'm trying to convert this DFA to RE using the Brzozowski Algebraic Method. (DFA created with JFlap) ## My Attempt ### Write down equations R1 = bR1 + aR2 + ɛ R2 = bR2 + aR3 R3 = bR2 + aR1 ### Eliminate equal Rx's from RHS R1 = b*aR2 + b* R2 = b*aR3 ### New equations R1 = b*aR2 + b* R2 = b*aR3 R3 = bR2 + aR1 ### Substitute R2 into R3 and eliminate R3 from RHS R3 = bb*aR3 + aR1 R3 = (bb*a)*aR1 ### Substitute R3 into R2 R2 = b*a(bb*a)*aR1 ### New equations R1 = b*aR2 + b* R2 = b*a(bb*a)*aR1 R3 = (bb*a)*aR1 ### Substitute R2 into R1 and eliminate R1 from RHS R1 = b*ab*a(bb*a)*aR1 + b* R1 = (b*ab*a(bb*a)*a)*b* ### Answer RE = (b*ab*a(bb*a)*a)*b* ### Issue(?) The past exam paper answer is: b*(ab*a(bb*a)*ab*)* My calculated answer is: (b*ab*a(bb*a)*a)*b* Are these 2 answers equal or have I done something wrong? EDIT: The answer is correct even in the form I have provided; the textbook answer has gone a step forward having used the shifting rule to achieve its format. Here are some useful rules for simplifying RE's: • Shifting Rule: E(FE)* = (EF)*E • Denesting Rule: (E*F)*E* = (E+F)* #### Automated optimization of 0-1 matrix vector multiplication Question: Is there established procedure or theory for generating code that efficiently applies a matrix-vector multiplication, when the matrix is dense and filled with only zeros and ones? Ideally, the optimized code would make systematic use of previously computed information to reduce duplicated work. In other words, I have a matrix M and I want to do some precomputation based upon M, that will make computing Mv as efficient as possible when I later receive the vector v. M is a rectangular dense binary matrix that is known at "compile time", whereas v is an unknown real vector that is only known at "run time". Example 1: (sliding window) Let me use an easy small example to illustrate my point. Consider the matrix,M = \begin{bmatrix}1 & 1 & 1 & 1 & 1\\ & 1 & 1 & 1 & 1 & 1 \\ & & 1 & 1 & 1 & 1 & 1\\ & & & 1 & 1 & 1 & 1 & 1\end{bmatrix}.Supposing we apply this matrix to a vector v to get w = Mv. Then the entries of the result are, \begin{align} w_1 &= v_1 + v_2 + v_3 + v_4 + v_5\\ w_2 &= v_2 + v_3 + v_4 + v_5 + v_6\\ w_3 &= v_3 + v_4 + v_5 + v_6 + v_7\\ w_4 &= v_4 + v_5 + v_6 + v_7 + v_8 \end{align} Doing a standard matrix-vector multiplication will compute exactly this way. However, a lot of this work is redundant. We could do the same matrix computation at less cost by keeping track of a "running total", and adding/subtracting to get the next number: \begin{align} w_1 &= v_1 + v_2 + v_3 + v_4 + v_5\\ w_2 &= w_1 + v_6 - v_1\\ w_3 &= w_2 + v_7 - v_2\\ w_4 &= w_3 + v_8 - v_3 \end{align} Example 2: (hierarchical structure) In the previous example, we could just keep track of a running total. However, usually one would need to create and store a tree of intermediate results. For example, considerM = \begin{bmatrix}1 & 1 & 1 & 1 & 1 & 1 & 1 & 1\\ 1 & 1 & 1 & 1 \\ & & & & 1 & 1 & 1 & 1\\ 1 & 1 \\ & & & & 1 & 1 \\ & & 1 & 1 \\ & & & & & & 1 & 1\end{bmatrix}$$One could compute w = Mv efficiently using a tree of intermediate results: 1. Compute w_5 and w_7, and add them to get w_3. 2. Compute w_4 and w_6, and add them to get w_2. 3. Add w_2 and w_3 to get w_1 The structure in the examples above is easy to see, but for the actual matrices I'm interested in, the structure is not so simple. Example 3: (low rank) To clear up some confusion, the matrices are generally not sparse. Specifically, a method solving this problem needs to be able to find efficient methods to apply matrices where large blocks are filled with ones. For example, consider$$M = \begin{bmatrix}1 & 1 & 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 & & \\ 1 & 1 & 1 & 1 & & \\ 1 & 1 & 1 & 1 & & \end{bmatrix}.$$This matrix can be decomposed as a difference of two rank-1 matrices,$$M = \begin{bmatrix}1 & 1 & 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 & 1 & 1\end{bmatrix} - \begin{bmatrix}& & & & & \\ & & & & & \\ & & & & 1 & 1 \\ & & & & 1 & 1 \\ & & & & 1 & 1\end{bmatrix}so its action on a vector w := Mv can be computed efficiently by, \begin{align} w_1 &= v_1 + v_2 + v_3 + v_4 + v_5 + v_6 \\ w_2 &= w_1 \\ w_3 &= w_2 - v_5 - v_6 \\ w_4 &= w_3 \\ w_5 &= w_4. \end{align} Motivation: I'm working on a numerical method for some image processing, and there are several large dense 0-1 matrices with different structures that are fixed for all time. Later these matrices will need to be applied to many unknown vectors v_i that will depend on the user's input. Right now I'm using pencil-and-paper to come up with efficient code on a per-matrix basis, but I'm wondering if the process can be automated. ### QuantOverflow #### Simulate stock prices following t distribution in matlab [on hold] Trying to simulate different price paths for a stock following the t distribution. Can anyone provide code? #### number of trades - flaw in White Reality Check? I went through Whites paper of the reality check for multiple strategy testing. To summarize at a simple example: I have 2 strategies, s1 and s2. s1 gives 2 signals and therefore 2 returns, s2 gives 10 signals and therefore 10 returns. Here is the Reality Check procedure: I compare the means of the returns to some benchmark., e.g., mean(returns) = zero. To get p-values of H0, strat returns are zero, I do a Bootstrap of both strat returns, calculate both means, take the max and repeat this prozess a high number of times. I call this array BSvals. From those values I get the p-value of H0(s1) and H0(s2) being the fraction of BSvals values, that is > mean return . However, In this method I see a flaw. Because of the lower number of signals in s1 the variance of the bootstrapped mean of s1 will be much higher than in s2. Therefore s1 will push pvalues of H0 much higher then if s1 would have signaled more often. Therefore the mean of returns as a performance statistic might not be suitable when comparing strategies with different number of trades/signals. I think of using t statistic as a unbiased performance measure to compensate for different signal numbers, being t = mean(strat returns) * squareroot(number returns) / stddev How is your thoughts about this? Strat returns being not normal distributed squareroot(number returns) might not be an appropriate scaling factor, though... ### /r/netsec #### 420 bytes file that uncompresses to a 141.4 GB 225,000 × 225,000 pixels (50.625 gigapixels) PNG. Upload as your profile picture to some online service, try to crash their image processing scripts. Set as your web site's favicon; try to crash browsers that don't check the size. ### Lobsters #### Best practices for a new Go developer #### The Cartographer Who’s Transforming Map Design ### QuantOverflow #### Forecasting using GARCH in R I am using the predict and ugarchforecast functions in R. When I fit my models and try to forecast, I get either only increasing or decreasing values for sigma, does anyone know why? Thank you Example: eGARCHfit2 = ugarchspec(variance.model=list(model="eGARCH", garchOrder=c(1,1)), mean.model=list(armaOrder=c(0,0), include.mean=TRUE), distribution.model="norm") eGARCH2 <- ugarchfit(brentlog2, spec=eGARCHfit2) ugarchforecast(eGARCH2, data =brentlog2, n.ahead = 21) * GARCH Model Forecast * ------------------------------------ * Model: eGARCH * Horizon: 21 Roll Steps: 0 Out of Sample: 0 0-roll forecast [T0=1976-06-26 01:00:00]: Series Sigma T+1 0.0002619 0.008350 T+2 0.0002619 0.008387 T+3 0.0002619 0.008423 T+4 0.0002619 0.008459 T+5 0.0002619 0.008496 T+6 0.0002619 0.008532 T+7 0.0002619 0.008569 T+8 0.0002619 0.008605 T+9 0.0002619 0.008642 T+10 0.0002619 0.008678 The first value is the mean which is always constant and the second one is sigma which is always increasing as you can see. ### Lobsters #### Biggest image in the smallest space #### What are you reading this week? I’ve missed this recurring thread, so I thought I would take initiative and post it myself. This week I finally finished Peter Matthiessen’s The Snow Leopard. It is especially relevant to me as one of my friends has just moved to Nepal to teach English as a second language, though not in the same region. As a replacement for The Snow Leopard, I’ve made some headway into Hermann Melville’s Moby-Dick. Very atmospheric! I regret that I wasn’t forced to read it in grade school. I am also tackling The New Nietzsche, although it’s slow going. I have yet to finish the first essay: “Nietzsche and Metaphysical Language,” by Michel Haar. What I admire most in Nietzsche is how he provides a way out of traditional Western philosophy by completely exploding the systems of his time. I haven’t read enough to judge Nietzsche’s conclusions, but I definitely appreciate his beginnings. The most computer-sciency thing I am reading at the moment is Why Philosophers Should Care About Computational Complexity by Scott Aaronson. I can’t yet offer any insight on the paper, but I would love to see more work done at the intersection of CS and philosophy, which to me seems under-serviced. ### /r/scala #### Ant + no IDE. How can I speed up compile times? I don't use an IDE, and I'm currently using Ant/fsc for my builds. I'm looking into speeding up my compile times. I've started by getting rid of implicits, but have not seen a big change from that. What would you recommend as the first "low hanging fruit" that I should explore? submitted by three-cups [link] [41 comments] ### QuantOverflow #### How fast is QuickFix ? In my firm we are beginning a new OMS (Order Management System) project and there is a debate whether we use Quickfix or we go for a professional fix engine? Because there is a common doubt that QuickFix is not enough fast and obviously we will not get any technical support. I heard that in BOVESPA it has been used for a while. They are changing it with a paid one now. Well that is enough for me. If they use it I can use it. Should I choose a professional one over QuickFix? Is it not good enough? #### Isolating single assets standard deviation in a portfolio accounting for correlation I am running a simple Monte Carlo analysis in Excel using mean return, standard deviation and the =NORMINV(RAND(),mean,std dev) method. I have a correlation matrix that I use to compute the portfolio variance for varying weights of 16 assets. I would like to compute the standard deviation of a single asset after accounting for correlations with the portfolio. For example using the data in the attached picture, if I wanted to run a Monte Carlo analysis on asset 1 after accounting for the correlations with assets 2 and 3 in portfolio 1, what would me steps be? The mean return would simply be the weight of the asset in the portfolio times the assets return but what would the standard deviation of the asset be, the standard deviation of the portfolio times the weight of the asset? ### Lobsters #### On the FreeBSD Code of Conduct ### StackOverflow #### How to convert valueForKeyPath in swift My old function in objective c is + (NSUInteger)getNumberOfDistinctUsers:(NSArray *)users { NSArray* usersAfterPredicate = [users valueForKeyPath:@"@distinctUnionOfObjects.userName"]; return [usersAfterPredicate count]; }  How do I convert this in swift, I was trying to something like this but its crashing "Could not cast value of type 'Swift.Array'to 'Swift.AnyObject'" static func getNumberOfDistinctUsers(users: [ICEPKReferenceDataUser]) -> Int { var retval : Int = 0 if let usersAfterPredicate = (users as! AnyObject).valueForKeyPath("@distinctUnionOfObjects.userName") { retval = usersAfterPredicate.count } return retval }  Can I solve this problem using filter, map or Reduce? I am just trying to find out distint users in users array using the property username. Edit* brute force way  static func getNumberOfDistinctUsers(users: [ICEPKReferenceDataUser]) -> Int { var retvalSet : Set<String> = [] for user in users { retvalSet.insert(user.userName) } return retvalSet.count }  ### Lobsters #### Playscii, an open-source ASCII art and animation program #### Locking the Web Open: A Call for a Distributed Web #### Will there be a Distributed HTTP? ### /r/emacs #### *Typing* diagrams, and get instant feedback in window (C-c C-c). Great for documentations. ### Planet FreeBSD #### New Release Schedule for PC-BSD  The PC-BSD team has always been dedicated to bringing you the best graphical BSD desktop possible. We received some great feedback after our last release cycle that made us rethink our release schedule. In the past, we tracked FreeBSD major releases, and also added our own quarterly updates that tended to add in a good bit of code for new features and utilities. Going forward, PC-BSD releases will track FreeBSD releases only, such as 10.2 -> 11.0 -> 11.1. Once the code base is frozen for a major release, an update can be pushed out to EDGE users who wish to act as advanced users and beta testers for the updates. During that several week testing period, if something goes wrong we'll count on EDGE users to help report issues so that we can quickly get those bugs fixed during the code freeze. After the several week testing period, we can release the update for PRODUCTION users, once we are confident that the kinks are worked out and EDGE users are happy. We're also changing the way the EDGE and PRODUCTION branches work a little bit. EDGE packages will now only be built with the 'stable' branch of PC-BSD code, to avoid radical changes that could break functionality to the PC-BSD tool-chain. This also allows us to focus our QA and testing on the new 3rd party packages themselves. However, all packages that are built on FreeBSD -CURRENT will include PC-BSD's 'master' branch, and are considered bleeding edge. These images will continue to be rolled monthly and are only intended for advanced users or developers who can debug and otherwise help fix issues as they arise. The PRODUCTION package branch will be switching to a monthly update schedule instead of quarterly. The PC-BSD tool-chain will also be based upon "stable", which will not radically change between releases. We are hoping this change will balance between the need for stability and bringing in the latest packages / security updates to end users in a timely manner.  ### Lobsters #### What if I were 1% charged? ### Fefe #### getdigital hat eine Abmahnung von Getty Images für ... getdigital hat eine Abmahnung von Getty Images für den Socially Awkward Pengiun erhalten, ein bekanntes Mem. Die haben da anscheinend einen Ärger-Magneten im Keller bei getdigital, das ist ja nicht das erste Mal, dass die von irgendwelchen Leuten wegen angeblicher Urheberrechtsprobleme abgemahnt werden. Meine Güte, ey. Der Clou an der Sache ist aber, dass Getty denen ein Angebot gemacht hat, das ohne Anwaltsgebühren zu regeln, gegen "nur" 785,40 Euro. Allerdings gilt das Angebot nur dann, wenn sie Stillschweigen bewahren. Vermutlich um Getty nicht die ewigen Jagdgründe mit einer Warnung zu verschmutzen. An dieser Stelle daher mal wieder einen herzlichen Dank an getdigital, die für solche Schweinereien nicht zu haben sind und das lieber öffentlich machen und dafür Anwaltsärger in Kauf nehmen. ### CompsciOverflow #### How do you calculate when effective access time is greater than cache access time? I'm having trouble understanding how to calculate effective access time, hit ratios and cache access times. I'm unfamiliar with the concepts and would love help, or an explanation on how to solve this problem: Consider a memory system with a main memory of 16 MB and a cache memory of 8,192 bytes. The cache cost is 0.02 cents/bit while the memory cost is 0.001 cents /bit. The cache access time is 50 nanoseconds whereas the memory access time is 400 nanoseconds. If the cache hit ratio is 90% what percent is the effective access time greater than the cache access time? What is the average cost per byte of memory for the system? Any help is truly appreciated. ### StackOverflow #### Zipping streams using JDK8 with lambda (java.util.stream.Streams.zip) In JDK 8 with lambda b93 there was a class java.util.stream.Streams.zip in b93 which could be used to zip streams (this is illustrated in the tutorial Exploring Java8 Lambdas. Part 1 by Dhananjay Nene). This function : Creates a lazy and sequential combined Stream whose elements are the result of combining the elements of two streams. However in b98 this has disappeared. Infact the Streams class is not even accessible in java.util.stream in b98. Has this functionality been moved, and if so how do I zip streams concisely using b98? The application I have in mind is in this java implementation of Shen, where I replaced the zip functionality in the • static <T> boolean every(Collection<T> c1, Collection<T> c2, BiPredicate<T, T> pred) • static <T> T find(Collection<T> c1, Collection<T> c2, BiPredicate<T, T> pred) functions with rather verbose code (which doesn't use functionality from b98). ### AWS #### Amazon S3 Update – CloudTrail Integration You can now use AWS CloudTrail to track bucket-level operations on your Amazon Simple Storage Service (S3) buckets. The tracked operations include creation and deletion of buckets, modifications to access controls, changes to lifecycle policies, and changes to cross-region replication settings. AWS CloudTrail records API activity in your AWS account and delivers the resulting log files to a designated S3 bucket. You can look up API activity related to creating, deleting and modifying your S3 resources using the CloudTrail Console, including access to 7 days of historical data. You can also create Amazon CloudWatch Alarms to look for specific API activities and receive email notifications when they occur. Effective today we are now logging actions on S3 buckets to CloudTrail in all AWS Regions. If you have already enabled CloudTrail, you do not need to take any further action in order to take advantage of this new feature. If you are not using CloudTrail, you can turn it on with a couple of clicks (read my introductory post – AWS CloudTrail – Capture API Activity) to learn more. You can use the log files in many different ways. For example, you can use them as supporting evidence if you need to demonstrate compliance with internal or external policies. Let’s say that you store some important files in an S3 bucket. You can set up a CloudWatch Alarm that will fire if someone else in your organization makes changes to the bucket’s access control policy. This will allow you to verify that the change is in compliance with your policies and to take immediate corrective action if necessary. You can also monitor creation and deletion of buckets, updates to life cycle policies, and changes to the cross-region replication settings. Jeff; ### CompsciOverflow #### Global relabeling heuristic: Push-relabel maxflow I have a correct, working implementation of the preflow-push-relabel maxflow algorithm [2]. I am trying to implement the global relabeling update heuristic [3], but have run into some issues. I have a specific instance of the problem here to illustrate my questions 1: a) For this problem instance, the "current preflow" is a state that is reached by my implemented max flow algorithm [[ even without global relabeling heuristic we reach this state ]]. Without applying any heuristics the algorithm proceeds to completion from this state giving the correct result. b) The distance labels (in green) at this state represent a valid labeling as for every (u, v) \belong to E_{residual} d_u <= d_v + 1 Q1: The distance label (or height) is supposed to be a lower bound on the distance from the sink. However for several nodes in the residual graph (eg: 6, 7) this is not true. Eg. Node 7 has a distance label of 14...but clearly has a distance of 1 from the sink in the residual graph. Q2: On running the global relabel at this stage, we get a labeling as seen in the extreme right. From this point on (depending on how frequently you do the global update), the algorithm can get stuck [[ deadlock ]] -- eg. nodes 6, 3 keep circulating flow between them as they each get relabeled to a higher height. If you run the global update before the reach the final height, you will reset the heights and the process repeats. I am reasonably sure I am making a very trivial error but am unable to put my finger on it. Can someone help me with this issue? I am happy to provide a code snippet if that would help. A couple more points that may be relevant: -> I am implementing a single phase version of the push-relabel algorithm (ie. do not stop when you have a min. cut, but continue till you obtain a valid flow). -> I am processing active vertices in FIFO order in the current problem instance. But I have a highest label first [3] implementation as well. I do not think this affects the two questions I have. Thank You! [[ This question has also been posted on programmers stackexchange -- I was unsure which forum is better for this question ]] [2] A new approach to the maximum flow problem; A.Goldberg, R.Tarjan; JACM, Vol 35. Iss 4, 1988 [3] On implementing the push-relabel method for the maximum flow problem; B.V.Cherkassky, A.Goldberg ### AWS #### Introducing the AWS SDK for C++ My colleague Jonathan Henson sent me a guest post to introduce a brand-new AWS SDK! — Jeff; After a long effort, we are proud to announce an open-source C++ SDK for scaling your native applications with Amazon Web Services. The AWS SDK for C++ is a modern C++ interface with lightweight dependencies. We designed it to be fully functioning, with both low-level and high-level interfaces. However, we also wanted it to have as few dependencies as possible and to be platform-independent. At the moment, it includes support for Windows, OS X, Linux, and mobile platforms. This SDK has been specifically designed with game developers in mind, but we have also worked hard to maintain an interface that will work for systems engineering tasks, as well as other projects that simply need the efficiency of native code. Features • Works with the Standard Template Library (STL). • Custom memory management support. • C++ 11 features used and supported. • Builds with CMake and your native compiler toolchain. • Lightweight dependencies. • Exception-safe. • Extensive, configurable logging. • Default credentials providers. • Identity management through Amazon Cognito Identity. • High-level Amazon S3 interface through TransferClient. • Uses native OS APIs for cryptographic and HTTP support. Code Samples Storing a value in a Amazon DynamoDB table: Aws::DynamoDB::DynamoDBClient dynamoDbClient; PutItemRequest putItemRequest; putItemRequest.WithTableName("TestTableName"); AttributeValue hashKeyAttribute; hashKeyAttribute.SetS("SampleHashKeyValue"); putItemRequest.AddItem("HashKey", hashKeyAttribute); AttributeValue valueAttribute; valueAttribute.SetS("SampleValue"); putItemRequest.AddItem("Value", valueAttribute); auto putItemOutcome = dynamoDbClient.PutItem(putItemRequest); if(putItemOutcome.IsSuccess()) { std::cout << "PutItem Success Using IOPS " << putItemOutcome.GetResult().GetConsumedCapacity(); } else { std::cout << "PutItem failed with error " << putItemOutcome.GetError().GetMessage(); }  Downloading a file from Amazon Simple Storage Service (S3): Aws::S3::S3Client s3Client; GetObjectRequest getObjectRequest; getObjectRequest.SetBucket("sample_bucket"); getObjectRequest.SetKey("sample_key"); getObjectRequest.SetResponseStreamFactory( [](){ return Aws::New(ALLOCATION_TAG, DOWNLOADED_FILENAME, std::ios_base::out | std::ios_base::in | std::ios_base::trunc); }); auto getObjectOutcome = s3Client.GetObject(getObjectRequest); if(getObjectOutcome.IsSuccess()) { std::cout << "File downloaded from S3 to location " << DOWNLOADED_FILENAME; } else { std::cout << "File download failed from s3 with error " << getObjectOutcome.GetError().GetMessage(); }  It’s that simple! Download the code from GitHub today and start scaling your C++ applications with the power of Amazon Web Services. Status We are launching the AWS SDK for C++ in its current experimental state while we gather feedback from users and the open source community to harden the APIs. We also are adding support for individual services as we become more confident that the client generator can properly support each protocol. Support for more services will be coming in the near future. We invite our customers to follow along with our progress and join the development efforts by submitting pull requests and sending us feedback and ideas via GitHub Issues. Jonathan Henson, Software Development Engineer (SDE) ### Lobsters #### Snowsuit Zine // issue 09 - FreeBSD ### CompsciOverflow #### Memory storage capacity of Bienenstock-Cooper-Munro rule I would like to know the memory storage capacity of the BCM learning rule when it is implemented on a Hopfield network. I understand that it will be a function of n where n is the number of neurons. #### Whats Wrong with this LL(1) Grammar? I am trying to build a LL(1) parse table for the following Grammar: S -> L L -> L : L L -> id R R -> ( L ) R -> ( ) R -> Epsilon There are two Problems here. First the L rules are a left recursion and the R rules have ( as the same prefix. So I modified the Grammar to this one: S -> L L -> id R X X -> : L X X -> Epsilon R -> ( P R -> Epsilon P -> L ) P -> ) I then checked the Grammar with the JFLAP Tool and it says that my grammar is not LL(1). But I just don't see the Problem. Can somebody help me out here? regards Alex ### Planet Emacsen #### Irreal: Pages in Emacs Eric James Michael Ritz over at One More Game-Dev and Programming Blog has a nice post on how and why to use pages in Emacs. Pages in Emacs are text separated by form feeds (0xC)1. As Ritz explains, it's easy to move between pages by using 【Ctrl+x [】 (backward-page) and 【Ctrl+x ]】 (forward-page). Most of us don't use pages but Ritz gives us a good use case. As he points out, you can install a package such as Steve Purcell's page-break-lines package that will make the breaks explicit and others that make navigation easier. Go read his post and see if you can make pages work for you. ## Footnotes: 1 Although, like everything else in Emacs, that's configurable. It's controlled by the page-delimiter variable. ### CompsciOverflow #### Equivalence in finite sets of turing machines I have this exercise and I really don't know how to complete it: Prove that for all finite sets S of Turing Machines it is decidable whether or not given any two Turing Machines M1 and M2 in S they recognize the same language, that is, L(M1)=L(M2)  I know that the problem of equivalence of languages is undecidable in general, but the additional hypothesis of the finiteness of the set S of machines does not suggest me any reason why this case should be different. Can somebody help me? ### TheoryOverflow #### Computing the DAG of a program given source code or AST I've seen many papers on scheduling components or tasks once a DAG for the program is known, either by user-input or by domain restriction (i.e. all cross shaped 5-pt stencil codes have a known DAG). Is there an algorithm for taking code (or Abstract syntax trees) from a language, say Python, and generating a DAG for it? I've been searching for a paper related to this for a long time and haven't found anything on google scholar. Intuitively, I'd say that it's not possible, since some languages (such as Python) can change just about everything at run-time, and thus to determine the behavior and dependencies of a given program you would have to run it first for every possible set of inputs. EDIT: I'm using DAG in this sense as the Directed Acyclic Graph for the dependencies (i.e. which parts depend on which other parts) ### StackOverflow #### Maintaining complex state in Haskell Suppose you're building a fairly large simulation in Haskell. There are many different types of entities whose attributes update as the simulation progresses. Let's say, for the sake of example, that your entities are called Monkeys, Elephants, Bears, etc.. What is your preferred method for maintaining these entities' states? The first and most obvious approach I thought of was this: mainLoop :: [Monkey] -> [Elephant] -> [Bear] -> String mainLoop monkeys elephants bears = let monkeys' = updateMonkeys monkeys elephants' = updateElephants elephants bears' = updateBears bears in if shouldExit monkeys elephants bears then "Done" else mainLoop monkeys' elephants' bears'  It's already ugly having each type of entity explicitly mentioned in the mainLoop function signature. You can imagine how it would get absolutely awful if you had, say, 20 types of entities. (20 is not unreasonable for complex simulations.) So I think this is an unacceptable approach. But its saving grace is that functions like updateMonkeys are very explicit in what they do: They take a list of Monkeys and return a new one. So then the next thought would be to roll everything into one big data structure that holds all state, thus cleaning up the signature of mainLoop: mainLoop :: GameState -> String mainLoop gs0 = let gs1 = updateMonkeys gs0 gs2 = updateElephants gs1 gs3 = updateBears gs2 in if shouldExit gs0 then "Done" else mainLoop gs3  Some would suggest that we wrap GameState up in a State Monad and call updateMonkeys etc. in a do. That's fine. Some would rather suggest we clean it up with function composition. Also fine, I think. (BTW, I'm a novice with Haskell, so maybe I'm wrong about some of this.) But then the problem is, functions like updateMonkeys don't give you useful information from their type signature. You can't really be sure what they do. Sure, updateMonkeys is a descriptive name, but that's little consolation. When I pass in a god object and say "please update my global state," I feel like we're back in the imperative world. It feels like global variables by another name: You have a function that does something to the global state, you call it, and you hope for the best. (I suppose you still avoid some concurrency problems that would be present with global variables in an imperative program. But meh, concurrency isn't nearly the only thing wrong with global variables.) A further problem is this: Suppose the objects need to interact. For example, we have a function like this: stomp :: Elephant -> Monkey -> (Elephant, Monkey) stomp elephant monkey = (elongateEvilGrin elephant, decrementHealth monkey)  Say this gets called in updateElephants, because that's where we check to see if any of the elephants are in stomping range of any monkeys. How do you elegantly propagate the changes to both the monkeys and elephants in this scenario? In our second example, updateElephants takes and returns a god object, so it could effect both changes. But this just muddies the waters further and reinforces my point: With the god object, you're effectively just mutating global variables. And if you're not using the god object, I'm not sure how you'd propagate those types of changes. What to do? Surely many programs need to manage complex state, so I'm guessing there are some well-known approaches to this problem. Just for the sake of comparison, here's how I might solve the problem in the OOP world. There would be Monkey, Elephant, etc. objects. I'd probably have class methods to do lookups in the set of all live animals. Maybe you could lookup by location, by ID, whatever. Thanks to the data structures underlying the lookup functions, they'd stay allocated on the heap. (I'm assuming GC or reference counting.) Their member variables would get mutated all the time. Any method of any class would be able to mutate any live animal of any other class. E.g. an Elephant could have a stomp method that would decrement the health of a passed-in Monkey object, and there would be no need to pass that Likewise, in an Erlang or other actor-oriented design, you could solve these problems fairly elegantly: Each actor maintains its own loop and thus its own state, so you never need a god object. And message passing allows one object's activities to trigger changes in other objects without passing a bunch of stuff all the way back up the call stack. Yet I have heard it said that actors in Haskell are frowned upon. ### CompsciOverflow #### Bottleneck TSP with MST There is a problem I don't know the answer too. The 3 approximation for the bottleneck TSP that involves first getting the MST. I have not been able to come up with the right "shortcut" method so far. Please help! To be more specific. The problem comes from Cormen's Introduction to Algorithms 3rd edition . The problem is "In the bottleneck traveling-salesman problem, we wish to find the hamiltonian cycle that minimizes the cost of the most costly edge in the cycle. Assuming that the cost function satisfies the triangle inequality, show that there exists a polynomial time approximation algorithm with approximation ratio 3 for this problem. (Hint:Show recursively that we can visit all the nodes in a bottleneck spanning tree, as discussed in Problem 23-3, exactly once by taking a full walk of the tree and skipping nodes, but without skipping more than two consecutive intermediate nodes. Show that the costliest edge in a bottleneck spanning tree has a cost that is at most the cost of the costliest edge in a bottleneck hamiltonian cycle.)" I can prove that the MST is also a bottleneck spanning tree, however when I search online and google scholar for the step where we walk on the bottleneck spanning tree, all of the articles are on much better approximations, and I can't seem to find any reference to this type of argument online. ### TheoryOverflow #### (False?) proof for computability of a function? Consider f(n), a function that returns 1 iff n zeros appear consecutively in \pi. Now someone gave me a proof that f(n) is computable: Either for all n, 0^n appears in \pi, or there is a m s.t. 0^m appears in \pi and 0^{m+1} does not. For the first possibility f(n) := 1; For the second one f(n) := 1 iff n \leq m, 0 otherwise. The author claims that this proves computability of f(n), as there exists an algorithm to compute it. Is this proof correct? ### /r/compsci #### Top 10 algorithms of the 20th century [Xpost: /r/math] ### CompsciOverflow #### 2 Dimensional Subset Sum: looking for information I do not know if this problems exists with a different name, if it is, I could not find it. The problem is this: Given a set S of n points in \mathbb{Z}^2, is there a subset A\subset S such that the points of A sum up to (0,0)? If we restrict in \mathbb{Z} this is the classic Subset-Sum problem. Furthermore this can be generalized in n-dimensions if all points are in \mathbb{Z}^n. Since Subset-Sum is NP-complete so would be this problem thus, I am intrested in approximating it. (Here there is another issue, even for the classic Subset-Sum: since the target is 0 how the approximation ratio be represented? Maybe using the largest integer or the mean of S?) Subset-Sum has an FPTAS but I don't think that that is the case for the 2D Subset-Sum. One idea is to solve only in the x dimension and from all the addmisable solutions pick the one that minimizes the sum in the y dimension. But this cannot guarantee a ratio for the y sum. Does anyone has any information for me? Even bad news (inapproximability) are good news. The only close problem that I manage to locate is the Weighted Distribution problem (set of binary vector and ask if there is a subset that sums to the zero vector) coming from coding theory, proven to be W[2]-complete in this paper, thus not in EPTAS. EDIT 1: Now I look at it again, I do not think that Weighted Distribution is similiar. It is more alike the classic Subset-Sum since every bit vector can just represent a number. Now I am looking at multidimensional Knapsack. EDIT 2:An idea to tranform it to 1D-SS is to take the xx' (and after the yy' projections). The xx' projection will give you a good x value but for the y axis we cannot guarantee anything. Another approach, generalizing the classic SS aproximation idea (see here and here) is to create rings in \mathbb{Z}^2 but this also does not work good. Maybe if we take multiple projection and choose the best? ### Lobsters #### Breaking and entering: lose the lock while embracing concurrency, Part I ### CompsciOverflow #### Algorithm to group two-dimensional data points I feel like there should be a known algorithm to the following problem, but I am short of ideas how to construct or search for it. Suppose as an input you have a list of two-dimensional data points (xi, yi). Your goal is to "simplify" this data by providing an output that is equivalent to the input, but grouped together using structures like this: ({x1, x2, .., xn}, {y1, y2, .., ym}), which denotes a set of all (xa, yb) where a ∈ {1, 2, .., n} and b ∈ {1, 2, .., m}. A perfect result is to have as few of these structures + remaining data points as possible. Some points to consider: • The output can be a combination of simple data points and such structures. • The structures in the output should be mutually exclusive. • It may as well be that there is no way to simplify a given input. Example 1: Input: (1, 1), (1, 2), (2, 1), (2, 2), (4, 5) Output: ({1, 2}, {1, 2}), (4, 5) Example 2: Input: (1, 1), (1, 2), (1, 3), (4, 5) Output: ({1}, {1, 2, 3}), (4, 5) Example 3: Input: (1, 2), (2, 2), (3, 2), (2, 1), (2, 3) Output: ({1, 2, 3}, {2}), ({2}, {1, 3}) ### Fefe #### Google kloppt jetzt mit dem großen Ban-Hammer auf ... Google kloppt jetzt mit dem großen Ban-Hammer auf Webseiten drauf, die einem erstmal eine ganzseitige "downloaden Sie doch unsere App!1!!"-Werbung reindrücken. Sehr schön! ENDLICH! So und jetzt bitte auch alle Webseiten runterscoren, die einen mit "like us on facebook" und alle Youtube-Videos, die "please subscribe"-Bullshit einblenden! Oh und diese EU-Cookie-Scheiße kann auch mal in die Tonne. Lange nicht mehr ging etwas so nach hinten los wie das. ### Halfbakery #### Clickbait Resort (0.5) ### CompsciOverflow #### Does removal of unit production from the grammar may increase number of total production? I have doubt ! Problem is There are m variable in a grammar. The number of productions after removal of unit productions in the worst case is ,(Assume there are no null productions) (a) O(m) (b) O(m^2) (c) O(k^m) (d) O(2^m) I try to explain He asked about number of total production remains in the minimized grammar , so , if the there is no null production , it is advantage , as you know when we convert a grammar without null production , generally , removal of all production introduces new production in the resultant grammar (it can be upto set of all subset of given max RHS number of variable , so exponetial ) . But , in given grammar , you don't have null production , so you are removing only unit production (i.e. A--->B type productions ) , resultant grammar maximum 'm' production , where m is number of variable . example :- S---->A A---->B B---->a/b/c resultant grammar , S---->a/b/c So , it is O(m) time . Still , I have doubt "Is it dependent on number of variable ?" , and confused "It should be depend on number of non-terminal on RHS of the productions !" . I am clueless . I know removal of null production can increase the total number of production . I need explanation . This is not a homework . ### TheoryOverflow #### A probabilistic set with no false positives? So, Bloom filters are pretty cool -- they are sets that support membership checking with no false negatives, but a small chance of a false positive. Recently though, I've been wanting a "Bloom filter" that guarantees the opposite: no false positives, but potentially false negatives. My motivation is simple: given a huge stream of items to process (with duplicates), we'd like to avoid processing items we've seen before. It doesn't hurt to process a duplicate, it is just a waste of time. Yet, if we neglected to process an element, it would be catastrophic. With a "reverse Bloom filter", one could store the items seen with little space overhead, and avoid processing duplicates with high probability by testing for membership in the set. Yet I can't seem to find anything of the sort. The closest I've found are "retouched Bloom filters", which allow one to trade selected false positives for a higher false negative rate. I don't know how well their data structure performs when one wants to remove all false positives, however. Anyone seen anything like this? :) ### StackOverflow #### Wait for latest values from dependent streams in BaconJS? I have 3 streams. gradingResult and contextId depend on studentResponse. I need to fire an event and only one event (otherwise, this is trivial) when all 3 have the latest values. I've tried #combineTemplate and #sampledBy studentResponse. Unfortunately, I always see the wrong data---gradingResult and contextId have the old values in the combined template. How can I wait for all streams to have the latest values? Code is shown below: var studentResponse = new Bacon.Bus(); var gradingResult = new Bacon.Bus(); var contextId = new Bacon.Bus(); studentResponse.onValue(function(f) { gradingResult.push(f); contextId.push(f); }); Bacon.combineTemplate({ studentResponse: studentResponse, gradingResult: gradingResult, contextId: contextId }).sampledBy(studentResponse) .onValue(function(t) { console.log(t); }); studentResponse.push(1); studentResponse.push(2); studentResponse.push(3);  Link to jsfiddle: https://jsfiddle.net/3o4c9sm8/1/ UPDATE: this is a contrived example. In the real code, gradingResult is an ajax request. Both gradingResult and contextId have time dependencies on studentResponse #### How to implement map that associates a name with void function in C? After reading Structure and Interpretation of Computer Programs (SICP) I decided to find a way to implement some of these functional programming techniques using C. I tried to write a program that makes a pair whose first argument is a name of the function and second arg is any function that takes one arg and returns one arg. Using implementation below I was expecting to see an output like: fact(7) = 5040 fib(7) = 13  but instead I am getting fact(7) = 5040 fib(7) = 0  along with warnings  cc map.c map.c: In function ‘main’: map.c:41:17: warning: assignment from incompatible pointer type [enabled by default] maps[0].f_ptr = &fact; ^ map.c:43:17: warning: assignment from incompatible pointer type [enabled by default] maps[1].f_ptr = &fib; ^ map.c:47:7: warning: passing argument 1 of ‘maps[i].f_ptr’ makes pointer from integer without a cast [enabled by default] ans = (int) maps[i].f_ptr((int) num); ^ map.c:47:7: note: expected ‘void *’ but argument is of type ‘int’ map.c:47:13: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast] ans = (int) maps[i].f_ptr((int) num); ^ map.c:52:7: warning: passing argument 1 of ‘maps[i].f_ptr’ makes pointer from integer without a cast [enabled by default] ans2 = (int) maps[i].f_ptr((int) num); ^ map.c:52:7: note: expected ‘void *’ but argument is of type ‘int’ map.c:52:14: warning: cast from pointer to integer of different size [-Wpointer-to-int-cast] ans2 = (int) maps[i].f_ptr((int) num);  during compilation. Looking at the code I don't see the problem but then again I haven't used C in quite some time. Is there a better way to implement such a construct and why is fib(7) printing a 0 instead of 13? Here's my code: struct Map { char* name; void* (*f_ptr)(void*); }; int fact(int a) { if (a == 0) return 0; if (a == 1) return 1; return a * fact (a-1); } int fib(int a) { if (a == 0) return 0; if (a == 1) return 1; return fib(a-1) + fib(a-2); } int findFunc (char* str, struct Map map) { if (map.name == str) return 1; return 0; } int main() { int i = 0; int ans = 0; int ans2 = 0; int num = 7; struct Map maps[2]; maps[0].name = "fact"; maps[0].f_ptr = &fact; maps[1].name = "fib"; maps[1].f_ptr = &fib; for (i; i < (sizeof(maps)/sizeof(maps[0])); i++) { if (findFunc("fact", maps[i])) ans = (int) maps[i].f_ptr((int) num); } for (i; i < (sizeof(maps)/sizeof(maps[0])); i++) { if (findFunc("fib", maps[i])) ans2 = (int) maps[i].f_ptr((int) num); } printf("fact(%d) = %d\n", num, ans); printf("fib(%d) = %d", num, ans2); return 0; }  ### Lobsters #### Speaker list for Haskell eXchange conference, 2-day Hackathon, and promo code Hey everyone! The programme for Haskell eXchange in London is finally getting settled, we’ll have 4 keynotes over the two days from Simon Peyton Jones, Lennart Augustsson, Simon Marlow and Luite Stegeman! To check out the full list of speakers and their topics, go here: https://skillsmatter.com/conferences/7069-haskell-exchange-2015#program Descriptions for some of the keynotes are still missing, hopefully it will be updated soon. We’re also announcing that there’s going to be a 2-day Hackathon over the following weekend where the focus will be on improving the Haskell infrastructure, anyone should feel free to come and if you’re a beginner that wants to start contributing to the Haskell infrastructure, you’ll be able to get help from experienced contributors. Read the full description and register here (it’s free): https://skillsmatter.com/meetups/7314-haskell-hackathon-2015#overview Finally, if you want to get 25% off on the conference ticket, you can use the promo code HASKELL-EXCHANGE-25, it should be valid until September 12th! Hope to see you there! :) ### CompsciOverflow #### Does a reentrant list for signal queue in a single-thread environment exist? I need to handle Unix signals in a single-threaded application with the following goals: 1. Signals doesn't mask on receive (thus, the signal handler must be reentrant). 2. I am not allowed to lose signal data (thus, if a new signal comes before the handler of the previous returned, it also must be handled correctly). I have the common multi-threaded primitives (spinlocks, semaphores, etc). But they doesn't seem enough, because my higher-level data structures (even a such simple as a list) aren't thread-safe. My initial idea was the following: 1. I use a list, in which I store the data of the incoming signal fast, 2. and process them (call the possibily much slower running handlers) later, out of the critical section. The main problem with that, that the list data structure isn't thread safe. If I lock it, I can't store a second signal anywhere. I can't wait until the previous handler exits, because on the second signal it is essentially suspended in a critical section. Simply I don't have any idea, how to handle the following scenario: 1. signal1 comes, the process suspends, and the handler of signal1 starts 2. signal2 comes, the handler of signal1 suspends, and the handler of signal2 starts 3. Handler of signal2 returns, execution returns to the handler signal1 4. Handler of signal1 returns, execution returns to the main program. After thinking a lot on it, I have an impression, maybe my problem is unsolvable. Am I right? How do operating systems handle similar problems (for example, possibily bursting interrupts from hardware)? ### TheoryOverflow #### Deciding functionality of transducers over infinite words Given a finite state transducer defining a rational relation over infinite words, it is known to be decidable whether or not the relation is a function, i.e. whether each infinite input word is related to at most one infinite output word. This is detailed in a paper by Gire: Two Decidability Problems for Infinite Words. Unfortunately, I cannot find the full text of the paper anywhere. The basic idea seems to be to form the composition of the transducer with its inverse T \circ T^{-1} and check if the resulting transducer is a restriction of the identity function. Note that the inverse of a transducer T is T with input and output word swapped for each transition. I am looking for details of the decision procedure. Do you have any references or a short description of the algorithm? ### CompsciOverflow #### Raptor Algorithm: Differences between trip and route I'm reading Microsoft's paper about algorithm in public transit named Round-Based Public Transit Routing, short name is RAPTOR algorithm. In this algorithm, author has some definitions that I don't understand well: About trip: trip represents a sequence of stops a specific vehicle (train, bus, subway, . . . ) visits along a line About route: Each route consists of the trips that share the same sequence of stops. So, if I understand well, trip means a "bus route" and route is collection of trip to make a path, right ? If my guess is true, why author states: Typically there are more routes than trips And if route is just collection of trips, how can I build all route? Thanks :) ### /r/compsci #### Video: Looking to the future with quantum chemistry methods for quantum computers. Crossposted to /r/quantumcomputing & /r/HPC ### /r/scala #### Fine Tuning Sentiment Analysis with Scala ### Fefe #### Liebe Leser, ich muss heute mal was für die Gerechtigkeit ... Liebe Leser, ich muss heute mal was für die Gerechtigkeit tun. Ich habe eine Zuschrift als Köln gekriegt, wo sich jemand beschwert, dass hier immer nur über Verkacker aus Berlin hergezogen wird, und vielleicht mal Hamburg. Köln sei da auch ganz vorne mit dabei. Flughäfen kann Köln auch überzeugend in den Sand setzen, und auch Wahlen können die in Köln nicht :-) #### Das Landgericht Hamburg hat mal wieder die Lage angeschaut ... Das Landgericht Hamburg hat mal wieder die Lage angeschaut und dann das Urteil gefällt, das den meisten Schaden anrichtet: "Auch DIN-Normen können urheberrechtlich geschützt sein". Bonus: Das Verfahren richtete sich gegen eine US-amerikanische Webseite. Update: Ja, Leute, das ist altbekannt. Das war nicht mein Punkt. Mein Punkt war: Das wäre die Gelegenheit gewesen, dem Beuth-Verlag mal gepflegt die Geschäftsgrundlage zu entziehen. Es gibt nämlich überhaupt keine wie auch immer geartete Rechtfertigung dafür, dass jemand für den Zugriff auf Normen Geld verdient. Entweder es ist eine Norm, deren Zweck ist, eine allgemeine Vorschrift für den interoperablen Bau von Dingen zu sein, und dann hat die auch kostenfrei jedermann zugänglich zu sein, oder es ist keine Norm, dann können wir von mir aus den Markt das regeln lassen. Aber was wir hier haben ist ja gerade kein Markt, in dem irgendjemand anderes konkurrieren könnte, sondern wir haben Normen, an die ich mich halten muss, aber nur einer kann sie mir verkaufen. Hier hätte das LG Hamburg sagen können: Nee, Freunde, das ist nicht urheberrechtlich geschützt. Das ist vom Charakter her wie ein amtliches Dokument. #### Zu der Zalando-Geschichte kam noch eine Mail rein, ... Zu der Zalando-Geschichte kam noch eine Mail rein, die mir erklärte, dass im Einzelhandeln die Anzeigen im Allgemeinen korrekt sind, allerdings der Versender immer geringfügig mehr Artikel auf Lager hat als angezeigt wird, damit man bei Race Conditions von mehreren Bestellern trotzdem liefern kann. Die Bestellprozesse bestehen halt aus mehreren Schritten und die Systeme updaten gelegentlich wohl ihre Datenbestände auch nicht in Echtzeit. #### Die Linkspartei in Sachsen kriegt gerade "fast täglich" ... ### DataTau #### Civis API: Scale Up Your Data Science ### Lobsters #### Bitcoin and the Uniform Commercial Code ### /r/scala #### How do I satisfy IntelliJ IDEA's expectation that my scala class's companion object's declaration be tested? case class Point(x: Double = 0, y: Double = 0, z: Double = 0) extends Point3D(x, y, z) { def +(that: Point): Point = new Point(x + that.x, y + that.y, z + that.z) def -(that: Point): Point = new Point(x - that.x, y - that.y, z - that.z) def *(that: Point): Point = new Point(x * that.x, y * that.y, z * that.z) def /(that: Point): Point = new Point(x / that.x, y / that.y, z / that.z) } object Point { def apply(p3d: Point3D) = new Point(p3d.getX, p3d.getY, p3d.getZ) }  IntelliJ is giving this code 92% coverage because the line object Point { is being counted in coverage, but I cannot figure out why or how to satisfy it's expectation. I have tried comparing it to something, tried adding another method and calling that... No such luck, and I am out of ideas. Here is the test code. Any help would be much appreciated! import org.scalatest._ class test_Point extends FlatSpec with Matchers { val u = Point(100, 200, 300) val v = Point(623, -85, 300) it should "supply zero-valued defaults" in { val p = Point() p.x should be (0) p.y should be (0) p.z should be (0) } it should "be constructable from base class" in { val p = Point(new Point3D(0, 0, 0)) } it should "implement + operator" in { val w = Point(u.x+v.x, u.y+v.y, u.z+v.z) u + v should be (w) } it should "implement - operator" in { val w = Point(u.x-v.x, u.y-v.y, u.z-v.z) u - v should be (w) } it should "implement * operator" in { val w = Point(u.x*v.x, u.y*v.y, u.z*v.z) u * v should be (w) } it should "implement / operator" in { val w = Point(u.x/v.x, u.y/v.y, u.z/v.z) u / v should be (w) } }  submitted by Chronopolitan [link] [8 comments] ### /r/compsci #### Why is the volume slider still logarithmic? This makes absolutely no sense to me. submitted by Syntaximus [link] [1 comment] ### Planet Theory #### Séminaire N. Bourbaki – Designs Exist (after Peter Keevash) – the paper On June I gave a lecture on Bourbaki’s seminare devoted to Keevash’s breakthrough result on the existence of designs. Here is a draft of the paper: Design exists (after P. Keevash). Remarks, corrections and suggestions are most welcome! I would have loved to expand a little on 1) How designs are connected to statistics 2) The algebraic part of Keevash’s proof 3) The “Rodl-style probabilistic part” (that I largely took for granted) 4) The greedy-random method in general 5) Difficulties when you move from graph decomposition to hypergraph decomposition 6) Wilson’s proofs of his theorem 7) Teirlink’s proof of his theorem I knew at some point in rough details both Wilson’s proof (I heard 8 lectures about and around it from Wilson himself in 1978) and Teirlink’s (Eran London gave a detailed lecture at our seminar) but I largely forgot, I’d be happy to see a good source). 8) Other cool things about designs that I should mention. 9) The Kuperberg-Lovett-Peled work (To be realistic, adding something for half these items will be nice.) Here is the seminar page, (with videotaped lectures), and the home page of Association des collaborateurs de Nicolas Bourbaki . You can find there cool links to old expositions since 1948 which overall give a very nice and good picture of modern mathematics and its highlights. Here is the link to my slides. In my case (but probably also for some other Bourbaki’s speakers) , it is not that I had full understanding (or close to it) of the proof and just had to decide how to present it, but my presentation largely represent what I know, and the seminaire forced me to learn. I was lucky that Peter gave a series of lectures (Video 1, Video 2, Video3Video4 ) about it in the winter at our Midrasha, and that he decided to write a paper “counting designs” based on the lectures, and even luckier that Jeff Kahn taught some of it at class (based on Peter’s lectures and subsequent article) and later explained to me some core ingredients. Here is a link to Keevash’s full paper “The existence of design,” and an older post on his work. Curiously the street was named only after Pierre Curie until the 60s and near the sign of the street you can still see the older sign. ### /r/compsci #### Seriously frustrated about my career Hi guys, I'm very sad to say that I'm not happy as a programmer and I'd like to share discuss my experience as well as your perspective and opinion on it. I got 'lucky' enough to try out programming when I was 12 and back then I decided that I wanted to do it all my life, I seemed seriously talented for algorithm designing and the idea of making computers do what I want rather than using other people's programs just seemed fun, and to this day I still have fun doing my job (relatively, maybe the first couple hours), yet I kinda think I fucked up. I'm pretty negative about it right now. I dont like the kind of people I meet in this field of work, and I hate the environment and the schedule. I think that I had one chance to work a bit harder and get into something 'cool' like medicine or aviation even if I'm not particularly talented, but instead I chose to sit at an office all day every day. I think that the 'funky' atmosphere of most software companies is actually there to hide something dark, and that several young programmers are young adults wasting their life from 9am to 8pm after they wasted their teen years with videogames. If you are a young guy considering the career, take my perspective into account but obviously take it with a grain of salt. And most importantly do some research on the different options you have instead of going for what you happened to be exposed to. submitted by xHzMU9 [link] [comment] ### infra-talk #### Transplanting Go packages for fun and profit A while back I read coders at work, which is a book of interviews with some great computer scientists who earned their stripes, the questions just as thoughtful as the answers. For one thing, it re-ignited my interest in functional programming, for another I got interested in literate programming but most of all, it struck me how common of a recommendation it was to read other people’s code as a means to become a better programmer. (It also has a good section of Brad Fitzpatrick describing his dislike for programming languages, and dreaming about his ideal language. This must have been shortly before Go came about and he became a maintainer.) I hadn’t been doing a good job reading/studying other code out of fear that inferior patterns/style would rub off on me. But I soon realized that was an irrational, perhaps slightly absurd excuse. So I made the decision to change. Contrary to my presumption I found that by reading code that looks bad you can challenge and re-evaluate your mindset and get out with a more nuanced understanding and awareness of the pros and cons of various approaches. I also realized if code is proving too hard to get into or is of too low quality, you can switch to another code base with negligible effort and end up spending almost all of your time reading code that is worthwhile and has plenty of learnings to offer. There is a lot of high quality Go code, easy to find through sites like Github or Golang weekly, just follow your interests and pick a project to start reading. It gets really interesting though once you find bodies of code that are not only a nice learning resource, but can be transplanted into your code with minimal work to solve a problem you’re having, but in a different context then the author of the code originally designed it for. Components often grow and mature in the context of an application without being promoted as reusable libraries, but you can often use them as if they were. I would like to share 2 such success cases below. # Nsq’s diskqueue code I’ve always had an interest in code that manages the same binary data both in memory and on a block device. Think filesystems, databases, etc. There’s some interesting concerns like robustness in light of failures combined with optimizing for performance (infrequent syncs to disk, maintaining the hot subset of data in memory, etc), combined with optimizing for various access patterns, this can be a daunting topic to get into. Luckily there’s a use case that I see all the time in my domain (telemetry systems) and that covers just enough of the problems to be interesting and fun, but not enough to be overwhelming. And that is: for each step in a monitoring data pipeline, you want to be able to buffer data if the endpoint goes down, in memory and to disk if the amount of data gets too much. Especially to disk if you’re also concerned with your software crashing or the machine power cycling. This is such a common problem that applies to all metrics agents, relays, etc that I was longing for a library that just takes care of spooling data to disk for you without really affecting much of the rest of your software. All it needs to do is sequentially write pieces of data to disk and have a sequential reader catching up and read newer data as it finishes processing the older. NSQ is a messaging platform from bitly, and it has diskqueue code that does exactly that. And it does so oh so elegantly. I had previously found a beautiful pattern in bitly’s go code that I blogged about and again I found a nice and elegant design that builds further on this pattern, with concurrent access to data protected via a single instance of a for loop running a select block which assures only one piece of code can make changes to data at the same time (see bottom of the file), not unlike ioloops in other languages. And method calls such as Put() provide a clean external interface, though their implementation simply hooks into the internal select loop that runs the code that does the bulk of the work. Genius. func (d *diskQueue) Put(data []byte) error { // some details d.writeChan <- data return <-d.writeResponseChan }  In addition the package came with extensive tests and benchmarks out of the box. After finding and familiarizing myself with this diskqueue code about a year ago I had an easy time introducing disk spooling to Carbon-relay-ng, by transplanting the code into it. The only change I had to make was capitalizing the Diskqueue type to export it outside of the package. It has proven a great fit, enabling a critical feature through little work of transplanting mature, battle-tested code into a context that original authors probably never thought of. Note also how the data unit here is the []byte, the queue does not deal with the higher level nsq.Message (!). The authors had the foresight of keeping this generic, enabling code reuse and rightfully shot down a PR of mine that had a side effect of making the queue aware of the Message type. In NSQ you’ll find thoughtful and deliberate api design and pretty sound code all around. Also, they went pretty far in detailing some lessons learned and providing concrete advice, a very interesting read, especially around managing goroutines & synchronizing their exits, and performance optimizations. At Raintank, we had a need for a messaging solution for metrics so we will so be rolling out NSQ as part of the raintank stack. This is an interesting case where my past experience with the NSQ code and ideas helped to adopt the full solution. # Bosun expression package I’m a fan of the bosun alerting system which came out of Stack Exchange. It’s a full-featured alerting system that solves a few problems like no other tool I’ve seen does (see my linked post), and timeseries data storage aside, comes with basically everything built in to the one program. I’ve used it with success. However, for litmus I needed an alerting handler that integrated well into the Grafana backend. I needed the ability to do arbitrarily complex computations. Graphite’s api only takes you so far. We also needed (desired) reduction functions, boolean logic, etc. This is where bosun’s expression language is really strong. I found the expression package quite interesting, they basically built their own DSL for metrics processing. so it deals with expression parsing, constructing AST’s, executing them, dealing with types (potentially mixed types in the same expression), etc. But bosun also has incident management, contacts, escalations, etc. Stuff that we either already had in place, or didn’t want to worry about just yet. So we could run bosun standalone and talk to it as a service via its API which I found too loosely coupled and risky, hook all its code into our binary at once - which seemed overkill - or the strategy I chose: gradually familiarize ourself and adopt pieces of Bosun on a case by case basis, making sure there’s a tight fit and without ever building up so much technical debt that it would become a pain to move away from the transplanted code if it becomes clear it’s not/no longer well suited. For the foreseeable future we only need one piece, the expression package. Potentially ultimately we’ll adopt the entire thing, but without the upfront commitment and investment. So practically, our code now simply has one line where we create a bosun expression object from a string, and another where we ask bosun to execute the expression for us, which takes care of parsing the expression, querying for the data, evaluating and processing the results and distilling everything down into a final result. We get all the language features (reduction functions, boolean logic, nested expressions, …) for free. This transplantation was again probably not something the bosun authors expected, but for us it was tremendously liberating. We got a lot of power for free. The only thing I had to do was spend some time reading code, and learning in the process. And I knew the code was well tested so we had zero issues using it. Much akin to the NSQ example above, there was another reason the transplantation went so smoothly: the expression package is not tangled into other stuff. It just needs a string expression and a graphite instance. To be precise, any struct instance that satisfies the graphiteContext interface that is handily defined in the bosun code. While the bosun design aims to make its various clients (graphite, opentsdb, …) applicable for other projects, it also happens to let us do opposite: reuse some of its core code - the expression package - and pass in a custom graphite Context, such as our implementation which has extensive instrumentation. This lets us use the bosun expression package as a “black box” and still inject our own custom logic into the part that queries data from graphite. Of course, once we want to change the logic of anything else in the black box, we will need come up with something else, perhaps fork the package, but it doesn’t seem like we’ll need that any time soon. # Conclusion If you want to become a better programmer I highly recommend you go read some code. There’s plenty of good code out there. Pick something that deals with a topic that is of interest to you and looks mature. You typically won’t know if code is good before you start reading but you’ll find out really fast, and you might be pleasantly surprised, as was I, several times. You will learn a bunch, possibly pretty fast. However, don’t go for the most advanced, complex code straight away. Pick projects and topics that are out of your comfort zone and do things that are new to you, but nothing too crazy. Once you truly grok those, proceed to other, possibly more advanced stuff. Often you’ll read reusable libraries that are built to be reused, or you might find ways to transplant smaller portions of code into your own projects. Either way is a great way to tinker and learn, and solve real problems. Just make sure the code actually fits in so you don’t end up with the software version of Frankenstein’s monster. It is also helpful to have the authors available to chat if you need help or have issues understanding something, though they might be surprised if you’re using their code in a way they didn’t envision and might not be very inclined to provide support to what they consider internal implementation details. So that could be a hit or miss. Luckily the people behind both nsq and bosun were supportive of my endeavors but I also made sure to try to figure out things by myself before bothering them. Another reason why it’s good to pick mature, documented projects. Part of the original meaning of hacking, extended into open source, is a mindset and practice of seeing how others solve a problem, discussion and building on top of it. We’ve gotten used to - and fairly good at - doing this on a project and library level but forgot about it on the level of code, code patterns and ideas. I want to see these practices come back to life. We also apply this at Raintank: not only are we trying to build the best open source monitoring platform by reusing (and often contributing to) existing open source tools and working with different communities, we realize it’s vital to work on a more granular level, get to know the people and practice cross-pollination of ideas and code. Next stuff I want to read and possibly implement or transplant parts of: dgryski/go-trigram, armon/go-radix, especially as used in the dgryski/carbonmem server to search through Graphite metrics. Other fun stuff by dgryski: an implementation of the ARC caching algorithm and bloom filters. (you might want to get used to reading Wikipedia pages also). And mreiferson/wal, a write ahead log by one of the nsqd authors, which looks like it’ll become the successor of the beloved diskqueue code. Go forth and transplant! Also posted on the Raintank blog ### CompsciOverflow #### How to construct a grammar that generates language L? I'm in a Formal languages class and have a grammar quiz coming up. I'm assuming something like this will appear. Consider the alphabet \Sigma = {a, b, c}. Construct a grammar that generates the language L = \{bab^nabc^na^p \mid n ≥ 0, p ≥ 1\}. Assume that the start variable is S. ### Planet Theory #### News for August This month saw more development on testing properties of distributions and a result with intriguing connections to property testing. And for readers who may have missed it, Clément Canonne and Gautam Kamath wrote an engaging survey of some recent work on testing properties of distribution here. Optimal algorithms and lower bounds for testing closeness of structured distributions by Ilias Diakonikolas, Daniel M. Kane, and Vladimir Nikishkin (arXiv). One of the fundamental results in testing properties of distributions is that if we want to estimate the ($$L_1$$) distance between two unknown distributions on a domain of size $$n$$ up to some constant additive factor, we need to draw $$O(n^{2/3})$$ samples from these distributions, and this sample complexity is tight in general. But what if we consider the same problem in the setting where we are promised that the distributions come from some (known) class of distribution? This paper shows that, for many natural classes of distributions, we can obtain much more efficient algorithms for testing the closeness of distributions in this setting. Smooth Boolean functions are easy: efficient algorithms for low-sensitivity functions by Parikshit Gopalan, Noam Nisan, Rocco A. Servedio, Kunal Talwar, and Avi Wigderson (arXiv). The (maximum) sensitivity of a Boolean function $$f : \{0,1\}^n \to \{0,1\}$$ is the size of the largest set $$S \subseteq [n]$$ such that there is an input $$x \in \{0,1\}^n$$ for which $$f(x) = f(y)$$ for every input $$y$$ obtained by flipping the value of the $$i$$th coordinate of $$x$$ for some $$i \in S$$. One of the results in this paper shows that functions with low sensitivity can be locally self-corrected: given query access to a function $$r : \{0,1\}^n \to \{0,1\}$$ that is close to a function $$f$$ with low sensitivity (think of $$r$$ as a corrupted version of $$f$$), there is an algorithm that for any input x \in \{0,1\}^n, queries $$r$$ on a few inputs and outputs with high probability the value $$f(x)$$. This notion of local self-correction is of course closely related to locally-decodable codes; it is also one of the fundamental techniques used to obtain many results in property testing as well (see for example Chapter 3 of Dana Ron’s survey). Can this result, or the techniques used to obtain it, also yield new results in property testing? ### QuantOverflow #### Full Kelly portfolios having same weights as tangency portfolios I'm currently comparing empirically the differences between Markowitz and Kelly portfolios. I calculated the Kelly weights for monthly return observations over 10 years for a sample of 50 stocks from the S&P 500. Without constraints, I received a highly levered Kelly portfolio of weights summing up to 23 with mean of 85% and std. deviation of 91%. I then de-levered the portfolio proportionally and received the same tangency portfolio with the exact same weights as in the Mean-Variance-Optimization case. I was surprised by this result and wanted to ask you if someone knows why this could be the case or made similar observations. I’m very grateful for any kind of advice on this subject. ### CompsciOverflow #### What is the difference of temporal dynamics in RNNs and the NEF Both spiking neural networks created with the Neural Engineering Framework (NEF) and Recurrent Neural Networks (RNNs) can be connected recurrently to exhibit neural dynamics. What is the difference between the set of dynamics that they can approximate and/or exhibit? ### Fefe #### Nicht mal Hände Hoch hilft gegen ballerwütige US-Cops, ... Nicht mal Hände Hoch hilft gegen ballerwütige US-Cops, stellt sich raus. Vielleicht sollten die USA Texas einfach geschlossen an Mexiko zurückgeben. #### Nicht nur das GPL-Lager hat heilige systemd-Kriege, ... Nicht nur das GPL-Lager hat heilige systemd-Kriege, auch OpenBSD hat eine eiternde Dramablase um LibreSSL und stunnel. #### Islands Finanzkrisenbewältigung ist so weit fortgeschritten, ... Islands Finanzkrisenbewältigung ist so weit fortgeschritten, dass die Regierung jetzt ihre Anteile an der Landesbanki verkaufen will. Die musste damals notverstaatlicht werden, die (wohl über 70% der Gesamt-) Anteile wurden dann vom staatlichen Rettungsfonds ISFI gehalten. Formell gibt es zwei Voraussetzungen für den Verkauf: Das Budgetgesetz muss das beinhalten (ist der Fall) und ISFI muss nach einer Prüfung des Marktumfeldes einen formellen Vorschlag vorlegen (ist noch nicht geschehen). Das ist aber auf jeden Fall ein Meilenstein für Island. Es ist übrigens spannend, dass bei einer aktuellen Umfrage nur gut 40% für einen Verkauf waren, und die Mehrheit wollte die Bank lieber staatlich halten. #### Berlin: Senat beschlagnahmt früheres Bankgebäude ... #### Das kennt ihr sicher, klassischer E-Mail-Verteiler-Anfängerfehler. ... Das kennt ihr sicher, klassischer E-Mail-Verteiler-Anfängerfehler. Alle Empfänger nicht im Bcc sondern im To oder Cc. Das kann auch richtig peinlich werden, wenn es eine Klinik für Geschlechtskrankheiten mit HIV-Patienten macht. ### StackOverflow #### Measure purity of a function Is there an existing solution to detect if a function is pure or impure, or the degree of purity using some metrics? The case is, I want to transition my existing app code to more functional-style, and one of the desires is to reduce the impurity of functions to the bare minimum level. I can do it manually, but if can be automated, why not? I know cyclomatic complexity can be measured with tools like ESLint, so I think purity can be too. ### High Scalability #### Building Globally Distributed, Mission Critical Applications: Lessons From the Trenches Part 2 This is Part 2 of a guest post by Kris Beevers, founder and CEO, NSONE, a purveyor of a next-gen intelligent DNS and traffic management platform. Here's Part 1. ## Integration and functional testing is crucial Unit testing is hammered home in every modern software development class. It’s good practice. Whether you’re doing test-driven development or just banging out code, without unit tests you can’t be sure a piece of code will do what it’s supposed to unless you test it carefully, and ensure those tests keep passing as your code evolves. In a distributed application, your systems will break even if you have the world’s best unit testing coverage. Unit testing is not enough. You need to test the interactions between your subsystems. What if a particular piece of configuration data changes – how does that impact Subsystem A’s communication with Subsystem B? What if you changed a message format – do all the subsystems generating and handling those messages continue to talk with each other? Does a particular kind of request that depends on results from four different backend subsystems still result in a correct response after your latest code changes? Unit tests don’t answer these questions, but integration tests do. Invest time and energy in your integration testing suite, and put a process in place for integration testing at all stages of your development and deployment process. Ideally, run integration tests on your production systems, all the time. ## There is no such thing as a service interrupting maintenance ### TheoryOverflow #### Commonality among greatest evidences for variations of class separations [on hold] There are many classes which are conjecturally separate in complexity theory? For instance, the famous P versus NP problem. What are famous examples of such classes which are conjecturally separate and what is commonly considered as greatest evidences for and against such conjectures? What is the main theme behind such evidences? Note that this post is not related to known barriers in class separation. Even evidence that support remote possibilities such as P=NP are welcome. I think a survey of such evidences would be of utility. ### /r/netsec #### Third party website resources can now be verified with a hash in Chromium version 45. ### Lobsters #### The Ten Commandments of egoless programming #### Regretting the Golden Handcuffs: Beware the Costs of Burnout ### /r/scala #### type erasure and overloading I read that overloading has some restrictions because of type erasure. For instance, def f(s: Seq[Int]) = ... def f(s: Seq[Double]) = ...  are not both allowed. Why is that? Type erasure is a runtime thing and overloading a compiletime thing. So, couldn't the compiler define two different functions and decide at compiletime which one to call? submitted by Kiuhnm [link] [5 comments] ### StackOverflow #### Why higher order procedures? So if a language provides higher order procedure then I can have procedure that returns procedure. Something like: (define (Proc a b c) (lambda (x) ( #| method body here in terms of a b c and x |# )))  To create new procedure, I would just do something like: (define ProcA (Proc a1 b1 c1)) ; Would create ProcA that has 1 argument  Similar task could be done in a language which does not support higher order procedure by defining Proc that takes 4 instead of 3 arguments and calling this procedure to define ProcA, like: (define (Proc a b c x) ( #| method body -- does not return any procedure |# ) (define (ProcA x) (Proc a1 b1 c1 x))  So why is there so much fuzz about higher order procedure? Am I missing something? ### Lobsters #### Compilers for Free (2013) ### /r/netsec #### RSA is not dead (yet) - Factoring RSA Keys With TLS Perfect Forward Secrecy ### Lobsters #### In depth guide to stack ### CompsciOverflow #### Recurrence Relation(with Square root) I came across a very peculiar recurrence relation : \sqrt {T(n)} = \sqrt {T(n-1)} + 2 \sqrt {T(n-2)}  with initial values T(0) = T(1) = 1 Any helps on how to find it #### Differences between SISD, SIMD and MIMD architecture (Flynn classification) I have a problem with classifying certain CPUs to the proper classes of Flynn's Taxonomy. # 1. Zilog Z80 According to this article on Sega Retro, Z80 has limited abilities to be classified as SIMD: a limited ability for SIMD (Single Instruction, Multiple Data) with instructions to perform copy, compare, input, and output over contiguous blocks of memory; For what I understand, Z80 is usually behaving as a SISD but when it comes to performing thing like copying or comparing Z80 is able to process multiple data using a single instruction. How should we classify Z80 then? Is the ability to become SIMD processor a voice for or against saying that Z80 implements SIMD architecture? # 2. Intel i5 (Dual core) Form what I understand, we classify multicore CPUs as MIMD. Is it as simple as that? # 3. ARM Cortex-A15 (single core) I'd classify the architecture of this processor as a SIMD model. Wikipedia says that it has superscalar pipeline, but as we know from Why is a superscalar processor SIMD? that multiple pipelines does not imply MIMD model. Are "modern" single cores usually implementing SIMD model or not? ### Dave Winer #### Progress on my Slack clone I spent the last 1.5 weeks working on a very simple chat app that implements a subset of the Slack API. It works! I baked the server side of the code into a new version nodeStorage, which is a convenient place to put it because it already has full support for Twitter identity and of course storage, both of which come in handy for this app. I'm not expecting to release the client soon. I may use it as a chat function for Scripting News, but first I want to let the whole thing settle in. However, I will include the server-side for webhooks in a new release of nodeStorage. That to me is the really interesting part. Chat programs are commonplace, but such a nice simple API implemented in Node.js seems like it might have value to others. (Update: The new version of nodeStorage has been released.) It's been a really fun project. PS: I wrote about this project last month. ### Lobsters #### Comments on the Alliance for Open Media #### Tab Discarding in Chrome: a Memory-Saving Experiment Lurching towards a process model. Comments ### /r/compsci #### PL constructs and/or formalisms that work well for iteration through two (or more) collections? Lately, have been pondering a question about good ways to specify and/or code iteration. Some background: One thing we have gotten very good at dealing with in recent years is iteration through a collection, both formally and in code. Once upon a time the counter-based for-loop held unquestioned dominance in programming land, but nowadays virtually every PL has some notion of an iterable sequence and some support for higher-order abstractions. We map and filter and zip and fold. We use comprehensions. We compose various of these. Etc. Only in "C" does the old way still rule. Ah, but what if we have two collections? Obviously, we can concatenate. We can do a Cartesian product by nesting constructs. And we can iterate in lockstep using a zip or something similar. But there are other things we might want to do. I'm thinking primarily of set-like operations on (usually sorted) sequences. Quintessential examples are std::set_intersection and the rest of them from the C++ Standard Template Library. The Stable Merge operation that is part of Merge Sort is another example. In each of these operations, we are given two iterable sequences, and we traverse each just once, in forward sequential order. But we may not handle the two sequences at the same rate. As an experiment, I tried implementing the set-like operations on sequences in a variety programming languages. I generally found them at least somewhat annoying to write in all PLs: messy, inconvenient, error-prone. Consistently, the standard constructs of the language were a relatively poor match for the problem I was solving. So, my question: has anyone looked at this problem? Has there been significant progress? In particular, are there any programming-language constructs, or formalisms, that have been found to be a good match for this multiple-iteration idea? (Apologies for lousy the post title. I could not think of a concise way to explain what I am wondering about. Still can't, actually.) submitted by ggchappell [link] [1 comment] ### StackOverflow #### How to reduce repetition of prime factors of a number in this method I tried to solve Project Euler #3 as per the documented solution but it is resulting in repetition of prime factors. In the above example for the factors of 10, 2 is repeating multiple time (3 times) where as 5 isn't repeating (I understood the reason behind it), but how to avoid it and print only factors of 10 just 2 and 5. ### Lobsters #### Handling Uncertainty When Estimating Software Projects ### /r/netsec #### Mainframe (SystemZ) Bind Shell - source code ### TheoryOverflow #### Consecutive ones property in rows and columns The clique vertex matrix of interval graphs have consecutive ones property along the columns. Adjacency matrix of which graphs have consecutive ones property in both rows and columns? #### Graphs with consecutive ones property in both rows and columns Consecutive ones property of the rows of adjacency matrix of a graph gives a characterisation of interval graphs. Adjacency matrix of which graphs have consecutive ones property in both rows and columns ?? ### /r/compsci #### O(n^2) Sorting Algorithms working on colored lines ### Fefe #### Nvidia vergisst eine zentrale DirectX-12-Funktion in ... Nvidia vergisst eine zentrale DirectX-12-Funktion in ihrer Hardware. Das ist doch mal eine willkommene Abwechslung von Bescheißen in Benchmarks und Bescheißen in Spezifikationen! Da muss man sich ja langsam fragen, wieviel Rückenwind dieser Art AMD eigentlich noch braucht, bevor da mal was konkurrenzfähiges rauskommt. #### Vorschlag des Jahres: Use Al Qaeda Fighters to Beat ... Vorschlag des Jahres: Use Al Qaeda Fighters to Beat ISIS! Kommt von Ex-General Petraeus, und der kennt sich ja aus mit schlechten Ideen und aus ihnen folgenden sinnlosen Endloskriegen. Herzlichen Gruß auch an die Fotoredaktion für diesen Beitrag :-) ### TheoryOverflow #### Is there a stand-alone statistical ZK argument with concurrent knowledge extraction? Is any known construction for an interactive argument of knowledge that • is stand-alone statistical zero-knowledge, and • allows concurrent knowledge extraction? This is a weakening of my previous question from 6 months ago, and is something that would presumably be easier to construct. #### Quasi-polynomial time algorithm for graph isomorphism GI arguably is the best known candidate for NP-intermediate problem. The best known algorithm is sub-exponential algorithm with run-time 2^{O(\sqrt{n \log n})}. However, I am not aware of any quasi-polynomial algorithm for it. GI is not NP-complete unless polynomial hierarchy collapses. Minimum Dominating set in Tournaments, Group Isomorphism, and Tournament Isomorphism problems are solvable in quasi-polynomial time (O(n^{\log n})). The last two problems are polynomial time reducible to GI. Is there any complexity theoretic conjecture that would rule out quasi-polynomial time algorithms? Can we efficiently reduce Minimum Dominating set in Tournaments to GI? Is there any reason to believe that GI can not be hard for quasi-polynomial time? ### StackOverflow #### Reading from screen of laptop/pc/mobile(bluestack) [on hold] Here is my question. Is there any programing language or a software which can read from specific portion of screen? Like in this image (think as i have opened it from bluestack) there is "inbox " Written on top.. So if i give x and y coordinate enboxing those letters then output should be "inbox is written in those coordinate". ### /r/compsci #### A Simple Proof Checker for Teaching ### Lamer News #### Family Law Lawyer Alexandria, VA ### Lobsters #### I'm a developer, but it's not my passion ### Dave Winer #### CitiBike goes north Things are going to get interesting as CitiBike goes north up the west side of Manhattan. The yellow markers in the map below are stations that haven't been installed yet. But some of the new ones have already been installed. Places you'll be able to get to by bike: Lincoln Center, Zabar's, Barney Greengrass, the Museum of Natural History, the Metropolitan Museum, 72nd St stores, the Marina, a bunch of new places on the edges of the park. I only wish they put a station near the fountain in the middle of the park. It would probably have to be huge it would be so well-used. (And it would lead to lots of abuse by bikes on walking paths, which is probably why they didn't do it.) #### Idea If you have a restaurant or shop in Manhattan, west Brooklyn or Queens, pay for a CitiBike station outside your business, and let your customers know they can conveniently ride a bike to come eat or shop there. It'll be like having a subway stop in front of your store, soon enough. An investment for the future of the city. ### /r/compsci #### F*: A Higher-Order Effectful Language Designed for Program Verification ### UnixOverflow #### What's the name argument to sh for/do? Taken from FreeBSD's man page on sh (because its the most convenient online, target platform is Debian if it matters): SH(1) FreeBSD General Commands Manual SH(1) NAME sh -- command interpreter (shell) SYNOPSIS sh [-/+abCEefhIimnPpTuVvx] [-/+o longname] [script [arg ...]] sh [-/+abCEefhIimnPpTuVvx] [-/+o longname] -c string [name [arg ...]] sh [-/+abCEefhIimnPpTuVvx] [-/+o longname] -s [arg ...] ...  I'm particularly interested in the use case of: sh [-/+abCEefhIimnPpTuVvx] [-/+o longname] -c string [name [arg ...]]  ### Example: # Normal: myscript hello world myscript hello world >/var/log/myscript.combined 2>&1 # Constrained: # - Arguments 'hello' and 'world' are suffixed by calling program. myscript >/var/log/myscript.combined 2>&1 hello world # Constrained Further: # - Wrapped in sh to catch system messages, like Segmentation Fault sh -c 'myscript @' _ >/var/log/myscript.combined 2>&1 hello world  I noticed that the first augment was not passed to myscript, and checking the documentation alludes to a name parameter, that I didn't see a doc section on. In my example I have added ad _ in place of the name argument, but: ### What should I fill for name? ### CompsciOverflow #### What kind of extra optimizations can a JIT compiler perform to a dynamically typed language? For dynamically typed languages such as python, ruby etc., the usual approach to maximize runtime performance is by just-in-time compilation. pypy would be a good example. It is possible to achieve dynamic typing with a statically typed language such as C++ by attaching a type code to every object, and the functions can operate for the correct type by runtime branching. That is, you can switch the type code. I once did an experiment in C++ by creating a dynamic type that can hold either an integer, floating point number, a pointer to an arbitrary precision integer, or a pointer to a dynamically allocated array. In this class there is an integer representing the type code and a union that holds an int, double, and void *. With this, I could directly translate a piece of python code leaving all the dynamic properties there. The speed improvement after this translation was about 20~40% depending on the type of code. But the interesting part is that pypy often performed unbeatably better than the C++ translation. Of course a fully optimized C++ code with its usual static type information will be the fastest, but I could realize that the pypy optimizer is applying some special optimizations that a typical static compiler doesn't. I'm no expert in this area, but I'm still interested in what kind of extra optimizations a JIT compiler can perform that a static compiler cannot do easily, especially to dynamically typed languages. ### Fefe #### Die Türkei will Anklage gegen zwei Journalisten von ... Die Türkei will Anklage gegen zwei Journalisten von Vice News erheben, weil auf dem Laptop ihres ortskundigen Führers Verschlüsselungssoftware gefunden wurde, die häufig von ISIS benutzt wird. ### CompsciOverflow #### Converting domain-specific natural language to database query I am interested in building something similar to what is shown in this video. For those unable to watch the video: it shows a system that takes natural language queries (which are all related to basketball) and converts them to database queries. Examples from the video: "lebron james most points" "Who was the last player to average 24+ ppt in a season" "lebron ppg by month last season"  My plan was to build a grammar, where different rules would map to parts of a database query. To process a new natural language query, I would then stitch together these different query-parts by traversing the tree in-order (or maybe post-order) to construct a query. My thoughts are that this is a pretty naive approach, and am looking for a way to bolster this approach. ### TheoryOverflow #### Who introduced nondeterministic computation? I have two historical questions: Who first described nondeterministic computation? I know that Cook described NP-complete problems, and that Edmonds proposed that P algorithms are "efficient" or "good" algorithms. I searched this Wikipedia article and skimmed "On the Computational Complexity of Algorithms," but couldn't find any reference to when nondeterministic computation was first discussed. What was the first reference to the class NP? Was it Cook's 1971 paper? ### Dave Winer #### Launching a Mac app from Node? Back in 1999, I needed to keep Frontier running on a Mac server, so I wrote a little AppleScript that watched to see if the app was still running, and if not, launch it. I need the same functionality today, but I'd rather use Node than AppleScript, if possible. #### The question Is there a way to make system calls to the Mac OS from Node that allow you to: 1. Tell if an app is still running. 2. Launch an app. Any help much appreciated. PS: Cross-posted on Facebook. #### Update Ted Howard posted a solution on the Frontier-User list. It took a few tries but I got it working on my Mac. Problem solved. ### Lobsters #### Propositions as Types with Michael Bernstein #### ES6 Object Literal Features in Depth ### TheoryOverflow #### Trying to understand a paper on ksvd algorithm (dictionary learning) by Elad, et al Trying to understand a paper titled KSVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation by M.Elad, et al; my take of section IV.C. detailed description of KSVD, is that we have an objectiove function which we wish to minimize with respect to two variables, D (which is called the dictionary) and X (the coefficient matrix.\min_{X,D} \{||Y-DX||_F^2\} \quad \textrm{ subject to} \quad \forall i, ||x_i||\leq T_0

One method of doing this is to divide it to two stages, sparse coding and dictionary update. In the sparse coding stage, we assume D is fixed and solve the above problem w.r.t X using a pursuit algorithm. Then we must update the dictionary to better fit our data using the coefficient matrix from previous stage. In the update stage, we keep both X and D fixed and only one column of D and its corresponding row in X remains in question. After simplifying the objective function and decomposing the multiplication DX to the sum of k rank-1 matrices, one might suggest to use svd. But that would ruin the sparsity of X, so wee need to set constraints. At this point, the authors define some variables and use them for setting constraints. The problem is, I don't understand what is meant by this statement:

examples that use the atom $d_k$,i.e. those where $x_T^{k} (i)$ is nonzero.

Any clarifications would be appreciated.

### Fred Wilson

#### What Are We Doing?

I’ve been told over the years that the number one reason employees leave a company for another job is that they don’t know where the company is headed and they don’t know what their role in that roadmap is.

Apparently a bad boss and below market comp and long working hours all come in lower than that one.

Which tells me that if people believe in you and your plan and have a role to play in it, they will put up with a lot of other stuff.

I am not advocating for a hostile workplace or below market comp.

What I am advocating for is the value of having a clear and intelligent plan, communicating it often and often, and, most importantly, mapping that plan to each team and each person in your organization.

This is pretty easy when you are five people. It gets harder when you are fifty people. And it’s really hard when you are 500 people.

The communication plan is very important particularly as the size of the company grows. And making it matter to everyone is quite challenging.

But it all starts with a plan. I know companies that are great at communicating but don’t really have a coherent plan. All the communicating in the world won’t help them.

So figure out where you are taking your company. Answer the basic question on everyone’s minds, “what are we doing?”, and you will be on a path to building a loyal, hardworking, and motivated team.

### QuantOverflow

#### Likelihood of a caplet ending in the money

with what likelihood would one expect an ATM caplet to end up in the money? Just as a very rough guess, from real world experience.

When I consider N(d2) from the Black formula, for spot = strike = 4%, vola = 20%, T = 1, tenor = 12m, I get something around 46.*%.

On the other hand, when I think about it qualitatively: Under most market conditions, the forward rate curve is increasing. As a very, very rough argument, we would expect the spot rate 1 year into the future to be similar to today's spot rate rather than today's 1 year forward rate, i.e. to be lower than today's ATM strike. Thus, under the physical measure, the caplet should be less likely to end up in the money than end up out of the money. Does this roughly conform with a number around 46%?

For my thesis, I am using a simulation model that tries to create synthetic realizations of market data (in a complicated procedure that would go too far to explain here). In this synthetic world, I get ITM-likelihoods much lower than the above and I am wondering about real world estimates from a practitioner's point of view.

### StackOverflow

#### How Do I Read This Hole Error

I am trying to write an implementation of flatMap. But this is inconsequential, I'm really interested in understanding the error message. But for context so far I have:

flatMap ::
(a -> List b)
-> List a
-> List b
flatMap f xs = foldRight undefined undefined _undefined


This gives the following error for the type hole:

Found hole ‘_undefined’ with type: List a0
Where: ‘a0’ is an ambiguous type variable
Relevant bindings include
xs :: List a (bound at src/Course/List.hs:266:11)
f :: a -> List b (bound at src/Course/List.hs:266:9)
flatMap :: (a -> List b) -> List a -> List b
(bound at src/Course/List.hs:266:1)
In the third argument of ‘foldRight’, namely ‘_undefined’
In the expression: foldRight undefined undefined _undefined
In an equation for ‘flatMap’:
flatMap f xs = foldRight undefined undefined _undefined


I know the right peg for this hole is xs, but I dont understand why the haskell compiler has given it a type denoted List a0 and how that relates to the type I'm going to use of List a?

I think it has something to do with the fact that the whole could take a superset of the type List a or something? by why then didn't it just call it List b instead of specifically List a0.

Cheers, Jim

### Lambda the Ultimate Forum

#### The most obsolete infrastructure money could buy - my worst job ever

A funny article by Juho Snellman about really existing legacy software engineering and PLT.

For example on my first day I found that X was running what was supposedly [the] largest VAXcluster remaining in the world, for doing their production builds. Yes, dozens of VAXen running VMS, working as a cross-compile farm, producing x86 code. You might wonder a bit about the viability of the VAX as computing platform in the year 2005. Especially for something as cpu-bound as compiling. But don't worry, one of my new coworkers had as their current task evaluating whether this should be migrated to VMS/Alpha or to VMS/VAX running under a VAX emulator on x86-64!

Why did this company need to maintain a specific C compiler anyway? Well, they had their own ingenious in-house programming language that you could think of as an imperative Erlang with a Pascal-like syntax that was compiled to C source. I have no real data on how much code was written in that language, but it'd have to be tens of millions lines at a minimum.

The result of compiling this C code would then be run on an ingenious in-house operating system that was written in, IIRC, the late 80s. This operating system used the 386's segment registers to implement multitasking and message passing. For this, they needed the a compiler with much more support for segment registers than normal. Now, you might wonder about the wisdom of relying on segment registers heavily in the year 2005. After all use of segment registers had been getting slower and slower with every generation of CPUs, and in x86-64 the segmentation support was essentially removed. But don't worry, there was a project underway to migrate all of this code to run on Solaris instead.

### Planet Theory

#### TR15-144 | Explicit resilient functions matching Ajtai-Linial | Raghu Meka

A Boolean function on n variables is q-resilient if for any subset of at most q variables, the function is very likely to be determined by a uniformly random assignment to the remaining n-q variables; in other words, no coalition of at most q variables has significant influence on the function. Resilient functions have been extensively studied with a variety of applications in cryptography, distributed computing, and pseudorandomness. The best known balanced resilient function on n variables due to Ajtai and Linial ([AL93]) is Omega(n/(log^2 n))-resilient. However, the construction of Ajtai and Linial is by the probabilistic method and does not give an efficiently computable function. In this work we give an explicit monotone depth three almost-balanced Boolean function on n bits that is Omega(n/(log^2 n))-resilient matching the work of Ajtai and Linial. The best previous explicit construction due to Meka [Meka09] (which only gives a logarithmic depth function) and Chattopadhyay and Zuckermman [CZ15] were only n^{1-c}-resilient for any constant c < 1. Our construction and analysis are motivated by (and simplifies parts of) the recent breakthrough of [CZ15] giving explicit two-sources extractors for polylogarithmic min-entropy; a key ingredient in their result was the construction of explicit constant-depth resilient functions. An important ingredient in our construction is a new randomness optimal oblivious sampler which preserves moment generating functions of sums of variables and could be useful elsewhere.

### Undeadly

#### Native EFI Bootloader Support for OpenBSD

As the j2k15 hackathon comes to a close, OpenBSD gets its very own native EFI bootloader.

On Twitter, Yojiro UO (yuo@-san) posted a list of systems tested and working with the new bootloader.

As you can see, a number of EFI-only systems are now successfully booting OpenBSD.

And as seen on twitter Yasuoka@-san posted a little teaser with the dmesg from an EFI-only MinnowBoard

### CompsciOverflow

#### starting android [on hold]

I have installed Android SDK studio and eclipse platform. The two have been integrated as per the instructions.

Now how to check if the integration was proper?

Basically how am I supposed to run a hello world program to check if everything is going well? What are the steps I need to follow.

#### doubt about phones present in the dictionary [on hold]

I am using KALDI speech recognition toolkit to train my system for speech recognition and I am using ILSL12 naming convention to represent various sounds.

I am facing a challenge in dictionary update:

When I am to break a word into phones (or individual sounds), then for all the complete sounds that are there in the middle of the word, I am adding an 'a' phone to represent complete sound.

However, when a similar complete sound occurs at the end of the word, then I am instructed not to put that 'a' sound at the end.

Is it fair?

For Example:

Suppose the word is nal (word has complete 'n' sound and complete 'l' sound) When breaking it into phones, I am writing

'n a l'

the a after n means that n has complete sound (as was already given). But l also had the complete 'l' sound; so why are we not writing nal as 'n a l a'?

#### recursive function as circular funciton

I recently learnt that a recursive function is also referred to as a circular function.

See, the recursive function cannot be a circular function because even if there is a call to function itself, it does not happen indefinitely. There is always a base case (or condition) from where the function is supposed to exit in one way or the other (either by displaying some message or returning some value).

So, is calling recursive function, a circular function not a misnomer ?

### UnixOverflow

#### What is the difference between vim and vim-lite?

I want to install vim on my FreeBSD machine, and this other site suggests installing from FreeBSD ports like this:

# cd /usr/ports/editors/vim-lite/
# make install clean


This is all great, and I intend to install from the FreeBSD ports. However, I notice that there is also this other directory on my machine:

/usr/ports/editors/vim/


What is the difference between vim and vim-lite? And why would I want one versus the other? Obviously, vim-lite is "lighter"; but what am I sacrificing to have a lighter version?

### QuantOverflow

#### How to calculate yield spread?

I came across this multiple choice question on yield spread and I can't understand why the reasoning for the selected answer is correct.Can you confirm or clarify ?

( emphasis in the text is mine)

The bond's market price of 103.73 can be computed by discounting its cash flows continuously at 4.0% per annum, which is represented by the flat yellow line. Specifically:

$3.00e^{-.04*0.5} + 3.00e^{-.04*1.0} + 3.00e^{-.04*1.5} + > 103.00e^{-.04*2} > = 103.73$.

The bond's same market price of \$104.73 can also be derived by discounting the same cash flows according to the continuous discount rates given by the the steep blue line. The lower steep line, which shows a rate of 0.40% at six months, is actually two nearby curves: a swap rate curve and nearby spot rate curve.

Both start at 0.40% but, as the spot rate curve is slightly steeper, by year 2.0, the spot rate is 1.61% while the swap rate is 1.60%. For this purpose, we assume both the spot and are risk-free curves; e.g., US Treasury.

Each of the following is true about this bond EXCEPT which is false?

a) The bond's yield-to-maturity is 4.0%

b) The yield spread, represented by the solid red vertical arrow, is the difference between 4.0% (yellow line) and 0.40% (spot rate at six months)

c) If the price of the bond decreased due solely to perceived credit risk of bond (without any change in market risk), the upper curves (yellow and blue) would shift up

d) The z-spread, represented by the dashed red vertical arrow, is the difference between the (upper steep) blue line and the (lower steep) spot rate; e.g., 2.42% = 4.03% - 1.61%

The given answer is:

B. The yield spread is the "single-point" difference between the yield to maturity (YTM) and the corresponding riskless spot rate; in this case, yield spread ~= 4.0% - 1.61% ~= 2.39%. As the endpoints at year 2.0 are near to each other, this yield is only slightly less than the z-spread of 2.42%, due primarily to the fact that the upper blue line endpoint must necessarily be slightly greater than the 4.0% YTM. If the corresponding riskless swap rate is necessarily interpolated, then this yield spread is an i-spread

Isn't yield spread the difference in yield to maturity between the corporate bond and the risk free bond? In this question, 1.61 is the 2 year spot and not a yield to maturity. What am I missing?

### QuantOverflow

#### Given cash flows, what is the interest rate of the following contract? [on hold]

I am presented with an investment opportunity where I am given #481,000 on day 1. Thereafter, every 10 days, I am required to give back #50,000 every for 100 days (10 * 50000 = 500000).

How do I calculate the interest rate I am paying?

I am guessing I have to use the present value of annuity problem to find out the interest rate.

So, my present value is #481,000. My "annuity" is 50000 every 10 days. First payment is due on the 10th day. Last payment is due on day 100.

Plugging the above values in wolframalpha I get .7107% for interest rate. I divide that by 10 to get per day interest rate and multiply by 365 to get 25.94%

I am surprised to see the above answer. It is a lot more than 14% which is what the rate would be if I were to pay 500000 at the end of 100 days. Is my reasoning incorrect?

### StackOverflow

#### Javascript Reduce: Initial Value

I would like help in writing my version of Reduce Method. I need help when initial value is omitted. I can use the shift() method. But I believe the testers wants something more elegant.

var reduce = function (arr, iterator, start) {
var result = start;
arr.forEach(function(element, index) {
result = iterator(result, iterator(start, element,index, arr))
})
return result;
}


### UnixOverflow

#### How do I allow user permissions to data on a Couchpotato and Sickbeard Plugin using Freenas

Trying to setup permissions for the plugins sickbeard / couchpotato.

I’ve read almost every thread google / the forums have but haven’t really found a solution yet.
I’m assuming it’s the data here: (Jail name)/usr/pbi/xxxxxxxxxx
I tried:

chown -R guest:guest /usr/pbi/sickbeard-amd64/*
chown -R guest:guest /usr/pbi/couchpotato-amd64/*


But I couldn’t get it to work.
Or would I have to chmod 777` the folders?

### /r/emacs

#### How can I load text macros in emacs?

Here's what I want to do.

1. Open emacs.
2. I write a lot of emails during my work day. I want to have it so that I can do something like type "th", use a keystroke, and paste in "Thank you for your message." Alternatively, I type "ed", use a keystroke, and paste in "To do that, you'll want to hit the top right button..." At the end, I'll just save the file, and paste it in to the clipboard (I know how to do that part). You get the idea.
3. I get the feeling I can do this using the init.el file to define functions, and then M-x to run the functions, but I'd like to see examples and a sample function of that type, to fill in those pieces.
submitted by moscowramada
[link] [6 comments]

### Wes Felter

#### "The problem of finding opportunities can be speedily solved by recursively applying a FIFO..."

“The problem of finding opportunities can be speedily solved by recursively applying a FIFO queue-based Bellman Ford algorithm to a graph of the negated logarithms of the ratios of the funded tips of all order books to find negative cycles and removing the smallest offer in each cycle on each recursion.”

- Donovan Hide on Exploiting Ripple Transaction Ordering For Fun And Profit

### arXiv Programming Languages

#### Program Synthesis using Natural Language. (arXiv:1509.00413v1 [cs.PL])

Interacting with computers is a ubiquitous activity for millions of people. Repetitive or specialized tasks often require creation of small, often one-off, programs. End-users struggle with learning and using the myriad of domain-specific languages (DSLs) to effectively accomplish these tasks.

We present a general framework for constructing program synthesizers that take natural language (NL) inputs and produce expressions in a target DSL. The framework takes as input a DSL definition and training data consisting of NL/DSL pairs. From these it constructs a synthesizer by learning optimal weights and classifiers (using NLP features) that rank the outputs of a keyword-programming based translation. We applied our framework to three domains: repetitive text editing, an intelligent tutoring system, and flight information queries. On 1200+ English descriptions, the respective synthesizers rank the desired program as the top-1 and top-3 for 80% and 90% descriptions respectively.

#### Reception Probabilities in 5G Vehicular Communications close to Intersections. (arXiv:1509.00399v1 [cs.SY])

Vehicular networks allow vehicles to constantly share information with each other and their surrounding, and are expected to be an integral part in future intelligent transportation system (ITS). However, the diversity of ITS applications in combination with the extremely stringent demands in particular posed by safety critical applications makes the design of vehicular communication a challenging task. 5G device-to-device communication as well as 802.11p are promising to address this challenge. In order to guide and validate the design process, analytical expressions of key performance metrics such as outage probability an throughput are necessary. In this paper, we focus on the intersection scenario, and present a general procedure to analytically determine the success probability of a selected link as well as system-wide throughput. We provide an overview of the salient properties of vehicular communication systems near intersections and show how the procedure can be used to model signal propagation conditions typical to different environments of practical relevance, for instance rural and urban scenarios. The results indicate that the procedure is sufficiently general and flexible to deal with a variety of scenarios, and can thus serve as a useful design tool for communication system engineers, complementing simulations and experiments.