Planet Primates

October 09, 2015



Style concerning semi-colons and return statements

I'm new and just getting into Scala. I'd like to understand the reasoning behind what seem to be certain style conventions. (I've been looking at a few I online to get a feel for generally accepted scala style conventions)

Scala allows, but doesn't require semi-colons. The general style convention I've seen is to not use them, but I kind of like having semi-colons at the end of the lines. Can someone help explain the motivation behind this being a style?

The same thing goes for return statements in functions. I like having it explicitly state what exactly is being returned from the function.

I'll admit freely that I'm new to all of this and would appreciate some input. Thanks in advance for the help.

submitted by newo4110
[link] [comment]

October 09, 2015 07:58 PM


Binary option greeks formula for american style exercise

I got Binary option greeks formula in many below links but if i am not wrong indirectly they all are related to European exercise style.





In many finance educational website and finance industry website I saw it is mentioned Binary options are European style.

So is the formulas mentioned in Link 1-4 is enough or we need specific formulas for Binary option greeks with American style exercises if we want to build a system for Binary option trading risk calculations?

by purnendumaity at October 09, 2015 07:56 PM

DragonFly BSD Digest

DragonFly and African digital libraries

The Tanzanian Digital Library Initiative is using DragonFly (and FreeBSD) as part of their library setup, and Michael Wilson, the project coordinator sent a note to users@ describing this.  They are looking to spread through the continent, so get in contact if you want to be part of the project.

by Justin Sherrill at October 09, 2015 07:49 PM


Resize XFS home directory in LVM

I am desperately trying to increase the swap space available to my server from 4GB to 16GB, however at present XFS + something is making this impossible.

I have a single SSD hosting the OS (CentOS 7), with a boot partition and a LVM2 partition, the defaults of the installation. Inside the LVM2 partition are three virtual partitions, root, home, and swap, consuming the entire LVM2 partition. Again by the defaults of the installation, these are in the XFS filesystem type.

My simple want was to shrink the under-utilised home partition, making space for the swap partition. Having now discovered that shrinking XFS partitions is impossible, I am down the rabbit hole of backing up the home drive deleting the partition and reinstating it in smaller size with a recover from the backup.

The problem now is I cannot seem to free the home directory or the partition. Having dropped out of the GUI, logged out and back in as root at the terminal, initially I could not unmount the home directory. Something was in use though lsof and fuser showed nothing open on it.

I issued a

umount -f home

and this appears to have worked in unmounting home. I can now freely mount and unmount home without the -f callout, which seems suspicious. However now trying

lvremove /dev/centos/home

the message returned is

Logical volume centos/home in use.

I have looked at many questions and answers here and it is none of the following:

  • NFS service
  • Open files
  • Open files using the major,minor designation
  • the Active flag (have tried lvchange -an -v /dev/centos/home, it also claimed the volume was in use)

Nothing I can find seems to give a hint as to what is using home. If I could get a solution that was not dependent on rebooting the machine, that would be ideal.

by J Collins at October 09, 2015 07:46 PM



Are all minimum spanning trees optimized for fairness?

I know by definition that a minimum spanning tree (MST) of a weighted, connected graph has the lowest global value for for sum of all edges in a path that connects all vertices.

I'm curious if all MSTs are also optimized for lowest individual edge value possible. (I know that ones calculated with Kruskal's Algorithm are!)

As an example, is it possible to have two MSTs in a graph (equal global sum of their edges) but have one of those MSTs contain an edge of higher value than any edge on the other MST in the graph.

Thanks for any help you can give!

by Acoustic77 at October 09, 2015 07:35 PM


Complexity of Best known algorigthms for Hamiltonian Cycle?

I think Isaw somewhere that the best known algorigthms for the generaal Hamiltonian cycle problem are O (2^(n/2)) Is this right? What is the best so far Thanks

by R. K. Molnar at October 09, 2015 07:22 PM


Wikileaks hat den "Intellectual" "Property"-Teil von ...

Wikileaks hat den "Intellectual" "Property"-Teil von TPP. Das wird in TTIP genau so aussehen.

Ich könnte mich ja bei solchen Dokumenten immer kaputtlachen, wenn sie als Begründung für den ganzen Scheiß ernsthaft sowas hier bringen:

  • promote innovation and creativity
  • facilitate the diffusion of information, knowledge, technology, culture and the arts
  • foster competition and open and efficient markets
Wenig erzeugt mehr Produktivität und Kreativität als am Ende sein Produkt von der MPEG-Mafia wegen angeblicher Patentverletzungen vom Zoll beschlagnahmen zu lassen kurz vor der wichtigen Industriemesse!

Und wenn irgendwas gut für die Verbreitung von Informationen, Wissen, Technologie, Kultur und den Künsten ist, dann sind es IP-Regeln, die den Pirate Bay zumachen, wo genau das getan wurde.

Und nichts hilft dem gesunden Wettbewerb wie künstliche, vom Staat durchgesetzte Monopole!

Dass die sich überhaupt trauen, diesen Scheiß in dieses Dokument zu schreiben! Aber da gibt es eine einfache Begründung für. Das Dokument ist TPP CONFIDENTIAL. Die gingen davon aus, dass das niemand mit mehr als einer Gehirnzelle zu Gesicht kriegt. Also niemand außerhalb ihrer Kreise.

Hier ist die Pressemitteilung von Wikileaks. Wieso finde ich das gerade nur der Süddeutschen? Wo sind Tagesschau, FAZ, Spiegel und co?

Der Guardian und RT haben natürlich was. Mann ist unsere Presselandschaft mal wieder abschreckend. Und dann wundern die sich, wenn sie von den Leuten ausgeschimpft werden.

October 09, 2015 07:00 PM





Monte Carlo VaR

How are simulated the factors returns in Monte Carlo VaR? Suppose my portfolio returns are modeled as follows:

r = B_1*X_1 + ... + B_k*X_k + spec risk

and let's suppose X_i are iid and normal distributed. After I got a sample of correlated random variables, how should I simulate the factor returns?


by user3705076 at October 09, 2015 06:47 PM


How would I simulate a network to explore the percolation threshold of a network connected by the knight's move?

"If we consider the squares of an infinite chess board as nodes of our graph and consider each to be connected to the other eight squares that are a knight's move away from it what is the percolation threshold of this graph?"

Note:One way I have thought about this problem is to try to use vectors: We can think of a knight's move as a vector of the form $<\pm1,\pm2>$ or $<\pm2,\pm1>$ for the eight cases.

In answering this question I think I need to create a simulation. How would I create a program to simulate an infinite board, for instance, and remove random nodes to create a graph with a percolation value.

by Math Man at October 09, 2015 06:40 PM



What does Big O notation actually specify?

Regarding time complexity I've read conflicting things:

1) That it is worst case.

2) That is average case.

For example if I want to know the time complexity for inserting into an arbitrary point in a linked list ( not the beginning or the end ), on average it will take

(n/2) operations so we can drop the constant and say

it is


So would Big O be n or would Big Omega be n. Or are they the same thing?

by cade galt at October 09, 2015 06:32 PM

Jeff Darcy

Winter Running in New England

I still consider myself a bit of a running n00b. Several months ago, I was even more of one - so much so that I kept running through one of the worst winters anyone here seems able to remember. Paradoxically, that n00b decision seems to have left me in the position of knowing more than most about how to run safely in those conditions. Since a couple of friends have expressed curiosity about that exact topic recently, I might as well collect those thoughts here.

First, the good news. It's entirely feasible to keep running all through a harsh New England Winter. There are certainly some challenges, which I'll try to address. There are rewards too. However, a little bit of context might help. I'm not that hard core. I'm talking about running in the suburbs, not the city or the country. I'm sure those present their own different challenges, of which I am certainy still ignorant. I'm not talking about extreme conditions, either. Even in the depths of winter I was still running on dry streets, not snow, and only down to about 15ºF. I'm crazy, but not that crazy.

The most important thing about winter running is situational awareness. Sidewalks are likely to be useless, so you'll be out in the road with the cars. Both visibility and mobility are going to be restricted by piled-up snow and other obstacles. This is a dangerous situation, so the first thing you want to do is improve your odds as much as possible. Always know where you'll go if a car comes along, and for heaven's sake don't impair your ability to hear them. Learn when and where the school/work rush hours are going to pose a problem. Learn how long the snowplows remain out after a snowfall (so you can avoid them) and where they dump the big piles (ditto). Learn which roads have too many turns or driveways with poor visibility, and avoid them. Ditto for steep downhills (uphills are actually OK) and places where puddles are likely to form. Some of my favorite summer routes are unusable in winter for one or more of these reasons, but that's life. Knowing a variety of routes in your neighborhood is always good, but these limitations make it even more important in winter.

Another big thing for winter running is knowing the weather. I find that Weather Underground is very accurate ahead of time, but just before I go out I double-check on AccuWeather; their "MinuteCast" is often eerily accurate. While knowing the temperature and likelihood of precipitation might determine when I run, wind speed and direction might determine where. Again, knowing a lot of routes comes in handy. There's nothing quite like coming over a hill or around a bend and getting blasted with a freezing wind. Lack of leaves on the trees might be good for visibility, but it can also make you more exposed.

OK, so let's talk gear. The most important thing is not so much specific items or brands but flexibility. I'll wear different gear if it's 32ºF than if it's 24ºF, and different gear again if it's 16ºF. Wind and humidity are also factors. It's also important to remember how much you warm up while you're running. I warm up a lot, so I dress to be slightly cool at the outset and I still usually end up taking off my cap and gloves before I'm done. Lastly, don't wear cotton. Sweat + cold = death, and cotton just absorbs too much. I'm an all-synthetic guy myself, but others swear by wool and/or silk.

With all that said, and purely by way of example, here are some of the items I have in my own winter-running closet. YMMV.

  • Head: lightweight "beanie" style hat. We're talking no more than a couple of layers of thin poly here. I'm not a big fan of ear warmers, but I do make sure my cap covers most of my ears. You can find any number of these at any running store.

  • Face: I have a convertible hat/mask that I really like, but mostly for snowboarding. I only used it for running on the coldest days; otherwise it was too warm.

  • Trunk/arms: Mostly I'd run in my usual T-shirts plus a lightweight jacket which I just love (mine's red BTW). Breathes well, nice little thumb holes to keep the sleeves from riding up, reflective material, etc. For really cold weather I have a couple of thermal long-sleeved shirts, but even the lighter one would be too warm above 20ºF or so.

  • Hands: Like beanies, lightweight gloves are easy to find. I have two pairs, one Under Armour and one Saucony (I think). The UA ones are very slightly warmer, which I mention because even tiny gradations become very noticeable when you're out there. It's worth it to have multiple hat and glove options.

  • Underwear: I have some New Balance, some Puma, some Champion. I can barely tell the difference. The important thing is that none of them are cotton.

  • Legs: probably my favorite find (narrowly beating out the jacket) was these leggings/tights. They're honestly a bit of a pain to get on and off, but they're absolutely perfect for keeping the wind and splatters off. I'd wear these with shorts over them for a little extra warmth and to look (just slightly) less silly, anywhere from 40ºF on down, and my legs never felt too warm or too cold. Modern technology is awesome.

  • Socks: the biggest decision point for each run. At the warmer end, I could actually get away with the same Balega or Fitsox I wear all year. At the colder end I'd wear insulated socks. Most of the time, in between, I'd wear calf length compression socks or (if I ran out of those) light ski socks.

  • Shoes: there are special winter running shoes, and "micro spikes" for better traction, but to be honest I never had much use for either. I just ran in the same Asics GT-2000 shoes I'd been using already, and tried to avoid puddles. My feet never felt cold, and I never felt that I didn't have enough grip.

  • Other: certain kinds of chafing are more of a problem in winter. I'll just mention Transpore tape and Friction Defense as potential solutions. If you have that even more awkward kind of chafing, I can recommend Chamois Butt'r. If you think it's gross to talk about these things I'm sorry, but if that's what it takes to save someone else some discomfort then it's worthwhile. I wish somebody had clued me in before I had to figure this stuff out on my own.

With all of that gear and preparation and good habits, you should be able to run safely even in that New England winter. It can even be fun. There's a special kind of quiet after a storm, and a special kind of light all the time. There are no cyclists. Places that are hidden behind greenery in summer become visible through bare trees. There's no danger of overheating. This spring, I was worried that I wouldn't even be able to run in anything over 50ºF because I'd gotten so used to it being cooler. I did adjust after all, but I think I still prefer running when it's cooler. You might find that you enjoy it too, no matter how crazy it seems.

October 09, 2015 06:31 PM


Why do CFDs track the underlying?

My understanding of CFDs is that the profit you make on a CFD is the difference between the price at which you bought the CFD and the price at which you sold your CFD minus various charges/commission.

The idea being that the CFD tracks the underlying and therefore you can speculate on the price of the underlying.

My question is, why does the CFD track the underlying? There is nothing about the CFD that forces it to follow it is there? Or is there?

Edit: To clarify the question... The CFD is priced according to supply and demand The underlying is priced according to supply and demand These two are priced independently. Isn't it possible that they move completely independently of each other?

by dan at October 09, 2015 06:13 PM


Normalised Floating Point System

Okay so I have a floating point number system and I have a number I need to calculate the exact relative error after rounding. The number is clearly an overflow. Does anyone know what I should do?

I've tried converting the number to the from decimal to the base of the system. then just chopping of and round the excess and keeping the exponent within the bounds of the system.

like i had (.a0a1a2a3a4...) and the exponent was say b^4 i chopped off at a3 and rounded so now i have (.a0a1a2a3) and i just changed the exponent to the bound which is b^1

so my new number is (.a0a1a2a3) * b^1 my answer need to be in base 10 so I take this number go back to decimal and calculate the exact relative error. I got something like 0.99... Is this what I'm suppose to do or is there something else to be done when it overflow?

by user1804234 at October 09, 2015 06:05 PM


New DoorBot for the Recurse Center

My lastest project during my time at the Recurse Center is this: a new robot for opening the door for alumni!


by empty at October 09, 2015 06:04 PM


question about ALUs

im not sure if this belongs here or not. if not, then can someone redirect me to where i can go??

okay, well im trying to do a problem and i have no idea where to start, this is the question: Assume we have a 16-bit Arithmetic Logic Unit. List the inputs and outputs in binary for the ALU if we are using it to determine if X = 15_10 < Y = 23_10. Remember: the ALU has three inputs and four outputs. Use 16 bits or 1 bit to represent the inputs and outputs as appropriate. The selection value will be (11)_2 for Set on Less Than.

submitted by _Valentine_
[link] [2 comments]

October 09, 2015 06:00 PM





What is an acceptable Sharpe Ratio for a prop desk?

What should be the value of a Sharpe Ratio for an intraday quantitative strategy to be accepted by a bank or hedge fund's prop desk? Let's assume the returns are daily changes in account equity, close to close.

by bushmanov at October 09, 2015 05:37 PM

what's the difference between Peak-Load pricing and price discrimination?

i just don't get it.

Peak-load pricing wiki page gives example:

in public goods such as public urban transportation, where day demand (peak period) is usually much higher than night demand (off-peak period)

Price discrimination wiki page gives example:

For example, schedule-sensitive business passengers who are willing to pay \$300 for a seat from city A to city B cannot purchase a \$150 ticket because the \$150 booking class contains a requirement for a Saturday night stay, or a 15-day advance purchase, or another fare rule that discourages, minimizes, or effectively prevents a sale to business passengers.

But I don't really see the difference.

In the example of urban transportation, people pay more to buy the convenience of "peak period usage"; while in the other example of air tickets, people pay more to buy the convenience of "returning without spending a Saturday night stay" (so that they could enjoy quality time with family) or "without 15-day advance purchase" (so that they need not spend time planning the detailed, maybe even trifle things that might happen in the future).

For example, a company might hire a secretary to book tickets in 15-days' advance, while need to pay the secretary; or, just buy air tickets when needed, while need to pay the premium of convenience.

that convenience is really a need, why shall government bother?

by athos at October 09, 2015 05:17 PM


How does FreeBSD allocate memory?

I am aware that this is simplified/generalized explanation, but top(1) utility divides memory in FreeBSD into six pools- Active, Inactive, Wired, Cache, Buffers and Free. Example from top(1) output:

Mem: 130M Active, 42M Inact, 51M Wired, 14M Cache, 34M Buf, 648K Free
Swap: 512M Total, 512M Free

Active is used by running processes and Wired is used mainly for kernel. Inactive is memory from closed processes which is still cached in case it needs to be reused, Cache is cached data, Buffers is disk buffers(I guess it is similar to cached in Linux free(1) output(?)) and Free is completely unused memory. Am I correct that FreeBSD kernel automatically allocates space from Inactive, Cache and Buffers pools to Active or Wired if needed?

by Martin at October 09, 2015 05:10 PM



Is it possible to come up with a graph instance that would force Dijkstra to perform a decrease key on every single edge?

From the analysis of Dijkstra there is a $O(mlogn)$ factor that assumes we do a decreasekey for every single edge of the given input graph.

However I find it hard to come up with an instance that can actually require this. All you have to do create the edges and then add the weights in a way that would induce a large number of decrease keys.

Is there any known way of doing that?

by jsguy at October 09, 2015 04:52 PM


Why does scala support Singleton when its universally considered an anti-pattern?

In the object oriented world, Singleton is considered an anti-pattern. Its essentially glorified globals, and the presence of global state brings all sorts of problems. To solve the issue, programmers have created a new approach(well not that new anymore) called Dependency Injection, which is the way better solution. And of course, theres also service locater, which is an antipattern just like singelton, and thus not recommended to use.

One may argue that Scala's singleton is different from Java and C#, in a way that a Singleton class itself is actually an object which can be passed to other class/methods, while in other languages its just static calls on the class. The below sample code is a typical singleton class in Scala:

object Blah { def sum(l: List[Int]): Int = l.sum } 

Nonetheless the fact that singleton classes are objects, doesnt make singleton much better. Of course, the class is now easier to test, as compared to Java's static methods(which is a selling point for scala's singleton implementation over Java's untestable singletons). But its still global state, the main problem with singleton class in general, and scala's singleton cannot address and fix this issue either.

The best way to implement the 'singleton' functionality is to use dependency injection container, in which the singleton nature is enforced by client code explicitly passing the same object to other objects that depend upon it. If the class enforces singleton on its own, it will inevitably expose global state, and becomes an anti-pattern no matter what. The class should never care about how it will be created, its the application that will decide how it will be created and used.

What do you think? Do you agree that scala's singleton support is a mistake in language design?

submitted by Hall_of_Famer
[link] [5 comments]

October 09, 2015 04:51 PM


Is Logic Done on Superpositional Bit Values Useful?

Let's say I have a way to represent $N$ bits such that those bits are in a superposition of the $2^N$ possible states those bits can have and that I can do XOR and AND on those superpositional bits to be able to map $N$ superpositional input bits to $M$ superpositional output bits using any logical circuit (since XOR and AND are supported).

Let's say that I am also able to include bits in those logic circuits which are not superpositional but have specific values of either 0 or 1. This way I can do a NOT by XORing a superpositional bit by 1 for example.

At the end of the logic circuit when I have the superpositional result, I can then decide what values the input bits had, and get a non superpositional result out for those inputs, without evaluating the circuit again. I can interpret the result as many times as I want for the entire permutation of input bits if I want to.

I'm wondering, would a superpositional logic technique like this be useful, and does anyone know if any work has been done in this area?

I know this is similar to how quantum computing works, except for a few key differences like not having interference, nor having probabilities for states, but it does allow cloning of the superpositional result, unlike quantum computing which only allows one question to be asked of the result.

I can think of a couple usage cases, but am wondering if there are others?

  1. Using a superpositional result as a way of letting an untrusted person specify how inputs should map to outputs (basically, "script" some process), without havign to sandbox the logic, worry about division by zero, infinite loops, and the like.
  2. I have yet to experiment deeply with this, but if you need to know several values of some function $f(x)$, it may be more computationally efficient to evaluate it superpositionally and then re-interpret the superpositional result several times, versus just calculating the function several times.

by Alan Wolfe at October 09, 2015 04:45 PM


What specifically makes quantum computers useful?

I know that quantum computers are able to process a superposition of all possible states with a single pass through the logic.

That seems to be what people point to as being what makes quantum computers special or useful.

However after you have processed the superpositional inputs, you have a superpositional result, of which you can only ask a single question and it collapses into a single value. I also know that it isn't (currently?) possible to clone the superpositional state, so you are stuck with getting an answer to that one question.

In both cases, it looks like that multi processing ability really hasn't gotten you anything since it's effectively as if only one state was processed.

Am i misinterpreting things, or does the real usefulness of quantum computing come from something else?

Can anyone explain what that something else is?

by Alan Wolfe at October 09, 2015 04:38 PM

Hypothesis space in Machine Learning

I am a new bee in ML.
so may be asking a silly question
What are completely expressive hypothesis spaces and restricted hypothesis spaces?

I was reading Machine learning by Tom Mitchell book and in third chapter in the very starting I encountered these two terms. I google both the terms but didn't find any details of these two.

by Anshu Mishra at October 09, 2015 04:28 PM

How does one find out whether $N = a^b$ for some $b$?

I was trying to find out how to find whether $N$ is a perfect power or not for some $a$ and $b$ (so the algorithm should discover that its not a perfect power if its not expressable in the form $a^b$).

My intuition tells me that for every value of the exponent $b$ we can binary search $a$ in the set $\{ 1,...,N-1 \}$. However, what I am not convinced yet is that this algorithm won't run forever, i.e. that we won't try new values of $b$ forever. Also, if someone has any comments on the correctness of the algorithm I suggested it would be greatly appreciated.

by Charlie Parker at October 09, 2015 04:26 PM

Calculating execution time for recursive algorithm [duplicate]

This question already has an answer here:

How would I calculate the execution time, T(n), for this algorithm?

int search(int a[], int l, int r, int x) {
    int m = (l+r)>>1;
    if (l > r) return -1; // not found 
    if (a[m] == x) return m;
    if (a[m] > x) return search(a, l, m-1, x);
    return search(a, m+1, r, x);

by FAYNUS at October 09, 2015 04:17 PM


Alternatives to Diffie Hellman

Assume that Discrete logarithms can be solved in linear time over any group (hence factorization is also trivial by a result of Eric Bach), is there any other candidate public key exchange problem that will facilitate computationally secure key exchange?

by Arul at October 09, 2015 04:11 PM


RSA Encryption & Anonymity

Considering plain RSA encryption, assume we have B who wants to send 10 messages to either A1 or A2. The recipients' public key, namely the exponent e1, e2 and prime product n1, n2 for A1/A2 respectively, is publicly known.

When B sends out, he will either encrypt his messages via $c_i = (m_i)^{e_1} mod (n_1)$ or $c_i = (m_i)^{e_2} mod (n_2)$

Let's assume eavesdropper E gets hold of the 10 encrypted messages $c_i$. Given that she knows e1, e2, n1, n2 - while she can't figure out the messages m themselves, can she figure out whether B sent the messages to A1 or A2, and if so, how?

by Marcel at October 09, 2015 04:07 PM

High Scalability

Stuff The Internet Says On Scalability For October 9th, 2015

Hey, it's HighScalability time:

Best selfie ever? All vacation photos taken by Apollo astronauts are now online. Fakes, obvi.

If you like Stuff The Internet Says On Scalability then please consider supporting me on Patreon.
  • millions: # of Facebook users have no idea they’re using the internet; 8%: total of wealth in tax havens; $7.3B: AWS revenues; 11X: YouTube bigger than Facebook; 10: days 6s would last on diesel; 65: years ago the transistor was patented; 80X: reduction in # of new drugs approved per billion US dollars spent since 1950; 37 trillion: cells in the human body; 83%: accuracy of predicting activities from pictures.

  • Quotable Quotes:
    • @Nick_Craver: Stack Overflow HTTP, last 30 days: Bytes 128,095,601,184,645 Hits 5,795,253,218 Pages 1,921,499,030 SQL 19,229,946,858 Redis 11,752,754,019
    • @merv: #reinvent Amazon process for creating new offerings: once decision is made "write the press release and the FAQ you’ll use - then build it."
    • @PaulMiller: @monkchips to @ajassy, “One of your biggest competitors is stupidity.” Quite. Or inertia. #reInvent
    • @DanHarper7: If SpaceX can publish their pricing for going to space, your little SaaS does NOT need "Contact us for pricing" 
    • @etherealmind: If you haven't implemented 10GbE yet, start thinking about 25GbE instead. Cost per port is roughly 1.4x for 2.5x performance.
    • @g2techgroup: Some of the most expensive real estate in the world was being used for data storage...We should not be in the data center business #reinvent
    • The microservices cargo cult: the biggest advantage a microservice architecture brings to the table that is hard to get with other approaches is scalability. Every other benefit can be had by a bit of discipline and a good development process.
    • findjashua: the new 'best practice' is to have a universal app - that renders on the server on first load, and runs as a js app subsequently. This way crawlers and browsers w js disabled still get raw markup.
    • Instagram: Do the simple thing first.
    • erikpukinskis: Generic containers are an awkward mid-way point between special-purpose containers (a Wordpress instance or a rails app on heroku) and an actual machine. You get the hassle of maintaining your own instances, without the flexibility or well-defined performance characteristics of an actual box.
    • @AWSreInvent: Showing off the Amazon Snowball - a 47lb, 50TB device for transporting data to the AWS cloud #reInvent 
    • @merv: #reinvent “There is no compression algorithm for experience” - Andy Jassy. Well said.
    • Alexander von Zitzewitz: I know that about 90% of software systems are suffering from severe architectural erosion, i.e. there is not a lot of the original architectural structure left in them, and coupling and dependencies are totally out of control.
    • Haunted By Data: But information about people retains its power as long as those people are alive, and sometimes as long as their children are alive. No one knows what will become of sites like Twitter in five years or ten. But the data those sites own will retain the power to hurt for decades.

  • Data is valuable, especially if you can turn it into your own private wire. Scandal Erupts in Unregulated World of Fantasy Sports. How many other data archipelagos are being used as private opaque oracles?

  • Cool idea, using drones as an exponential technology to spread seeds, countering deforestation with industrial scale reforestation. BioCarbon Engineering. It's precision forestry. A mapping drone system is used to generate high quality 3D maps of an area. Then drones follow a predetermined planting pattern derived from the mapping phase to air fire biodegradable seed pods onto the ground from a height of 1-2 meters. A problem not unlike dropping a mars rover. Clever pod design shields the seeds from impact while giving them the best chance at germination. This approach recapitulates the batch to real-time transformation that we are seeing everywhere. The current version uses a batch approach with distinct pipelined phases. One can imagine the next version using a swarm of communicating drones to coordinate both the mapping and planting in real-time; perhaps even target selection can be automated to form a continuous reactive system.

  • Birmingham Hippodrome shows how they use Heroku and Facebook's HHVM (HipHop Virtual Machine) to scale their WordPress system and keep it running on a modest budget. Maximum of 4 Standard-1X dynos; Peak requests: ~800/minute; Average memory use per dyno: 130MB; no downtime; Median response time : 5ms; Peak dyno load (so far): ~3.0.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

by Todd Hoff at October 09, 2015 03:59 PM


Variant of (WEAK) PARTITION with 2 distinct solutions

I am interested in the complexity of the following problem:

Input: A list $a1\leq ⋯ \leq a_n$ of positive integers.

Question: Are there two vectors $x, x'\in\{−1,0,1\}^n$ such that $$\sum_{i=1}^nx_ia_i=0, \sum_{i=1}^nx'_ia_i=0 \text{ and } x_ix'_i=0 \text{ }\forall i.$$ WEAK PARTITION is a variant of PARTITION where we are looking for a partition of a subset of the input, and not a partition of all the integers. This is the reason why we have $\{−1,0,1\}^n$ and not $\{−1,1\}^n$.

But I am interested in the problem described above where we are looking of two distinct partitions. For example, with $A=(1,2,3,4,5,7,11,11)$, we have $x=(1,0,1,0,0,1,-1,0)$, $x'=(0,1,0,1,1,0,0,-1)$ and for each component $i$ we have $x_ix'_i=0$. It means we have two disjoint subsets of $A$ where there is a weak partition.

There is clearly a reduction from WEAK PARTITION but I suspect this problem to be NP-hard in the strong sense. Do you see any reduction from a problem which is NP-hard in the strong sense ?

Thank you very much.

by user2370336 at October 09, 2015 03:55 PM




A simple problem whose decidability is not known

I am preparing for a talk aimed at undergraduate math majors, and as part of it, I am considering discussing the concept of decidability. I want to give an example of a problem which we do not currently know to be decidable or undecidable. There are many such problems, but none seem to stand out as nice examples so far.

What is a simple-to-describe problem whose decidability is open?

by Lev Reyzin at October 09, 2015 03:40 PM


Tools in R for estimating time-varying copulas?

Are there libraries in R for estimating time-varying joint distributions via copulas?

Hedibert Lopes has an excellent paper on the topic here. I know there is an existing packaged called copula but it fits a static copula.

by Quant Guy at October 09, 2015 03:39 PM


Variant of (WEAK) PARTITION with 2 distinct solutions

I am interested in the complexity of the following problem:

Input: A list $a1\leq ⋯ \leq a_n$ of positive integers.

Question: Are there two vectors $x, x'\in\{−1,0,1\}^n$ such that $$\sum_{i=1}^nx_ia_i=0, \sum_{i=1}^nx'_ia_i=0 \text{ and } x_ix'_i=0 \text{ }\forall i.$$ WEAK PARTITION is a variant of PARTITION where we are looking for a partition of a subset of the input, and not a partition of all the integers. This is the reason why we have $\{−1,0,1\}^n$ and not $\{−1,1\}^n$.

But I am interested in the problem described above where we are looking of two distinct partitions. For example, with $A=(1,2,3,4,5,7,11,11)$, we have $x=(1,0,1,0,0,1,-1,0)$, $x'=(0,1,0,1,1,0,0,-1)$ and for each component $i$ we have $x_ix'_i=0$. It means we have two disjoint subsets of $A$ where there is a weak partition.

There is clearly a reduction from WEAK PARTITION but I suspect this problem to be NP-hard in the strong sense. Do you see any reduction from a problem which is NP-hard in the strong sense ?

Thank you very much.

by user2370336 at October 09, 2015 03:24 PM


Looking for an algorithm to iterate over essentially different solutions

I'll explain my problem with an analogy to Sudoku-grids.

Consider a filled Sudoku-grid. If you exchange labels or rearrange rows/columns within a block, you have another valid Sudoku-grid. However the new grid is essentially equivalent, if that makes sense. One might say, under the operations "swap rows", "exchange labels" etc., both grids are in the same orbit. (

I want to iterate over all grids, but in a way that the algorithm ignores other solutions in the same orbit. However, this question is not about Sudoku grids! My grids have a different structure, which is a bit simpler:

I want to fill a 4x6 table with the integers 1 to 8 (each three times) such that each row contains each integer at most once. There are a few other more complex conditions, but I think it's easier to check them later after a solution has been generated.

Just like in the Sudoku analogy, there are a few operations that generate other essentially equivalent solutions:

1) Rearranging rows: if one row is

1 3 5 6 2 8

the solution with that row replaced by

1 2 3 5 6 8

is of course also valid and consindered essentially equivalent.

2) Swapping rows.

3) Exchanging labels.

E.g. these solutions are essentially equivalent

1 3 5 6 2 8
4 1 5 7 8 2
4 7 6 1 2 3
3 4 5 6 8 7

First row rearranged:
1 2 3 5 6 8
4 1 5 7 8 2
4 7 6 1 2 3
3 4 5 6 8 7

First two rows swapped:
4 1 5 7 8 2
1 3 5 6 2 8
4 7 6 1 2 3
3 4 5 6 8 7

All 1's and 2's exchanged
2 3 5 6 1 8
4 2 5 7 8 1
4 7 6 2 1 3
3 4 5 6 8 7

My question is what an efficient (both runtime and memory) algorithm to generate these solutions might be.

The naive approach would be to store all previous solutions and check whether an equivalent solution has already been found. However, this is expensive in both runtime and memory!

I managed to improve this by calculating a hash of each solution in such a way that equivalent solutions generate the same hash. Then we can store the solutions in a map, so that we need to check fewer solutions and thus reduce the runtime. However, the memory cost is still huge.

If the hash has the property that different solutions yield different hashes, then I would only need to store the hashes and reduce the required memory by a large factor.

Can you think of a more elegant algorithm that does not store hashes nor previous solutions?


I simplified my problem too much without realising that I reduced the number of orbits to 1. (Thanks @Klaus Draeger). In fact the numbers 4,6,8 and 3 were made up as an example. I thought I could adapt the algorithm to the more general case later. Actually, my problem is a bit more complex: The number of rows is variable and rows need not have the same length, so rule 2) only applies to rows of equal length. Also the frequency of the numbers is not necessarily constant (e.g numbers 1,2,3,4 might appear four times and 5,6,7,8 three times or whatever), so rule 3) only applies to labels with the same frequency.


And this is the more complex condition, that I thought doesn't matter for now. Maybe I'm wrong again...

If, in one row, y appears directly to the right of x (or y is in the first column and x in the last), then in a (unique) different row there must be an x to the right of y (above example does not satisfy this). This means that rule 1) only applies to rotations of rows and not any permutation.

I think this can be dealt with at a later point.

by Alex at October 09, 2015 03:13 PM

Daniel Lemire

Secular stagnation: we are trimming down

Economists worry that we have entered in a secular stagnation called the Great Stagnation. To summarize: whereas industrial productivity grew steadily for most of the XXth century, it started to flatten out in the 1970s. We have now entered an era where, on paper, we are not getting very much richer.

Houses are getting a bit larger. We can afford a few more clothes. But the gains from year to year are modest. Viewed from this angle, the stagnation looks evident.

Why is this happening? Economists have various explanations. Some believe that government regulations are to blame. Others point out that we have taken all the good ideas, and that the problems that remain are too hard to solve. Others yet blame inequality.

But there is another explanation that feels a lot more satisfying. We have entered the post-industrial era. We care less and less about producing “more stuff” and we are in a process of trimming down.

Young people today are less likely to own a car. Instead, they pay a few dozen dollars a month for a smartphone. They are not paying for the smartphone itself, they are paying for what it gives them access to.

Let us imagine the future, in 10, 20 or 30 years. What I imagine is that we are going to trim down, in every sense. People will own less stuff. Their houses won’t be much larger. They may even choose not to own cars anymore. They may choose to fly less often. If we are lucky, people will eat less. They may be less likely to be sick, and when sickness strikes, the remedy might be cheaper. They will use less power.

We are moving to a more abstract world. It is a world where it becomes harder to think about “productivity”, a concept that was invented to measure the output of factories. What is the “productivity” of a given Google engineer? The question is much less meaningful than if you had asked about the productivity of the average factory worker from 1950.

Suppose that, tomorrow, scientists discover that they have a cure for cancer. Just eat some kale daily and it will cure any cancer you have (say). This knowledge would greatly improve our lives… we would all be substantially richer. Yet how would economists see this gain? These scientists have just made a discovery that is almost without price… they have produced something of a very great value… how is it reflected in the GDP? Would you see a huge bump? You would not. In fact, you might see a net decrease in the GDP!

We won’t cure cancer next year, at least not by eating kale… but our lives are made better year after year by thousands of small innovations of this sort. In many cases, these cannot be measured by economists. And that’s increasingly what progress will look like.

Measuring progress in a post-industrial world is going to get tricky.

by Daniel Lemire at October 09, 2015 03:03 PM


Equivalence of weighted Minkowski sums

Given $n$ polytopes $P_1, \cdots, P_n$, each $P_i$ is given as the V-representation, i.e., a set of $m$ points as its set of vertices.

Furthermore, consider a variant of the Minkowski sum (somehow its weighted version). For $n$ nonnegative numbers $\vec{w}=(w_1, \cdots, w_n)$, we write $\sum_{\vec{w}}P_i=\{\sum_{i=1}^n w_i \vec{x}_i\mid \vec{x}_i\in P_i\}$.

The question is, given two weight vectors $\vec{w}$ and $\vec{v}$, decide whether $\sum_{\vec{w}}P_i = \sum_{\vec{v}}P_i$, where $=$ is interpreted as set equivalence.

It appears that a straightforward approach fails because $\sum_{\vec{w}}P_i$ as a polytope, might contain exponentially many vertices (?, order of $m^n$), so I only can show that the problem is in coNP. But is the problem in P, or it is coNP-hard as well?

by user35648 at October 09, 2015 03:02 PM



Simpler proof of Rabin's Compression Theorem?

I was doing a presentation on Rabin's Compression Theorem, when someone in the audience brought up a point I have no answer to.

Rabin's Compression Theorem, states that every reasonable complexity class has a prorblem that can be solved optimally in it. The proof is a little involved, but not horribly difficult.

The audience member preposed a much simpler proof. For the given complexity class, calculate the target volume from the input length. Then write that many hashes to the output.

Does this really prove the same result?

Also, I have had a lot of trouble finding Rabin's original paper, does anyone know how to get it?

A formal version of the argument would be: Given a constructable function f, find a program that is optimal for O(f(x)) time. Where x is the length of the input.


Target = f(len(x))

For 0 to target Print "1"

The runtime of the algorithm is O(f(x)): the function is constructable so the first line is at most O(f(x)) and the loop is exactly O(f(x)).

The algorithm is optimal: Any program faster then O(f(x)) will be unable to output exactly f(len(x)) "1"s on every input.

by user833970 at October 09, 2015 02:52 PM


Hidden subgroup problem and public key exchange

Which problems relevant to public key exchange can be considered as a hidden subgroup problem?

Conversely which hidden subgroup problems are relevant to public key exchange?

(We know that for some finite abelian groups relevant to RSA and Diffie Hellman on cyclic groups we have a connection)

by Arul at October 09, 2015 02:44 PM



Relative Volatility Index in Mathematica [migrated]

How does Mathematica calculate Relative Volatility Index?

by Arnold at October 09, 2015 02:37 PM



Gaussiab Time-varing copula in R [duplicate]

This question already has an answer here:

I want to estimate the parameters of time-varing Normal Copula using R. A bivariate Normal copula is defined as following:

enter image description here

The dynamic equation of dependance parameter ρ is : enter image description here

So I need the identify the parameters ω,α and β. Is there any package in R?

by Nourhaine at October 09, 2015 02:21 PM



What are the minimum pumping length for the following languages? [duplicate]

This question already has an answer here:

What are the minimum pumping length for the following languages ?

  1. empty string

  2. (01)*

  3. 10(11*0)*0

  4. 1011

  5. 011 U 0*1*

Here are my solutions. Please correct me if i'm wrong.

  1. p = 0 because the language has no pumpable strings

  2. p = 2 because 01 is the shortest string that can be pumped

  3. p = 5 because 10100 is the shortest string that can be pumped

  4. p = 0 because string cant be pumped

  5. p = 1 because the string 0 can be pumped

I am not sure about my answers, so any help is appreciated. Thanks a lot!

by user40826 at October 09, 2015 02:18 PM


Why can't I enable Transmission on FreeNAS 9.2? I keep getting an error

I have installed the plugin correctly, but every time I try and start it either using service transmission start, or trying to flip the on switch from the plugins page, I get this error:


I also have tried creating a new standard jail and using portsnap to get transmission, but same error.

by rakeshdas at October 09, 2015 02:11 PM




Patterns in scala [on hold]

Could you please recommend good book about patterns /functional patterns in Scala.

Also very interesting compare GOF patterns with scala concepts.

for example:

Singleton -------> Object
Adapter ---------> Implicit 
Strategy --------> High order function 

More examples? articles, books?

by scalaz at October 09, 2015 01:47 PM

Dave Winer

The design of "Moments"

Since the release of Moments I've been making an effort to use it, so I can decide what it is about it that doesn't feel good. I have some conclusions, and an idea for Twitter on how to proceed.

  1. It doesn't feel good.

  2. I think that's because it shares a lot with the spam sites I've been trained to avoid.

  3. The ones with catchy headlines that suck you into lots of clicks, presumably so they can show you ads and get paid for them. Or maybe it's doing something to improve the pagerank of some pages it points to. I don't know why they are like they are, but it's not to make me, a reader happy, or inspired. Inspired people go look something up on Wikipedia, so they just want you bored enough so you'll click the Next button, if you can find it.

  4. However I think the idea is valid. "Surfacing good stuff" on Twitter seems doable, and worthwhile. Every so often I get a gem, like this link from Jay yesterday. It's dense, but packed with good stuff.

  5. Here's a free idea. A design contest! With a nice cash prize. Maybe even a job. Dear designers, here's the raw material. Make something people would like to come back to.

  6. Once you have this, put this on the home page for people who aren't logged in. Have the links on the page take you into the behind-the-scenes context for whatever it is you're looking at. Train new users to go deeper, but if they don't want to -- give them the best experience possible. Make them say "Oh I get Twitter now, it's about news!" At least that much. Let them understand the position. Even better if they come back, again and again.

  7. I think the reason why the current design doesn't achieve this is that it was designed in a corporate environment. The idea needs to be set free and let artists play with it. (Adam Bain, the new COO of Twitter, comments on this item.)

  8. Banksy: "It's not art unless it has the potential to be a disaster."

Update on Medium experiment

Yesterday I did an experiment, using IFTTT to post an item from Scripting News on Medium.

It worked, sort of. One of the problems was a flaw in Fargo's RSS generator. Without thinking, it indented the beginning of an item's <description>. If the content is HTML this doesn't matter, whitespace is irrelevant in HTML. However in Markdown, a tab at the beginning of a line means something. So that was certainly tripping up IFTTT, and in turn causing Medium to display the first line as literal text, not markup. This might have somehow been connected to the missing title on the Medium version of the post.

I fixed the problem in Fargo. If the item is markdown, it doesn't indent the text.

This post will be the second test case.

So far I have not even looked at the API, I haven't had to. Using RSS is fine with me, if it works. It does have a downside, any changes to an article do not get reflected in the Medium version. It would be nice if the RSS interface could accommodate that as well. Maybe Medium wants to directly support RSS, given that it is kind of a standard?


When I first published this piece, it had an image in the margin, and that kind of screwed up the Medium version of this post. Still no title.

October 09, 2015 01:47 PM


Is there a step-by-step guide for calculating portfolio VaR using monte carlo simulations

I am trying to determine a step-by-step algorithm for calculating a portfolio's VaR using monte carlo simulations. It seems to me that the literature for this is extraordinarily opaque for something as common as VaR. To simplify things, I want to initially consider only a portfolio of stocks and at a later stage include derivatives.

Here are the steps I have managed to pickup using different sources:

  1. Estimate the portfolio's current value $P_0$.
  2. Build the portfolio's covariance matrix using stock historical data.
  3. Create the Cholesky decomposition of the covariance matrix.
  4. Generate a vector of n independent standard normal variates
  5. multiply the matrix resulting from the Cholesky decomposition with the vector of standard normal variates in order to get a vector of correlated variates.
  6. Calculate the assets' terminal prices using geometric brownian motion. $$ S_i(T) = S_i(0) \exp\left(\left(\mu-\frac{\sigma^2}{2}\right)T + \sigma\sqrt{T}\epsilon_i\right)$$ where $\epsilon_i$ corresponds to the correlated random variate for asset i obtained from the vector of correlated variates.
  7. reevaluate the portfolio's value at time $T$, $P_T$, using the stock prices generated in the previous step.
  8. Calculate the portfolio return using $$R_T=\frac{P_T - P_0}{P_0}$$
  9. Repeat steps 4-8 many times (for example $n=10000$ simulations).
  10. Sort the returns in ascending order.

I have the following questions:

  1. How do I extract the VaR from the sorted portfolio returns?
  2. How do I define the time horizon T?
  3. I have seen examples where the whole stock path is discretized using a relation of the form: $$ S_{(t+dt)} = S_t + S_t\mu dt + S_t \sigma \sqrt{dt} \epsilon_i $$ Do we need to do that or simply evaluating the stock's terminal price using the formula in point 6 is enough?

by BigLudinski at October 09, 2015 01:46 PM




Why would you introduce the goto statement into a modern language?

I just found out something really quite extraordinary. While looking through Stackoverflow, I came across a question about removing goto from a php function. PHP doesn't have goto I thought and looked it up on

It turns out I was sort of right. PHP introducted goto in version 5.3 which was release in 2009. PHP didn't start out with gotoit actually introduced it into the language in 2009!

Why on earth with all the horror stories we have from 40 years of bad programs written with goto in other languages would php actually decide to introduce it?

The website even has this XKCD image suggesting no programmer should ever use goto.

enter image description here


What possible reason from a technical point of view could there be for introducing a the goto feature, which I nievely thought had been vanquished from modern computer programming languages?

by Toby Allen at October 09, 2015 01:17 PM


Do you have a validation set for Libor Market Model implementation?

I'm trying to calibrate a Libor Market Model (LMM) in Matlab with my user-defined function, not their package.

I already fitted the market volatilities using SABR but failed to simulate the correct market prices.

Here are my questions:

  • Has anyone a validation set of cap surfaces and cap prices?

  • What are the quoting market conventions for pricing using a Libor curve? I've seen many different ways of applying discount on the black formula. What about OIS?

  • Is there any really good implementation guide available, with pseudo or real code to observe?

by Tulio Carnelossi at October 09, 2015 01:12 PM

Which option pricing models agree best with the market, given the asset price is known?

Assuming you can somewhat forecast the underling asset price movement, and you want to translate this value into the corresponding option price. In practice, which are the better models for this task?

Black-Scholes sometimes gives me unrealistic implied volatility, even for near the money cases, and Heston model fitting is not very stable.

I don't have a huge set of option price data to test.

by user2188453 at October 09, 2015 01:07 PM


Is scala is faster that java since both runs on JVM? [duplicate]

This question already has an answer here:

In many websites I have studied that scala is faster than Java. I have written code to test the time difference between these two but Scala takes more time. I don't know whether I am making any mistake. Please correct me if I am wrong.

Scala code

package com.first

import java.util.ArrayList

object Word extends App{
  val absoluteResult = new ArrayList[Any]()
  val before = System.currentTimeMillis()
  var i =0
  while (i<10000) {
    i = i+1
    val result = List("foo", 23, true).iterator
    while (result.hasNext) {
  println("Took : "+(System.currentTimeMillis() - before)
      +" ms, number of elements : "+absoluteResult.size)

  def foo(obj : Any) =
    obj match {
          case _:String => "String"
          case _:Boolean => "Boolean"
          case _:Integer => "Integer"
          case _ => throw new IllegalArgumentException()

output Took : 142 ms, number of elements : 30000

Java code

package com.first;

import java.util.ArrayList;
import java.util.HashSet;
import java.util.List;
import java.util.Set;

public class quicksort {
    public static void main(String[] args) {
        List<Object> absoluteResult = new ArrayList<Object>();
        long before = System.currentTimeMillis();
        for (int i=0; i < 10000; i++) {
          List<Object> result = new ArrayList<Object>();
          result.add( 23);
          for (Object y : result) {
        System.out.println("Took : "+(System.currentTimeMillis() - before)
          +" ms, number of elements : "+absoluteResult.size());

    static String foo(Object s) {
      if (s instanceof String) {
        return "String";
      } else if (s instanceof Boolean) {
        return "Boolean";
      } else if (s instanceof Integer) {
        return "Integer";
      } else {
        throw new IllegalArgumentException();

output Took : 30 ms, number of elements : 30000

by user4665655 at October 09, 2015 01:02 PM


Major unsolved problems in theoretical computer science?

Wikipedia only lists two problems under "unsolved problems in computer science":

What are other major problems that should be added to this list?


  1. Only one problem per answer
  2. Provide a brief description and any relevant links

by Shane at October 09, 2015 12:55 PM

Is there a better than linear lower bound for factoring and discrete log?

Are there any references that provide details about circuit lower bounds for specific hard problems arising in cryptography such as integer factoring, prime/composite discrete logarithm problem and its variant over group of points of elliptic curves (and their higher dimensional abelian varieties) and the general hidden subgroup problem?

Specifically do any of these problems have more than a linear complexity lower bound?

by v s at October 09, 2015 12:50 PM


Combining 2 md5 hashes [on hold]

first of all sry for my bad english skills. So heres my question, is there a way to combine two md5 hashes like that? we have two strings first is "a" and second is "b" the hashes are for "a" : 0cc175b9c0f1b6a831c399e269772661 and "b" : 92eb5ffee6ae2fec3ad71c777531578f and for the string "ab" the hash is : 187ef4436122d1cc2f40dc2b92f0eba0 is there a way to get the hash for "ab" only with the hashes from "a" and "b"? ty

by ToastWithButter at October 09, 2015 12:11 PM

Planet Emacsen

Irreal: Video on Using Org Mode for Your Init File

Daniel Mai has a nice video on how he uses Org mode to organize his Emacs init file. I've written about this before but some of you may prefer, or at least enjoy, watching a video on the subject.

This appears to be the first of several videos on the subject so Mai restricts himself to showing how he organizes the Org file and how he uses the use-package package to install and configure his packages. He promises to go into some of the details in future videos.

If you've reached the point where you're thinking, “I really should refactor my init file” you should consider organizing it as an Org file. Doing that makes it easy to add commentary on why your doing things a particular way or what a custom function is for. One of the big benefits, I think, is that Org mode's outline paradigm makes it natural to organize your init file in a reasonable way. That, in turn, makes it easy to find things either by just looking at the headings or using one of the excellent Org mode search functions.

In any event, the video is enjoyable and only 12 minutes long so it should be easy to fit it into your schedule.

by jcs at October 09, 2015 11:47 AM


What do you prefer, IDO or Helm?

Currently two packages for file navigation are most used, IDO and Helm.

I noticed there are many packages for IDO, like IDO clever match and there are many packages for Helm too.

For me personally, I prefer Helm with tight Projectile integration. But when I want to change directories, then I prefer Ido-find-filewith Ido-vertical, because it's much easier in use than Helm.

What do you prefer for which operations, and why?

submitted by ReneFroger
[link] [36 comments]

October 09, 2015 11:21 AM


Teach me, masters! I need help with SBT, local dependencies, multiple-projects without the same root-dir and building a working .jar

First of, let me say that I'm using sbt successfully in two (other, unrelated) projects already. I love Scala and I love sbt although I must admit I am by no means an expert.

Regarding a project I'm currently working on I'm facing multiple really annoying problems I have not found solutions for yet:

While the actual development of the project is going great, it's kind of embarrassing that the dependency-management does not work out as I would wish.

I'm developing on Linux and in IntelliJ. Currently one library I use (lwjgl3) is still under development, so there is no maven/ivy dependency I can use. You have to manually download the jar and natives from here:

Now I created a libs-folder and put lwjgl in there with all the natives and stuff.

SBT looks for unmanaged libs in lib. One can change this with:

unmanagedBase := baseDirectory.value / "libs"

but then all the libs would need to be in the "root" libs-folder.

That's Problem #0:

Not really what I call organization. I'd prefer if I could specify dependencies within the base and keep a nice and tidy folder-hierarchy.



-lib1 --jar --natives --docs -lib2 -lib3 --jar 

Problem #1 is:

On each startup of the project I have to manually add back the dependency of the natives, else I can't compile.

Setting the -Djava.library.path=libs/ only helps with the .jar, not with the natives :(

Problem #2 is:

When I export my project from IntelliJ (artefact with all dependencies) on startup it always complains along the lines of:

Missing lwjgl from path

With some exception thrown right in my face.

Problem #3:

Another major game-breaker for me is, that I want to develop an engine and a game, separately. (Currently I'm developing the game in the same engine)

It seems that this is not really possible without the same root-directory in sbt (?)!

Basically what I want is to let the core (engine) reside in another base/root-directory than the actual application (game) so:

- /core -- build.sbt - /application -- build.sbt 

I want to be able to:

  1. modify both projects in the same IntelliJ-window
  2. leave both projects in their respective folder (no wrapper-folder around them!). Core will be used in other applications as well, applications that are no siblings to the current "application", so I do not want them to reside under the same root-folder!

You can have a more detailed look here on what I tried:

So, here I am, wanting to use sbt, but I really can't until these problems are fixed.

Currently I'm the sole developer of these projects, but I want to open source at least the engine. While for me it's "just" annoying it's an absolute nogo that other developers would have to manually add in module-dependencies on each startup, just to be able to compile, let alone that I can't get a working .jar exported.

Help, please!

submitted by Teolha
[link] [4 comments]

October 09, 2015 11:19 AM


Is there a faster solution for the Google Code Jam Great Wall Problem

Consider the following Google Code Jam round 1C question:

The Great Wall of China starts out as an infinite line, where the height at all locations is $0$.

Some number of tribes $N$, $N \le 1000$, will attack the wall the wall according to the following parameters - a start day, $D$, a start strength $S$, a start west-coordinate, $W$, and a start east-coordinate, $E$. This first attack occurs on day $D$, on range $[W,E]$, at strength $S$. If there is any portion of the Great Wall within $[W,E]$ that has height $< S$, the attack is successful, and at the end of the day, the wall will be built up such that any segment of it within $[W,E]$ of height $< S$ would then be at height $S$ (or greater, if some other attack that day hit upon the same segment with strength $S' > S$)

Each tribe will perform up to $1000$ attacks before retreating, and each attack will be determined iteratively from the one before it. Every tribe has some $\delta_D$, $\delta_X$, and $\delta_S$ that determines their sequence of attacks: The will wait $\delta_D \ge 1$ days between attacks, they will move their attack range $\delta_X$ units for each attack (negative = west, positive = east), though the size of the range will stay the same, and their strength will also increase/decrease by a constant value after each attack.

The goal of the problem is, given a complete description of the attacking tribes, determine how many of their attacks will be successful.

I managed to code a solution that does work, running in about 20 seconds: I believe the solution I implemented takes $O(A\log A + (A+X)\log X)$ time, where $A =$ the total number of attacks in a simulation (max $1000000$), and $X =$ the total number of unique edge points on attack ranges (max $2000000$).

At a high level, my solution:

  • Reads in all the Tribe information
  • Calculates all the unique $X$-coordinates for attack ranges - $O(A)$
  • Represents the Wall as a lazily-updated binary tree over the $X$ ranges that tracks minimum height values. A leaf is the span of two $X$ coordinates with nothing in-between, and all parent nodes represent the continuous interval covered by their children. - $O(X \log X)$
  • Generates all the Attacks every Tribe will perform, and sorts them by day - $O(A \log A)$
  • For each attack, see if it would be successful ($\log X$ query time). When the day changes, loop through all unprocessed successful attacks and update the wall accordingly ($\log X$ update time for each attack). - $O(A\log X)$

My question is this: Is there a way to do better than $O(A\log A + (A+X)\log X)$? Perhaps, is there some strategic way to take advantage of the linear nature of Tribes' successive attacks? 20 seconds feels too long for an intended solution (although Java might be to blame for that).

by torquestomp at October 09, 2015 11:13 AM


Directed multigraphs as minimal automata

Given a regular language $L$ on alphabet $A$, its minimal deterministic automaton can be seen as a directed connected multigraph with constant out-degree $|A|$ and a marked initial state (by forgetting labels of transitions, final states). We keep the initial state because every vertex must be accessible from it.

Is the converse true ? I.e. given a directed connected multigraph $G$ with constant out-degree and initial state such that every vertex is accessible from it, is there always a language $L$ such that $G$ is the underlying graph of the minimal automaton of $L$ ?

For instance if $|A|=1$ it's true, since the graph must be a "lasso" with a prefix of size $i$ and a loop of size $j$, and corresponds to the minimal automaton of $L=\{a^{i+nj}~|~n\in\mathbb N\}$.

The motivation comes from a related problem encountered in a decidability reduction, where the solution is easier : starting from a non-oriented simple graph, and with more operations allowed like adding sinks. But I was wondering if someone had already looked at this more natural question ?

The only things remotely connected I could find in the literature are papers like Complexity of Road Coloring with Prescribed Reset Words, where the goal is to color such a multigraph so that the resulting automaton has a synchronizing word. However minimality does not seem to be considered.

Update: Follow-up question after the answer of Klaus Draeger: what is the complexity of deciding whether a graph is of this shape ? We can guess the labeling and polynomially verify minimality of the automaton, so it is in NP, but can we say more ?

by Denis at October 09, 2015 11:05 AM


Overcoming Scala Type Erasure For Function Argument of Higher-Order Function

Essentially, what I would like to do is write overloaded versions of "map" for a custom class such that each version of map differs only by the type of function passed to it.

This is what I would like to do:

object Test {
  case class Foo(name: String, value: Int)

  implicit class FooUtils(f: Foo) {
    def string() = s"${}: ${f.value}"

    def map(func: Int => Int) = Foo(, func(f.value))
    def map(func: String => String) = Foo(func(, f.value)

  def main(args: Array[String])
    def square(n: Int): Int = n * n
    def rev(s: String): String = s.reverse

    val f = Foo("Test", 3)

    val g =
    val h =


Of course, because of type erasure, this won't work. Either version of map will work alone, and they can be named differently and everything works fine. However, it is very important that a user can call the correct map function simply based on the type of the function passed to it.

In my search for how to solve this problem, I cam across TypeTags. Here is the code I came up with that I believe is close to correct, but of course doesn't quite work:

import scala.reflect.runtime.universe._

object Test {
  case class Foo(name: String, value: Int)

  implicit class FooUtils(f: Foo) {
    def string() = s"${}: ${f.value}"

    def map[A: TypeTag](func: A => A) =
      typeOf[A] match {
        case i if i =:= typeOf[Int => Int] => f.mapI(func)
        case s if s =:= typeOf[String => String] => f.mapS(func)
    def mapI(func: Int => Int) = Foo(, func(f.value))
    def mapS(func: String => String) = Foo(func(, f.value)

  def main(args: Array[String])
    def square(n: Int): Int = n * n
    def rev(s: String): String = s.reverse

    val f = Foo("Test", 3)

    val g =
    val h =


When I attempt to run this code I get the following errors:

[error] /src/main/scala/Test.scala:10: type mismatch;
[error]  found   : A => A
[error]  required: Int => Int
[error]         case i if i =:= typeOf[Int => Int] => f.mapI(func)
[error]                                                      ^
[error] /src/main/scala/Test.scala:11: type mismatch;
[error]  found   : A => A
[error]  required: String => String
[error]         case s if s =:= typeOf[String => String] => f.mapS(func)

It is true that func is of type A => A, so how can I tell the compiler that I'm matching on the correct type at runtime?

Thank you very much.

by Searchin4Sanity at October 09, 2015 11:03 AM


Der FBI-Direktor ist ungehalten, dass der Guardian ...

Der FBI-Direktor ist ungehalten, dass der Guardian und die Washington Post bessere Daten über Polizeitötungen haben als das FBI.
James Comey tells crime summit that ‘it’s ridiculous’ Guardian and Washington Post have more information on civilians’ deaths at hands of US police than FBI
Besser spät als nie!

October 09, 2015 11:00 AM


Can Economic Capital cover Regulatory Capital?

If economic capital is set by the institution to cover unexpected loss (given a confidence level) and regulatory capital is set by the regulator, can one "absorb" the other?

For example, if I determine I want my fictional bank to hold 5bn in economic capital as that corresponds to 99.98% confidence of my loss distribution and the local regulator says I need to hold a total of 3bn in regulatory capital based on my regulatory filings, can I say I'm holding 5 already for EC so I'm covered? Or do I have to hold 5 + 3?

by AfterWorkGuinness at October 09, 2015 10:52 AM


Best place/online resource to learn more Theory?

I have been programming in C# since April 2015, making websites with HTML/CSS since 2013 and am currently learning JavaScript, however my lack in basic theory is starting to hinder my ability to understand more complex algorithms. Does anyone have any resources that they could throw at me? I don't have a programming job or anything (As of now, hoping to change that in the future) I am just doing side projects for fun. Not only do i want to learn it so i have a better grasp of things, but it just generally seems interesting and something i want to know more about. Thanks in advance for the help! :)

submitted by sloansta
[link] [4 comments]

October 09, 2015 10:48 AM


Computer Scientist Questions!

my name is Dawid, I'am a 5th year student in Colaiste an Chraoibhin in Fermoy Co.Cork. I am interested in being a computer scientist later on in life and I would like to ask you a couple of questions as part of my project into career investigation. What's your role as a computer scientist? What's the career like? For how long are you a computer scientist? How long did it take you to achieve your position? What are the average wages like? What are your daily basis tasks? Is the job hard?

by Dawid at October 09, 2015 10:40 AM


Directed cyclic graph with node rewards and arc costs

The problem I have seems fairly simple and I feel it must have some kind of name. I have a (directed cyclic) graph. Each node has an associated reward for visiting it, and each arc costs a certain amount of time to traverse it. The reward is consumed on visiting once, so a path may visit a node multiple times but receives 0 reward for future visits.

Is there an algorithm for maximising the reward while proceeding from node A to node B (or back to A again) in a given maximum amount of time available?

Note that the times and rewards cannot be combined as a single "weight".

The data is sparse and isn't that big: a few hundred nodes which typically have less than 5 arcs each.

Any tips on modelling and/or algorithms greatly appreciated.

by Geoff Bache at October 09, 2015 10:22 AM


Fastest Turing Machine

Recently I have been reading about Kolmogorov Complexity. As such I started thinking about the "fastest turing machine". In particular I am not interested in finding such a machine, I am only interested in the time complexity of it. By googling I found the blog, and the definition of the "fastest turing machine" is exactly the same as mine:

Let $L(M) = \{ w \in \Sigma^* | M \text{ accepts } w \}$, $M=$ TM = Turing Machine.

$T_M(n) = \text{ max } \{ t_M(w) | w \in \Sigma^n \}$, where $t_M(w)$ is the computation time of $M$ on $w$.

Fix some universal Turing machine $U$. Then we can define:

$T_U(L,n) := min_M \{ T_M(n) : M \text{ recognizes the language } L \}$ The $min$ is taken with respect to lexicographic ordering of the programs $M$ in the UTM $U$.

This $T_U(L,n)$ measures the time of the fastest Turing machine to recognize words of length $n$ of the language $L$.

Then by definition of $T_U(L,n)$ we have for each TM $M$:

$T_U( L(M), n ) \le T_M(n)$

I was wondering if changing the UTM from $U$ to $V$ will have much impact on $T_U(L,n)$.

I guess that $T_U(L,n) \le c_{UV} \cdot T_V(L,n)$ for some constant $c_{UV}$, which depends solely on $U$ and $V$ and not on $L$ and $n$. The intuition behind it, is the same as in the Kolmogorv Complexity case. First on has an interpreter from $U$ to $V$, than one runs the programs using this interpreter. But how does one make this into a formal proof?

Do you have any idea?

by stackExchangeUser at October 09, 2015 10:14 AM


Akt 1: Gastprofessor hält einen Keynote-Vortrag an ...

Akt 1: Gastprofessor hält einen Keynote-Vortrag an der Purdue University, zeigt irgendwo in der Mitte kurz eine Snowden-Folie.

Akt 2: Purdue vernichtet alle Videoaufnahmen und erwägt sogar kurz die Zerstörung des Projektors.

Wie sich rausstellt, macht Purdue auch Forschung fürs Pentagon und hat dafür eine Security Clearance erhalten und die kam mit länglichen Vorschriften des Militärs, wie mit classified material umzugehen sei.

Update: Ob die sich an dem Bildungsministerium Mecklenburg-Vorpommern orientiert :-)

October 09, 2015 10:01 AM

Gute Nachrichten von Google: Google ist aufgefallen, ...

Gute Nachrichten von Google: Google ist aufgefallen, dass mobile Webseiten furchtbar langsam laden und viel zu viel Daten übertragen. Sie schlagen daher eine Art Best Practices vor, die sie aber in Form eines Standards namens Accelerated Mobile Pages (AMP) festlegen wollen.

Die Kernpunkte sind: CSS ja, Javascript nein, Analytics nur via statischem Tracking-Pixel, Werbung nur via iframe. Die Zielgruppe für diese Vorgaben sind Nachrichtenseiten, und dort redet Google von "relatively static content" und will am Ende statische HTML-Seiten rausrendern und per CDN verteilen.

Mit anderen Worten sind die jetzt da wo ich seit Jahren bin :-)

Aber mal völlig abgesehen von solcher Häme ist dieses Konzept auf der einen Seite wunderschön, weil es tatsächlich das Web besser machen würde, wenn sich die Leute daran hielten. Auf der anderen Seite lassen sich Trackingpixel wunderbar adblocken, und iframes ebenso, und die aktuell grassierenden anti-adblock-Lösungen bestehen auf Javascript, was bei AMP verboten ist.

October 09, 2015 10:01 AM



Estimate market risk premium?

There are uncountably many factor models to estimate stock returns, such as CAPM, Fama-French, Carhart-Momentum, APT etc.

Which models can estimate the market (index) return?

I found only three models: Cay, Dividends and Average Correlation.

by emcor at October 09, 2015 09:54 AM



Useful code which uses reduce() in python

Does anyone here have any useful code which uses reduce() function in python? Is there any code other than the usual + and * that we see in the examples?

Refer Fate of reduce() in Python 3000 by GvR

by cnu at October 09, 2015 09:45 AM

Node.js programming style with Akka

I'm wrapping my head around Akka from a node.js perspective. I do the following node.js code below, and it's easy to write & follow. I can easily extend to handle and aggregate multiple services async, all while holding onto initial argument state data (a in example).

function handleAction(a, b, callback) {
    remoteServiceOperation(b, function(err, data) {
        if (!err) {
            // I can reference argument a
            z = a + data; // Do work with a and service result

Below is rough pseudo for Akka/Scala. My understanding is ask ? blocks, and should be generally avoided. Attempting to illustrate where my knowledge train ends, and how I'm not clear about how to hold state a, aggregate (not-shown), or perhaps structure Akka in a node.js style in general.

receive {
    case handleAction(a) =>
        remoteService ! new RemoteServiceOperation(b, c)
        // About to leave and we'll loose `a`
    case remoteServiceOperationResponse(data) =>
        // `a` is afk

How can I write Akka more like node?

by fionbio at October 09, 2015 09:28 AM


Analysis of Control flow graph

I have read some papers that describe using static analysis to generate call graph of an application and the call graph is used to identify entry points of the application. Based on the nature of mobile applications (having multiple entry points), how can i identify the entry point (where actions are triggered) of an applications from its call graph or control flow graph? I have used a static analysis tool that generated a control flow graph of a application as numbers indicating the number of Nodes and Edges (eg. Nodes 61 and Edges 94), i couldn't understand how to identity the entry points from this. Please, can anyone help.

by Ibrahim Salihu Anka at October 09, 2015 09:21 AM


Number of k-expressions of graph (clique Width)

The clique-width of a graph $G$ is the minimum number of labels needed to construct G by means of the following 4 operations. The Construction of a graph $G$ using the four operations is represented by an algebraic expression called $k$-$expression$, Where $k$ is the number of labels used in expression. For example, $K_4$(complete graph with four vertices) can be constructed by $$\rho_{2\rightarrow 1}(\eta_{1,2}(\rho_{2\rightarrow 1}(\eta_{1,2}(\rho_{2\rightarrow 1}(\eta_{1,2}(a(1)\oplus b(2)))\oplus c(2))) \oplus d(2))).$$

Question--> How many k-expressions for bounded clique width graph? Any particular graph classes known for this question?

by GOLD at October 09, 2015 08:13 AM


Which software uses the Python 2.0 license?

Hey guys i am a computer science student from Greece and i have to write an essay on examples of software that use the Python 2.0 license.I would really appreciate if you knew any.

submitted by feelingwisesometimes
[link] [5 comments]

October 09, 2015 08:12 AM




Floating output/making a non-self stabilizing algorithm self stabilizing

So we're using this book by Shlomi Dolev on self-stabilization in class. Not being from a theretical CS background, I think the book is rather terse. Most of it is fine but there's this section on "recomputation of floating output" that I just couldn't get and I would like to know if anyone could help explain the concept in a bit more depth. There's probably a more commonly accepted term for it in the field of distributed systems, so if anyone recognizes the concept, the common name for it would be very helpful as well.

As I understand it, the gist of it is that you take a non-stabilizing, finite algorithm A and run it forever. With some care regarding tracking arguments and outputs of each iteration, you can turn algorithm A into a self-stabilizing algorithm. But that's about the limit of my understanding and again, the book doesn't do an amazing job explaining it...

by Benjamin Lindqvist at October 09, 2015 07:20 AM

Fred Wilson

Feature Friday: 3D Touch

When we get back from Europe later this month, I’m going to get a new phone. I’ve been on Android since getting back from LA last winter and I think it’s time to try out the latest stuff on iOS, but I’m not 100% there. I do love the tight integration of the google apps into android and miss that on iPhone. So I’m still a bit on the fence.

One thing that might get me there is 3D Touch. I’ve read a number of people saying this additional UI capability is a game changer for them.

So let’s talk about 3D touch today. For those of you who have the new iPhones, how big of a deal is 3D Touch? Game changer or nice feature?

by Fred Wilson at October 09, 2015 07:10 AM



Scala: Free variable inside closure does not bind properly

I have a processor which reads a message off of a beanstalk queue. It then instantiates some immutable variables from the message and makes a REST call to a server.

trait ActUponX extends RestClientForX {
  def process(json: JSONObject): Unit = {
    val x: String = if (json.containsKey("hello")) json.getString("hello") else null // we'll assume {"hello": "world"} else null
    val y: String = if (json.containsKey("hola"))  json.getString("hola") else null// we'll assume "hola" maps to "mundo" else null
    println(s"x=$x y=$y")
    attemptAPIRequest(RequestObj(performAction, x, y))

  def performAction(x: String, y: String) = {
    println(s"x=$x y=$y")

  process(JSONObject.fromObject("{\"hello\": \"world\", \"hola\": \"mundo\"}"))

case class RequestObj(call: (String, String) => Unit, x: String, y: String)

trait RestClientForX {
  def attemptAPIRequest(req: RequestObj): Unit = {, req.y)

The problem that I'm seeing with the above is that the first println in the process method prints "x=world y=mundo" and the println statement in performAction method prints "x=null y=mundo"

What is throwing me off is the fact that in process, x is the proper value. Given that it is immutable, I don't see how x could have possibly been mutated.

I've also been reading up on binding of variables within closures and closing over variables ( but am still confused with the above behavior. Does it matter where in the code I bind the free variables? Is it possible that a free variable fails to bind if it is nested within too many layers?

I know I should not be using nulls in a Scala app but for the sake of argument please bare with me.

by sc_1824 at October 09, 2015 06:25 AM


True Stuff: Alexander Graham Bell vs. Western Union

“Be sure you get my sideburns in focus.”

One of my favorite subjects to research is how society’s complaints about “this danged whiz-bang modern world” tend to repeat throughout history.

In previous articles, I’ve shared how Socrates decried the invention of writing; how monks opposed the invention of the printing press; and how electricity and the telephone were seen as demonic instruments.

One takeaway from this sort of story is how these parallels seem to show us that fear of change is an ever-present part of human nature. But these episodes also illuminate another repeating pattern: fear of the new, from those with a vested interest in the old.

After all, the printing press was going to fundamentally alter — and perhaps cheapen — the books that monks dedicated their entire lives to inscribing. Socrates, perhaps, felt that if his words were written down, there would be less interest in his lectures.

And, since we as a thinking species are great at devising “rational” arguments to back up our preconceived notions, the opposition to printing was framed as genuine concern that the reader would no longer feel connected to God through the devotional act of transcribing the Bible.

Socrates’ argument against writing took the form of worrying that the human mind will no longer need to have its own ideas.

As for advances in electricity and the communication breakthroughs of the Industrial Revolution, according to an 1889 editorial in Nature,

It may well be questioned whether, in view of the startling and unforeseen consequences of scientific success which have changed the aspect and economy of the entire globe within the past fifty years, we have not overstepped the moral bounds of science by perverting the knowledge which Man came into possession of surreptitiously when he ate of the “forbidden fruit.”

We have not only experimented with the visible forces of Nature, but, like Saul, have had dealings with the occult. When Benjamin Franklin first called down the lightning from the sky he was accused by the superstitious or reverential with “tempting the Almighty.” Now we handle the subtile element as if it were inert matter, and we impress it into our nurseries as a toy for the children!

About the telephone in particular, the author of this piece declares, “The telephone is the most dangerous of all because it enters into every dwelling. Its interminable network of wires is a perpetual menace to life and property. In its best performance it is only a convenience. It was never a necessity. In a multitude of cities its service is unsatisfactory and is being dispensed with.”

The Providence Press went one step further: “It is indeed difficult, hearing the sounds out of the mysterious box, to wholly resist the notion that the powers of darkness are somehow in league with it.”

Horse Sense

Of course, I’m not alone in deriving enjoyment from looks back at history’s curmudgeons. Articles and thinkpieces about “tech disruption” today are always evoking the follies of small minds of the past, reminding us to adapt or die, lest we share the fate of the buggy-whip makers driven out of business by Henry Ford and the automobile — a metaphor which, by the way, was made famous by Theodore Levitt, writing for Harvard Business Review in 1960:

The article, “Marketing Myopia” (click the image above to see excerpts on Google Books), describes many concepts that have become familiar, even rote knowledge, in the digital age.

The “buggy whip” notion is even robust enough to survive being inverted:

It’s absolutely precious that the above article appears on some kind of blog for LinkedIn, the poster child for the declaration “What?? WE are ABSOLUTELY relevant!!”

The Myth of Western Union’s Blunder

I want to talk more about the telephone. Across the Internet plains, where thinkpieces about tech disruption multiply like dandelions, a common story is recounted.

Alexander Graham Bell, the tale goes, facing financial ruin and unending legal battles over the patent rights to the telephone, offered in 1876 to sell his patent to Western Union, the telegraph company, for $100,000.

An internal committee memo describes the Western Union response:

The Telephone purports to transmit the speaking voice over telegraph wires. We found that the voice is very weak and indistinct, and grows even weaker when long wires are used between the transmitter and receiver. Technically, we do not see that this device will be ever capable of sending recognizable speech over a distance of several miles.

Messer Hubbard and Bell want to install one of their “telephone devices” in every city. The idea is idiotic on the face of it. Furthermore, why would any person want to use this ungainly and impractical device when he can send a messenger to the telegraph office and have a clear written message sent to any large city in the United States?

The electricians of our company have developed all the significant improvements in the telegraph art to date, and we see no reason why a group of outsiders, with extravagant and impractical ideas, should be entertained, when they have not the slightest idea of the true problems involved. Mr. G.G. Hubbard’s fanciful predictions, while they sound rosy, are based on wild-eyed imagination and lack of understanding of the technical and economic facts of the situation, and a posture of ignoring the obvious limitations of his device, which is hardly more than a toy …

In view of these facts, we feel that Mr. G.G. Hubbard’s request for $100,000 of the sale of this patent is utterly unreasonable, since this device is inherently of no use to us. We do not recommend its purchase.

Oh, man. This story is like catnip. The dullard corporation too stuck in the past to recognize innovation, versus the fearless visionary vindicated by history!! Heroic mascot of every they-laughed-at-Einstein-too tinkerer in a basement cluttered to the ceiling with half-finished whatevers!!

I was fascinated by this memo and its sneering tone — “The idea is idiotic on the face of it”, indeed! So I went to look it up in the primary sources.

There’s no mention of it in newspapers from 1876 to 1900. Well, maybe it became public in histories or archives later on… But there’s no mention of it, verbatim, in any documents up through 1950, either.

The memo is often described as having been sent to the president of Western Union, Chauncey M. Depew. But: Chauncey Depew wasn’t a telegraph magnate. He was an attorney and railroad man; in 1876, he was general counsel for the Vanderbilt group of railroad companies.

He was, starting in 1881 (five years after the date of this purported document), on the board of directors of Western Union, but not in an executive capacity.

Chauncey Depew, rocking his house-sittin’ chair.

So, what’s going on here?

On its face, this memo is not factual.

With some more digging, it becomes clear that the text of the memo was fabricated. The events do have some relation to reality — and as far as business lessons go, they are meaningful.

But the message may not be precisely what we thought it was.

For much of the below information, and for arming me with search terms that led me to discover other bits, I am indebted to Phil Lapsley at The History of Phone Phreaking Blog.

Suspicious of the “Depew letter” just as I was, he wrote a series of posts (one, two, three) that shed a lot of light on the background of this famous memo, and the events surrounding it — saving me a lot of original research. I aim to partially summarize Phil’s findings, and also add some information of my own.

The Depew letter, as such, seems to have first seen print as an anecdote in the “Chairman’s Statement”, written by Richard C. Levine, in a 1968 journal of the IEEE (Institute of Electrical and Electronics Engineers).

Levine himself, writing to Lapsley in 2011, explains:

I put it together from two similar but not identical versions. One was widely known among engineers and business people in the late 1950s and early 1960s in the Boston area. I even saw a framed copy of the first version on the office wall of Richard E. Dolbear, an electrical engineer and expert on high voltage power systems […]

The second version of the letter, which is closer to the one I published, was copied from the files of MIT Professor Carlton Tucker. Tucker was an expert on electromechanical telephone switching and taught a survey class on telephone systems at MIT in the 1950s.

Who knows where they got it from. Lapsley also cites (and rehosts a PDF of) a 1976 article on Alexander Graham Bell in IEEE Spectrum magazine, which addresses the memo thus:

Unfortunately, this report is suspect on several counts… How did such a document originate? One can speculate along three lines: It could have been a joke. Or it could have been an honest attempt to recreate years later what such a committee might have reported. Or it might have grown out of a confused reminiscence by someone who was aware that Chauncey Depew was offered — and refused, to his everlasting regret — a share in the Bell patents.

Aha! So there is some truth to the notion that Depew passed, unwisely, on some offer from Bell.

Bell’s Bearded Benefactor

No story about Alexander Graham Bell is complete without mentioning Gardiner G. Hubbard.

G.G. Hubbard was an attorney, investor, public-works advocate, and one of the first champions of Bell’s invention. (Bell would go on to marry Hubbard’s daughter.) He was described as “a man of venerable appearance, with white hair, worn long, and a patriarchal beard.”

“Mabel, bring me my sittin’ beard.”

He was a familiar figure in Washington, and well known among the public men of his day… He saw that this new idea of telephoning must be made familiar in the public mind. He talked telephone by day and by night. Whenever he travelled, he carried a pair of the magical instruments in his valise, and gave demonstrations on trains and in hotels. He buttonholed every influential man who crossed his path. He was a veritable “Ancient Mariner” of the telephone. No possible listener was allowed to escape. (The History of the Telephone, 1910)

Before the telephone could be popularized, however, it needed to be invented.

Hubbard first met Alexander Bell through Mabel Hubbard, his daughter, who was a pupil of Bell’s at a school for the deaf in Boston. Hubbard supported Bell’s tinkering through years of failed experiments, even going so far as to warn him, “If you wish my daughter, you must abandon your foolish telephone.”

Here’s an excerpt from a letter Bell wrote in 1874, relating the details of an early attempt to “create the sensation of sound without the aid of any intermediate apparatus”.

I love the note there, too: “I hope you admire my drawing!!!”

It’s so…good, Alex.

But Bell persisted, and in 1875, with the help of machinist Thomas Watson, began to see some success in transmitting musical sounds over telegraph wire. In March 1876, Bell filed patent No. 174,465 for “An improvement in telegraphy.” It was Bell’s twenty-ninth birthday.

It doesn’t even LOOK like a bell.

On the same day, engineer Elisha Gray, who had been working for some years in the same field, and whom Bell knew, filed a caveat (a provisional patent application) for a very similar invention. There is some controversy about whether Bell knew about Gray’s application, or whose application arrived first at the patent office, or whether Bell knew the workings of Gray’s inventions or came up with his own device independently. Gray would go on to challenge Bell’s patent.

In the years to come Bell would be vindicated in the courts, eventually withstanding six hundred patent challenges from competitors over the span of decades. And in time, of course, the telephone would become one of the most valuable inventions in history.

But at that moment in 1876, he was penniless. And the following year, Gray’s competing patent was purchased by one of the largest corporations in America: Western Union.

The Reigning King of Beeps

Telegraphs, of course, were at the time the best way to transmit long-distance communications. The president of Western Union at the time was William Orton, who knew — and hated — Bell’s father-in-law, G.G. Hubbard.

You just know this guy is the villain of this story:

In the movie version of this article, Orton will be played by John Cleese.

Hubbard, in his capacity as an attorney, had introduced an anti-trust suit against Western Union in 1868. Wary of the telegraph giant’s growing corporate power, he also lobbied Congress to create a federal telegraph company, to be administered by the Postal Service.

Orton had been the head of a small telegraph company that later merged with Western Union — like many other small telegraph companies had been forced to do. When Orton took charge of Western Union in 1867, he was a reformer: he poured money and resources into improving telegraph technology. In 1878, Scientific American wrote that Orton had “possessed a ready appreciation of inventors’ work, and was quick to advocate the adoption and use of new and improved devices calculated to add to the extension and efficiency of the telegraph system.”

He was no dummy. He was up-to-date on Elisha Gray’s inventions; Gray worked for Western Electric, one of Western Union’s suppliers, and in 1874 Orton supported Gray’s efforts at developing a device for transmitting musical tones over wire. Soon, he would hire none other than Thomas Edison to innovate upon (and reverse engineer) Bell’s principle of the “acoustic telegraph”.

Orton, you see, thought he could beat Bell at his own game. A year before Bell filed his first telephone patent, Orton had invited Bell to demonstrate his work, as Gray had demonstrated his own inventions, at Western Union headquarters.

In a March 1875 letter to his parents, Bell described what it felt like to go from his own ramshackle workshop, where he couldn’t even get a battery to work half the time, to Western Union’s elaborate laboratories:

[Orton] told me that the Western Union would be glad to give me every facility in perfecting my instruments, and he gave me a hearty invitation to take my apparatus to New York, and I should have the assistance of their best electricians.

They have a special experimental room, and have at instant command thousands of cells of battery, and thousands of miles of real line wire to test with.

Mr. Orton said further, that he wished me distinctly to understand that the Western Union had no interest in Mr. Gray or his invention.

This is very encouraging. Mr. Orton had previously seen Gray’s apparatus, and yet he came forward to take up mine. [emphasis original]

In 1875, Western Union’s only remaining rival was the Pacific Telegraph Company, whose scientists were making strides in technology that let them lower their transmission prices, making them a new threat to Western Union. Orton was clearly scouting for advantages of his own. “At all events,” wrote Bell, outlining the situation in his letter home, “it is evidently a good time to bring out the invention.”

But, though he put on a kindly face, Orton offered Bell no financial support once he’d looked over his invention. The device did indeed transmit various noises, but practical uses for it seemed unlikely. (At this point, Bell’s first successful vocal transmissions were still a year away.)

G.G. Hubbard was Bell’s chief investor, but he was also now his father-in-law. Perhaps motivated by worry for his daughter’s future, there is evidence that Hubbard offered to sell Bell’s patents — which he had the legal right to do — to Orton, for the proverbial $100,000. The sole source for this figure is a 1915 recollection by Thomas Watson; he said that Western Union had refused the offer “somewhat contemptuously”. That much is plausible — Orton, remember, hated Hubbard, for his opposition to Western Union in the courts and in Congress.

Depew’s Proverbial Regret

Back now to Chauncey Depew, who, as you will recall, worked for the Vanderbilt group of railroad companies. As fate would have it, G.G. Hubbard had been a railway mail inspector in the past, and had done business with Depew often. The two were friends.

In late 1876, having now been turned down by Orton and Western Union, Hubbard approached Depew and asked him to invest in Bell’s invention. (Phil Lapsley reprints two descriptions of this meeting in his own post.)

Here’s an excerpt from Depew’s 1924 memoir:

One day [Hubbard] said to me: “My son-in-law, Professor Bell, has made what I think is a wonderful invention. It is a talking telegraph. We need ten thousand dollars, and I will give you one-sixth interest for that amount of money.”

So far, so good. What could go wrong?

I was very much impressed with Mr. Hubbard’s description of the possibilities of Professor Bell’s invention. Before accepting, however, I called upon my friend, Mr. William Orton, president of the Western Union Telegraph Company. Orton had the reputation of being the best-informed and most accomplished electrical expert in the country. He said to me: “There is nothing in this patent whatever, nor is there anything in the scheme itself, except as a toy. If the device has any value, the Western Union owns a prior patent called the Gray’s patent, which makes the Bell device worthless.” [emphasis added]

In a different account in 1926, in an interview with the New York Herald Tribune, Depew added the following details:

On hearing my story, friend Orton laid his hand on my shoulder to make his words the more emphatic, I suppose, and told me in all good faith and complete sincerity to drop the matter at once. He said in the first place the invention was a toy, that Bell could not perfect anything so that it might have commercial possibilities, and that above all, if there was any merit to the thing, the Western Union owned the Gray patents and would simply step in, superseding Bell, and take the whole thing away from him. That cooled me off to an amazing extent. I felt I was out of the deal when I left Orton’s office.

Of course Depew would ask Orton’s opinion, since Orton was known as the expert in the field of commercial telegraphy.

And despite what Orton had told Bell the year before, Orton was indeed — or was now — interested in Gray’s version of the telephone. In fact, Western Union had bought Gray’s patents (and one by another early innovator, Amos Dolbear), and would put their own Thomas Edison-designed telephones into production the following year.

Having heard Orton’s advice, Depew now declined the offer. But Depew reported that Hubbard, who had come by his office to ask after his decision, convinced him once again of the merits of Bell’s telephone. “By the time he got through,” said Depew, “I seemed to be weaned away from Orton and back to Hubbard’s side again…I went home that night actually resolved to risk the $10,000 in Bell’s device on the strength of Hubbard’s arguments.”

That night, Orton showed up at Depew’s house to talk him back down.

[Orton] said, “After you left my office I began to worry for fear you would be foolish enough to let Bell have that $10,000… I want to explain further to you why Bell could not succeed with his device, even if it worked. We would come along and take it away from him, and you would be out of pocket the $10,000.” And so on these lines I chatted it out with Orton — that’s the most expensive chat I have had yet!

Next day I told Hubbard I had decided after all not to invest any money with Bell, and although we argued some more, I stuck to this last decision, and Hubbard went away.

Depew didn’t know at that time about Orton’s strained relationship with Hubbard, and didn’t know that Hubbard had made an earlier offer to Orton — the one that had been “contemptuously” shot down. Orton, after all, was planning to have Edison improve on Bell’s invention, and thus secure new patents that could beat Bell in court.

Here Is the Point

Western Union’s decision to pass on buying Bell’s patent is considered to be, in retrospect, a tactical blunder.

But Orton’s success at making Western Union profitable over the last few decades was due to the fact that he realized the bulk of their day-to-day business was in short commercial messages such as stock trades. He had actively focused the business at that market. And since telegraph technology had improved to the point where they could send up to four messages on a single line, the telephone, with its short range and single line, seemed obviously inferior.

Plus, at the time, Bell’s phone didn’t seem like the best option on the market — not when they could buy up competing patents, and put a genius like Edison on their payroll to build a version for themselves.

Western Union did, apparently, receive and decline an offer from Hubbard. Whatever its details — whether similar in fact to the catty, apocryphal “committee memo” or not — the lesson isn’t that the big company couldn’t pivot, or see the future. They did at least kind of pivot, and invest in where their industry was going. They even beat Bell for a while, ultimately forcing Bell’s hand — by 1878, Bell had no choice but to sue Western Union for patent infringement, a risky move when neither party knew whether the courts would uphold Bell’s patent over Gray’s.

It’s probably true that Orton couldn’t conceptualize the telephone’s potential to transform everyday civic life, in a way that Hubbard seemed to from the start. Western Union did have a hard time competing with Bell, in the long run, and Bell Telephone eventually bought Western Union.

But the missed business opportunity of the apocryphal “Depew letter” wasn’t Western Union’s — it was Depew’s. Depew was a successful railroad executive, a private investor, who came to Orton for advice about an opportunity his friend Hubbard brought to him.

Orton didn’t want to enrich Hubbard, the man who’d tried to dismantle Western Union. Orton wanted Hubbard and Bell to fail. So Orton convinced Depew not to invest in Bell, even coming to Depew’s house to talk him out of funding the competition.

Depew lived long enough to see what he could have been a part of. (Orton didn’t — he died in 1878.) To his credit, Depew writes in his memoir:

I would have netted by to-day at least one hundred million dollars. I have no regrets. I know my make-up, with its love for the social side of life and its good things, and for good times with good fellows. I also know the necessity of activity and work. I am quite sure with this necessity removed and ambition smothered, I should long ago have been in my grave.

That is a very nice way of coming to terms with the missed opportunity.

Western Union didn’t just sneer dismissively at Alex Bell’s newfangled invention, as the story seems to go. That casts the Western Union camp simply as a boardroom of buffoons.

In fact, what Western Union actually did was something far more typical: it put on a show of sneering at the new thing, so it could quietly try to crush it.

by David Malki at October 09, 2015 06:18 AM


Can a regular expression be infinite?

I know that languages which can be defined using regular expressions and those recognisable by DFA/NFA ( finite automata ) are equivalent. Also no DFA exists for the language $\{0^n1^n|n \ge 0\}$. But still it can be written using regular expressions ( for that matter any non-regular language can be ) as $\{ \epsilon \} \cup \{01\} \cup \{0011\}......$ . But we know that every language that has a regular expression has a DFA that recognises it ( contradiction to my earlier statement ). I know this is a trivial thing, but does the definition of regular expression includes the condition that it should be finite ?

by sasha at October 09, 2015 06:18 AM


How to create multiple agents in parallel (for? or map?) with name and values in CLOJURE?

I'm trying to make a bunch of agents. Individually, one can do:

(def myAgent (agent 3))

But if I want to make a lot of agents, how can I assign both names and values to the agent in an anonymous function? I have this:

(def agents (vec (map agent (range 0 50)) ))

Which makes 50 agents, but none of them have a value. When I try an anonymous function:

(def agents (vec (map (fn [x] (def x (agent 3)) (range 0 50)) ))

It doesn't work. Any help would be greatly appreciated.

by ProgrammingEqualsSuperpower at October 09, 2015 06:13 AM


When would dedicated portfolios do better than 'immunized' portfolios?

We just learned about cash-matching through dedicated portfolios (using risk free bonds) in my class that concerned mathematical programming. However, in an aside one of the notes said:

It should be noted, however, that dedicated portfolios cost typically from 3% to 7% more in dollars terms than do “immunized” portfolios that are constructed based on matching present value, duration, and convexity of the assets and liabilities. "

When would dedicated portfolios be better than 'immunized' portfolios, where you would have to manage accordingly? Immunized portfolios are dependent on duration, convexity, and present value of cash flows. The book says,

"Portfolios that are constructed by matching these three factors are immunized against parallel shifts in the yield curve, but there may still be a great deal of exposure and vulnerability to other types of shifts, and they need to be actively managed, which can be costly. By contrast, dedicated portfolios do not need to be managed after they are constructed."

What are these other variables? Also, you might have access to more instruments (possibly)? But other than that, I'm not sure when to use dedicated portfolios or just calculate how to immunize the portfolio. Anyone have insight on this?

by Supp at October 09, 2015 06:12 AM


A Replace Function in Lisp That Duplicates Mathematica Functionality

What is the easiest way to accomplish the following in a Mathematica clone or in any version of Lisp(any language is probably okay actually even Haskell)? It doesn't appear any lisps have a similar replace function.

  f[{x, "[", y, "]"}],
  f@f[{x, "[", y, y2, "]"}]
 , f[{x_, "[", y__, "]"}] :> x[y],

and a return value of {x[y], f[x[y, y2]]}

It replaces all instances of f[{x_, "[", y__, "]"}] in args where x_ represents a single variable and y__ represents one or more variables.

In lisp the function and replacement would probably be the equivalent(forgive me I am not the best with Lisp). I'm looking for a function of the form (replace list search replace).

   (f (x "[" y "]"))
   (f (f '(x "[" y y2 "]")))
  '(f (x_ "[" y__ "]"))
  '(x y)

and get a return value of ((x y) (f (x y y2))).

by William at October 09, 2015 05:59 AM


Squared Hellinger distance between Binomial(n,p) and Binomial(n+1,p)

I came across a claim in a paper Trace reconstruction revisited (in Lemma 11), which is basically that the squared Hellinger distance between Binomial$(n,p)$ and Binomial$(n+1,p)$ grows as $O(1/n)$. $p$ is a small positive number less than $\frac{1}{2}$.

I've been trying to prove it but have not been able to see this. Is there an easy way to see this?

by Devil at October 09, 2015 05:51 AM


String having r & g separated by 5 characters! (Error:String index out of range error)

Given an input string,check whether the string has char 'r' & 'g' separated be exactly 5 characters. For the following code, the error is String index out of range error. Can't figure out whats wrong

My code for class having function that checks for pattern:

public class classb {
String s = new String();
    public int match(String str){
        int counter = 0;
        int j;
            while(s.charAt(j)!='r' || s.charAt(j)!='g')
            if((s.charAt(j)=='r' && s.charAt(j+6)=='g') || s.charAt(j)=='g' && s.charAt(j+6)=='r'){

        return counter;


Main class:

import java.util.*;
public class classa 
public static void main(String[] args)

    String a = new String();
    int count;
    Scanner sc = new Scanner(;
    System.out.println("Enter a string: ");
    a= sc.nextLine();
    classb x = new classb();
        System.out.println("Pattern found ");
    else if(count==0)
        System.out.println("Pattern not found ");


by Aishwarya R at October 09, 2015 04:47 AM

What is one way that recursion disambiguates a programing language or process? Or is there no answer to this

For my programming language class we talk about BNF, CFG, and ambiguity. We also talk about FPL(functional programming languages), which use a lot of recursion. I need to have 1 or more reasons why recursion is a tool that disambiguates a programming language or a programming proccess.

by Daniel Dil at October 09, 2015 04:08 AM



Daniel Lemire

Predicting the near future is a crazy, impossible game

Back in 1903, the Wright brothers flew for the first time, 20 feet above ground, for 12 seconds. Hardly anyone showed up. The event went vastly unnoticed. It was not reported in the press. The Wright brothers did not become famous until many years later. Yet, ten years later, in 1914, we had war planes used for reconnaissance and dropping (ineffective) bombs. It was not long before we had dogfighting above the battleground.

Lord Kelvin, one of the most reputed and respected scientist at the time, wrote in 1902 that “No balloon and no aeroplane will ever be practically successful.”

If we could not see ten years in the future, back in 1903, what makes us think that we can see ten or twenty years in the future in 2015?

by Daniel Lemire at October 09, 2015 03:30 AM


How does marking algorithm works in dominating set?

I am trying to figure out how marking algorithm works for dominating set.

enter image description here

The algorithm says:

  1. Each node u compiles the set of neighbors N(u)
  2. Each node u transmits N(u), and receives N(v) from all its neighbors.
  3. If node u has two neighbors v,w and w is not in N(v) (and since the graph is undirected v is not in N(w)), then u marks itself being in the set CDS.

I am assuming N(u) is the neighbors of node u and N(v) is message that it takes. I didnt quite understand w is not in N(v).

by hamza at October 09, 2015 03:03 AM



What does the ':' mean after a register in register operation?

For example,

R1 <- R1 + R2:2

What does colon in R2:2 mean?

submitted by SH4NKZ
[link] [comment]

October 09, 2015 02:59 AM


Demo: A fully open source flow for iCE40 FPGAs (using Project IceStorm)

An FPGA development workflow for FPGA programming without any dirty proprietary software has been a long-sought-after goal for lots of us free-software hackers. Clifford Wolf has finally made it happen, just in the last few months! His Project IceStorm uses his reverse-engineering of the Lattice FPGA bitstream format to process the output from arachne-pnr into a valid bitstream, then program that bitstream onto the device.

Apparently he had to write his own Verilog synthesis tools too; dunno if Icarus Verilog wasn’t good enough, or if he just did that for fun.


by kragen at October 09, 2015 02:56 AM



Delay Tolerant Networking

The report is kind of a survey of different protocols that can be used for interplanetary internet. They performed simulated tests which was a little disappointing: I was hoping for real transmission between satellites etc. The language is easy to understand, if a bit dry. As a non sequitor, LTP to a person with a life sciences background is Long Term Potentiation which was very fashionable to study at one point as a possible component of memory.


by kghose at October 09, 2015 02:26 AM

New Horizons Finds Blue Skies and Water Ice on Pluto

This is a bit off-topic for but water on Pluto with some hints of a past thaw are kind of exciting. We should have a post series on about the interplanetary internet and the deep space comm network.


by kghose at October 09, 2015 02:17 AM

Planet Theory

TR15-162 | Graph Isomorphism and Circuit Size | Eric Allender, Joshua Grochow, Cris Moore

We show that the Graph Automorphism problem is ZPP-reducible to MKTP, the problem of minimizing time-bounded Kolmogorov complexity. MKTP has previously been studied in connection with the Minimum Circuit Size Problem (MCSP) and is often viewed as essentially a different encoding of MCSP. All prior reductions to MCSP have applied equally well to MKTP, and vice-versa, and all such reductions have relied on the fact that functions computable in polynomial time can be inverted with high probability relative to MCSP and MKTP. Our reduction uses a different approach, and consequently yields the �first example of a problem in ZPP$^{\rm MKTP}$ that is not known to lie in NP $\cap$ coNP. We also show that this approach can be used to provide a reduction of the Graph Isomorphism problem to MKTP.

October 09, 2015 02:02 AM


why are folktale and ramda so different? [on hold]

I'm learning javascript FP by reading DrBoolean's book.

I searched around for functional programming library. I found Ramda and Folktale. Both claim to be functional programming library.

But they are so different:

  • Ramda seems to contain utility functions for dealing with list: map, reduce, filter and pure functions: curry, compose. It doesn't contain anything to deal with monad, functor.

  • Folktale however doesn't contain any utility for list or functions. It seems to implement the some algebraic structures in javascript like monad: Maybe, Task...

Actually I found more libraries, they all seem fall into the two categorys. underscore, lodash are very like Ramda. Fantasy-land, pointfree-fantasy are like folktale.

Can these very different libraries both be called functional, and if so, what makes each one a functional library?

by Aaron Shen at October 09, 2015 01:58 AM

hubertf's NetBSD blog

NetBSD 7.0 is released

The NetBSD Project is pleased to announce NetBSD 7.0, the fifteenth major release of the NetBSD operating system. The last release is quite some time ago, but given a philosophy of "it is ready when it is ready" and the amount of work that went into it, this is a very good reason to celebrate!

See the official release announcement for a long list of hilights and harder to notice changes.

Also, there is an article on What to expect in NetBSD 7 which also describes many of the new features, changes and bugfixes in NetBSD 7.

October 09, 2015 01:54 AM


Scala trait that accepts generic n-size tuple and returns m-sized tuple

I'm currently developing a spark application that involves a series of complex joins between tuples, building out a N sized tuple based on the target data set.

I come from a Java world, and have been resolving specific fields from each element of a tuple into a new object. I know there has to be a better way to do this functionally.


val obj1Obj2: ((Int, (Object1, Object2)) = object1.join(object2) val obj3Resolve = obj1Obj2 .map(a => a match {case (k,v) => v}} .map(a => (a._2.key, new Object3(a._2.key,, 

What I would like to do is to have a generic trait that I extend for each specific target object, taking in an arbitrary tuple and returning an arbitrary tuple. I've found that the joins themselves are fairly straightforward, its the intermediate object declarations that bloat the code, as well as the restructuring of the tuples to rekey them for a future join, and I feel this is too "java" like.

Any advice is much appreciated, I'm developing with spark so this might not be the correct sub, but to me this is a scala problem.


Edit: example is obviously generalized, is much more complex

submitted by thabarrtender
[link] [5 comments]

October 09, 2015 01:51 AM


I came across a 2-bit paper processor, and now want to build a physical one. How do I do this?

I do math, but have some CS knowledge (computability theory, programming, logic, ...). I read this article on simulating a 2-bit processor on paper, and now I am really excited.

I would like to go all engineer and learn a bit about processors. My end goal is building a physical 2-bit computer as a project. Could you recommend books/articles that would teach me how to?

I just want to understand how this works, and build a working physical processor. So no over-specialized books, plz. :)

submitted by nalkinwor88
[link] [comment]

October 09, 2015 01:50 AM

arXiv Cryptography and Security

Secure State Estimation against Sensor Attacks in the Presence of Noise. (arXiv:1510.02462v1 [math.OC])

We consider the problem of estimating the state of a noisy linear dynamical system when an unknown subset of sensors is arbitrarily corrupted by an adversary. We propose a secure state estimation algorithm, and derive (optimal) bounds on the achievable state estimation error given an upper bound on the number of attacked sensors. The proposed state estimator involves Kalman filters operating over subsets of sensors to search for a sensor subset which is reliable for state estimation. To further improve the subset search time, we propose Satisfiability Modulo Theory based techniques to exploit the combinatorial nature of searching over sensor subsets. Finally, as a result of independent interest, we give a coding theoretic view of attack detection and state estimation against sensor attacks in a noiseless dynamical system.

by <a href="">Shaunak Mishra</a>, <a href="">Yasser Shoukry</a>, <a href="">Nikhil Karamchandani</a>, <a href="">Suhas Diggavi</a>, <a href="">Paulo Tabuada</a> at October 09, 2015 01:30 AM

1-Sperner hypergraphs. (arXiv:1510.02438v1 [math.CO])

We introduce a new class of hypergraphs, the class of $1$-Sperner hypergraphs. A hypergraph ${\cal H}$ is said to be $1$-Sperner if every two distinct hyperedges $e,f$ of ${\cal H}$ satisfy $\min\{|e\setminus f|,|f\setminus e|\} = 1$. We prove a decomposition theorem for $1$-Sperner hypergraphs and examine several of its consequences, including bounds on the size of $1$-Sperner hypergraphs and a new, constructive proof of the fact that every $1$-Sperner hypergraph is threshold. We also show that within the class of normal Sperner hypergraphs, the (generally properly nested) classes of $1$-Sperner hypergraphs, of threshold hypergraphs, and of $2$-asummable hypergraphs coincide. This yields new characterizations of the class of threshold graphs.

by <a href="">Endre Boros</a>, <a href="">Vladimir Gurvich</a>, <a href="">Martin Milani&#x10d;</a> at October 09, 2015 01:30 AM

Exact Inference Techniques for the Dynamic Analysis of Attack Graphs. (arXiv:1510.02427v1 [cs.CR])

Attack graphs are a powerful tool for security risk assessment by analysing network vulnerabilities and the paths attackers can use to compromise valuable network resources. The uncertainty about the attacker's behaviour and capabilities make Bayesian networks suitable to model attack graphs to perform static and dynamic analysis. Previous approaches have focused on the formalization of traditional attack graphs into a Bayesian model rather than proposing mechanisms for their analysis. In this paper we propose to use efficient algorithms to make exact inference in Bayesian attack graphs, enabling the static and dynamic network risk assessments. To support the validity of our proposed approach we have performed an extensive experimental evaluation on synthetic Bayesian attack graphs with different topologies, showing the computational advantages in terms of time and memory use of the proposed techniques when compared to existing approaches.

by <a href="">Luis Mu&#xf1;oz-Gonz&#xe1;lez</a>, <a href="">Daniele Sgandurra</a>, <a href="">Mart&#xed;n Barr&#xe8;re</a>, <a href="">Emil Lupu</a> at October 09, 2015 01:30 AM

Effect-Dependent Transformations for Concurrent Programs. (arXiv:1510.02419v1 [cs.PL])

We describe a denotational semantics for an abstract effect system for a higher-order, shared-variable concurrent programming language. We prove the soundness of a number of general effect-based program equivalences, including a parallelization equation that specifies sufficient conditions for replacing sequential composition with parallel composition. Effect annotations are relative to abstract locations specified by contracts rather than physical footprints allowing us in particular to show the soundness of some transformations involving fine-grained concurrent data structures, such as Michael-Scott queues, that allow concurrent access to different parts of mutable data structures.

Our semantics is based on refining a trace-based semantics for first-order programs due to Brookes. By moving from concrete to abstract locations, and adding type refinements that capture the possible side-effects of both expressions and their concurrent environments, we are able to validate many equivalences that do not hold in an unrefined model. The meanings of types are expressed using a game-based logical relation over sets of traces. Two programs $e_1$ and $e_2$ are logically related if one is able to solve a two-player game: for any trace with result value $v_1$ in the semantics of $e_1$ (challenge) that the player presents, the opponent can present an (response) equivalent trace in the semantics of $e_2$ with a logically related result value $v_2$.

by <a href="">Nick Benton</a>, <a href="">Martin Hofmann</a>, <a href="">Vivek Nigam</a> at October 09, 2015 01:30 AM

Literature Review Of Attribute Level And Structure Level Data Linkage Techniques. (arXiv:1510.02395v1 [cs.DB])

Data Linkage is an important step that can provide valuable insights for evidence-based decision making, especially for crucial events. Performing sensible queries across heterogeneous databases containing millions of records is a complex task that requires a complete understanding of each contributing databases schema to define the structure of its information. The key aim is to approximate the structure and content of the induced data into a concise synopsis in order to extract and link meaningful data-driven facts. We identify such problems as four major research issues in Data Linkage: associated costs in pair-wise matching, record matching overheads, semantic flow of information restrictions, and single order classification limitations. In this paper, we give a literature review of research in Data Linkage. The purpose for this review is to establish a basic understanding of Data Linkage, and to discuss the background in the Data Linkage research domain. Particularly, we focus on the literature related to the recent advancements in Approximate Matching algorithms at Attribute Level and Structure Level. Their efficiency, functionality and limitations are critically analysed and open-ended problems have been exposed.

by <a href="">Mohammed Gollapalli</a> at October 09, 2015 01:30 AM

Security-aware selection of Web Services for Reliable Composition. (arXiv:1510.02391v1 [cs.DC])

Dependability is an important characteristic that a trustworthy computer system should have. It is a measure of Availability, Reliability, Maintainability, Safety and Security. The focus of our research is on security of web services. Web services enable the composition of independent services with complementary functionalities to produce value-added services, which allows organizations to implement their core business only and outsource other service components over the Internet, either pre-selected or on-the-fly. The selected third party web services may have security vulnerabilities. Vulnerable web services are of limited practical use. We propose to use an intrusion-tolerant composite web service for each functionality that should be fulfilled by a third party web service. The third party services employed in this approach should be selected based on their security vulnerabilities in addition to their performance. The security vulnerabilities of the third party services are assessed using a penetration testing tool. In this paper we present our preliminary research work.

by <a href="">Shahedeh Khani</a>, <a href="">Cristina Gacek</a>, <a href="">Peter Popov</a> at October 09, 2015 01:30 AM

Aperiodic Subshifts on Polycyclic Groups. (arXiv:1510.02360v1 [cs.DM])

We prove that every polycyclic group of nonlinear growth admits a strongly aperiodic SFT and has an undecidable domino problem. This answers a question of [4] and generalizes the result of [2].

by <a href="">Emmanuel Jeandel</a> (CARTE) at October 09, 2015 01:30 AM

On the structure of (banner, odd hole)-free graphs. (arXiv:1510.02324v1 [math.CO])

A hole is a chordless cycle with at least four vertices. A hole is odd if it has an odd number of vertices. A banner is a graph which consists of a hole on four vertices and a single vertex with precisely one neighbor on the hole. We prove that a (banner, odd hole)-free graph is either perfect, or does not contain a stable set on three vertices, or contains a homogeneous set. Using this structure result, we design a polynomial-time algorithm for recognizing (banner, odd hole)-free graphs. We also design polynomial-time algorithms to find, for such a graph, a minimum coloring and largest stable set.

by <a href="">Ch&#xed;nh T. Ho&#xe0;ng</a> at October 09, 2015 01:30 AM

Data Transmission with Reduced Delay for Distributed Acoustic Sensors. (arXiv:1510.02259v1 [cs.DC])

This paper proposes a channel access control scheme fit to dense acoustic sensor nodes in a sensor network. In the considered scenario, multiple acoustic sensor nodes within communication range of a cluster head are grouped into clusters. Acoustic sensor nodes in a cluster detect acoustic signals and convert them into electric signals (packets). Detection by acoustic sensors can be executed periodically or randomly and random detection by acoustic sensors is event driven. As a result, each acoustic sensor generates their packets (50bytes each) periodically or randomly over short time intervals (400ms~4seconds) and transmits directly to a cluster head (coordinator node). Our approach proposes to use a slotted carrier sense multiple access. All acoustic sensor nodes in a cluster are allocated to time slots and the number of allocated sensor nodes to each time slot is uniform. All sensor nodes allocated to a time slot listen for packet transmission from the beginning of the time slot for a duration proportional to their priority. The first node that detect the channel to be free for its whole window is allowed to transmit. The order of packet transmissions with the acoustic sensor nodes in the time slot is autonomously adjusted according to the history of packet transmissions in the time slot. In simulations, performances of the proposed scheme are demonstrated by the comparisons with other low rate wireless channel access schemes.

by <a href="">Hyun-Gyu Ryu</a>, <a href="">Sang-Keum Lee</a>, <a href="">Dongsoo Har</a> at October 09, 2015 01:30 AM

On the Maximal Shortest Path in a Connected Component in V2V. (arXiv:1510.02238v1 [cs.PF])

In this work, a VANET (Vehicular Ad-hoc NETwork) is considered to operate on a simple lane, without infrastructure. The arrivals of vehicles are assumed to be general with any traffic and speed assumptions. The vehicles communicate through the shortest path. In this paper, we study the probability distribution of the number of hops on the maximal shortest path in a connected component of vehicles. The general formulation is given for any assumption of road traffic. Then, it is applied to calculate the z-transform of this distribution for medium and dense networks in the Poisson case. Our model is validated with the Madrid road traces of the Universitat Polit\`ecnica de Catalunya. These results may be useful for example when evaluating diffusion protocols through the shortest path in a VANET, where not only the mean but also the other moments are needed to derive accurate results.

by <a href="">Michel Marot</a>, <a href="">Adel Mounir Sa&#xef;d</a>, <a href="">Hossam Afifi</a> at October 09, 2015 01:30 AM

Explicit Parallel-in-time Integration of a Linear Acoustic-Advection System. (arXiv:1510.02237v1 [cs.CE])

The applicability of the Parareal parallel-in-time integration scheme for the solution of a linear, two-dimensional hyperbolic acoustic-advection system, which is often used as a test case for integration schemes for numerical weather prediction (NWP), is addressed. Parallel-in-time schemes are a possible way to increase, on the algorithmic level, the amount of parallelism, a requirement arising from the rapidly growing number of CPUs in high performance computer systems. A recently introduced modification of the "parallel implicit time-integration algorithm" could successfully solve hyperbolic problems arising in structural dynamics. It has later been cast into the framework of Parareal. The present paper adapts this modified Parareal and employs it for the solution of a hyperbolic flow problem, where the initial value problem solved in parallel arises from the spatial discretization of a partial differential equation by a finite difference method. It is demonstrated that the modified Parareal is stable and can produce reasonably accurate solutions while allowing for a noticeable reduction of the time-to-solution. The implementation relies on integration schemes already widely used in NWP (RK-3, partially split forward Euler, forward-backward). It is demonstrated that using an explicit partially split scheme for the coarse integrator allows to avoid the use of an implicit scheme while still achieving speedup.

by <a href="">Daniel Ruprecht</a>, <a href="">Rolf Krause</a> at October 09, 2015 01:30 AM

Combining behavioural types with security analysis. (arXiv:1510.02229v1 [cs.PL])

Today's software systems are highly distributed and interconnected, and they increasingly rely on communication to achieve their goals; due to their societal importance, security and trustworthiness are crucial aspects for the correctness of these systems. Behavioural types, which extend data types by describing also the structured behaviour of programs, are a widely studied approach to the enforcement of correctness properties in communicating systems. This paper offers a unified overview of proposals based on behavioural types which are aimed at the analysis of security properties.

by <a href="">Massimo Bartoletti</a>, <a href="">Ilaria Castellani</a>, <a href="">Pierre-Malo Deni&#xe9;lou</a>, <a href="">Mariangiola Dezani-Ciancaglini</a>, <a href="">Silvia Ghilezan</a>, <a href="">Jovanka Pantovic</a>, <a href="">Peter Thiemann</a>, <a href="">Bernardo Toninho</a>, <a href="">Hugo Torres Vieira</a> at October 09, 2015 01:30 AM

Initial Service Provider DevOps concept, capabilities and proposed tools. (arXiv:1510.02220v1 [cs.NI])

This report presents a first sketch of the Service Provider DevOps concept including four major management processes to support the roles of both service and VNF developers as well as the operator in a more agile manner. The sketch is based on lessons learned from a study of management and operational practices in the industry and recent related work with respect to management of SDN and cloud. Finally, the report identifies requirements for realizing SP-DevOps within an combined cloud and transport network environment as outlined by the UNIFY NFV architecture.

by <a href="">Wolfgang John</a>, <a href="">Catalin Meirosu</a>, <a href="">Pontus Sk&#xf6;ldstr&#xf6;m</a>, <a href="">Felician Nemeth</a>, <a href="">Andras Gulyas</a>, <a href="">Mario Kind</a>, <a href="">Sachin Sharma</a>, <a href="">Ioanna Papafili</a>, <a href="">George Agapiou</a>, <a href="">Guido Marchetto</a>, <a href="">Riccardo Sisto</a>, <a href="">Rebecca Steinert</a>, <a href="">Per Kreuger</a>, <a href="">Henrik Abrahamsson</a>, <a href="">Antonio Manzalini</a>, <a href="">Nadi Sarrar</a> at October 09, 2015 01:30 AM

On Summarizing Graph Streams. (arXiv:1510.02219v1 [cs.DB])

Graph streams, which refer to the graph with edges being updated sequentially in a form of a stream, have wide applications such as cyber security, social networks and transportation networks. This paper studies the problem of summarizing graph streams. Specifically, given a graph stream G, directed or undirected, the objective is to summarize G as S with much smaller (sublinear) space, linear construction time and constant maintenance cost for each edge update, such that S allows many queries over G to be approximately conducted efficiently. Due to the sheer volume and highly dynamic nature of graph streams, summarizing them remains a notoriously hard, if not impossible, problem. The widely used practice of summarizing data streams is to treat each element independently by e.g., hash- or sampling-based method, without keeping track of the connections between elements in a data stream, which gives these summaries limited power in supporting complicated queries over graph streams. This paper discusses a fundamentally different philosophy for summarizing graph streams. We present gLava, a probabilistic graph model that, instead of treating an edge (a stream element) as the operating unit, uses the finer grained node in an element. This will naturally form a new graph sketch where edges capture the connections inside elements, and nodes maintain relationships across elements. We discuss a wide range of supported graph queries and establish theoretical error bounds for basic queries.

by <a href="">Nan Tang</a>, <a href="">Qing Chen</a>, <a href="">Prasenjit Mitra</a> at October 09, 2015 01:30 AM

The F-snapshot Problem. (arXiv:1510.02211v1 [cs.DC])

Aguilera, Gafni and Lamport introduced the signaling problem in [5]. In this problem, two processes numbered 0 and 1 can call two procedures: update and Fscan. A parameter of the problem is a two- variable function $F(x_0,x_1)$. Each process $p_i$ can assign values to variable $x_i$ by calling update(v) with some data value v, and compute the value: $F(x_0,x_1)$ by executing an Fscan procedure. The problem is interesting when the domain of $F$ is infinite and the range of $F$ is finite. In this case, some "access restrictions" are imposed that limit the size of the registers that the Fscan procedure can access. Aguilera et al. provided a non-blocking solution and asked whether a wait-free solution exists. A positive answer can be found in [7].

The natural generalization of the two-process signaling problem to an arbitrary number of processes turns out to yield an interesting generalization of the fundamental snapshot problem, which we call the F-snapshot problem. In this problem $n$ processes can write values to an $n$-segment array (each process to its own segment), and can read and obtain the value of an n-variable function $F$ on the array of segments. In case that the range of $F$ is finite, it is required that only bounded registers are accessed when the processes apply the function $F$ to the array, although the data values written to the segments may be taken from an infinite set. We provide here an affirmative answer to the question of Aguilera et al. for an arbitrary number of processes. Our solution employs only single-writer atomic registers, and its time complexity is $O(n \log n)$, which is also the time complexity of the fastest snapshot algorithm that uses only single-writer registers.

by <a href="">Gal Amram</a> at October 09, 2015 01:30 AM

Star-Replaced Networks: A Generalised Class of Dual-Port Server-Centric Data Centre Networks. (arXiv:1510.02181v1 [cs.DC])

We propose a new generic construction for the design of dual-port server-centric data centre networks so that: every server-node is adjacent to exactly one switch-node and exactly one server-node; and every switch-node is adjacent only to server-nodes. Our construction facilitates the transformation of well- studied topologies from interconnection networks, along with their networking properties, into viable server- centric data centre network topologies. As an example, we instantiate this construction with generalized hypercubes as the base graphs so as to obtain the data centre networks GQ*. We empirically compare GQ* with well-established architectures, namely FiConn and DPillar, using a comprehensive number of performance metrics: network throughput, mean distance, load balancing capability, scalability, and fault tolerance. We find that GQ* performs much better than FiConn as regards these metrics and that it seems a competitive alternative to DPillar because it offers similar performance with a much lower number of networking components. Further, we present a routing algorithm for GQ* and show it provides excellent performance in terms of average path-lengths as well as high reliability in the presence of a significant number of link failures.

by <a href="">Alejandro Erickson</a>, <a href="">and Iain A. Stewart</a>, <a href="">Javier Navaridas</a>, <a href="">Abbas E. Kiasari</a> at October 09, 2015 01:30 AM

Performance Analysis of an Astrophysical Simulation Code on the Intel Xeon Phi Architecture. (arXiv:1510.02163v1 [cs.DC])

We have developed the astrophysical simulation code XFLAT to study neutrino oscillations in supernovae. XFLAT is designed to utilize multiple levels of parallelism through MPI, OpenMP, and SIMD instructions (vectorization). It can run on both CPU and Xeon Phi co-processors based on the Intel Many Integrated Core Architecture (MIC). We analyze the performance of XFLAT on configurations with CPU only, Xeon Phi only and both CPU and Xeon Phi. We also investigate the impact of I/O and the multi-node performance of XFLAT on the Xeon Phi-equipped Stampede supercomputer at the Texas Advanced Computing Center (TACC).

by <a href="">Vahid Noormofidi</a>, <a href="">Susan R. Atlas</a>, <a href="">Huaiyu Duan</a> at October 09, 2015 01:30 AM

A Scheme for Maximal Resource Utilization in Peer-to-Peer Live Streaming. (arXiv:1510.02138v1 [cs.NI])

Peer-to-Peer streaming technology has become one of the major Internet applications as it offers the opportunity of broadcasting high quality video content to a large number of peers with low costs. It is widely accepted that with the efficient utilization of peers and server's upload capacities, peers can enjoy watching a high bit rate video with minimal end-to-end delay. In this paper, we present a practical scheduling algorithm that works in the challenging condition where no spare capacity is available, i.e., it maximally utilizes the resources and broadcasts the maximum streaming rate. Each peer contacts with only a small number of neighbours in the overlay network and autonomously subscribes to sub-streams according to a budget-model in such a way that the number of peers forwarding exactly one sub-stream will be maximized. The hop-count delay is also taken into account to construct a short depth trees. Finally, we show through simulation that peers dynamically converge to an efficient overlay structure with a short hop-count delay. Moreover, the proposed scheme gives nice features in the homogeneous case and overcomes SplitStream in all simulated scenarios.

by <a href="">Bahaa Aldeen Alghazawy</a>, <a href="">Satoshi Fujita</a> at October 09, 2015 01:30 AM

A Remote Procedure Call Approach for Extreme-scale Services. (arXiv:1510.02135v1 [cs.DC])

When working at exascale, the various constraints imposed by the extreme scale of the system bring new challenges for application users and software/middleware developers. In that context, and to provide best performance, resiliency and energy efficiency, software may be provided as a service oriented approach, adjusting resource utilization to best meet facility and user requirements. Remote procedure call (RPC) is a technique that originally followed a client/server model and allowed local calls to be transparently executed on remote resources. RPC consists of serializing the local function parameters into a memory buffer and sending that buffer to a remote target that in turn deserializes the parameters and executes the corresponding function call, returning the result back to the caller. Building reusable services requires the definition of a communication model to remotely access these services and for this purpose, RPC can serve as a foundation for accessing them. We introduce the necessary building blocks to enable this ecosystem to software and middleware developers with an RPC framework called Mercury.

by <a href="">Jerome Soumagne</a>, <a href="">Philip H. Carns</a>, <a href="">Dries Kimpe</a>, <a href="">Quincey Koziol</a>, <a href="">Robert B. Ross</a> at October 09, 2015 01:30 AM

Inherent Diversity in Replicated Architectures. (arXiv:1510.02086v1 [cs.DC])

In this paper, we report our ongoing investigations of the inherent non-determinism in contemporary execution environments that can potentially lead to divergence in state of a multi-channel hardware/software system. Our approach involved setting up of experiments to study execution path variability of a simple program by tracing its execution at the kernel level. In the first of the two experiments, we analyzed the execution path by repeated execution of the program. In the second, we executed in parallel two instances of the same program, each pinned to a separate processor core. Our results show that for a program executing in a contemporary hardware/software platform , there is sufcient path non-determinism in kernel space that can potentially lead to diversity in replicated architectures. We believe the execution non-determinism can impact the activation of residual systematic faults in software. If this is true, then the inherent diversity can be used together with architectural means to protect safety related systems against residual systematic faults in the operating systems.

by <a href="">Peter Okech</a>, <a href="">Nicholas Mc Guire</a>, <a href="">William Okelo-Odongo</a> at October 09, 2015 01:30 AM


CompSci Weekend SuperThread (October 09, 2015)

/r/compsci strives to be the best online community for computer scientists. We moderate posts to keep things on topic.

This Weekend SuperThread provides a discussion area for posts that might be off-topic normally. Anything Goes: post your questions, ideas, requests for help, musings, or whatever comes to mind as comments in this thread.


  • If you're looking to answer questions, sort by new comments.
  • If you're looking for answers, sort by top comment.
  • Upvote a question you've answered for visibility.
  • Downvoting is discouraged. Save it for discourteous content only.


  • It's not truly "Anything Goes". Please follow Reddiquette and use common sense.
  • Homework help questions are discouraged.
submitted by AutoModerator
[link] [comment]

October 09, 2015 01:03 AM

Planet Theory

Small-Area Orthogonal Drawings of 3-Connected Graphs

Authors: Therese Biedl, Jens M. Schmidt
Download: PDF
Abstract: It is well-known that every graph with maximum degree 4 has an orthogonal drawing with area at most $\frac{49}{64} n^2+O(n) \approx 0.76n^2$. In this paper, we show that if the graph is 3-connected, then the area can be reduced even further to $\frac{25}{36}n^2+O(n) \approx 0.56n^2$. The drawing uses the 3-canonical order for (not necessarily planar) 3-connected graphs, which is a special Mondshein sequence and can hence be computed in linear time. To our knowledge, this is the first application of a Mondshein sequence in graph drawing.

October 09, 2015 12:41 AM

An Efficient Data Structure for Fast Mining High Utility Itemsets

Authors: Zhi-Hong Deng, Shulei Ma, He Liu
Download: PDF
Abstract: In this paper, we propose a novel data structure called PUN-list, which maintains both the utility information about an itemset and utility upper bound for facilitating the processing of mining high utility itemsets. Based on PUN-lists, we present a method, called MIP (Mining high utility Itemset using PUN-Lists), for fast mining high utility itemsets. The efficiency of MIP is achieved with three techniques. First, itemsets are represented by a highly condensed data structure, PUN-list, which avoids costly, repeatedly utility computation. Second, the utility of an itemset can be efficiently calculated by scanning the PUN-list of the itemset and the PUN-lists of long itemsets can be fast constructed by the PUN-lists of short itemsets. Third, by employing the utility upper bound lying in the PUN-lists as the pruning strategy, MIP directly discovers high utility itemsets from the search space, called set-enumeration tree, without generating numerous candidates. Extensive experiments on various synthetic and real datasets show that PUN-list is very effective since MIP is at least an order of magnitude faster than recently reported algorithms on average.

October 09, 2015 12:41 AM


Algorithm for grid with obstacles and movement restriction

We are given an $n \times n$ grid with some of the squares darkened.

Our goal is to move from the bottom-left to the top-right corner with the following constraints:

1) We cannot step on a darkened square.

2) Each move must be up or to the right.

3) We cannot move in the same direction consecutively four times.

Design an algorithm that runs in $O(n^2)$ time and outputs the set of all squares that can be reached from the bottom-left corner.

My thoughts: We can convert the grid to a graph in $O(n^2)$: each square (darkened or not) becomes a node, and adjacent squares correspond to edges; each edge is directed either rightwards or upwards. We can then remove the nodes that correspond to darkened squares. At this point, we can BFS or DFS from the top-left node, but we must keep track of condition 3 somehow, with some sort of "cost" array.

Another idea is to "chop off" (this can be formalized) a triangle from the upper left and lower right, which are "too high" and "too right" respectively. But then, the existence of the darkened squares causes some additional squares to be unreachable.

Any help would be appreciated.

by Pom Pom at October 09, 2015 12:41 AM

Planet Theory

Distributed Estimation of Graph 4-Profiles

Authors: Ethan R. Elenberg, Karthikeyan Shanmugam, Michael Borokhovich, Alexandros G. Dimakis
Download: PDF
Abstract: We present a novel distributed algorithm for counting all four-node induced subgraphs in a big graph. These counts, called the $4$-profile, describe a graph's connectivity properties and have found several uses ranging from bioinformatics to spam detection. We also study the more complicated problem of estimating the local $4$-profiles centered at each vertex of the graph. The local $4$-profile embeds every vertex in an $11$-dimensional space that characterizes the local geometry of its neighborhood: vertices that connect different clusters will have different local $4$-profiles compared to those that are only part of one dense cluster.

Our algorithm is a local, distributed message-passing scheme on the graph and computes all the local $4$-profiles in parallel. We rely on two novel theoretical contributions: we show that local $4$-profiles can be calculated using compressed two-hop information and also establish novel concentration results that show that graphs can be substantially sparsified and still retain good approximation quality for the global $4$-profile.

We empirically evaluate our algorithm using a distributed GraphLab implementation that we scaled up to $640$ cores. We show that our algorithm can compute global and local $4$-profiles of graphs with millions of edges in a few minutes, significantly improving upon the previous state of the art.

October 09, 2015 12:40 AM

A characterization of linearizable instances of the quadratic minimum spanning tree problem

Authors: Ante Ćustić, Abraham P. Punnen
Download: PDF
Abstract: We investigate special cases of the quadratic minimum spanning tree problem (QMSTP) on a graph $G=(V,E)$ that can be solved as a linear minimum spanning tree problem. Characterization of such problems on graphs with special properties are given. This include complete graphs, complete bipartite graphs, cactuses among others. Our characterization can be verified in $O(|E|^2)$ time. In the case of complete graphs and when the cost matrix is given in factored form, we show that our characterization can be verified in $O(|E|)$ time. Related open problems are also indicated.

October 09, 2015 12:40 AM

Structure and automorphisms of primitive coherent configurations

Authors: Xiaorui Sun, John Wilmes
Download: PDF
Abstract: Coherent configurations (CCs) are highly regular colorings of the set of ordered pairs of a "vertex set"; each color represents a "constituent digraph." A CC is primitive (PCC) if all its constituent digraphs are connected.

We address the problem of classifying PCCs with large automorphism groups. This project was started in Babai's 1981 paper in which he showed that only the trivial PCC admits more than $\exp(\tilde{O}(n^{1/2}))$ automorphisms. (Here, $n$ is the number of vertices and the $\tilde{O}$ hides polylogarithmic factors.)

In the present paper we classify all PCCs with more than $\exp(\tilde{O}(n^{1/3}))$ automorphisms, making the first progress on Babai's conjectured classification of all PCCs with more than $\exp(n^{\epsilon})$ automorphisms.

A corollary to Babai's 1981 result solved a then 100-year-old problem on primitive but not doubly transitive permutation groups, giving an $\exp(\tilde{O}(n^{1/2}))$ bound on their order. In a similar vein, our result implies an $\exp(\tilde{O}(n^{1/3}))$ upper bound on the order of such groups, with known exceptions. This improvement of Babai's result was previously known only through the Classification of Finite Simple Groups (Cameron, 1981), while our proof, like Babai's, is elementary and almost purely combinatorial.

Our result also has implications to the complexity of the graph isomorphism problem. PCCs arise naturally as obstacles to combinatorial partitioning approaches to the problem. Our results give an algorithm for deciding isomorphism of PCCs in time $\exp(\tilde{O}(n^{1/3}))$, the first improvement over Babai's $\exp(\tilde{O}(n^{1/2}))$ bound.

October 09, 2015 12:40 AM


EuroBSDCon 2015 OpenBSD Presentations Online

This year's EuroBSDCon in Stockholm, Sweden was a quite successful conference with approximately 250 attendees and a fairly strong showing of OpenBSD developers presenting:

  • Raceless network configuration, Vadim Zhukov (video, slides)
  • Portroach, OpenBSD distfile scanner, Jasper Lievisse Adriaanse (video, slides)
  • softraid(4) boot, Stefan Sperling (video, paper)
  • Cryptography in OpenBSD: Another Overview Ted Unangst (video, slides)
  • Faster and more secure packages in OpenBSD, Marc Espie (video, slides)
  • mandoc: from scratch to the standard BSD documenation toolkit in 6 years, Ingo Schwarze (video, slides)
  • Why OpenBSD matters in the Healthcare Industry, Brandon Mercer (video, slides)
  • OpenBSD sucks, Henning Brauer (video)
  • config – Rethinking kernel build Masao Uebayashi (video)

October 09, 2015 12:39 AM

HN Daily

October 08, 2015


Looking for reference proving polynomial-time bounds for A* search under specific conditions

In the textbook "Artificial Intelligence - A Modern Approach" (Russel, Norvig), it mentions that a sufficient criteria for the A* search algorithm to complete in polynomial time is for the heuristic estimate h(n) and actual cost h*(n) to never be different by more than O(log(h*(n)), i.e.:

|h(n) - h*(n)| <= O(log(h*(n)))

The textbook mentions this result without proof. I am looking for the proof. Does anyone know where I can find this result proven?

Edit: I discovered that the proof is most likely available in Judea Pearl's textbook "Heuristics: Intelligent Search Strategies for Computer Problem Solving" ( (I should have guessed that this was by Pearl). It appears that the textbook is not available except through pay walls.

I do not want anyone to violate copyright laws and post the content, but if someone can validate that the proof is in this text, I will happily go out and buy the textbook (or get it from the university library).

by Bill Province at October 08, 2015 11:55 PM


How do you maintain local changes to your FreeBSD system?

Since I started using FreeBSD my system has been modified quite a bit. I added or modified preexisting some periodic scripts, added/modified some ports in /usr/ports, created bunch of scripts that lives in /usr/local etc. What these files have in common is that they're all owned by the superuser and they only exist on this particular system (I have only one FreeBSD box).

I wonder what's the best way to manage these files so I won't loose them in case of disaster and migrating or recreating machine would be easy. One host is not enough in my opinion to play with chef and modifying sources would probably break compatibility with the binary updates. What are my other options?

submitted by AvidAvider
[link] [14 comments]

October 08, 2015 11:46 PM


Data binding seems to mutate in closures

I'm using immutable variables in my function and pass those immutable variables to an anonymous function, but somehow it seems that one of my variables is getting mutated. Anyone ever experience this? Any help would be appreciated. I know I shouldn't be using null in a Scala app but for the sake of argument, please bare with me. I've also posted a link to my SO question that includes code and more detail. Thanks!

submitted by samchoii
[link] [1 comment]

October 08, 2015 11:26 PM


Black-Scholes formula with deterministic discrete dividend (Musiela approach)

For deterministic discrete dividend, there are two approach

  • Musiela approach, works when every dividend are paid at maturity of the option.
  • Hull approach, works when every dividend are paid immediately after ex-dividend date.

I spend 1 day to understand the Musiela approach, but I can not understand his formula. In his book "Martingal Method for Financial Modelling 2nd Edit" $3.2.2, his first approach firstly define quantity :

  • Timeline $0 < T_1 <T_2 … < T_m <T$ and dividend cash flow $q_1, q_2, .. q_m$

  • Value of all posterior-t dividend compounded to Maturity time : $$ I_t = \sum^m_{i=1} q_i e^{r(T-T_i)} \mathbf{1}_{[0,T_i]}(t) $$ Note that $I_t$ decrease in time $t$ and piecewise constant. At each time $T_i$, $I_t$ drop down $q_i$

  • Value of all anterior-t dividend compound to time $t$. $$ D_t = \sum^m_{i=1} q_i e^{r(t-T_i)} \mathbf{1}_{[T_i,T]}(t) $$ Here, $D_t$ increase in time $t$. At each time $T_i$, $D_t$ jump up $q_i$
  • He define the capital gain process $$ G_t = S_t + D_t $$

Note that $$ D_T=I_0 \hspace{1cm} G_0=S_0 \hspace{1cm} G_T=S_T+D_T=S_T+I_0 $$
And all jump in price process $S_t$ are separated to $D_t$, he can model $G_t$ by the geometric brownian as usual, i.e under risk-neutral measure $$ \frac{dG_t}{G_t} = rdt + \sigma dW_t $$ Now, he can give the B&S formula for European Call option at time zero $$ C_0 = e^{-rT}\mathbb{E}[(S_T-K)^+] = e^{-rT}\mathbb{E}[(G_T-(K+I_0))^+] $$ Since the modelled process is $G_t$, this price at time $0$ is easily found by Black-Scholes calculation routine. $$ C_0 = S_0 \mathcal{N}(d_+) - e^{-rT}K \mathcal{N}(d_{-}) $$ with $$ d\pm = \frac{ \text{ln}\frac{S_0}{K+I_0} + (r\pm\frac{\sigma^2}{2})T } {\sigma \sqrt{T}} $$ For this price formula at time $0$, I can understand it. An then I tried to compute for an arbitrary time $t$ $$ C_t = e^{-r(T-t)}\mathbb{E}[(S_T-K)^+|\mathcal{F}_t] = e^{-r(T-t)}\mathbb{E}[(G_T-(K+I_0))^+|\mathcal{F}_t] $$ Again, the calculation routine of Black-Scholes should give $$ C_t = G_t \mathcal{N}(d_+) - e^{-r(T-t)} (K+I_0) \mathcal{N}(d_{-}) $$ with $d\pm$ should be $$ d\pm = \frac{ \text{ln}\frac{G_t}{K+I_0} + (r\pm\frac{\sigma^2}{2})(T-t) } {\sigma \sqrt{T-t}} $$ But in the Musiela's book, he give the different result without detail proof. His result is $$ C_t = S_t \mathcal{N}(\hat{d}_+) - e^{-r(T-t)} (K+I_t) \mathcal{N}(\hat{d}_{-}) $$ with $$ \hat{d}\pm = \frac{ \text{ln}\frac{S_t}{K+I_t} + (r\pm\frac{\sigma^2}{2})(T-t) } {\sigma \sqrt{T-t}} $$ So the annoying differences are

  • He have strike term as $K+I_t$, I have $K+I_0$
  • He have random process as $S_t$, I have $G_t$

Can anyone help please. I've spent to much time without success.

PS : one more question. There are maybe something that I am missing. The fact that he use $D_t$ to model the dividend, but in the result, he use $I_t$, that seems strange.

by ctNGUYEN at October 08, 2015 11:12 PM


Endlich ist rausgekommen, wer am VW-Abgasskandal Schuld ...

Endlich ist rausgekommen, wer am VW-Abgasskandal Schuld war:
Volkswagen's cheating on emissions with the use of software in diesel cars was not a corporate decision, but something that "individuals did," its U.S. chief executive told lawmakers on Thursday.
Lauter verwirrte Einzeltäter!1!!

Aber so mal ganz fundamental gesehen hat er natürlich völlig Recht. Wer soll das auch sonst getan haben, wenn nicht einige Individuen?

October 08, 2015 11:00 PM


How to calculate a forward-starting swap with forward equations?

I have been trying to resolve this problem for some time but I cannot get the correct answer. The problem is the following one.

Compute the initial value of a forward-starting swap that begins at $t=1$, with maturity $T=10$ and a fixed rate of 4.5%. (The first payment then takes place at $t=2$ and the final payment takes place at $t=11$ as we are assuming, as usual, that payments take place in arrears.) You should assume a swap notional of 1 million and assume that you receive floating and pay fixed.)

We also know that

  • $r_{0,0}=5\%$
  • $u=1.1$
  • $d=0.9$
  • $q=1−q=1/2$

Using forward equations from $t=1$ to $t=9$, I cannot resolve the problem:

Here is what I have done in Excel with a final result of -31076 but it is not the correct answer:

enter image description here

by Katherine99 at October 08, 2015 10:31 PM


Die Amis behaupten, die Russen hätten mit vier ihrer ...

Die Amis behaupten, die Russen hätten mit vier ihrer Marschflugkörper aus dem Kaspischen Meer den Iran getroffen, nicht Syrien. Die Russen sagen dazu (noch?) nichts.

Update: Den Russen ist klar, dass ihr Wort an der Stelle nicht viel zählt, daher lassen sie das den Iran dementieren :-)

"No matter how unpleasant and unexpected for our colleagues in the Pentagon and Langley was yesterday's high-precision strike on Islamic State infrastructure in Syria, the fact remains that all missiles launched from our ships have found their targets," ministry's spokesman Maj. Gen. Igor Konashenkov said.

October 08, 2015 10:00 PM






Optimimum Image Zoom Levels, Rotation Angles, and Resize Ratios [on hold]

In Photoshop, images appear sharpest at zoom levels of 25%, 33.33%, 50%, 66.66%, 100%, etc.

1) Can this be extrapolated to smaller increments like fifths and eighths? Wouldn't 12.5% (an eighth of 100%) be crisper than 11.87%? Is there a formula to quantify the results? I want to create a chart like this:

  • 1/2 increments (50, 100) = 100% sharpness
  • 1/4 increments (25, 75) = 100% sharpness
  • 1/3 increments (33.33, 66.66) = 90% sharpness
  • 1/5 increments (20, 40, 60, 80) = 80% sharpness
  • 1/6, 1/8, 1/10, 1/12, 1/14, 1/16
  • ...
  • all remaining increments = 60% sharpness

2) Is this also true for rotating images? Is there a formula to quantify the results? I wish to create another chart with this data, as above.


by skibulk at October 08, 2015 09:15 PM



Erinnert ihr euch an den Impfgegner, der 100.000 Euro ...

Erinnert ihr euch an den Impfgegner, der 100.000 Euro für den Beweis ausgeschrieben hatte, dass das Masernvirus existiert? Wo dann jemand Fachliteratur hingeschickt und die 100k eingefordert hat? Gegen den hat das Amtsgericht am Bodensee jetzt Haftbefehl erlassen. Offenbarungseid oder Kohle rausrücken. Ergebnis: Er soll wohl gezahlt haben.

October 08, 2015 09:01 PM



Using the random forest algorithm to predict vectors [duplicate]

I know this might sound like a newbie question, but bear with me.

I have read a paper where researchers use a random forest to predict species distribution, but in their study, they only predict a new set of points given an old set and a map of environmental variables.

I would like to replicate these results, only using a random forest to predict a new set of vectors given a set of vectors (the seal migrations) and some environmental variables in map form (a scalar field).

My plan would be to prepare the training data as each vector paired with the environmental variables (ocean surface temperature, ocean currents, etc.) at the head and tail of that vector.

Would a random forest be able to, given this type of data, predict a new set of vectors if I feed it a new map of environmental variables? Is this even the right way to go about this?

Thanks so much, I really appreciate anyone's feedback and or criticism.

by jeshaitan at October 08, 2015 08:19 PM


What does calling map with one argument do in Clojure?

In Clojure, what does calling map with one argument, like this:

(map inc) ;=> #object[clojure.core$map$fn__4549 0x1decdb9d "clojure.core$map$fn__4549@1decdb9d"] / return? Because it doesn't do auto currying as expected, so the following two expressions are not equivalent:

;; E1
((map inc) [100 200 300]) ;=> #object[clojure.core$map$fn__4549$fn__4550 0x1b0c8974 "clojure.core$map$fn__4549$fn__4550@1b0c8974"]

;; E2
((partial map inc) [100 200 300]) ;=> (101 201 301)

...and the documentation says nothing.

So what IS the mysterious function returned by (map inc) and other similar expressions?

by NeuronQ at October 08, 2015 08:18 PM




Which machine learning algorithm is appropriate for predicting a vector?

I'm have a very large set of animal migration data, consisting of many series of vectors - each series is basically a path of a single animal. The dataset i'm using consists of 244 of these series.

Blue vectors are between pink points, pink points are snapshots of an individual animal's location

I want to train a predictive model so that when it is given a collection of these series and a map of environmental variables, such as a map of ocean surface temperature, it can output a new collection of these series.

My question is which machine learning algorithm would be optimal for this kind of prediction - I was looking into multinomial logistic regression, but that is for categorical data. I want the algorithm to develop a model that thinks 'at a point with an ocean surface temperature of 20˚C, an animal will travel along a vector with a magnitude of 35km and an angle of 34˚'.

I was also looking into using the random forest algorithm, but that seems to be more for classification than forecasting. Does the algorithm i'm thinking of exist? Thanks so much.

EDIT: My particular question is not how to fit this data with a particular ML method, but is there an ML algorithm that can predict a new vector (in polar form) given an old vector and an environmental variable at the tail of that vector.

by jeshaitan at October 08, 2015 07:53 PM


How to assign a mathematical expression to an attribute in python?

I want to assign a mathematical expression (or a function) to an attribute in python. But so far I only know how to assign hard parameters like Int, Double or Boolean. In the code bellow is some kind of similarity for what I want.

class MyClass:

def __init__(self, mathExp):


def evalExp(self,param)
    return self.Expression(param)

What I want is mathExp to be something like "1+2x" and when I execute the function evalExp I can enter a parameter like 5 so that the return is something like 1+2*5=11.

Thanks for your help!,

by Aldo Pareja at October 08, 2015 07:48 PM



Heuristics for space-efficient storing of Unordered Finite Sets in a DFA

I've got an algorithm I'm working on that generates, stores, and iterates through a large number of finite sets. I'm finding that memory is a bottleneck long before time is.

The finite sets are subsets of the vertices $V$ in my graph, so each finite set is small, there's just a lot of them.

In an effort to save space, I've started representing the finite sets as binary words of length $|V|$, with a 0 indicating the element is not in the set, a 1 indicating that it is. I'm storing the collection of these words as an acyclic deterministic automaton (also known as DAWG, directed acyclic word graph).

However, this requires a fixed ordering of the potential elements, which is fine, but arbitrary. If a different ordering were more likely to produce a smaller output set, I'd be happy to use it.

I'm wondering:

  • Is there a known, efficient algorithm for finding the permutation which gives the smallest DFA representing a set of finite sets?
  • If not, has any research been done on heuristics for orderings which have been shown to often produce smaller DFAs?

by jmite at October 08, 2015 07:14 PM


Does it matter who beings communication in $IP(f(x))$?

Consider $IP(f(x))$, in other words, the class of languages that admit a private coin protocol $(P, V)$ running in $f(x)$ rounds (often in terms of the size of $x$) that satisfies the following conditions:

  • If $x \in L$, then $\text{Pr}[\lt P, V\gt(x)=1]=1$
  • If $x \notin L$, then for any possible $P^*$ (provers that seem to follow the specified protocol but actually return the wrong values) we have $\text{Pr}[\lt P^*, V\gt(x)=1]\leq\frac{1}{2}$

Where $\lt P, V\gt(x)$ is the function returning $1$ if the verifier $V$ thinks that $x \in L$ (the prover $P$ is telling the truth) and $0$ if $V$ thinks that $x \notin L$ ($P$ is lying). Note that we must assume that $V$ uses randomness because otherwise it is boring because one round is always sufficient because the prover can simply send the entire history of the conversation that is about to occur in one round. And by private coin I mean that the randomness generated by the verifier cannot be seen by the prover.

Clearly, $P$ will always send the last message, because if $V$ sent the last message it wouldn't be adding any new information it needed to know before returning $\lt P, V\gt(x)$. Also, because $IP$ is the union over all $IP(f(x))$ including the case where $P$ starts and $V$ starts, it doesn't matter who starts for $IP$ itself.

However, given some function like $f(x)=|x|$, does it matter who begins? I'm not sure this is the case but if $IP(O(f(x))$ has the same power regardless of who starts then I am interested in not using the big-oh notation and using functions specified directly instead.

by Danielle Ensign at October 08, 2015 07:10 PM


How to record tick data from Google/Yahoo Finance data streams?

Is there any way to record or piggyback with an app, code or excel Google finances' or Yahoo finance's data stream?

Ideally, I need tick by tick data, as in every price change of the day.

All the downloaders and codes I found can only get 1-minute intervals in the format of OHLC. I have not found any code snippet for tick by tick or every price change when denoting the interval change for downloading EOD data.

The closest thing I can come up with is: and the idea of Live, streaming web import capabilities in Excel. However I am not sure whether this just queries Google in frequent time periods or whether it would reflect what you see with the real time ticker changes. In fact, going by tick change would give you different EOD results than if you queried every 1m or so.

For example, using an index price like the Dow if the price does not change for three minutes your query would return that figure three times leading you to believe it transacted more than it did at the end of the day.

Is is it possible to get tick data from these services?

by user16803 at October 08, 2015 07:07 PM

DragonFly BSD Digest

BSDNow 110: Firmware Fights

BSDNow 110 is now available.  It’s back to the text summary format, so I can tell you easily that it includes an interview with Benno Rice, about Isilon and their interactions with FreeBSD.

by Justin Sherrill at October 08, 2015 07:04 PM



On definition of IP class

I'm a little bit lost with the actual definition of IP, some sources define as interaction between algorithms starting with Verifier, another one does not any put restriction on who send the first message. This simple question turn things messy ahead, when take into account the number of messages on definition of IP(k). Does matter who start the interaction?

by Yair at October 08, 2015 06:39 PM


Where can I find a database of ALL ETFs, sorted by age?

I have a portfolio allocation strategy I want to backtest, but I need a large "universe" of ETFs for it to choose from at each time period. I was thinking of starting with a criteria such as "all ETFs available in 2000" or something like that to ensure sufficient data for a backtest.

Any suggestions as to where I can find a databse of all the ETFs out there, complete with their inception date?

by Zach at October 08, 2015 06:31 PM



Comparing different implementations of genetic algorithms

I am looking up some material for my thesis in CS (development of a module to integrate a genetic algorithm in a system developed by other students).

My actual current task is to make a comparative analysis of several libraries for genetic algorithms. After this comparison I will choose one of these libraries and an algorithm.

I want to know if you can give me any tips or best practice to perform such comparison. Which approach to the task would you recommend?

by Tuccio at October 08, 2015 06:10 PM


A little help with the Single Factor model for credit risk

I'm studying the "single factor model" in Malz text "Financial Risk Management - Models, History and Institutions". He only refers to it as such and gives it no proper name.

The model:

$a_{i} = \beta_{i}m+(\sqrt{1-\beta^2})\epsilon_{i}$

$\beta$ is the correlation of the firm to the state of the economy

$m$ is the state of the economy

I'm a bit confused here. The author first says we can use the model to convert an unconditional probability of default into one conditional on the state of the economy

and then further says "The unconditional probability of a particular loss level (the fraction of the portfolio that defaults) is equal to the probability that the market factor return that leads to that loss level is realized"

We find both probabilities in the same way:

$p(m) = \phi( \frac{k_{i}-\beta_{i}m}{\sqrt{1 - \beta^2}} )$

Where $k = \phi^{-1}(\pi)$

and $\pi$ = unconditional probability of default in the first usage and probability of realizing the market factor leading to observed the loss level in the second usage.

These sound opposite to me. In one usage we are finding a conditional PD and in another what is described as an unconditional.

by AfterWorkGuinness at October 08, 2015 06:09 PM


Das oberste Gericht Australiens hat Patent auf Gene ...

Das oberste Gericht Australiens hat Patent auf Gene für ungültig erklärt.
Since the information stored in the DNA as a sequence of nucleotides was a product of nature, it did not require human action to bring it into existence, and therefore could not be patented.

October 08, 2015 06:01 PM

Planet Theory

TCS+ talk: Wednesday, October 14th— Ryan O’Donnell, CMU

Our next talk will take place this coming Wednesday, October 14th at 1:00 PM Eastern Time (10:00 AM Pacific Time, 19:00 Central European Time, 17:00 UTC). Ryan O’Donnell from CMU will speak about his work on “How to refute a random CSP”, to appear in the next FOCS (abstract below).

Please make sure you reserve a spot for your group to join us live by signing up on the online form. As usual, for more information about the TCS+ online seminar series and the upcoming talks, please see the website.

Let P be a k-ary predicate and let H be a uniformly random constraint satisfaction problem with n variables and m = m(n) constraints, each of type P applied to k literals. For example, if P  is the 3-ary OR predicate, then H is a classic random instance of 3SAT.

For each P there is a critical constant \alpha such that H will be satisfiable (with high probability) if m < \alpha n and will be unsatisfiable (whp) if m > \alpha n. In the former case, the natural algorithmic task is to find a satisfying assignment to the variables.

In the latter case, the natural algorithmic task is to find a refutation; i.e., a proof of unsatisfiability. This task becomes algorithmically easier the larger m is. As an example, in the case of 3SAT, it is known that efficient refutation algorithms exist provided m \gg n^{3/2}. We will discuss the refutation problem in general, focusing on how the predicate, P, affects the number of constraints, m, required for efficient refutation. We will also describe the applications to hardness of learning.

by plustcs at October 08, 2015 05:40 PM


How do I design this functional code as class based (object oriented)?

I'm a beginner-intermediate self taught Python developer,

In most of the projects I completed, I can see the following procedure repeats. I don't have any outside home code experiences, I think the below code is not so professional as it is not reusable, and seems like it is not fitting all together in a container, but loosely coupled functions on different modules.

def get_query():
    # returns the query string

def make_request(query):
    # makes and returns the request with query

def make_api_call(request):
    # calls the api and returns response

def process_response(response):
    # process the response and returns the details

def populate_database(details):
    # populates the database with the details and returns the status of population

def log_status(status):
    # logs the status so that developer knows whats happening

query = get_query()
request = make_request(query)
response = make_api_call(request)
details = process_response(response)
status = populate_database(details)

How do I design this procedure as a class based design?

by Marty at October 08, 2015 05:20 PM


Basic knowledge requirement for each of these CS areas? [on hold]

I'm about to graduate and I wanna know what is the basic knowledge that each of these areas requires :

Artificial Intelligence, Machine Learning, Computer Security, Software Engineering, Human-Computer Interaction, Computer Graphics, Computer Networks, Business Intelligence, Distributed Systems, Programming Languages Design.

For example, I know that AI requires a knowledge in Probabilities and Statistics, Mathematical Logic, Algorithms... maybe you can cite more

What about the other areas ?

by reaffer at October 08, 2015 05:20 PM


Proof software for Primitive Recursive Arithmetic

Primitive Recursive Arithmetic is a critical foundational system in mathematics at large, and all the more so in areas studying constructive reasoning and/or computability such as Theoretical Computer Science.

Is there any proof software that deals with Primitive Recursive Arithmetic... or equivalent systems?

by Rex Butler at October 08, 2015 05:17 PM


AWS Mobile Hub – Build, Test, and Monitor Mobile Applications

The new AWS Mobile Hub (Beta) simplifies the process of building, testing, and monitoring mobile applications that make use of one or more AWS services. It helps you skip the heavy lifting of integrating and configuring services by letting you add and configure features to your apps, including user authentication, data storage, backend logic, push notifications, content delivery, and analytics—all from a single, integrated console.

The AWS Mobile Hub helps you at each stage of development: configuring, building, testing, and usage monitoring. The console is feature-oriented; instead of picking individual services you select higher-level features comprised of combinations of one or more services, SDKs, and client code.  What once took a day to properly choose and configure can now be done in 10 minutes or so.

Diving In
Let’s dive into the console and take a look!

The Mobile Hub outlines (and helps with) each step of the mobile app development process:

I will call my project SuperMegaMobileApp:

Each feature is backed up by one or more AWS services. For example, User Sign-In is powered by Amazon Cognito and Push Notification is powered by Amazon Simple Notification Service (SNS). I simply click on a feature to select and configure it.

I click on Push Notifications and Enable push, then  choose the destination platform(s):

I want to push to Android devices, so I select it. Then I need to enter an API Key and a Sender ID:

I can add logic to my application by connecting it to my AWS Lambda functions:

After I have selected and configured the features that I need, I click on Build to move forward:

The Mobile Hub creates a source package that I can use to get started, and provides me with the links and other information that I need to have at hand in order to get going:

I can download the entire package, open it up in my preferred IDE, and keep going from there:

I can use this code as a starter  app and edit it as desired. I can also copy selected pieces of code and paste them in to my existing mobile app.

I can also make use of the AWS Device Farm for testing and Amazon Mobile Analytics to collect operational metrics.

Visit the Hub
Whether you are creating a brand new mobile app or adding features to an existing app, AWS Mobile Hub lets you take advantage of the features, scalability, reliability, and low cost of AWS in minutes. As you have seen, AWS Mobile Hub walks you through feature selection and configuration. It then automatically provisions the AWS services required to power these features, and generates working quickstart apps for iOS and Android that use your provisioned services.

You can now spend more time adding features to your app and less time taking care of all of the details behind the scenes!

To learn more, visit the Mobile Hub page.



by Jeff Barr at October 08, 2015 05:11 PM


How much bigger can an LR(1) automaton for a language be than the corresponding LR(0) automaton?

In an LR(0) parser, each state consists of a collection of LR(0) items, which are productions annotated with a position. In an LR(1) parser, each state consists of a collection of LR(1) items, which are productions annotated with a position and a lookahead character.

It's known that given a state in an LR(1) automaton, the configurating set formed by dropping the lookahead tokens from each LR(1) item yields a configurating set corresponding to some state in the LR(0) automaton. In that sense, the main difference between an LR(1) automaton and an LR(0) automaton is that the LR(1) automaton has more copies of the states in the LR(0) automaton, each of which is annotated with lookahead information. For this reason, LR(1) automata for a given CFG are typically larger than the corresponding LR(0) parser for that CFG.

My question is how much larger the LR(1) automaton can be. If there are $n$ distinct terminal symbols in the alphabet of the grammar, then in principle we might need to replicate each state in the LR(0) automaton at least once per subset of those $n$ distinct terminal symbols, potentially leading to an LR(1) automaton that's $2^n$ times larger than the original LR(0) automaton. Given that each individual item in the LR(0) automaton consists of a set of different LR(0) items, we may get an even larger blowup.

That said, I can't seem to find a way to construct a family of grammars for which the LR(1) automaton is significantly larger than the corresponding LR(0) automaton. Everything I've tried has led to a modest increase in size (usually around 2-4x), but I can't seem to find a pattern that leads to a large blowup.

Are there known families of context-free grammars whose LR(1) automata are exponentially larger than the corresponding LR(0) automata? Or is it known that in the worst case, you can't actually get an exponential blowup?


by templatetypedef at October 08, 2015 05:10 PM

Dave Winer

Medium has an API?

I saw an announcement that Medium has an API.

They list a bunch of software developers they worked with, and one of them is IFTTT, so I tried hooking up my RSS feed to Medium using a recipe and maybe it'll just work.

That would be fairly easy.

Let's see what happens!


A few minutes later, the story shows up on Medium. But it's not all there. There were so many steps in the process, it's hard to know where the missing bits got lost.

Also, when I clicked on the headline, I saw HTML markup on the page, and no title, though it looked fine when I saw it in my list of stories.

Probably no matter what an IFTTT connection will be less accurate than a direct software connection. But maybe it'll do to begin with.

I would update the post if we were using the API, to reflect changes.

October 08, 2015 05:09 PM


What is the average time complexity, for a single linked list, for performing an insert?

I thought this would be a very simple O(n) b.c. you can do the insert any where with in the list.

The longer the list, the longer it will take on average to do the insert.

However according to bigocheatsheet it is O(1) or has not dependency on the size of the list.

What is wrong here?

by cade galt at October 08, 2015 05:07 PM


AWS IoT – Cloud Services for Connected Devices

Have you heard about the Internet of Things (IoT)? Although critics sometimes dismiss it as nothing more than “put a chip in it,” I believe that the concept builds upon some long-term technology trends and that there’s something really interesting and valuable going on.

To me, the most relevant trends are the decreasing cost of mass-produced compute power, the widespread availability of IP connectivity, and the ease with which large amounts of information can be distilled into intelligence using any number of big data tools and techniques:

  • Mass-produced compute power means that it is possible to crank out powerful processors that consume modest amounts of power, occupy very little space, and cost very little. These attributes allow the processors to be unobtrusively embedded in devices of all shapes and sizes.
  • Widespread IP connectivity (wired or wireless) lets these processors talk to each other and  to the cloud. While this connectivity is fairly widespread, it is definitely not ubiquitous.
  • Big data allows us to make sense of the information measured, observed, or collected, by the processors running in these devices.

We could also add advances in battery & sensor technology to the list of enabling technologies for the Internet of Things. Before too long, factory floors, vehicles, health care systems, household appliances, and much more will become connected “things.” Two good introductory posts on the topic are 20 Real World Problems Solved by IoT and Smart IoT: IoT as a Human Agent, Human Extension, and Human Complement. My friend Sudha Jamthe has also written on the topic; her book IoT Disruptions focuses on new jobs and careers that will come about as IoT becomes more common.

Taking all of these trends as givens, it should not come as a surprise that we are working to make sure that AWS is well-equipped to support many different types of IoT devices and applications. Although I have described things as connected devices, they can also take the form of apps running on mobile devices.

Today we are launching AWS IoT (Beta).

This new managed cloud service provides the infrastructure that allows connected cars, factory floors, aircraft engines, sensor grids, and the like (AWS IoT refers to them as “things”) to easily and securely interact with cloud services and with other devices, all at world-scale. The connection to the cloud is fast and lightweight (MQTT or REST), making it a great fit for devices that have limited memory, processing power, or battery life.

Let’s take a look at the components that make up AWS IoT:

  • Things are devices of all types, shapes, and sizes including applications, connected devices, and physical objects. Things measure and/or control something of interest in their local environment. The AWS IoT model is driven by state and state changes. This allows things to work properly even when connectivity is intermittent; applications interact with things by way of cloud-based Thing Shadows. Things have names, attributes, and shadows.
  • Thing Shadows are virtual, cloud-based representations of things. They track the state of each connected device, and allow that state to be tracked even if the thing loses connectivity for an extended period of time.
  • The real-time Rules Engine transforms messages based on expressions that you define, and routes them to AWS endpoints (Amazon DynamoDB, Amazon Simple Storage Service (S3)AWS Lambda, Amazon Simple Notification Service (SNS), Amazon Simple Queue Service (SQS), Amazon Kinesis, and Amazon Kinesis Firehose) all expressed using a SQL-like syntax. Routing is driven by the contents of individual messages and by context. For example, routine readings from a temperature sensor could be tracked in a DynamoDB table; an aberrant reading that exceeds a value stored in the thing shadow can trigger a Lambda function .
  • The Message Broker speaks MQTT (and also HTTP 1.1) so your devices can take advantage of alternative protocols even if your cloud backend does not speak them. The Message Broker can scale to accommodate billions of responsive long-lived connections between things and your cloud applications. Things use a topic-based pub/sub model to communicate with the broker, and can also publish via HTTP request/response. They can publish their state and can also subscribe to incoming messages. The pub/sub model allows a single device to easily and efficiently share its status with any number of other devices (thousands or even millions).
  • Device SDKs are client libraries that are specific to individual types of devices. The functions in the SDK allow code running on the device to communicate with the AWS IoT Message Broker over encrypted connections. The devices identify themselves using X.509 certificates or Amazon Cognito identities. The SDK also supports direct interaction with Thing Shadows.
  • The Thing Registry assigns a unique identity to each thing. It also tracks descriptive metadata such as the attributes and capabilities for each thing.

All of these components can be created, configured, and inspected using the AWS Management Console, the AWS Command Line Interface (CLI), or through the IoT API.

AWS IoT lets billions of things keep responsive connections to the cloud, and lets cloud applications interact with things (works in device shadows, rules engine, and the real-time functionality). It receives messages from  things and filters, records, transforms, augments, or routes them to other parts of AWS or to your own code.

Getting Started with AWS IoT
We have been working with a large group of IoT Partners to create AWS-powered starter kits:

Once you have obtained a kit and connected it to something interesting, you are ready to start building your first IoT application using AWS IoT. You will make use of several different SDKs during this process:

The AWS IoT Console will help you get started. With a few clicks you can create your first thing, and then download the SDK, security credentials, and sample code you will need to connect a device to AWS IoT.

You can also build AWS IoT applications that communicate with an Amazon Echo via the Alexa Skills Kit. AWS IoT can trigger an Alexa Skill via a Lambda function and Alexa Skills can interact with thing shadows. Alexa Skills can also take advantage of AWS IoT’s bidirectional messaging capability (which traverses NAT and firewalls found in home networks) to wake devices with commands from the cloud. Manufacturers can use thing shadows to store responses to application-specific messages.

AWS IoT in the Console
The Console includes an AWS IoT tutorial to get you started:

It also provides complete details on each thing, including the thing’s API endpoint, MQTT topic, and the contents of its shadow:

AWS IoT Topics, Messages, and Rules
All of the infrasructure that I described can be seen as a support system for the message and rule system that forms the heart of AWS IoT. Things disclose their state by publishing messages to named topics.  Publishing a message to a topic will create the topic if necessary; you don’t have to create it in advance. The topic namespace is hierarchical (“myfactories/seattle/sensors/door”)

Rules use a SQL-like SELECT statement to filter messages. In the IoT Rules Engine, the FROM clause references an MQTT topic and the WHERE clause references JSON properties in the message. When a rule matches a message, it can invoke one or more of the following actions:

The SELECT statement can use all (*) or specifically chosen fields of the message in the invocation.

The endpoints above can be used to reach the rest of AWS. For example, you can reach Amazon Redshift via Kinesis, and external endpoints via Lambda, SNS, or Kinesis.

Thing Shadows also participate in the message system. Shadows respond to HTTP GET requests with JSON documents (the documents are also accessible via MQTT for environments that don’t support HTTP). Each document contains the thing state, its metadata,  and a version number for the state.  Each piece of state information is stored in both “reported” (what the device last said), and “desired” (what the application wants it to be). Each shadow accepts changes to the desired state (HTTP) post, and publishes “delta” and “accepted” messages to topics associated with the thing shadow. The device listens on these topics and changes its state accordingly.

IoT at re:Invent
If you are at re:Invent, be sure to check out our Mobile Developer & IoT track. Here are some of the sessions we have in store:

  • MBL203 – From Drones to Cars: Connecting the Devices in Motion to the Cloud.
  • MBL204 -Connecting the Unconnected – State of the Union – Internet of Things Powered by AWS.
  • MBL303 -Build Mobile Apps for IoT Devices and IoT Apps for Mobile Devices.
  • MBL305 – You Have Date from the Devices, Now What? Getting Value of the IoT.
  • WRK202 – Build a Scalable Mobile App on Serverless, Event-Triggered, Back-End Logic.

More to Come
There’s a lot more to talk about and I have barely scratched the surface with this introductory blog post. Once I recover from AWS re:Invent, I will retire to my home lab and cook up a thing or two of my own and share the project with you. Stay tuned!


PS – Check out the AWS IoT Mega Contest!

by Jeff Barr at October 08, 2015 05:06 PM




Best way to use Mercurial with Emacs?

What's the best way to use Mercurial with Emacs these days? I'm happy with Emacs 24.5's built-in vc-mode for commits but pushing and pulling is awkward.

submitted by michaelhoffman
[link] [8 comments]

October 08, 2015 04:48 PM


AWS Lambda Update – Python, VPC, Increased Function Duration, Scheduling, and More

We launched AWS Lambda at re:Invent 2014 and the reception has been incredible. Developers and system architects quickly figured out that they can quickly and easily build serverless systems that need no administration and can scale to handle a very large number of requests. As a recap, Lambda functions can run in response to the following events:

Over the past year we have added lots of new features to Lambda. We launched in three AWS regions (US East (Northern Virginia), US West (Oregon), and Europe (Ireland)]) and added support for Asia Pacific (Tokyo) earlier this year. Lambda launched with support for functions written in Node.js; we added support for Java functions earlier this year. As you can see from the list above, we also connected Lambda to many other parts of AWS. Over on the AWS Compute Blog, you can find some great examples of how to put Lambda to use in powerful and creative ways, including (my personal favorite), Microservices Without the Servers.

New Features for re:Invent
Today we are announcing a set of features that will Lambda even more useful. Here’s a summary of what we just announced from the stage:

  • VPC Support
  • Python functions
  • Increased function duration
  • Function versioning
  • Scheduled functions

As you can see, it is all about the functions! Let’s take a look at each of these new features.

Accessing Resources in a VPC From a Lambda Function
Many AWS customers host microservices within a Amazon Virtual Private Cloud and would like to be able to access them from their Lambda functions.Perhaps they run a MongoDB cluster with lookup data, or want to use Amazon ElastiCache as a stateful store for Lambda functions, but don’t want to expose these resources to the Internet.

You will soon be able to access resources  of this type by setting up one or more security groups within the target VPC, configure them to accept inbound traffic from Lambda, and attach them to the target VPC subnets. Then you will need to specify the VPC, the subnets, and the security groups when your create your Lambda function (you can also add them to an existing function). You’ll also need to give your function permission (via its IAM role) to access a couple of EC2 functions related to Elastic Networking.

This feature will be available later this year. I’ll have more info (and a walk-through) when we launch it.

Python Functions
You can already write your Lambda functions in Node.js and Java. Today we are adding support for Python 2.7, complete with built-in access to the AWS SDK for Python.  Python is easy to learn and easy to use, and you’ll be up and running in minutes. We have received many, many requests for Python support and we are very happy to be able to deliver it. You can start using Python today. Here’s what it looks like in action:

Increased Function Duration
Lambda is a great fit for Extract-Transform-Load (ETL) applications. It can easily scale up to ingest and process large volumes of data, without requiring any persistent infrastructure. In order to support this very popular use case, your Lambda functions can now run for up to 5 minutes. As has always been the case, you simply specify the desired timeout when you create the function. Your function can consult the context object to see how much more time it has available.

Here’s how you can access and log that value using Python:

print(" Remaining time (ms): " + str(context.get_remaining_time_in_millis()) + "\n")

Functions that consume all of their time will be terminated, as has always been the case.

Function Versioning & Aliasing
When you start to build complex systems with Lambda, you will want to evolve them on a controlled basis. We have added a new versioning feature to simplify this important aspect of development & testing.

Each time you upload a fresh copy of the code for a particular function, Lambda will automatically create a new version and assign it a number (1, 2,3, and so forth). The Amazon Resource Name (ARN) for the function now accepts an optional version qualifier at the end (a “:” and then a version number).  An ARN without a qualifier always refers to the newest version of the function for ease of use and backward compatibility. A qualified ARN such as “arn:aws:lambda:us-west-2:123456789012:function:PyFunc1:2″ refers to a particular version (2, in this case).

Here are a couple of things to keep in mind as you start to think about this new feature:

  • Each version of a function has its own description and configuration (language / runtime, memory size, timeout, IAM role, and so forth).
  • Each version of a given function generates a unique set of CloudWatch metrics.
  • The CloudWatch Logs for the function will include the function version as part of the stream name.
  • Lambda will store multiple versions for each function. Each Lambda account can store up to 1.5 gigabytes of code and you can delete older versions as needed.

You can also create named aliases and assign them to specific versions of the code for a function. For example, you could initially assign “prod” to version 3, “test” to version 5, and “dev” to version 7 for a function. Then you would use the alias as part of the ARN that you use to invoke the function, like this:

  • Production – “arn:aws:lambda:us-west-2:123456789012:function:PyFunc1:prod”
  • Testing – “arn:aws:lambda:us-west-2:123456789012:function:PyFunc1:test”
  • Development – “arn:aws:lambda:us-west-2:123456789012:function:PyFunc1:dev”

You can use ARNs with versions or aliases (which we like to call qualified ARNs) anywhere you’d use an existing non-versioned or non-aliased ARN.  In fact, we recommend using them as a best practice.

This feature makes it easy to promote code between stages or to rollback to earlier versions if a problem arises. For example, you can point your prod alias to version 3 of the code, and then remap it to point to version 5 (effectively promoting it from test to production) without having to make any changes to the client applications or to the event source that triggers invocation of the function.

Scheduled Functions (Cron)
You can now invoke a Lambda function on a regular, scheduled basis. You can specify a fixed rate (number of minutes, hours, or days between invocations) or you can specify a Cron-like expression:

This feature is available now in the console, with API and CLI support in the works.

— Jeff;


by Jeff Barr at October 08, 2015 04:38 PM


Why can't we have superlinear bounds on Boolean circuit size for an explicit function?

I am interested about the minimal size (number of gates) of a family of circuits (with negation) over a complete Boolean basis (with fanin 2) that computes some explicit Boolean function. (In other words, I want results that apply for a specific function, not diagonalization, counting, or non-constructive arguments). I call $n$ the number of inputs.

Section 1.5.2 of Boolean Function Complexity by Stasys Jukna (2011) says that the best such lower bound currently known is in $5n - \text{o}(n)$, from Iwama and Morizumi in 2002. This is very surprising, because, as Shannon proved in 1949 by a counting argument, most Boolean functions require exponential-size circuits.

Is the $5n - \text{o}(n)$ result still the best known? Is there any reason why proving a super-linear lower bound on the circuit size of an explicit function seems out of reach? In particular, do we know that solving this would close long-standing open problems in complexity theory as well?

by a3nm at October 08, 2015 04:35 PM


CloudWatch Dashboards – Create & Use Customized Metrics Views

Amazon CloudWatch monitors your AWS cloud resources and your cloud-powered applications. It tracks the metrics so that you can visualize and review them. You can also set alarms that will fire when a metrics goes beyond a limit that you specified. CloudWatch gives you visibility into resource utilization, application performance, and operational health.

New CloudWatch Dashboards
Today we are giving you the power to build customized dashboards for your CloudWatch metrics. Each dashboard can display multiple metrics, and can be accessorized with text and images. You can build multiple dashboards if you’d like, each one focusing on providing a distinct view of your environment. You can even pull data from multiple regions into a single dashboard in order to create a global view.

Let’s build one!

Building a Dashboard
I open up the CloudWatch Console and click on Create dashboard to get started. Then I enter a name:

Then I add my first “Widget” (a graph or some text) to my dashboard. I’ll display some metrics using a line graph:

Now I need to choose the metric. This is a two step process. First I choose by category:

I clicked on EC2 Metrics. Now I can choose one or more metrics and create the widget. I sorted the list by the Metric Name selected all of my EC2 instances, and clicked on the Create widget button (not shown in the screen shot):

As I noted earlier, you can access and make use of metrics drawn from multiple AWS regions; this means that you create a single global status dashboard for your complex, multi-region applications and deployments.

And here’s my dashboard:

I can resize the graph, and I can also interact with it. For example, I can focus on a single instance with a click (this will also highlight the other metrics from that instance on the dashboard):

I can add several widgets. The vertical line allows me to look for correlation between metrics that are shown in different widgets:

The graphs can be linked or independent with respect to zooming (the Actions menu allows me to choose which option I want). I can click and drag on a desired time-frame and all of the graphs will zoom (if they are linked) when I release the mouse button:

The Action menu allows me to reset the zoom and to initiate many other operations on my dashboards:

I can also add static text and images to my dashboard by using a text widget. The contents of the widget are specified in GitHub Flavored Markdown:

Here’s my stylish new dashboard:

Text widgets can also include buttons and tables. I can link to help pages, troubleshooting guides, internal and external status pages, phone directories, and so forth.

I can create several dashboards and switch between then with a click:

I can also create a link that takes me from one dashboard to another one:

I can also control the time range for the graphs, and I can enable automatic refresh, with fine-grained control of both:

Dashboard Ownership and Access
The individual dashboards are stored at the AWS account level and can be accessed by IAM users within the account. However, in many cases administrators will want to set up dashboards for use across the organization in a controlled fashion.

In order to support this important scenario, IAM permissions on a pair of CloudWatch functions can be used to control the ability to see metrics and to modify dashboards. Here’s how it works:

  • If an IAM user has permission to call PutMetricData, they can create, edit, and delete dashboards.
  • If an IAM user has permission to call GetMetricStatistics, they can view dashboard content.

Available Now
CloudWatch Dashboards are available now and you can start using them today in all AWS regions! You can create up to three dashboards (each with up to 50 metrics) at no charge. After that, each additional dashboard costs $3 per month.

Share Your Dashboards
I am looking forward to seeing examples of this feature in action. Take it for a spin and let me know what you come up with!


by Jeff Barr at October 08, 2015 04:33 PM



EC2 Container Service Update – Container Registry, ECS CLI, AZ-Aware Scheduling, and More

I’m really excited by the Docker-driven, container-based deployment model that is quickly becoming the preferred way to build, run, scale, and quickly update new applications. Since we launched Amazon EC2 Container Service last year, we have seen customers use it to host and run their microservices, web applications, and batch jobs.

Many developers have told me that containers have allowed them to speed up their development, testing, and deployment efforts. They love the fact that individual containers can hold “standardized” application components, each of which can be built using the language, framework, and middleware best suited to the task at hand. The isolation provided by the containers gives them the freedom to innovate at a more granular level instead of putting the entire system at risk due to large-scale changes.

Based on the feedback that we have received from our contingent of Amazon ECS and Docker users, we are announcing some powerful new features  – the Amazon EC2 Container Registry and the Amazon EC2 Container Service CLI. We are also making the Amazon ECS scheduler more aware of Availability Zones and adding some new container configuration options.

Let’s dive in!

Amazon EC2 Container Registry
Docker (and hence EC2 Container Service) is built around the concept of an image. When you launch a container, you reference an image, which is pulled from a Docker registry. This registry is a critical part of the deployment process. Based on conversations with our customers, we have learned that they need a registry that is highly available and exceptionally scalable, and globally accessible, with the ability to support deployments that span two or more AWS regions. They also want it to integrate with AWS Identity and Access Management (IAM) to simplify authorization and to provide fine-grained control.

While customers could meet most of these requirements by hosting their own registry, they have told us that this would impose an operational burden that they would strongly prefer to avoid.

Later this year we will make the Amazon EC2 Container Registry (Amazon ECR) available. This fully managed registry will address the issues that I mentioned above by making it easy for you to store, manage, distribute, and collaborate around Docker container images.

Amazon ECR is integrated with ECS and will be easy to integrate into your production workflow. You can use the Docker CLI running on your development machine to push images to Amazon ECR, where Amazon ECS can retrieve them for use in production deployments.

Images are stored durably in S3 and are encrypted at rest and in transit, with access controls via IAM users and roles. You will pay only for the data that you store and for data that you transfer to the Internet.

Here’s a sneak peek at the console interface:

You can visit the signup page to learn more and to sign up for early access. If you are interested in participating in this program, I would encourage you to sign up today.

We are already working with multiple launch partners including Shippable, CloudBees, CodeShip, and Wercker to provide integration with Amazon ECS and Amazon ECR, with a focus on automatically building and deploying Docker images. To learn more, visit our Container Partners page.

Amazon EC2 Container Service CLI
The ECS Command Line Interface (ECS CLI) is a command line interface for Amazon EC2 Container Service (ECS) that provides high level commands that simplify creating, updating and monitoring clusters and tasks from a local development environment.

The Amazon ECS CLI supports Docker Compose, a popular open-source tool for defining and running multi-container applications. You can use the ECS CLI as part of your everyday development and testing cycle as an alternative to the AWS Management Console.

You can get started with the ECS CLI in a couple of minutes. Download it (read the directions first), install it, and then configure it as follows (you have other choices and options, of course):

$ ecs-cli configure --region us-east-1 --cluster my-cluster

Launch your first cluster like this:

$ ecs-cli up --keypair my-keys --capability-iam --size 2

Docker Compose requires a configuration file. Here’s a simple one to get started (put this in docker-compose.yml):

  image: amazon/amazon-ecs-sample
  - "80:80"

Now run this on the cluster:

$ ecs-cli compose up
INFO[0000] Found task definition TaskDefinition=arn:aws:ecs:us-east-1:980116778723:task-definition/ecscompose-bin:1
INFO[0000] Started task with id:arn:aws:ecs:us-east-1:9801167:task/fd8d5a69-87c5-46a4-80b6-51918092e600

Then take a peek at the running tasks:

$ ecs-cli compose ps
Name                                      State    Ports
fd8d5a69-87c5-46a4-80b6-51918092e600/web  RUNNING>80/tcp

Point your web browser at that address to see the sample app running in the cluster.

The ECS CLI includes lots of other options (run it with –help to see all of them). For example, you can create and manage long-running services. Here’s the list of options:

The ECS CLI is available under an Apache 2.0 license (code is available at and we are looking forward to seeing your pull requests.

New Docker Container Configuration Options
A task definition is a description of an application that lets you define the containers that are scheduled together on an EC2 instance. Some of the parameters you can specify in a task definition include which Docker images to use, how much CPU and memory to use with each container, and what (if any) ports from the container are mapped to the host container.

Task definitions now support lots of Docker options including Docker labels, working directory, networking disabled, privileged execution, read-only root filesystem, DNS servers, DNS search domains, ulimits, log configuration, extra hosts (hosts to add to /etc/hosts), and security options for Multi-Level Security (MLS) systems such as SELinux.

The Task Definition Editor in the ECS Console has been updated and now accepts the new configuration options:

For more information, read about Task Definition Parameters.

Scheduling is Now Aware of Availability Zones
We introduced the Amazon ECS service scheduler earlier this year as a way to easily schedule containers for long running stateless services and applications. The service scheduler optionally integrates with Elastic Load Balancing. It ensures that the specified number of tasks are constantly running and restarts tasks if they fail. The service scheduler is the primary way customers deploy and run production services with ECS and we want to continue to make it easier to do so.

Today the service scheduler is availability zone aware. As new tasks are launched, the service scheduler will spread the tasks to maintain balance across AWS availability zones.

Amazon ECS at re:Invent
If you are at AWS re:Invent and want to learn more about how your colleagues (not to mention your competitors) are using container-based computing and Amazon ECS, check out the following sessions:

  • CMP302 – Amazon EC2 Container Service: Distributed Applications at Scale (to be live streamed).
  • CMP406 – Amazon ECS at Coursera.
  • DVO305 – Turbocharge Your Continuous Deployment Pipeline with Containers.
  • DVO308 – Docker & ECS in Production: How We Migrated Our Infrastructure from Heroku to AWS.
  • DVO313 – Building Next-Generation Applications with Amazon ECS.
  • DVO317 – From Local Docker Development to Production Deployments.


by Jeff Barr at October 08, 2015 04:28 PM


Can Eve impersonate Alice or Bob by using a replay attack?

For my computer science study, I have to design a replay attack (if possible) for the following authentication protocols.

I use the standard security protocol notation. In these protocols, $A$ is Alice, $B$ is Bob and $E(A)$ is for example Eve impersonating Alice. $K_{AB}$ is a shared secret key only Alice and Bob know, $K_{AB}\{x\}$ is the data $x$ encrypted with this key, and $N_A$ and $N_B$ are fresh random nonces generated by $A$ and $B$. We assume that Eve cannot simply break the encryption.

$1. A \rightarrow B : A, K_{AB}${$N_A$}
$2. B \rightarrow A : B, N_A, K_{AB}${$N_B$}
$3. A \rightarrow B : A, B, N_A, N_B, K_{AB}${$N_A, N_B$}

For this, I designed the following replay attack:

$1. A \rightarrow E(B) : A, K_{AB}${$N_A$}
$2. E(A) \rightarrow B : A, K_{AB}${$N_A$}
$3. B \rightarrow E(A) : B, N_A, K_{AB}${$N_B$}
$4. E(B) \rightarrow A : B, N_A, K_{AB}${$N_B$}

Is this gonna work?

The next example is this:

$1. A \rightarrow B : A, N_A, K_{AB}${$A, N_A$}
$2. B \rightarrow A : B, N_B, K_{AB}${$B, N_A, N_B$}
$3. A \rightarrow B : K_{AB}${$A, B, N_A$}

I have no idea how to solve this one. Could you please help me with that?

by Peter at October 08, 2015 04:20 PM


EC2 Instance Update – X1 (SAP HANA) & T2.Nano (Websites)

AWS customers love to share their plans and their infrastructure needs with us. We, in turn, love to listen and to do our best to meet those needs. A look at the EC2 instance history should tell you a lot about our ability to listen to our customers and to respond with an increasingly broad range of instances (check out the EC2 Instance History for a detailed look).

Lately, we have been hearing two types of requests, both driven by some important industry trends:

  • On the high end, many of our enterprise customers are clamoring for instances that have very large amounts of memory. They want to run SAP HANA and other in-memory databases, generate analytics in real time, process giant graphs using Neo4j or Titan , or create enormous caches.
  • On the low end, other customers need a little bit of processing power to host dynamic websites that usually get very modest amounts of traffic,  or to run their microservices or monitoring systems.

In order to meet both of these needs, we are planning to launch two new EC2 instances in the coming months. The upcoming X1 instances will have loads of memory; the t2.nano will provide that little bit of processing power, along with bursting capabilities similar to those of its larger siblings.

X1 – Tons of Memory
X1 instances will feature up to 2 TB of memory, a full order of magnitude larger than the current generation of high-memory instances. These instances are designed for demanding enterprise workloads including production installations of SAP HANA, Microsoft SQL Server, Apache Spark, and Presto.

The X1 instances will be powered by up to four Intel® Xeon® E7 processors. The processors have high memory bandwidth and large L3 caches, both designed to support high-performance, memory-bound applications. With over 100 vCPUs, these instances will be able to handle highly concurrent workloads with ease.

We expect to have the X1 available in the first half of 2016. I’ll share pricing and other details at launch time.

T2.Nano – A Little (Burstable) Processing Power
The T2 instances provide a baseline level of processing power, along with the ability to save up unused cycles (“CPU Credits”) and use them when the need arises (read about New Low Cost EC2 Instances with Burstable Performance to learn more). We launched the t2.micro, t2.small, and t2.medium a little over a year ago. The burstable model has proven to be extremely popular with our customers. It turns out that most of them never actually consume all of their CPU Credits and are able to run at full core performance. We extended this model with the introduction of t2.large just a few months ago.

The next step is to go in the other direction. Later this year we will introduce the t2.nano instance.  You’ll get 1 vCPU and 512 MB of memory, and the ability run at full core performance for over an hour on a full credit balance. Each newly launched t2.nano starts out with sufficient CPU Credits to allow you to get started as quickly as possible.

Due to the burstable performance, these instances are going to be a great fit for websites that usually get modest amounts of traffic. During those quiet times, CPU Credits will accumulate, providing a reserve that can be drawn upon when traffic surges.

Again, I’ll share more info as we get closer to the launch!


by Jeff Barr at October 08, 2015 04:17 PM


Coding help -- using abstract types with generic parameters

I'm thinking through modeling code for a simple web application. Here's what I have:

package model trait Id { def value: String } trait Persisted[ID <: Id] { def id: ID } trait Update[ID <: Id] { def id: ID } trait Free 

The idea here is that IDs will be typed, so a "User" ID cannot be confused for an "Item" ID (I put "User" and "Item" in quotes because these are just abstract ideas at this point). Each type will have three manifestations:

1) Persisted for instances that have been persisted in a data store 2) Update for updating the persisted instances 3) Free for creating persisted instances

This is the definition of a User:

case class User( id: User.Id, name: String, email: String) extends Persisted[User.Id] object User { case class Id(value: String) extends model.Id case class Update( id: Id, name: String, email: String) extends model.Update[Id] case class Free( name: String, email: String) extends model.Free } 

Now I'll define a generic type to CRUD objects in the data store:

trait Dao { type ID <: Id type F <: Free type P <: Persisted[ID] type U <: Update[ID] def create(f: F): ID def update(u: U): Unit def one(id: ID): P def delete(id: ID): Unit } abstract class UserDao extends Dao { type ID = User.Id type F = User.Free type P = User type U = User.Update } 

I'm not filling out all the implementation here, hence the abstract class UserDao. In a real implementation, the Dao trait would do the work of accessing the data store (with help from the extending class for translating from Free to data store params, etc..

Next, I want a Service trait that adds more functionality (not yet defined) on top of the Dao functionality.

trait Service extends Dao { abstract override def create(f: F): ID = { try { val id = super.create(f) id } finally { // Do something else here } } } 

Now I want to create a search service that maintains a search index.

trait SearchService extends Service { def update(p: P): Unit def search(query: String): Seq[P] } 

The above SearchService works fine, but I have to extend Service to tie it with the P abstract type. I'd like to use Service as a member of SearchService like this:

trait SearchService[ID <: Id, P <: Persisted[ID]] { def service: Service def update(p: P): Unit // Updates the search index def search(query: String): Seq[P] // Queries the search index } 

But the problem is that the P generic type parameter for SearchService is not "tied" to the P abstract type of the member Service. Is it possible to tie these together to let the compiler know that both Ps are the same?

I've done something similar to this using generics throughout, and it worked, but it's a lot of typing, I have to pass type parameters that are not used, and there is an "explosion" of type parameters (the a lot of typing problem).

Thanks for any help or feedback!

submitted by three-cups
[link] [2 comments]

October 08, 2015 04:17 PM


how to create a map in javascript with array of values dynamically

I have this requirement. Depending on the number of arguments passed in the function, I need to create that many entries in my map. Say I have a function myfunc1(a,b,c) , I need to have a map with the keys as "a","b" and "c" and I can have more than one values for each of the keys. But the problem is that I do not know beforehand, how many values will come for those keys. As and when the values come, I need to add them to the list of values corresponding to the matching key in the map. How do I do this in javascript? I have found static answers like below. But I want to do this dynamically. Can we use the push method ?

var map = {};
map["country1"] = ["state1", "state2"];
map["country2"] = ["state1", "state2"];

by Zack at October 08, 2015 04:02 PM



Holla, der Wahlkampf in Österreich wird aber mit harten ...

Holla, der Wahlkampf in Österreich wird aber mit harten Bandagen geführt. Da tauchte ein Immobilieninserat in einer Zeitung auf, bei der die Büroräume der Kleinpartei NEOS angeblich verkauft werden sollten, mit Namen und echter Handynummer des Parteigründers. Die IP-Adresse des Inseratsaufgebers ging zur ÖVP.

October 08, 2015 04:01 PM

Die Fifa hat Blatter und seine Handlanger für 90 Tage ...

Die Fifa hat Blatter und seine Handlanger für 90 Tage suspendiert. Der Hammer daran: Die Korruption in der Fifa ist SO KRASS, dass jetzt das passiert:
Issa Hayatou, who heads Africa's football confederation (Caf), will act as Fifa president during Blatter's ban.
Das muss man sich mal vorstellen! Der am wenigsten korrupte potentielle Nachfolger kommt aus Afrika!

Update: Von jemandem, der mehr von Fußball versteht als ich, kommt noch der Hinweis rein, dass Mannschaften aus Afrika im Weltfußball stark unterrepräsentiert sind. Auch aus der Sicht ist das also bemerkenswert.

Update: Oh und Wikipedia berichtet von Korruptionsvorwürfen gegen den 2010.

October 08, 2015 04:01 PM

Lamer News


Good Graduate Schools for Compilers?

Hey all; sorry if this doesn't belong here, but a few people directed me to this sub for this question. I'm working on grad school applications for next fall, and I'm trying to find schools with strong compilers programs.

I've found a lot of stuff at the upper end (UIUC comes up a lot, for obvious reasons), but I'd like to get my safety schools lined up as well. Does anyone have advice about lower-ranked universities that still have good work being done in compilers/programming languages?

submitted by FreakyCheeseMan
[link] [11 comments]

October 08, 2015 03:47 PM


What type of virus protections are available and their suitability to a cyber-security company. [on hold]

What type of virus protections are available and their suitability to a cyber-security company. I'm talking about top notch stuff. Are there any methods or techniques that are rare or unusual? Othere than Anti-Virus softwares, are there other ways to keep your systems protected from virus?

by Wulfinite at October 08, 2015 03:36 PM



Dave Winer

Tale of 2015 blogging

I had a paragraph-length idea to blog, so I put it in three places.

  1. On the new blogging software I'm working on (private, sorry).

  2. On Facebook, as a post (a perfect fit).

  3. On Twitter as a tweetstorm using

The paragraph

Google just learned about movies in a new way. I went to see The Martian last friday. I guess they figured out what I was doing, from my location and the amount of time I stayed there, when it started and ended. They figured out I was at the movies, and that I went to see The Martian. So now they're making movie recommendations in Google Now that say "Because you saw The Martian..."

The response

What's realllly cool is that I got a response from the product lead at Google Now, clarifying. I responded on Twitter.

There's a bunch of extra work, but the result is gratifying. I learned something about the software I'm using from the person who develops it.

October 08, 2015 03:12 PM


Im Moment liest man ja vereinzelt darüber, wie furchtbar ...

Im Moment liest man ja vereinzelt darüber, wie furchtbar selbstfahrende Autos für das Knöllchen-Einkommen der Kommunen sind, und dass die Versicherungen alle überflüssig werden und so weiter. Aktuelles Highlight: Volvo sagt an, die Haftung für Unfälle ihrer selbstfahrenden Autos zu übernehmen.

Wenn ich eine KFZ-Versicherung wäre, würde ich mir jetzt Sorgen machen.

Was tun also die Versicherer angesichts dieses Götterdämmerungsszenarios? Sie bieten sich als "Investoren" für Infrastruktur an. Gemeint sind damit Brücken, Autobahnen und co. Nun lohnt sich ein Investment natürlich nur, wenn man im Gegenzug Einkommen generiert. Und zwar mehr Einkommen als man investiert hat. Diesen Aspekt sprechen sie natürlich nicht an in ihrer vor Altruismus strotzenden Selbstdarstellung. Aber dafür gibt es jetzt ein Positionspapier.

In einem gemeinsamen Positionspapier (PDF) sprechen sich der Gesamtverband der Deutschen Versicherungswirtschaft (GDV) und der Hauptverband der Deutschen Bauindustrie (HDB) dafür aus, künftig vor jedem Bauvorhaben eine verpflichtende, objektive Wirtschaftlichkeitsuntersuchung durchzuführen.
Früher hat man Straßen gebaut und Brücken repariert, weil sie gebraucht werden. Heute müssen sie "wirtschaftlich sein". Auf Seite 11 geht es um die Gründung einer Bundesautobahngesellschaft.

So und jetzt wisst ihr plötzlich, wieso dieser sinnlose Maut-Blödsinn so dringend durchgepresst werden musste. Das war die Bedingung dafür, dass "privates Kapital" in den Autobahnbau fließen kann.

October 08, 2015 03:01 PM

Ich hatte noch eine inhaltlich weiterbringende Einsendung ...

Ich hatte noch eine inhaltlich weiterbringende Einsendung zu Kommunikationsproblemen zwischen Nerds und anderen Nerds (denn nicht nur Nerds und Nicht-Nerds haben Probleme, auch Nerds untereinander).
Infrastruktur wird überhaupt nur beurteilt, wenn sie versagt - ansonsten wird sie gar nicht wahrgenommen. Der Linux-Kernel wird an "ReiserFS/btrfs/XFS/ext4 hat meine Daten gefressen" und an "Der Kernel cored, wenn die Kiste aus dem Sleep aufwacht" gemessen und nicht an "Oh, der write(2) System Call hat meine Daten tatsächlich auf die Platte geschrieben.

Das ist ein großer Unterschied zu Libreoffice, das keine Infrastruktur ist und an neuen Best Cases (= neue Features) statt an seinen Worst Cases gemessen zu werden.

Infrastrukturentwickler verwenden also dieselben Sprachen, Tools, Unit-Tests, VCVSe und dergleichen mehr, arbeiten aber unter einem komplett anderen Wertesystem.

Wenn da einer auf der LKML ankommt und zeigt "Hier, neuer shiny code" (= neuer Best Case), dann wissen die schon, daß der aus einem anderen, falschen Universum ist. Infrastrukturentwickler wissen, daß Code auch immer Bugs/LoC hat, also eine Liability und kein Asset ist. Das kann man natürlich versuche, solchen Personen immer wieder, und wieder, und wieder zu erklären, aber das kostet mehr Energie als es wert ist.

Man will in einer Infrastruktur auch nicht mehr Leute haben. Wenn wenn man da einmal genug Teilnehmer hat, daß das Projekt läuft, dann sind mehr Leute auch kein Asset, sondern eine Liability - und das gilt nicht nur für die LKML-Regulars oder Kernel-Committer, oder glibc-Developer, sondern auch für Leute, die Deine Gasleitungen oder Stromkabel warten, und sogar für Wikipedia Admins (sic!). Das ist ein ganz besonderer Menschenschlag, mit einer besonderen Arbeitsethik, die vollständig daraus stammt, daß die ihre Arbeit dann gut machen, wenn sie ihnen nicht gedankt wird und sie einfach niemand bemerkt.

Infrastrukturprojekte wollen nicht nur Deinen Code nicht, sondern auch Deine Mitarbeit nicht, weil sie - ausreichend Basis-Masse vorausgesetzt - beides nicht brauchen. Insbesondere können sie Dich nicht brauchen, wenn Du nicht diesen Infrastruktur-Mindset hast, weil Du sonst auch die Kritik und die Arbeitssituation unter einer Fail-Metrik wie sie bei Infrastuktur herrscht nicht aushältst.

Wenn man da als shiny bling bling Feature-Developer hin kommt, oder als eine 'Findet mich und meine Anliegen wichtig'-SJW-Mentalität, dann passiert eine Sarah Sharp.

Ich stimme dem im Detail nicht zu (ich bewerte den Kernel auch nach neuen Features und Performancegewinnen bei alten Features), aber das kann man auch durch geschickte Wortauslegung wegdefinieren. Die Ausführungen finde ich trotzdem weiterhelfend und konnte das so auch schon beobachten. Bei so langen Statements erwähne ich häufig den Einsender nicht, aber in diesem Fall linke ich mal auf ein vergleichbares Statement von ihm, denn Kris macht das seit vielen, vielen Jahren und weiß, wovon er spricht. Das Verständnis der verschiedenen Herangehensweisen von Infrastruktur-Kram und Feature-Kram ist wichtig und es kann uns allen nur hilfreich sein, wenn wir es uns immer mal wieder vor Augen führen.

October 08, 2015 03:01 PM

Aus der beliebten Kategorie "bei UNS ist Kernkraft ...

Aus der beliebten Kategorie "bei UNS ist Kernkraft SICHER", heute: Das Schweizer AKW Beznau Löcher im Reaktordruckbehälter.
In den Stahlwänden des Reaktordruckbehälters von Beznau 1 klaffen gegen 1000 Löcher [...] Die Rede ist von Löchern mit einem durchschnittlichen Durchmesser von einem halben Zentimeter, eingeschlossen innerhalb der Stahlwände.
Die Betreibergesellschaft hatte dazu im Juli folgendes mitgeteilt:
Der Betreiber von Beznau hatte am 16. Juli mitgeteilt, in den Stahlwänden des Reaktordruckbehälters «an einigen Stellen Anzeigen registriert» zu haben, «die auf minimale Unregelmässigkeiten aus dem Herstellungsprozess hinweisen».

Update: Mail-Kommentar:

Es handelt sich nicht um Löcher. In der über 15 cm dicken Stahlwand hat es Luft-Bläschen von 5mm Durchmesser. Als Physiker bin ich davon überzeugt, dass diese kein Risiko darstellen.

October 08, 2015 03:01 PM


Combine multiple filter() predicates in Underscore / Ramda / Functional languages / Libraries

For a list of data I want to filter it using several predicates and then do an operation on each filter.

Eg. if my data is:

var people = [
    {name: 'Sachin',    profession: 'doctor',   cases: 12},
    {name: 'Djokovic',  profession: 'lawyer',   cases: 14},
    {name: 'Paes',      profession: 'doctor',   cases: 36},
    {name: 'Jordan',    profession: 'lawyer',   cases: 78},
    {name: 'Williams',  profession: 'doctor',   cases: 30},
    {name: 'Nehwal',    profession: 'lawyer',   cases: 75}

I want to convert it to:

var peopleWithoutCases = [
    {name: 'Sachin',    profession: 'doctor',   patients:   12, cases: 12},
    {name: 'Djokovic',  profession: 'lawyer',   courtcases: 14, cases: 14},
    {name: 'Paes',      profession: 'doctor',   patients:   36, cases: 36},
    {name: 'Jordan',    profession: 'lawyer',   courtcases: 78, cases: 78},
    {name: 'Williams',  profession: 'doctor',   patients:   30, cases: 30},
    {name: 'Nehwal',    profession: 'lawyer',   courtcases: 75, cases: 75}

Is there an elegant functional approach like this??

    .filter (person => person.profession == 'doctor')
    .map    (person => {
                person.patients = person.cases
                return person;
    .filter (person => person.profession == 'lawyer')
    .map    (person => {
                person.courtcases = person.cases
                return person;

Problem is the first map returns an array where there are only doctors. So the second filter returns [].

I know I can do this:

        .filter (person => person.profession == 'doctor')
        .map    (person => {
                    person.patients = person.cases
                    return person;
        .filter (person => person.profession == 'lawyer')
        .map    (person => {
                    person.courtcases = person.cases
                    return person;

Please correct me if I'm wrong but, this takes a multi pass approach to the problem which is inefficient in my opinion as the array list grows and the number of predicates grow.

Its quite easy to write this with an imperative approach. A single for loop with multiple if statements. Efficient but not elegant :)

Please suggest the best approach using Functional javascript like Underscore, LoDash or the Excellent RamdaJS library. How's it done in pure functional languages?


  1. Array order is not important in this case.
  2. Please don't take the example literally and suggest alternate solutions, I want a general solution for filtering and mapping lists fir multiple predicates.

by Gaurav Ramanan at October 08, 2015 03:01 PM




Incorrect program for dividing polynomials in Scheme [on hold]

(define (poly-scale poly n) (map (lambda (x) (* x n)) poly))

(define (poly-quotient poly1 poly2) 
   (let ((scale (/ (leading-coef poly1) (leading-coef poly2))))
    (if (< (degree poly1) (degree poly2)) the-zero-poly 
        (- (degree poly1) (degree poly2)) 
        (poly-quotient (rest-of-poly (p+ (poly-scale poly2 (* scale -1)) poly1)) poly2))))) 

Can you, please, tell me what's wrong with the code above? The code looks correct to me, but it doesn't work.

by user40786 at October 08, 2015 02:58 PM


QuantLib: Is the StochasticProcess class adapt for a HJM type of modelling?

I would like to use the following model in QuantLib:

$\frac{dF(t,T)}{F(t,T)} = \sigma_se^{-\beta(T-t)}dW_{t}^{1} + \sigma_L\left(1-e^{-\beta(T-t)}\right)dW_{t}^{2}$

This is a reformulation of the Schwartz Smith model (Schwartz-Smith). $F(t,T)$ is the commodity future price and the model is to be calibrated to American option prices (options on futures).

I plan to proceed in the following way:

  1. Derive a class from StochasticProcess for the process.
  2. Implement a PricingEngine for the analytical formula of european options.
  3. Implement a PricingEngine for American Options. I will use the Barone-Adesi/Whaley approximation. I have adapted the algorithm to use it with this model. I cannot use the provided implementation in the library though. My implementation will follow the lines of the one in the library I just have to plug-in 3 things: the analytical formula for the european option, the delta and the term that multiplies the second derivative wrt $F$ in the pricing PDE (the first two coming from the european pricing engine and the last one coming from the process).
  4. Implement a CalibratedModel.
  5. Implement a CalibrationHelper.

My problem is with point number 1. Is it OK to use the StochasticProcess class ? or should I implement a different class because in fact I'm modelling a family of processes, one for each T?

Thank you for any help and thoughts.

by Giancarlo Giuffra at October 08, 2015 02:54 PM






What's the time complexity of this algorithm? And Why?

I am stuck by analyzing the time complexity of the following algorithm:

def fun (r, k, d, p):
    if d > p:
        return r
    if d = 0 and p = 0:
        r <- r + k
        return r
    if d > 0:
        fun (r, k + 1, d - 1, p)
    if p > 0:
        fun (r, k - 1, d, p - 1)

The root call will be fun (0, 0, n, n), and n is the size of the problem.

I guess that: The recurrence relation is $ T(n, n) = T(n-1, n) + T(n, n-1)$, which is equivalent to $T(2n) = 2T(2n-1) \iff T(m) = 2T(m - 1)$, and so $O(2^m) \iff O(4^n)$.

Is my analysis correct (I know it's not very complete and exact)? If it does have serious flaw, please point it out or show me a correct and complete proof on the time complexity of this algorithm.

by xando at October 08, 2015 02:17 PM

Homomorphism Languages

Let $h$ be a homomorphism and let $L$ be a language. Writing ${}^*$ for Kleene star, I want to show that $(h^{-1}(L))^* \neq h^{-1}(L^*)$. Can I prove this just by showing that we can have $h^{-1}(x)$ which does not have a map in $L$ and therefore the right side is the empty set? Doing the same to the left side gives $\emptyset^* = \{\epsilon\}\neq\emptyset$.

by POC at October 08, 2015 02:15 PM



non exhaustive pattern in function filtering?

cubes = [ (a,b,c) | a <- [1..30],b <-[1..30],c <- [1..30] ]

filtering (d,f,g)
 | d == f && f == g && d ==g = "cube"

third = filter newfun cubes 

newfun (x,y,z) = (filtering (x,y,z) == "cube")

*Charana> third
[(1,1,1)*** Exception: haskell.hs:(55,1)-(56,37): Non-exhaustive patterns in  function filtering

So when i put this in terminal it gives me a non-exhaustive pattern error ,the functions individually by them selves works fine and the program complies fine too.Any idea? Thank you

by Charana at October 08, 2015 02:04 PM


Das Bundesverfassungsgericht hat mal wieder ein interessantes ...

Das Bundesverfassungsgericht hat mal wieder ein interessantes Urteil gefällt. Kernsatz:
Fertigt die Polizei Filmaufnahmen von einer Versammlung an, ist sie nicht ohne Weiteres berechtigt, die Identität von Versammlungsteilnehmern festzustellen, die die Polizeikräfte ihrerseits filmen.
Das heißt nicht, dass sie die Identität gar nicht feststellen dürfen, aber halt nicht einfach so.
Die Identitätsfeststellung ist nur bei konkreter Gefahr für ein polizeiliches Schutzgut zulässig.

October 08, 2015 02:01 PM

Hier ist noch eine Position zur Linux-Debattenkultur-Situation, ...

Hier ist noch eine Position zur Linux-Debattenkultur-Situation, die bisher nicht zu Wort kam. Das ist nämlich jemand, der beschreibt, dass die Debattenkultur a) gar nicht schlimm ist und b) in den letzten 10 Jahren deutlich besser wurde und c) diese unsubstanziierten Vorwürfe jetzt mehr Schaden anrichten als die vorgeworfenen Missstände.

Und hier hat mal jemand die Postings rausgesucht und verlinkt, um die es eigentlich gegangen ist. Da kann man sich mal selber eine Meinung bilden.

October 08, 2015 02:01 PM

Oh Mann, hier kommt gerade eine Mail rein, die mich ...

Oh Mann, hier kommt gerade eine Mail rein, die mich ganz nostalgisch macht. Ein Mann, mit dem ich mich schon vor 15 Jahren im Usenet gezofft habe. Wir sind aber immer zivil genug geblieben, dass es nie zu Killfile-Einträgen kam. Das halte ich heute noch für eine Errungenschaft, dass wir uns gegenseitig aushalten konnten, obwohl wir inhaltlich so gut wie in allen Punkten diametral verschiedene Positionen vertreten haben.

Aber genug reminisziert, hier ist die Mail:

ich sehe da den "Betrug" nicht:
Die Aufgabe war: "Besteht den praxisfernen Test", VW hat ein praxisfernes Karnickel aus dem Hut gezaubert.

Test bestanden, mission accomplished.

Gut, sie waren so blöd, sich erwischen zu lassen.

Ihr seht schon, was ich meine, wenn ich von inhaltlichen Differenzen zwischen uns spreche :)
Um Größenordnungen schlimmer fand ich den gigantischen Betrug von VW Mitte der achtziger Jahre, der mich heute noch in Rage bringt. Ich habe das 'mal als Gastbeitrag bei einem Freund eingereicht, ziemlich lang, der VW-Betrug findet sich im letzten Viertel und ist nach bestem Wissen und Gewissen geschrieben.

Die skandalöse Angelegenheit ging damals durch die Tagespresse und es ist mir schleierhaft, weshalb sich keiner daran erinnert.

Aber es erinnert sich ja auch keiner daran, daß das Recht auf freie Meinungsäußerung auch für "Facebook"-Dumpfbacken gilt.

Die Lektüre der Lebensbeichte hinter dem Link lohnt sich auf jeden Fall, und ich für meinen Teil erinnere mich nicht, schon mal was von der VW-Katalysator-Story gehört zu haben, die er da erzählt.

October 08, 2015 02:01 PM



Functional oracles

In the traditional oracle Turing machine, the oracle is specified as a decision problem. Roughly speaking, one puts a string in the oracle tape, and asks whether it is true or false.

I am wondering whether it makes sense to consider a functional oracle. This means, one puts a string $x$ in the oracle tape, and guarantee the returned result $f(x)$ is polynomially bounded in length by $x$, and the oracle is supposed to return $f(x)$.

A very natural such an oracle is an FNP oracle, and one can define a class $P^{FNP}$.

Any study regarding this? Or the notion is not well defined? Any comments are welcome.

by maomao at October 08, 2015 01:47 PM

Planet FreeBSD

Trademark Usage Terms and Conditions Update

It was recently brought to my attention that when I updated the Foundation's Trademark Usage Terms and Conditions on February 17, 2015, I didn't update the date of the change or post a notification about the change on our website. I apologize for this unintentional oversight. The terms and conditions date has been updated to reflect today's date and this is a formal notification of the change in section 3 which specifically states that it is a violation to incorporate any of our Marks in a username. 

I'd like to remind the community that permission is required to use the Foundation's trademarks. Please refer to the Foundation’s Trademark Usage Terms and Conditions to get information on how to get permission to use the Foundation's trademarks. 

Deb Goodkin
Executive Director
The FreeBSD Foundation

by Deb at October 08, 2015 01:41 PM


Sorting sequence with $O(n^{\frac{3}{2}})$ inversions

There is given sequence $a_1,...a_n$ such that there are $O(n^{\frac{3}{2}}) $ inversions in this sequence. I am thinking about sorting algorithm for that.

I know lower bound for number of comparisons - it is $O(n)$ - on the contrary, there would be a minimum finding algorithm faster than $O(n)$.

Nevertheless, I don't have idea how sort it in linear time ? What doy you think ?

Inversion is a pair $(i, j)$ such that $i < j$ and $a_i > a_j$

by user40545 at October 08, 2015 01:35 PM



Starting BFS at s and t

INPUT: undirected graph, s, t

OUTPUT: connectivity of s and t

I perform BFS on s AND t, each taking turns to make one traversal.

When a vertex exists in both s and t's BFS tree, we can assume it is connected.

When one tree is done traversing but the other is not, s and t are not connected.

Does such an algorithm exist or am I making stuff up?

by iluvAS at October 08, 2015 01:07 PM

Planet Emacsen

Irreal: Some Org Agenda Tips

Marcin Borkowski (mbork), who's been posting several great articles lately, has a nice offering with A few org-agenda hacks. It's a short list of some of the ways he's tweaked Org's agenda view to better suit his needs.

As Borkowski says, all of this is in the documentation but may be a bit hard to dig out. If you use even one of his tips it will have been worth reading this short but pithy post.

by jcs at October 08, 2015 01:06 PM




Analytical soluton to the Black-Scholes equation with a modified European Call Option

Please consider the following modified European Call Option

enter image description here

where $ 0 < a \leq 1$. When $a = 1$ the modified European call option is reduced to the standard European call option.

Transforming the Black-Scholes equation in the standard heat equation and using Fourier transform, I am obtaining the following analytical solution for such modified European call option (please do right click on the image to enlarge it)

enter image description here

When $a=1$ the standard Black-Scholes formula for the usual European call option is recovered, namely

enter image description here

My questions are:

  1. Do you know another method to derive the solution for the modified European Cal option.

  2. Do you know real life cases on which a modified European Call options are applied.

by Juan Ospina at October 08, 2015 12:30 PM


Is that "every infinite history H of x, an infinite number of method calls are completed" mean, that x object is lock-free?

Is the following statement equivalent that object x is lock-free? For every infinite history H of x, an infinite number of method calls are completed.

I believe the answer is yes, but I can't proof it correct:

So first, let me give some definitions:

  • An execution of a concurrent system is modeled by a history H, a finite sequence of method invocation and response events. A subhistory of a history H is a subsequence of the events of H.
  • A method is lock-free if it guarantees that infinitely often some
    method call finishes in a finite number of steps.

From the initial statement we have infinite method call. So let suppose, that x isnt lock-free, that will mean that all method calls will finish only in a infinite number of steps, which contradicts with the initial statement, that infinite number of calls are completed.

Is that explanation correct or not? The question related with multiprocessing algorithm, but I cant create such tag.

by Rocketq at October 08, 2015 12:24 PM


Efficient algorithms for vertical visibility problem

During thinking on one problem, I realised that I need to create an efficient algorithm solving the following task:

The problem: we are given a two-dimensional square box of side $n$ whose sides are parallel to the axes. We can look into it through the top. However, there are also $m$ horizontal segments. Each segment has an integer $y$-coordinate ($0 \le y \le n$) and $x$-coordinates ($0 \le x_1 < x_2 \le n$) and connects points $(x_1,y)$ and $(x_2,y)$ (look at the picture below).

We would like to know, for each unit segment on the top of the box, how deep can we look vertically inside the box if we look through this segment.

Formally, for $x \in \{0,\dots,n-1\}$, we would like to find $\max_{i:\ [x,x+1]\subseteq[x_{1,i},x_{2,i}]} y_i$.

Example: given $n=9$ and $m=7$ segments located as in the picture below, the result is $(5, 5, 5, 3, 8, 3, 7, 8, 7)$. Look at how deep light can go into the box.

Seven segments; the shaded part indicates the region which can be reached by light

Fortunately for us, both $n$ and $m$ are quite small and we can do the computations off-line.

The easiest algorithm solving this problem is brute-force: for each segment traverse the whole array and update it where necessary. However, it gives us not very impressive $O(mn)$.

A great improvement is to use a segment tree which is able to maximize values on the segment during the query and to read the final values. I won't describe it further, but we see that the time complexity is $O((m+n) \log n)$.

However, I came up with a faster algorithm:


  1. Sort the segments in decreasing order of $y$-coordinate (linear time using a variation of counting sort). Now note that if any $x$-unit segment has been covered by any segment before, no following segment can bound the light beam going through this $x$-unit segment anymore. Then we will do a line sweep from the top to the bottom of the box.

  2. Now let's introduce some definitions: $x$-unit segment is an imaginary horizontal segment on the sweep whose $x$-coordinates are integers and whose length is 1. Each segment during the sweeping process may be either unmarked (that is, a light beam going from the top of the box can reach this segment) or marked (opposite case). Consider a $x$-unit segment with $x_1=n$, $x_2=n+1$ always unmarked. Let's also introduce sets $S_0=\{0\}, S_1=\{1\}, \dots, S_n=\{n\}$. Each set will contain a whole sequence of consecutive marked $x$-unit segments (if there are any) with a following unmarked segment.

  3. We need a data structure that is able to operate on these segments and sets efficiently. We will use a find-union structure extended by a field holding the maximum $x$-unit segment index (index of the unmarked segment).

  4. Now we can handle the segments efficiently. Let's say we're now considering $i$-th segment in order (call it "query"), which begins in $x_1$ and ends in $x_2$. We need to find all the unmarked $x$-unit segments which are contained inside $i$-th segment (these are exactly the segments on which the light beam will end its way). We will do the following: firstly, we find the first unmarked segment inside the query (Find the representative of the set in which $x_1$ is contained and get the max index of this set, which is the unmarked segment by definition). Then this index $x$ is inside the query, add it to the result (the result for this segment is $y$) and mark this index (Union sets containing $x$ and $x+1$). Then repeat this procedure until we find all unmarked segments, that is, next Find query gives us index $x \ge x_2$.

Note that each find-union operation will be done in only two cases: either we begin considering a segment (which can happen $m$ times) or we've just marked a $x$-unit segment (this can happen $n$ times). Thus overall complexity is $O((n+m)\alpha(n))$ ($\alpha$ is an inverse Ackermann function). If something is not clear, I can elaborate more on this. Maybe I'll be able to add some pictures if I have some time.

Now I reached "the wall". I can't come up with a linear algorithm, though it seems there should be one. So, I have two questions:

  • Is there a linear-time algorithm (that is, $O(n+m)$) solving the horizontal segment visibility problem?
  • If not, what is the proof that the visibility problem is $\omega(n+m)$?

by mnbvmar at October 08, 2015 12:08 PM


Does Orgzly support simple timestamps (for appointments etc) yet?

6 months ago, it didn't:

Has that changed yet?

Having to miss out on events to show up is such a bummer. Not everything is a TODO.

I've read a bit about org-mode and trello here and there. Is it a good alternative to Orgzly? All I really want is Todo and calendar like features in my app, nothing more.

submitted by eu-guy
[link] [10 comments]

October 08, 2015 11:50 AM



Constructing a regular expression from a NFA [on hold]

I am trying to find the regular expression from the following NFA's and would like some feedback on whether or not I am correct and why! I used the following algorithm to come up with my answers,

enter image description here

For the first NFA I came up with:


For the second NFA I came up with:


by Hailey at October 08, 2015 11:44 AM





Where is the ambiguity in this grammar?

I am trying to understand ambiguous grammar in programming languages. I was given this ruleset and told it was ambiguous. If my understanding is correct, this means that it is possible to create the same sentence using different parse trees. After looking at it for a while, I can't find the sentence/parse trees that show the ambiguity.

s ::= a

a ::= a a | A | B | C

by ayayron at October 08, 2015 11:13 AM

Understanding many-one reductions [duplicate]

In an introductory course on Complexity Theory I am asked to prove

If $A$ is in $\text{P}$, then for every set $B$ that is not empty or equal to $\{0,1\}^*$: $A\leq^p_m B$

Here we are working with finite binary strings (as could be seen from the mentioning of $\{0,1\}^*$). It seems this is easy to prove, but I think I don't understand many-one reductions well enough, since I am not able to come up with a good proof of this.

(I first asked this question on Math.SE, but it seems more suitable here)

by konewka at October 08, 2015 11:13 AM

BIT: Unable to understand update operation in Binary index Tree

I have just read this answer and was very satisfied and it is indeed a fantastic answer. It taught me the working of BIT.

But at the end, the second last paragraph is where I am struggling. It says,

Similarly, let's think about how we would do an update step. To do this, we would want to follow the access path back up to the root, updating all nodes where we followed a left link upward. We can do this by essentially doing the above algorithm, but switching all 1's to 0's and 0's to 1's.

But if I see, take some example, it does not work just as by simply switching 1's and 0's, according to me.

e.g. lets take we want to update value at node 5 = 101 Switching 1s and 0s, we get 010... Now applying the procedure they have given earlier, we will end up updating some other node or so.

I must be getting it wrong. Please correct me.

Thank you in advance.

by Bhavesh Munot at October 08, 2015 11:08 AM


Generating uniform integers in a range from a random generator with another range

Let $p$ and $q$ be two positive integers. I have an oracle that can generate a uniform integer in $\{1, \ldots, p\}$, the integers thus produced being independent across oracle calls. My goal is to generate a random number in $\{1, \ldots, q\}$ with uniform probability, and to do so in a way that minimizes the expected number of oracle calls required.

My question is: What is an optimal algorithm to solve this problem? I expect this to be a standard problem, so I'd be interested in pointers to the name of this problem in the literature. A special case of this problem ($p = 5$ and $q = 7$) is a standard puzzle or interview question (see here, but I was unable to find pointers to a generalization, and I don't see any optimality proof in those answers.

Of course, in general, the problem will not be solvable with a bounded number of oracle calls. Take for instance $p = 2$ and $q = 3$, or more generally, cases where some prime divisor of $q$ is not a divisor of $p$, and it is clear that for any bound $N$, the number of possible outcomes when doing $N$ calls, i.e,. $p^N$, cannot be divided evenly in $q$ buckets for each possible value.

One natural approach outlined here is to see the oracle calls as generating a uniform real number in [0, 1] from its $p$-ary writing, and terminate whenever the first decimal in the $q$-ary writing of that number is certain: think of it as using the oracle to select a subdivision of size $1/p^k$ in the interval $[0, 1]$, and concluding whenever the interval that you fall in is included in one of the $q$ intervals $[i/q, (i+1)/q]$ for $0 \leq i < q$. However, this approach does not seem optimal: for instance, for $p = q+1$, the approach only concludes in one call if the call returns 1, otherwise a second call is necessary. By contrast, a simpler rejection-based approach that calls the oracle, concludes if it falls in $\{1, \ldots, q\}$, and rejects and tries again if it falls in $q+1$, seems to have better performance.

One can try to generalize the rejection-based approach to the following: letting $k$ be minimal such that $p^k \geq q$, use $k$ oracle calls to get a uniform integer $n$ in $\{1, \ldots, p^k\}$. If $n$ falls in $\{1, \ldots, M\}$, for $M > 0$ the smallest multiple of $q$ which is $\leq p^k$, return $n \text{ mod } q$, otherwise try again. However it would seem that the additional information of which excess value in $\{M+1, \ldots, p^k\}$ was obtained could be also used in the next attempt. I can imagine how to do this (see here for a proposed algorithm, which I think is correct), but I have no idea of how to prove the optimality of such a scheme.

I am also curious about the natural variation of this problem where you want to generate not a single random number in $\{1, \ldots, q\}$, but a stream of such numbers. What's the expected number of calls per number, and an optimal method in this case?

[Note that an alternative vision of the problem that does not involve randomness at all is to think of the outcome trees: each node is either a leaf with label in $\{1, \ldots, q\}$ corresponding to a decision, or an internal node with $p$ children corresponding to an oracle call. An algorithm is a (generally infinite) such tree where, for each $1 \leq l \leq q$, the probability mass of all the leaves labeled $l$ (i.e., the sum of the mass over leaves, where the mass of each leaf is $1/p^h$, where $h$ is its height) is equal to $1/q$; and its expected performance is the sum, over all leaves, of $h/p^h$ (probability mass of this leaf, times the number of oracle calls performed in that case).]

by a3nm at October 08, 2015 10:45 AM


How many videos could a top shelf smart phone play at once? (Not sure if this is the right place to ask)

I'm guessing it would depend on the video quality as well as the phone processing power etc. What would be an expected number for a snapchat quality video and a standard video filmed on the phone?

submitted by Ryboflavin_B
[link] [1 comment]

October 08, 2015 10:20 AM



Stochastic Volatility CIR estimation

Would anyone have a code (pref. Matlab or R) for any type of estimation (QML, GMM) not using option prices of a stochastic volatility model driven by a CIR process described below?

\begin{equation} dS_t = \mu dt + \sqrt{v_t} dW_t^1 \end{equation} \begin{equation} dv_t = \kappa (\theta - v_t) dt + \sigma \sqrt{v_t} dW^2_t \end{equation} such that $dW_{t}^{1}\,dW_{t}^{2}=\rho dt$

I just need to cross-check my results.


by SchnitzelRaver at October 08, 2015 10:06 AM



Fred Wilson


TPP stands for Trans Pacific Partnership, a far reaching free trade deal that US and our Asian trading partners have been working on for years. The TPP was in the news yesterday because Hillary Clinton came out and said that, in its current form, she cannot support TPP. You can read her reasons for taking this stance.

You would think as a free trade loving, free market loving venture capitalist I would be a huge proponent of TPP. But I am not.

I am very concerned about the copyright provisions in TPP which feel very much in the old world model of intellectual property protection and which would make it hard for the US government to evolve copyright laws in an era of digital content, more open innovation, and remix culture.

The EFF has a great discussion of these issues on its website so instead of reciting them here, you can read a detailed discussion of the copyright issues in TPP here.

One of the problems with these big multi-national trade negotiations is that it is super hard to get everyone to agree on everything in them. That is why they are negotiated in secret and the end result is then voted yes or no in each country without any amendments.

I realize that perfect is the enemy of the good and you need to have a comprehensive view of a trade bill like this and not focus on one issue. But copyright law is a big deal for the innovation economy and if I were in Congress, I would be seriously thinking about voting no on TPP.

by Fred Wilson at October 08, 2015 09:48 AM




How to perform bottom-up construction of heaps?

What are the steps to perform bottom-up heap construction on a short sequence, like 1, 6, 7, 2, 4?

At this link there are instructions on how to do for a list of size 15, but I can't [necessarily] apply the same process to a list of 5 items (my trouble is that 5 is not enough to provide a complete tree).

by Imray at October 08, 2015 09:13 AM



UpStats - Freelance Job Market Analytics

Hi, I’ve been working on this for a while now. I launched the website today. It’s a free service that offers freelance job market analytics. There is a blog post here describing it in more detail

I’m interested in getting feedback on it. Thank you


by wsdookadr at October 08, 2015 08:23 AM


Why can we not read and write to the same address at the same time?

I was reading Wikipedia about the von Neumann bottleneck.

Surely there is some simple answer to this. Why can we not read and write to the same address at the same time? We can if the addresses are different.

by Jon Mark Perry at October 08, 2015 08:19 AM


FreeBSD and Windows show different times

In a dual-boot system, I usually use FreeBSD 9 but when I boot to Windows 7 my system time in both OS automatically change and show incorrect time. What's the problem and How can I solve it?

by hesam at October 08, 2015 08:17 AM


How to get list of all CUSIPS/ISIN?

I want a list of all CUSIPs/ISINs. It would be nice if they were also categorized (e.g. Bonds/Funds etc). Where can I get such a data?

by wrick at October 08, 2015 08:12 AM



How to trade a Ratio?

I came across a ratio plotting of Corn And Soybeans contracts, notice it's in a historical low, an intuitive question came to my mind, how should I trade this ratio (or relationship)? It's unlike flat or spread prices, which is always linear to the underlyings, what is also interesting on how to calculate the pnl if I trade a ratio not a linear combination of underlyings? enter image description here

by pidig at October 08, 2015 07:55 AM

Where to find pricing formulas for affine stochastic volatility jump-diffusion models?

Does anyone know a reference where I can find the pricing formulas for vanilla calls in the affine stochastic volatility jump diffusion class of models such as SVJ and SVJJ?

I am looking for something analogous to the following formulas which apply to the Heston (square root) affine stochastic volatility model:

\begin{align} c(t) & = \frac{e^{-\alpha\log K}}{\pi}\int_0^\infty dv e^{-i v \log K}\rho(v) \\ \rho(v) & = \frac{e^{-r(T-t)}\phi(v-i(\alpha+1);T)}{\alpha^2+\alpha-v^2 + i(2\alpha+1)v} \\ \phi(u;T) & = \mathbb{E}^{Q_B}_t[e^{i u \log S(T)}], \\ \phi(u;T) & = e^{i u[\log S(t)+(r-\delta)(T-t)]-\frac{1}{\sigma_v^2}\left[\bar{v}\kappa\left(a(T-t) + 2\log\beta\right)+v_0 \gamma \right]} \\ \beta & = \frac{1-ge^{-d (T-t)}}{1-g} \\ \gamma & = \frac{a(1-e^{-d (T-t)})}{1-g e^{-d (T-t)}} \\ d & = \sqrt{(i\rho \sigma_v u - \kappa)^2 + \sigma_v^2(iu + u^2)} \\ g & = a/b \\ a & = i\rho\sigma_v u-\kappa + d \\ b & = i\rho \sigma_v u-\kappa - d \end{align}

by user11881 at October 08, 2015 07:51 AM


Data strucutre for finite graphs on a cylinder

My problem is to find an efficient data structure that allows the encoding of a (non-directed) graph on the surface of a cylinder (without top and bottom discs). Usually I would store a graph as a list of edges, which are pairs $(a,b)$ of vertices. My problem is that I can have two kinds of edge from $a$ to $b$ either by taking the shortest path or going around the cylinder. These two edges should be considered as being different.

My set of vertices is fixed ($n$ vertices on the top circle, $n$ vertices on the bottom circle of the cylinder) and I requires that, for any vertex, there exists precisely one edge to another vertex. Furthermore crossings of edges are not allowed (that is why it is important to be able of "going around" the cylinder; taking a "longer" way).

The aim then is to glue two such cylinders together (the bottom of one goes on the top of the other) and compute the connected components of the "glued" graphs, which will produce another graph on a cylinder.

Of course one could just tag the edges, depending whether one goes the shortest or longest way, but maybe someone has a better idea!?

by Christian Lomp at October 08, 2015 07:50 AM


How to treat large (5K-10K) non-positive-definite (particularly near-singular) covariance matrices for Cholesky decomposition?

I have a very large covariance matrix (around 10000x10000) of returns, which is constructed using a sample size of 1000 for 10000 variables. My goal is to perform a (good-looking) Cholesky decomposition of this matrix. However, as expected, this matrix is near-singular with very small ( < 10^-10 ) eigenvalues (around 5K-6K out of 10K).

I tried a naive approach; doing an eigenvalue decomposition, and setting all the eigenvalues that are less than 10^-10 to 10^-10. In other words, if eigenvalue < 10^-10, then eigenvalue = 10^-10. After that, I reconstructed the matrix with the modified eigenvalue matrix. However, although I can perform Cholesky decomposition, it is very unstable.

What is the best way to handle this PD approximation for large matrices?

by acmh at October 08, 2015 07:50 AM


Does ln n ∈ Θ(log2 n)? [duplicate]

This question already has an answer here:

Is that statement false or true? I believe it's false because ln(n) = log base e of n. So therefore, log base 2 of n can be a minimum because in 2^x = n, x will always be less than y in e^y = n. However can it ever be proven that log base 2 of n can be a maximum?

by Jonathan Smit at October 08, 2015 07:48 AM

Maximal Independent Set in Hypergraph [on hold]

H = (V,E) is a Hypergraph H

E = {e1,e2,e3}

e1 = {v1,v2}

e2 = {v2,v3}

e3 = {v3,v4,v5}

We are going to find Maximal Independet Set in H.

MIS = {v1,v4,v5}(vertices labeled with red stars)

This is my question: We don't select any of e2 vertices(v2,v3) for MIS.{v1,v4,v5} is a right MIS? or it is a wrong MIS?

by H.H at October 08, 2015 07:38 AM


How do I wrap a chained stateful computation in State monad?

I have computations in this format: s -> a -> s, where s is the type of some state. The result of such a function is also the state of the next evaluation. For example,

appendInt :: String -> Int -> String
appendInt s i = s ++ (show i)

Then, appendInt "Int: " 1 will give "Int: 1", while (appendInt $ appendInt "Int: 1") 2 will give "Int: 12". However, I cannot find a way to put this kind of computation in a State Monad.

A first guess is s -> (s,s), but then a cannot be passed in. Then, I tried (a -> s) -> (s, a -> s), but again it is impossible to get s without a. s -> (a,s) won't work because a is the input instead of output.

How should I wrap this computation, then? Is the State monad appropriate for this?

by Carl Dong at October 08, 2015 07:16 AM



Gauus Jordan using visual basic 2010 with solutions [on hold]

can i ask how to program a gauss jordan in visual basic 2010? its our project next week. I hope that someone could help me. Thanks God Bless

by ces25 at October 08, 2015 05:42 AM


Is there a javascript library for complete functional programming? [on hold]

I'm learning functional programming concepts by reading doctor Boolean's book:

I've got some basic understanding about functor, monad. So I'm looking for a library to program in pure functional style.

A lot of online tutorials recommend ramda, lodash, underscore ... But I feel these libraries only contain utility functions to deal with array and pure functions. What about monads: Maybe, Future, IO and so on? I don't find them useful for side effects.

Is there a js library which implement most functional programming features I can use for web development?

by Aaron Shen at October 08, 2015 05:03 AM


C modifying a pointers address with mod [on hold]

I have a program using shared memory as a circular queue. I have two pointers acting as the front pointer and rear pointer. The queue has 10 ints. How can I code this logic to move the pointers through the queue:

front = front % 10;

C doesn't like the % part. Ideas?

by Pareod at October 08, 2015 04:49 AM

Small research problems to increase chances of getting hired at a big company [on hold]

I am unsure as to the purpose of this site is to answer such questions(if not, please tell me where to redirect it) but this is the one that seemed most appropriate.

I am going to have some small research projects with the master this year and i would like to try and do some useful ones that will help me get noticed by big companies like Google, Facebook, Microsoft etc. I have done a small research but only found ran-off-the-mill ideas of projects.

Could you please tell me what kind of projects you think would help? Thanks in advance!

by SummerCode at October 08, 2015 04:23 AM


Git "write failed, file system is full fatal: unable to write new_index file"

I am running OpenBSD 5.7 and am modifying the kernel for a university assignment. It's running in a virtual machine.

I cloned the repository into '/usr/src' and started to modify 1 file. I went

$ cd /usr/src/sys/conf/files
$ nano files
$ git add files 
/usr/src: write failed, file system is full
error: unable to create temporary file: No space left on device
error: sys/conf/files: failed to insert into database
error: unable to index file sys/conf/files
fatal: updating files failed
$ df -ih
Filesystem     Size    Used   Avail Capacity iused   ifree  %iused  Mounted on
/dev/wd0a      731M   73.6M    620M    11%    1775  117263     1%   /
/dev/wd0k      7.0G    2.3M    6.6G     0%     284  935138     0%   /home
/dev/wd0d      1.1G   10.0K    1.1G     0%       6  155896     0%   /tmp
/dev/wd0f      1.5G    370M    1.1G    25%   11751  196119     6%   /usr
/dev/wd0g      895M    191M    660M    22%    9183  120735     7%   /usr/X11R6
/dev/wd0h      3.2G    169M    2.9G     5%    5041  436685     1%   /usr/local
/dev/wd0j      1.8G   1000M    710M    58%   26924  232914    10%   /usr/obj
/dev/wd0i      1.2G    1.2G   -8.0K   100%   70321  111565    39%   /usr/src
/dev/wd0e      1.7G    8.1M    1.6G     0%     582  233272     0%   /var

Not sure why this is happening. Any help is appreciated.

by jnd at October 08, 2015 04:02 AM


When does a PDA split?

In case of NFA, if the NFA is in a state and reads $\epsilon$ ( empty string ) the NFA splits in to two, with one being at the current state and other with the state along the $\epsilon$ transition. In case of PDA where transitions are of the type $a,b \to c$, with $a$ being the input alphabet being read, $b$ being the stack element being read and popped and $c$ being the stack element being pushed. In NFA I understood the splitting upon $\epsilon$ as the ability to guess. So I assumed that a PDA in a state $r$ with stack being $S$ splits into two PDA only when the transition is $\epsilon,\epsilon \to \epsilon \\$ ( that is when $a=b=c=\epsilon$ in the figure below the PDA splits into two with one being in state $r$ and other in $s$ with same stack $S$ ). But now I am a bit doubtful, about when does a PDA split. I feel I am wrong and am misunderstanding something trivial. ( The figure below just shows the part of a larger PDA ) enter image description here

by sasha at October 08, 2015 04:00 AM



Constructing a regular expression from a NFA [on hold]

I am trying to find the regular expression from the following NFA's and would like some feedback on whether or not I am correct and why! I used the following algorithm to come up with my answers,

enter image description here

For the first NFA I came up with:


For the second NFA I came up with:


by Takeln at October 08, 2015 03:30 AM


Please tell me if this high level view is accurate/correct

I've been doing a bit of recent reading since finishing school (BS) 6 years ago, and something hit me and I want to run it by this community for scrutiny.

With respect to the relationship between software engineering and computer science:

We know all turing complete systems are going to have "bugs" beyond programmer error. Specifically, because of both church-turing and Gödel's incompleteness theorem we know that turing machines can be programmed in such a way that given a turing complete language as input, there is an input that will bring the system down.

The way we can avoid contradictions that cause the system to... do whatever it means to bring the system down... is by either programming the machine to NOT accept a turing complete language, but instead a less complex language (context free or regular), or by further restricting input to a finite subset of a turing complete language.

Thanks in advance!

submitted by thedude42
[link] [9 comments]

October 08, 2015 03:19 AM


ubuntu touch emacs

I installed ubuntu touch. Of course i installed emacs. But the default keyboard lacks control and meta. Anyone know how to fix that?

submitted by brianqx
[link] [3 comments]

October 08, 2015 02:42 AM


where to get shares trading info [on hold]

I have no idea about finances, trading and other things.

But very interested in passive long term income.

I've read many things about how cheap was microsoft, google, facebook shares in the past.

And will like to find more information about this business of shares exchange.

Thank you.

by Mikhael Djekson at October 08, 2015 02:29 AM


Java 8 Way of Adding in Elements

Is there a more concise, perhaps one liner way, to write the following:

ArrayList<Integer> myList = new ArrayList<>();
for (int i = 0; i < 100; i++){

Using Java 8 features, and functionally insipred approaches. I'm not expecting a Haskell solution like:

ls = [1..100]

But something more elegant than the traditional imperative style.

by Michael at October 08, 2015 02:20 AM


Scala REST Client

Anyone have any opinions as to a good and relatively simple to use REST Client in Scala? If so, could you explain why you've decided to use specific client? I've read a little about Newman which seemed user friendly

Edit: forgot to mention that I'm using a lift application

submitted by samchoii
[link] [16 comments]

October 08, 2015 02:07 AM



How to draw a automaton with this description?

I'm having trouble drawing an automata with this description S ← abb|aab|aaabbb|aaaS|Sbbb.

I have tried drawing the abb and aab states but am having trouble making this into one automaton

by user40655 at October 08, 2015 01:42 AM

Can I write a regular expression or regular grammar for this language?

Let's say I have a formal language,

a^n b^n which mean ab,aabb,aaabbb

Can I write any regular expression or grammar to create a language like this? I am positive not entirely sure that there is not regex possible, however have no idea about the grammar part?

by Dude at October 08, 2015 01:36 AM


arXiv Computer Science and Game Theory

Asymptotically tight bounds for inefficiency in risk-averse selfish routing. (arXiv:1510.02067v1 [cs.GT])

We consider a nonatomic selfish routing model with independent stochastic travel times, represented by mean and variance latency functions for each edge that depend on their flows. In an effort to decouple the effect of risk-averse player preferences from selfish behavior on the degradation of system performance, Nikolova and Stier- Moses [16] defined the concept of the price of risk aversion as the worst-case ratio of the cost of an equilibrium with risk-averse players and that of an equilibrium with risk-neutral users. For risk-averse users who seek to minimize the mean plus variance of travel time on a path, they proved an upper bound on the price of risk aversion, which is independent of the latency functions, and grows linearly with the size of the graph and players' risk-aversion. In this follow-up paper, we provide a matching lower bound for graphs with number of vertices equal to powers of two, via the construction of a graph family inductively generated from the Braess graph. We also provide conceptually different bounds, which we call functional, that depend on the class of mean latency functions and provide characterizations that are independent of the network topology (first derived, in a more complicated way, by Meir and Parkes [10] in a different context with different techniques). We also supplement the upper bound with a new asymptotically-tight lower bound. Our third contribution is a tight bound on the price of risk aversion for a family of graphs that generalize series-parallel graphs which applies to users minimizing the mean plus standard deviation of a path, a much more complex model of risk-aversion due to the cost of a path being non-additive over edge costs. This is a refinement of previous results in [16] that characterized the price of risk-aversion for series-parallel graphs and for the Braess graph.

by <a href="">Thanasis Lianeas</a>, <a href="">Evdokia Nikolova</a>, <a href="">Nicolas E. Stier-Moses</a> at October 08, 2015 01:30 AM

Solving the Quadratic Assignment Problem on heterogeneous environment (CPUs and GPUs) with the application of Level 2 Reformulation and Linearization Technique. (arXiv:1510.02065v1 [cs.DC])

The Quadratic Assignment Problem, QAP, is a classic combinatorial optimization problem, classified as NP-hard and widely studied. This problem consists in assigning N facilities to N locations obeying the relation of 1 to 1, aiming to minimize costs of the displacement between the facilities. The application of Reformulation and Linearization Technique, RLT, to the QAP leads to a tight linear relaxation but large and difficult to solve. Previous works based on level 3 RLT needed about 700GB of working memory to process one large instances (N = 30 facilities). We present a modified version of the algorithm proposed by Adams et al. which executes on heterogeneous systems (CPUs and GPUs), based on level 2 RLT. For some instances, our algorithm is up to 140 times faster and occupy 97% less memory than the level 3 RLT version. The proposed algorithm was able to solve by first time two instances: tai35b and tai40b.

by <a href="">Alexandre Domingues Gon&#xe7;alves</a>, <a href="">Artur Alves Pessoa</a>, <a href="">L&#xfa;cia Maria de Assump&#xe7;&#xe3;o Drummond</a>, <a href="">Cristiana Bentes</a>, <a href="">Ricardo Farias</a> at October 08, 2015 01:30 AM

Budget Constraints in Prediction Markets. (arXiv:1510.02045v1 [cs.GT])

We give a detailed characterization of optimal trades under budget constraints in a prediction market with a cost-function-based automated market maker. We study how the budget constraints of individual traders affect their ability to impact the market price. As a concrete application of our characterization, we give sufficient conditions for a property we call budget additivity: two traders with budgets B and B' and the same beliefs would have a combined impact equal to a single trader with budget B+B'. That way, even if a single trader cannot move the market much, a crowd of like-minded traders can have the same desired effect. When the set of payoff vectors associated with outcomes, with coordinates corresponding to securities, is affinely independent, we obtain that a generalization of the heavily-used logarithmic market scoring rule is budget additive, but the quadratic market scoring rule is not. Our results may be used both descriptively, to understand if a particular market maker is affected by budget constraints or not, and prescriptively, as a recipe to construct markets.

by <a href="">Nikhil Devanur</a>, <a href="">Miroslav Dud&#xed;k</a>, <a href="">Zhiyi Huang</a>, <a href="">David M. Pennock</a> at October 08, 2015 01:30 AM

Bitcoin-NG: A Scalable Blockchain Protocol. (arXiv:1510.02037v1 [cs.CR])

Cryptocurrencies, based on and led by Bitcoin, have shown promise as infrastructure for pseudonymous online payments, cheap remittance, trustless digital asset exchange, and smart contracts. However, Bitcoin-derived blockchain protocols have inherent scalability limits that trade-off between throughput and latency and withhold the realization of this potential.

This paper presents Bitcoin-NG, a new blockchain protocol designed to scale. Based on Bitcoin's blockchain protocol, Bitcoin-NG is Byzantine fault tolerant, is robust to extreme churn, and shares the same trust model obviating qualitative changes to the ecosystem.

In addition to Bitcoin-NG, we introduce several novel metrics of interest in quantifying the security and efficiency of Bitcoin-like blockchain protocols. We implement Bitcoin-NG and perform large-scale experiments at 15% the size of the operational Bitcoin system, using unchanged clients of both protocols. These experiments demonstrate that Bitcoin-NG scales optimally, with bandwidth limited only by the capacity of the individual nodes and latency limited only by the propagation time of the network.

by <a href="">Ittay Eyal</a>, <a href="">Adem Efe Gencer</a>, <a href="">Emin Gun Sirer</a>, <a href="">Robbert van Renesse</a> at October 08, 2015 01:30 AM

Algebraic Structure of Vector Fields in Financial Diffusion Models and its Applications. (arXiv:1510.02013v2 [math.PR] UPDATED)

High order discretization schemes of SDEs by using free Lia algebra valued random variables are introduced by Kusuoka, Lyons-Victoir, Ninomiya-Victoir and Ninomiya-Ninomiya etc. These schemes are called KLNV method. These scheme involves solving flow of vector fields usually by numerical method. The authors found the special Lie algebraic structure on the vector fields in the measure financial diffusion models. Using this structure, the flow associated with vector fields can be solved analytically, and enable the high speed computation.

by <a href="">Yusuke Morimoto</a>, <a href="">Makiko Sasada</a> at October 08, 2015 01:30 AM

VERCE delivers a productive e-Science environment for seismology research. (arXiv:1510.01989v1 [cs.DC])

The VERCE project has pioneered an e-Infrastructure to support researchers using established simulation codes on high-performance computers in conjunction with multiple sources of observational data. This is accessed and organised via the VERCE science gateway that makes it convenient for seismologists to use these resources from any location via the Internet. Their data handling is made flexible and scalable by two Python libraries, ObsPy and dispel4py and by data services delivered by ORFEUS and EUDAT. Provenance driven tools enable rapid exploration of results and of the relationships between data, which accelerates understanding and method improvement. These powerful facilities are integrated and draw on many other e-Infrastructures. This paper presents the motivation for building such systems, it reviews how solid-Earth scientists can make significant research progress using them and explains the architecture and mechanisms that make their construction and operation achievable. We conclude with a summary of the achievements to date and identify the crucial steps needed to extend the capabilities for seismologists, for solid-Earth scientists and for similar disciplines.

by <a href="">Malcolm Atkinson</a>, <a href="">Michele Carpen&#xe9;</a>, <a href="">Emanuele Casarotti</a>, <a href="">Steffen Claus</a>, <a href="">Rosa Filgueira</a>, <a href="">Anton Frank</a>, <a href="">Michelle Galea</a>, <a href="">Tom Garth</a>, <a href="">Andr&#xe9; Gem&#xfc;nd</a>, <a href="">Heiner Igel</a>, <a href="">Iraklis Klampanos</a>, <a href="">Amrey Krause</a>, <a href="">Lion Krischer</a>, <a href="">Siew Hoon Leong</a>, <a href="">Federica Magnoni</a>, <a href="">Jonas Matser</a>, <a href="">Alberto Michelini</a>, <a href="">Andreas Rietbrock</a>, <a href="">Horst Schwichtenberg</a>, <a href="">Alessandro Spinuso</a>, <a href="">Jean-Pierre Vilotte</a> at October 08, 2015 01:30 AM

On the Uniform Computational Content of the Baire Category Theorem. (arXiv:1510.01913v1 [math.LO])

We study the uniform computational content of different versions of the Baire Category Theorem in the Weihrauch lattice. The Baire Category Theorem can be seen as a pigeonhole principle that states that a complete (i.e., "large") metric space cannot be decomposed into countably many nowhere dense (i.e., "small") pieces. The Baire Category Theorem is an illuminating example of a theorem that can be used to demonstrate that one classical theorem can have several different computational interpretations. For one, we distinguish two different logical versions of the theorem, where one can be seen as the contrapositive form of the other one. The first version aims to find an uncovered point in the space, given a sequence of nowhere dense closed sets. The second version aims to find the index of a closed set that is somewhere dense, given a sequence of closed sets that cover the space. Even though the two statements behind these versions are equivalent to each other in classical logic, they are not equivalent in intuitionistic logic and likewise they exhibit different computational behavior in the Weihrauch lattice. Besides this logical distinction, we also consider different ways how the sequence of closed sets is "given". Essentially, we can distinguish between positive and negative information on closed sets. We discuss all the four resulting versions of the Baire Category Theorem. Somewhat surprisingly it turns out that the difference in providing the input information can also be expressed with the jump operation. Finally, we also relate the Baire Category Theorem to notions of genericity and computably comeager sets.

by <a href="">Vasco Brattka</a>, <a href="">Matthew Hendtlass</a>, <a href="">Alexander P. Kreuzer</a> at October 08, 2015 01:30 AM

Approximation and Heuristic Algorithms for Computing Backbones in Asymmetric Ad-Hoc Networks. (arXiv:1510.01866v1 [cs.NI])

We consider the problem of dominating set-based virtual backbone used for routing in asymmetric wireless ad-hoc networks. These networks have non-uniform transmission ranges and are modeled using the well-established disk graphs. The corresponding graph theoretic problem seeks a strongly connected dominating-absorbent set of minimum cardinality in a digraph. A subset of nodes in a digraph is a strongly connected dominating-absorbent set if the subgraph induced by these nodes is strongly connected and each node in the graph is either in the set or has both an in-neighbor and an out-neighbor in it. Distributed algorithms for this problem are of practical significance due to the dynamic nature of ad-hoc networks. We present a first distributed approximation algorithm, with a constant approximation factor and O(Diam) running time, where Diam is the diameter of the graph. Moreover we present a simple heuristic algorithm and conduct an extensive simulation study showing that our heuristic outperforms previously known approaches for the problem.

by <a href="">Faisal N. Abu-Khzam</a>, <a href="">Christine Markarian</a>, <a href="">Friedhelm Meyer auf der Heide</a>, <a href="">Michael Schubert</a> at October 08, 2015 01:30 AM

Cybernetic modeling of Industrial Control Systems: Towards threat analysis of critical infrastructure. (arXiv:1510.01861v1 [cs.CR])

Industrial Control Systems (ICS) encompassing resources for process automation are subjected to a wide variety of security threats. The threat landscape is arising due to increased adoption of Commercial-of-the-shelf (COTS) products as well as the convergence of Internet and legacy systems. Prevalent security approaches for protection of critical infrastructure are scattered among various subsystems and modules of ICS networks. This demands a new state-of-the-art cybernetic model of ICS networks, which can help in threat analysis by providing a comprehensive view of the relationships and interactions between the subsystems. Towards this direction, the principles of the Viable System Model (VSM) are applied to introduce a conceptual recursive model of secure ICS networks that can drive cyber security decisions.

by <a href="">Abhinav Biswas</a>, <a href="">Sukanya Karunakaran</a> at October 08, 2015 01:30 AM

Practical Accounting in Content-Centric Networking (extended version). (arXiv:1510.01852v1 [cs.NI])

Content-Centric Networking (CCN) is a new class of network architectures designed to address some key limitations of the current IP-based Internet. One of its main features is in-network content caching, which allows requests for content to be served by routers. Despite improved bandwidth utilization and lower latency for popular content retrieval, in-network content caching offers producers no means of collecting information about content that is requested and later served from network caches. Such information is often needed for accounting purposes. In this paper, we design some secure accounting schemes that vary in the degree of consumer, router, and producer involvement. Next, we identify and analyze performance and security tradeoffs, and show that specific per-consumer accounting is impossible in the presence of router caches and without application-specific support. We then recommend accounting strategies that entail a few simple requirements for CCN architectures. Finally, our experimental results show that forms of native and secure CCN accounting are both more viable and practical than application-specific approaches with little modification to the existing architecture and protocol.

by <a href="">Cesar Ghali</a>, <a href="">Gene Tsudik</a>, <a href="">Christopher A. Wood</a>, <a href="">Edmund Yeh</a> at October 08, 2015 01:30 AM

Fast Perfect Simulation of Vervaat Perpetutities. (arXiv:1510.01780v1 [math.PR])

This work presents a faster method of simulating exactly from a distribution known as a Vervaat perpetuity. A parameter of the Vervaat perpetuity is $\beta \in (0,\infty)$. An earlier method for simulating from this distributon ran in time $O((2.23\beta)^{\beta}).$ This earlier method utilized dominated coupling from the past that bounded a stochastic process for perpetuities from above. By extending to non-Markovian update functions, it is possible to create a new method that bounds the perpetuities from both above and below. This new approach is shown to run in $O(\beta \ln(\beta))$ time.

by <a href="">Alex Cloud</a>, <a href="">Mark Huber</a> at October 08, 2015 01:30 AM

Doubled patterns are $3$-avoidable. (arXiv:1510.01753v1 [cs.DM])

In combinatorics on words, a word $w$ over an alphabet $\Sigma$ is said to avoid a pattern $p$ over an alphabet $\Delta$ if there is no factor $f$ of $w$ such that $f=h(p)$ where $h:\Delta^*\to\Sigma^*$ is a non-erasing morphism. A pattern $p$ is said to be $k$-avoidable if there exists an infinite word over a $k$-letter alphabet that avoids $p$. A pattern is said to be doubled if no variable occurs only once. Doubled patterns with at most 3 variables and patterns with at least 6 variables are $3$-avoidable. We show that doubled patterns with 4 and 5 variables are also $3$-avoidable.

by <a href="">Pascal Ochem</a> at October 08, 2015 01:30 AM

Type Reconstruction for the Linear {\pi}-Calculus with Composite Regular Types. (arXiv:1510.01752v1 [cs.PL])

We extend the linear {\pi}-calculus with composite regular types in such a way that data containing linear values can be shared among several processes, if there is no overlapping access to such values. We describe a type reconstruction algorithm for the extended type system and discuss some practical aspects of its implementation.

by <a href="">Luca Padovani</a> at October 08, 2015 01:30 AM


Is the code of my binary call option pricer (using explicit finite difference, backward scheme) correct? [on hold]

I am using explicit finite difference (backward scheme) to price a binary call option.

Here is my MATLAB code:



% Binary Option


%% Finite Difference Method

% Stability Condition


% Allocation Memory

% Setting The Payoff Function

%Solving The Grid

for k=NTS:-1:0

    for i=(Asset_Steps):-1:2

        % Discretizing The Option Value and Computing The Greeks

        Delta= (V_Old(i + 1) - V_Old(i - 1)) / (2*ds);

        Gamma= (V_Old(i + 1) - 2 * V_Old(i) + V_Old(i - 1)) / (ds*ds);

        Theta = -0.5 * Sigma * Sigma * Stock(i) * Stock(i) * Gamma + ...
            R*(V_Old(i)-(Stock(i) * Delta));  % Black Scholes PDE Solving

        V_New(i) = V_Old(i) - dt * Theta; % Explicit Scheme


    % Boundary Conditions

    V_New(1) = V_Old(1) * (1 - R * dt); % Lower Boundary

    V_New(Asset_Steps+1) = 2 * V_New(Asset_Steps) - V_New(Asset_Steps - 1); % Upper Boundary

    %Marching Backwards in T



% Interpolate the Grid to find the Option Value for the Stock Price


Is the way how I code the boundary conditions correct?

by user161976 at October 08, 2015 01:06 AM

Planet Theory

Balanced Islands in Two Colored Point Sets in the Plane

Authors: Oswin Aichholzer, Nieves Atienza, Ruy Fabila-Monroy, Pablo Perez-Lantero, Jose M. Dıaz-Báñez, David Flores-Peñaloza, Birgit Vogtenhuber, Jorge Urrutia
Download: PDF
Abstract: Let $P$ be a set of $n$ points in general position in the plane, $r$ of which are red and $b$ of which are blue. In this paper we prove that there exist: for every $\alpha \in \left [ 0,\frac{1}{2} \right ]$, a convex set containing exactly $\lceil \alpha r\rceil$ red points and exactly $\lceil \alpha b \rceil$ blue points of $P$; a convex set containing exactly $\left \lceil \frac{r+1}{2}\right \rceil$ red points and exactly $\left \lceil \frac{b+1}{2}\right \rceil$ blue points of $P$. Furthermore, we present polynomial time algorithms to find these convex sets. In the first case we provide an $O(n^4)$ time algorithm and an $O(n^2\log n)$ time algorithm in the second case. Finally, if $\lceil \alpha r\rceil+\lceil \alpha b\rceil$ is small, that is, not much larger than $\frac{1}{3}n$, we improve the running time to $O(n \log n)$.

October 08, 2015 12:42 AM

A Box Decomposition Algorithm to Compute the Hypervolume Indicator. (arXiv:1510.01963v2 [cs.DM] UPDATED)

We propose a new approach to the computation of the hypervolume indicator, based on partitioning the dominated region into a set of axis-parallel hyperrectangles or boxes. We present a nonincremental algorithm and an incremental algorithm, which allows insertions of points, whose time complexities are $O(n^{\lfloor \frac{p-1}{2} \rfloor+1})$ and $O(n^{\lfloor \frac{p}{2} \rfloor+1})$, respectively. While the theoretical complexity of such a method is lower bounded by the complexity of the partition, which is, in the worst-case, larger than the best upper bound on the complexity of the hypervolume computation, we show that it is practically efficient. In particular, the nonincremental algorithm competes with the currently most practically efficient algorithms. Finally, we prove an enhanced upper bound of $O(n^{p-1})$ and a lower bound of $\Omega (n^{\lfloor \frac{p}{2}\rfloor} \log n )$ for $p \geq 4$ on the worst-case complexity of the WFG algorithm.

by <a href="">Renaud Lacour</a>, <a href="">Kathrin Klamroth</a>, <a href="">Carlos M. Fonseca</a> at October 08, 2015 12:41 AM

Source Localization in Networks: Trees and Beyond

Authors: Kai Zhu, Lei Ying
Download: PDF
Abstract: Information diffusion in networks can be used to model many real-world phenomena, including rumor spreading on online social networks, epidemics in human beings, and malware on the Internet. Informally speaking, the source localization problem is to identify a node in the network that provides the best explanation of the observed diffusion. Despite significant efforts and successes over last few years, theoretical guarantees of source localization algorithms were established only for tree networks due to the complexity of the problem. This paper presents a new source localization algorithm, called the Short-Fat Tree (SFT) algorithm. Loosely speaking, the algorithm selects the node such that the breadth-first search (BFS) tree from the node has the minimum depth but the maximum number of leaf nodes. Performance guarantees of SFT under the independent cascade (IC) model are established for both tree networks and the Erdos-Renyi (ER) random graph. On tree networks, SFT is the maximum a posterior (MAP) estimator. On the ER random graph, the following fundamental limits have been obtained: $(i)$ when the infection duration $<\frac{2}{3}t_u,$ SFT identifies the source with probability one asymptotically, where $t_u=\left\lceil\frac{\log n}{\log \mu}\right\rceil+2$ and $\mu$ is the average node degree, $(ii)$ when the infection duration $>t_u,$ the probability of identifying the source approaches zero asymptotically under any algorithm; and $(iii)$ when infection duration $<t_u,$ the BFS tree starting from the source is a fat tree. Numerical experiments on tree networks, the ER random graphs and real world networks with different evaluation metrics show that the SFT algorithm outperforms existing algorithms.

October 08, 2015 12:41 AM

Low regret bounds for Bandits with Knapsacks

Authors: Arthur Flajolet, Patrick Jaillet
Download: PDF
Abstract: Achievable regret bounds for Multi-Armed Bandit problems are now well-documented. They can be classified into two categories based on the dependence on the time horizon $T$: (1) small, distribution-dependent, bounds of order of magnitude $\ln(T)$ and (2) robust, distribution-free, bounds of order of magnitude $\sqrt{T}$. The Bandits with Knapsacks theory, an extension to the framework allowing to model resource consumption, lacks this duality. While several algorithms have been shown to yield asymptotically optimal distribution-free bounds on regret, there has been little progress toward the development of small distribution-dependent regret bounds. We partially bridge the gap by designing a general purpose algorithm which we show enjoy asymptotically optimal regret bounds in several cases that encompass many practical applications including dynamic pricing with limited supply and online bidding in ad auctions.

October 08, 2015 12:41 AM

M. Levin's construction of absolutely normal numbers with very low discrepancy

Authors: Nicolás Alvarez, Verónica Becher
Download: PDF
Abstract: Among the currently known constructions of absolutely normal numbers, the one given by Mordechay Levin in 1979 achieves the lowest discrepancy bound. In this work we analyze this construction in terms of computability and computational complexity. We show that, under basic assumptions, it yields a computable real number. The construction does not give the digits of the fractional expansion explicitly, but it gives a sequence of increasing approximations whose limit is the announced absolutely normal number. The $n$-th approximation has an error less than $2^{2^{-n}}$. To obtain the $n$-th approximation the construction requires, in the worst case, a number of mathematical operations that is double exponential in $n$. We consider variants on the construction that reduce the computational complexity at the expense of an increment in discrepancy.

October 08, 2015 12:40 AM


Matlab question. I want to display a row of a matrix based on user input [on hold]

My goal is to prompt the user for a value, and then output the row in my matrix that corresponds to the value they enter. The command window recognizes my matrix but does not output the specific rows, even when I enter the right values.

Here's what I have so far.

prompt='Please enter an alloy code: '; %enter either A2042,A6061 or A7005 x=input(prompt);

A2042=a(1, :); A6061=a(2, :); A7005=a(3, :);

%alloy compositions a=[4.4 1.5 0.6 0 0; 0 1 0 0.6 0; 0 1.4 0 0 4.5; 1.6 2.5 0 0 5.6; 0 0.3 0 7 0];

So when I enter A2042, I want it to display row 1. For some reason, it's not cooperating. Thank you for your help!

by Anthony Desmond at October 08, 2015 12:40 AM


Machine Learning vs Regression and/or Why still use the latter?

I come from a different field (Machine learning/AI/data science), but aim to ask a philosophical question with the utmost respect: Why do quantitative financial analysts (analysts/traders/etc.) prefer (or at least seem) traditional statistical methods (traditional = frequentist/regression/normal correlation methods/ts analysis) over newer AI/machine learning methods? I've read a million models, but it seems biased? Background: I recently joined a 1B AUM (I know it's not a ton) asset management firm. I was asked to build a new model for a sector rotation strategy (basically predicting which SP 500 sector would do the best over 6 months-- chose to use forward rolling 6 month returns) they employ and my first inclination was to combine ARIMA (traditional) with random forest (feature selection) and a categorical (based on normal distribution standard deviation) gradient boosted classifier for ETFs in each sector. Not to be rude, but I beat the ValuLine timeliness for each sector. I used the above mentioned returns as my indicator and pretty much threw everthing at the wall for predictors initially (basically just combing FRED), then used randomForest to select features. I ended up combining EMA and percent change to create a pretty solid model that, like I said, beat ValuLine.

I've read a lot of literature, and I haven't seen anyone do anything like this. Any help in terms of pointing me in the right direction for literature? Or any answers to the overarching idea of why isn't there more machine learning in equity markets (forgetting social/news analysis)? EDIT: For clarification, I'm really interested in long-term predictions (I think Shiller was right) based on macro predictors.


PS- I've been lurking for a while. Thanks for all the awesome questions, answers, and discussions.

by Kirk Hadley at October 08, 2015 12:08 AM

hubertf's NetBSD blog

Interview with Jeff Rizzo on NetBSD 7.0

The polish Sektor BSD page has an interview with Jeff Rizzo on NetBSD 7.0 online. With the NetBSD 7.0 release around the corner, this is good timing!

The interview is available in both english and polish language, and starts off a series of NetBSD-related interviews. Have a look!

October 08, 2015 12:08 AM


What does composability mean in context of functional programming?

What do functional programmers mean when they say a certain thing is composable or not composable?

Some of the statements of this sort that I've read are:

  • Control structures are not composable.
  • Threads do not compose.
  • Monadic operations are composable.

by Surya at October 08, 2015 12:01 AM

HN Daily

Planet Theory

On the Hardest Problem Formulations for the 0/1 Lasserre Hierarchy

Authors: Adam Kurpisz, Samuli Leppänen, Monaldo Mastrolilli
Download: PDF
Abstract: The Lasserre/Sum-of-Squares (SoS) hierarchy is a systematic procedure for constructing a sequence of increasingly tight semidefinite relaxations. It is known that the hierarchy converges to the 0/1 polytope in n levels and captures the convex relaxations used in the best available approximation algorithms for a wide variety of optimization problems.

In this paper we characterize the set of 0/1 integer linear problems and unconstrained 0/1 polynomial optimization problems that can still have an integrality gap at level n-1. These problems are the hardest for the Lasserre hierarchy in this sense.

October 08, 2015 12:00 AM

October 07, 2015


what data to use to compare the interest rate among different currencies?

Very new to fixed income signals. I am a little confused about which data to use to compare interest rate among different currencies.

For example, I am interested in compare interest rate in the following currencies (countries): USD, EUR, GBP, JPY,CHF,CAD, DKK,NZD,AUD,SEK and NOK. Before 2013, We can get LIBOR for almost all of them (except NOK I believe), but now only first 5 currencies have LIBOR rates updated and I have no idea where to get the interest rate for some currencies such as AUD, NZD. some currencies have their own interbank rate, such as STIBOR for SEK and CIBOR for DKK, are they the right interest rate to use? Also there are more than one IBOR available for some currencies, for example EUR have both LIBOR and EURIBOR, which one should I choose?

What about swap rates? since swap are closely rated to the interbank rate, would swap rate be a better data to use to measure monetary policy of each country? if so, what kind a swap data should I use? Thank you very much!

by user6396 at October 07, 2015 11:02 PM


what's "pseudo time" when used in comparison with semaphores

I'm currently listening to Alan Kays' talk "Is it really complex or did we just make it complicated ?" ( ) where he says that "semaphores were a bad idea and there was something called pseudo time that was superior" (at 51:40 on the linked video). Maybe I misunderstood the word "pseudo time", but do you know anything about those ?

by molyss at October 07, 2015 11:02 PM


Is there a good backtesting package in R?

My model exports a vector that have for each day b-buy s-sell or h- hold it's look like this:

sig [1] b b s s b b b s s b s b s s b s b s s s s b b s s b b b b b b s b b b b b b b

I want to backtest that it will buy or sell all the equity in the portfolio at the end of each day and for hold will do nothing. what is the best way to backtest in R or other method this strategy?


by alonch7 at October 07, 2015 11:00 PM


what algorithm to apply to such a graph?

I'm nowadays studying shortest path problem.

I have to apply one of the three following algorithm: Bellman, Bellman Kalaba, Djikstra to the following graph:

graph with ten vertices

I have to search for the shortest path between $x_1$ and $x_{10}$ yet I don't know which to choose. We only studied oriented graphs for these algorithm in class.

I. The problem and my assumptions

I know that for the research of the shortest path between $x_i$ and $x_j$ the Necessary Condition is that the graph $G_{kj}$ exists and that he may not have absorbing loops (that is to say path valued $<0$ in the case of $min$ and $>0$ in the case of $max$.

I started using Djikstra one because Bellman can't be applied when there is a loop.

(1)$M \leftarrow \{x_1\}, d_1=0, d_i= \begin{cases} v_1, & \mbox{if } x_1,x_i\in U \\ \infty, & else \end{cases}$

(2)$Select \ i ∈X-M \ with \ d_i=min_{j\in X-M}{dj}$

$M\leftarrow M\bigcup \{ x_i \}$

$x_i=x_t \ then \ END$

$if M =X \ then \ END$

$for \ all j\in j\in (X-M)\bigcap \Gamma(x_i), d_j\leftarrow min\{d_j,d_i+v_{ij}\}$

return to (2)

II. What I tried

I wrote that $d_1=0, d2=5, d4=10, d3=14$


(1) Selection of $x_2$: $M=\{x_1,x_2\}$




(1) Selection of $x_5$: $M=\{x_1,x_2, x_5\}$


First I don't know if I'm apllying it the right way, second I don't know if this is the right algorithm to apply on this case. Any hint appreciate!

by Marine1 at October 07, 2015 10:49 PM

PageRank toolkit for large graphs [on hold]

I need to compute the PageRank scores for a large graph which cannot be loaded into memory. I need a simple toolkit that can be easily modified, since I need to change its code in my research. Are you aware of any useful and simple toolkit that computes PageRank for large graphs (the size of graph is around 40 GB).


by boomz at October 07, 2015 10:29 PM



Need help with discrete mathematics

As the title says I seriously need help with discrete math. I've been making it from assignment to assignment, barely, and finally had the last straw today when I took my mid-term and could only answer half of the test. Aside from the already existing language barrier between my teacher and myself he's not a very good teacher at all. Basically what I'm asking is does anyone know a very good, in depth, comprehensive guide to discrete math. Whether it is a video series or textbook or whatever I'm fine with it.

submitted by nodnarbiter
[link] [7 comments]

October 07, 2015 09:50 PM




propositional logic valuation in sml

I'm trying to define a propositional logic valuation using sml structure. A valuation in propositional logic maps named variables (i.e., strings) to Boolean values.

Here is signature :

signature VALUATION =
    type T
    val empty: T
    val set: T -> string -> bool -> T
    val value_of: T -> string -> bool
    val variables:  T -> string list
    val print: T -> unit

Then i define a matching structure:

structure Valuation :> VALUATION =
            type T = (string * bool) list
            val empty = []
            fun set  C a b = (a, b) :: C
            fun value_of [] x = false
                | value_of ((a,b)::d) x = if x = a then b else value_of d x  
            fun variables [] = []
                | variables ((a,b)::d) =  a::(variables d ) 
            fun print valuation = 
                        (fn name => TextIO.print (name ^ " = " ^ Bool.toString (value_of valuation name) ^ "\n"))
                        (variables valuation);
                    TextIO.print "\n"

So the valuations should look like [("s",true), ("c", false), ("a", false)]

But can't declare like a structure valuation or make an instruction like: [("s",true)]: Valuation.T; When i try to use the valuation in a function i get errors like:

Can't unify (string * bool) list (*In Basis*) with

Could someone help me ? thanks

by Cheikh Ahmadou at October 07, 2015 09:21 PM


Is it worth to study economics with a focus on financial engineering (research area)? [on hold]

I have a background in mathematics. I have been working as a software developer for several years. My work environment is hilarious but I feel that the products which we develop are not that meaningful. I am starting to feel a little bit upset.
Recently I started looking for some good PhD studies. I was always interested in economics – mostly in global and macro economics. Moreover I discovered financial engineering which seems to be very mathematical part of economics. I am considering a profession change to more quantitative field like quant developer or quant analyst. Here comes my questions to more experienced users. I would really appreciate to get to know your thoughts and opinions
1. Is it worth to study economics with focus on financial engineering? I have an impression that the Nobel prize in economics is like a Nobel prize in literature. The current trend is meaningful and people with Nobel prize are not as respected as Nobel winners in other field like physics.
2. Are economist wanted as professionals and respected?
3. How can we differentiated a good economist?
4. Is an expertise in economy useful for a person and society? Can it bring some added value?
5. Is it worth studying it if I consider changing a job to more quantitative field?

by dave at October 07, 2015 09:21 PM


Counting words of length $n$ in an inherently ambiguous CFG?

There is a polynomial-time algorithm for computing the number of words of length $n$ in an unambiguous CFG $G = (V, \Sigma, R, S)$ (via a dynamic programming approach). However, for ambiguous CFGs, the algorithm only computes the number of parse trees resulting in strings of length $n$. Therefore, this result is not the number of words of length $n$ in an ambiguous CFG.

Is there a result (other than testing all possible strings of length $n$) for inherently ambiguous CFGs?

by Ryan at October 07, 2015 09:17 PM



In Mail-Diskussionen ist noch ein Punkt aufgekommen, ...

In Mail-Diskussionen ist noch ein Punkt aufgekommen, den ich wohl besser mal öffentlich klarstelle. Da meinte jemand "Hey, wenn der Typ 5 Minuten suchen muss, und ich das in 10 Sekunden beantworten könnte, weil ich im Thema drin bin, dann ist das doch besser, wenn er das postet und ich antworte! Also in einem kleinen Team jetzt."

Nein. Also ... jein. In manchen Gebieten mag das so sein. Unter Admins zum Beispiel, die häufig nur kurze Lastspitzen haben und ansonsten auf das Eintreffen von Katastrophen warten. Da kann man mal anfragen.

Aber für andere Fälle gilt: Unterbrechungen sind der Feind der Produktivität.. Es gab da einige Studien zu. Bei Programmierern ist es so, dass man sich in einen Zustand erhöhter Konzentration reinarbeiten kann, und da erreicht man dann maximale Produktivität. Dieser Zustand ist unter anderem als "Flow" oder "the Zone" bekannt. Nach einer Unterbrechung, und seien es nur Sekunden, ist der Flow weg und man braucht bis zu 30 Minuten (!), um sich da wieder reinzuarbeiten.

Daher ist es für die Produktivität absolut tödlich, Entwickler in Großraumbüros zu pferchen. Es reicht, wenn einer angerufen wird, und für alle ist die Produktivität versaut.

Schlimmer noch: Ich habe für mich festgestellt, dass ich meine maximale Performance erreiche, wenn ich meine Aufgabe in kleine Blöcke aufteile, die alle jeweils schnell gemacht sind. Dann habe ich die ganze Zeit ein Gefühl des Vorankommens, das dann als Motivation für das Weitermachen dient. Wenn ich während so eines Segmentes unterbrochen werde, schadet das nicht nur diesem Segment sondern den nächsten gleich mit.

Der oben verlinkte Artikel macht es an der Fehlerrate fest, nicht an der Produktivität. Das ist eine noch gruseligere Vorstellung für mich, dass ich mich produktiv fühle, aber dann später nachbessern muss. Und die sagen dort: Schon eine Zwei-Sekunden-Unterbrechung kann die Fehlerrate verdoppeln.

Das sind Größenordnungen der schlechten Folgen, die es für mich völlig offensichtlich machen, dass die Zeit anderer Leute nicht nur als Aggregat als wertvoll zu betrachten ist, sondern auch ein kurzer Impuls schon immense Schäden verursachen kann.

Das würde ich jetzt weniger auf Mailinglisten beziehen, denn die sortiert man ja normalerweise irgendwo hin und liest sie dann in den Pausen. Aber das generelle Argument, dass der andere das ja möglicherweise leichter beantworten kann als ich, und ich störe ja "bloß" 10 Leute und nicht 10000, selbst unter dem Gesichtspunkt ist das objektiv und auch moralisch verwerflich. Die moralische Verwerflichkeit konstruiere ich mir daraus, dass ich mich gelegentlich über eine kurze Sache über eine längere Zeit ärgern kann. Und daraus, dass das ein klassischer Fall von Externalisierung von Kosten ist. Genau so argumentieren Umweltverschmutzer und Subventionsbetrüger.

Update: Jetzt kommt hier Einspruch von einem Admin rein. Das sei keinesfalls so, dass man sich als Admin nicht konzentrieren müsse, denn sie nutzen die Zeit dann für Planung und Fortbildung und so. Fortbildung sehe ich ein. Aber alles andere sollte in Arbeitsgruppen gemacht werden, nicht "nebenher" von den Admins. Admins sind dafür da, Katastrophen nicht entstehen zu lassen, und wenn sie eintreten, einen Plan B zu haben. Wenn man die mit anderem Kram ablenkt, tut man niemandem einen Gefallen. In meinem Weltbild (und ich war auch ein paar Jahre als Admin tätig) erkennt man den guten Admin daran, dass er den ganzen Tag reddit klickt, weil er den ganzen anderen Scheiß wegautomatisiert hat. Ein Admin, der immer am arbeiten ist, ist wahrscheinlich im Feuerwehrmodus und überlastet und dann ist die gesamte IT-Infrastruktur wahrscheinlich in einem schlechten Zustand.

Update: Hmm, das war unglücklich formuliert oben. Diesen Zustand gibt es auch bei anderen Tätigkeiten, nicht nur beim Programmieren. Kinder beim Videospielen sind in der Zone, wenn man ein gutes Buch liest, ist man in der Zone, und auch beim Putzen oder Kochen kann man in die Zone kommen. Wenn man sich einen wirklich exzeptionell guten Vortrag anhört, kann man auch in die Zone kommen.

Update: Passender Comicstrip :-)

October 07, 2015 09:00 PM




Calculating log-returns across multiple securities and time

I've been getting very confused on the topic of calculating returns. To get cumulative returns in time, log-returns are used, but apparently log-returns aren't used across different securities at a fixed time?

I would like to get cumulative returns as a function of time over my portfolio.

I have two securities, A and B. I buy one share of both A and B when the market opens and sell when it closes.

Suppose these are the prices for a specific day:

    open    close
A   9       10
B   10      8

My overall return for that day is (10+8)/(10+9) - 1 = -5.2%. I store that -5.2% for that day. I repeat this for many days. How do I then calculate my cumulative sum? If it helps, I'm doing this in python.

Side note: I can use a cumsum() function very easily in python, but that assumes log-returns. I have no issue working with log-returns, but I'm not sure how to go about doing that.

I will add that I always purchase 1 share of whatever security is in my portfolio for that day. The securities in my portfolio change over the course of time.

by David at October 07, 2015 08:50 PM


Basic arithmetic operations in MIX computer

I am not from computer science, but I am reading the Knuth's book The art of computer science vol.1. I got stuck with the examples he gave about basic operations in MIX (add, sub, mul, etc.) on p132. Can somebody explain in a pedagogical way these examples? I'm totally confused with them.

From all the examples given there, the first one is quite clear (sum), but the other are confusing. Ok, the first mul is clear, but that's only because I know how to do it in paper without considering the byte positions in memory.

Take for example the sub operation.


Why should be there ? (undef value) at the last byte of the result? Furthermore, it would appear to me that the operation is carried out by byte position instead of just the operation 1234009-(-20001500). These things got repeated in the examples and I would like an explanation in detail, since this is the first time I see operations in registers in a machine.

by user2820579 at October 07, 2015 08:39 PM


Graph isomorphism problem with invertible adjacency matrices

This question is supplementary to the question asked here.

One of the answers give a class of graphs for which the adjacency matrices are invertible which is defined as follows.

Given a permutation $\pi$ of a finite set $V$, form its cycle graph $G$ as follows: the vertex set is $V$ and the edges are pairs $(v,w)$ for which $\pi(v)=w$. (This is a simple directed graph.) The adjacency matrix will in fact be the permutation matrix corresponding to $\pi$, which is invertible.

Let's consider a class of graph isomorphism problems where it is promised that both of the input graphs will be from the class defined above.

Can the graph isomorphism problems from this class be solved efficiently?

by Omar Shehab at October 07, 2015 08:10 PM



How to use a #() instead of (fn ...) in (sorted-map-by ...)?

I would like to translate the inner-function call in the following snippet, to the one using the #() macro

(let [m {:a 3, :b 2, :c 4, :x 9, :y 0, :z 5}]
  (into (sorted-map-by (fn [key1 key2]
                         (compare [(get m key2)]
                                  [(get m key1)]))) m))

Yet, I am a little bit confused on how I can accomplish that.

by kaffein at October 07, 2015 08:03 PM

Calculate average of tree nodes

I am trying to calculate the average of a tree in OCaml but only traversing through the tree once. I want to return a pair (tuple) the first element containing the sum of the nodes, and the second element containing the total number of nodes. I know the calculation for both are correct but I can not figure how to return the pair. This is the function:

let rec avgHelp (s : inttree) = 
 match s with
 | Empty -> (0, 0)
 | Node(j,l,r) -> (j + avgHelp l + avgHelp r, 1 + avgHelp l + avgHelp r) 

j contains the integer in the node and l and r are the rigt and left of the tree. The error I currently get now is under the first call to avgHelp l stating: "Error: This expression has type int * int but an expression was expected of type int avgHelp: inttree → int × int"

by B. Smith at October 07, 2015 08:02 PM


Boah, hätte ich die Mail mal ausgeschaltet gelassen. ...

Boah, hätte ich die Mail mal ausgeschaltet gelassen. Jetzt kommen hier so Heulsusen-Mails rein. "Aber ich bin im IRC immer total fies behandelt worden, weil ich immer plenke !"

Daher werde ich jetzt hier mal eine Sache ansagen, die ich für so selbstverständlich hielt, dass mir die Möglichkeit gar nicht in den Sinn kam, dass jemand das nicht intuitiv verstehen könnte.


Wenn dich jemand online auf einen Fehler hinweist, dann bist du ihm Dank schuldig.

Unter Nerds ist das sowas wie eine Auszeichnung. Denn wenn ich dir sage, wieso du dich gerade öffentlich zum Spaten machst, und wie du das in Zukunft vermeiden kannst, dann tue ich dir einen Gefallen. Der hat mich Zeit gekostet. Überhaupt beobachten zu müssen, wie du dich zum Spaten machst, hat mich Zeit und Nerv gekostet.

Ich weiß ja nicht, wie es euch geht, aber ich bin lieber von Leuten umgeben, die schlauer sind als ich, als von den anderen. Wenn ihr euch vor meinen Augen wie Idioten benehmt, dann erinnert ihr mich daran, dass ich mit meiner Zeit auch produktive Dinge hätte tun können, anstatt mich hier aufzuhalten und mir das Elend anzuschauen.

Das ist schon mal direkt eine Familienpackung schlechte Laune auf einem Haufen. Wenn ich mir DANN noch die Zeit nehme, dir zu erklären, wieso du gerade öffentlich verkackt hast, dann ist das eine auf meiner Seite nicht rational zu rechtfertigende Einzahlung auf mein Karmakonto. Ich habe davon keine Vorteile. Erfahrungsgemäß habe ich sogar Nachteile! Denn wahrscheinlich wird der nächste Schritt sein, dass du mich auch noch anpupst, weil du dich beleidigt fühlst. Denn seien wir mal ehrlich, wer einmal ungeniert öffentlich ins Klo greift, der steht 10 Minuten später immer noch günstig für eine Wiederholung oder Zugabe. Im Grunde kann man gar nicht genug betonen, was das für eine unerwartete kosmische Geste der Glückseligkeit ist, wenn ein Nerd dich beobachtet und sich dann noch die Zeit nimmt, dir zu erklären, was du besser machen kannst. Denn das heißt, dass der Nerd dich für lernfähig hält.

In der heutigen Zeit ist das leider eines der größten Komplimente, das man vergeben kann.


Ihr solltet also auf eure Knie fallen und dem Nerd die Hand küssen, wenn das passiert. Oder noch besser: Einfach mal die Fresse halten. Denn eure Handküssen verplempert noch mehr von der Zeit des Nerds.

Falls jemand nicht weiß, was Plenken ist (und zu faul ist, das selber zu googeln *GRRRR*): Plenken heißt, dass man vor Satzendezeichen ein Leerzeichen lässt. Das machen viele Leute, bis ihnen jemand mitteilt, dass das auf einer Stufe mit Deppenleerzeichen und dem Deppenapostroph steht. Wenn ich jemanden plenken sehe, dann weiß ich, dass ich mich mit dem nicht zu unterhalten brauche, weil der Rest von dessen Allgemeinbildung wahrscheinlich ebenfalls große Defizite aufweist.

Online sieht man dein Gesicht nicht, wenn du was sagst. Und ob ihr euch das eingestehen wollt oder nicht, wenn euch ein stinkender Obdachloser in der Fußgängerzone anspricht, dann reagiert ihr anders als wenn das ein gepflegter Schlipsträger ist. Diese Art von Indizien sucht man sich auch online, und die Leute gucken dann auf andere kleine Dinge. Plenken ist ein Beispiel dafür.

Update: Es gibt da ein schönes Zitat zu, auf das mich jemand per Mail hinwies:

"Good spelling, punctuation, and formatting are essentially the on-line equivalent of bathing." -- Elf Sternberg

October 07, 2015 08:00 PM



Consider two nodes A and B directly connected. Node A would like to send 10 kB message to B

I got a question, if anybody can help: Consider two nodes A and B directly connected. Node A would like to send 10 kB message to B. The message is fragmented in 1kB frames, additional 80 Bytes is appended to each frame(for header,etc.) The channel between nodes A and B can reliably transmit 3000 symbols per second. Neglecting any transmission,or other delay choose the modulation for the channel that would allow the transmission of 12 Kbits/second bit rate. Draw and name the constellation diagram of the modulation.

by Udit_1 at October 07, 2015 07:38 PM


Why org-mode?

So I've found a ton of tutorials online and a lot of people say it's invaluable, but I don't really understand why everyone is so excited about org-mode. I can take notes with a simple markdown file. I use Latex to write academic papers. Why do I need org mode on top of these?

submitted by zreeon
[link] [42 comments]

October 07, 2015 07:33 PM






Basic operations in MIX computer [on hold]

I am not from computer science, but I am reading the Knuth's book The art of computer science vol.1. I got stuck with the examples he gave about basic operations in MIX (add, sub, mul, etc.) on p132. Can somebody explain in a pedagogical way these examples? I'm totally confused with them.

From all the examples given there, the first one is quite clear (sum), but the other are confusing. Ok, the first mul is clear, but that's only because I know how to do it in paper without considering the byte positions in memory.

Take for example the sub operation.


Why should be there ? at the last byte of the result? Furthermore, it would appear to me that the operation is carried out by byte position instead of just the operation 1234009-(-20001500). These things got repeated in the examples and I would like an explanation in detail, since this is the first time I see operations in registers in a machine.

by user2820579 at October 07, 2015 07:10 PM

Find an algorithm that sorts the nodes of a given graph acc. to distance from source node + value of the node's key

an unweighted connected graph G=(V,E) all nodes of the graph contain a "serial number" between 1 to V. (V is an integer).

I am trying to come up with an algorithm that sorts all nodes firstly by distance (number of eadges) from source node 's' , secondly- if two nodes have same distance value sort them by value of thier serial number.

complexity required is O (V+E)

I tried to solve this via using bfs and then recieve a "bucket" array of distances - for each distance 1,2... all nodes within it are in its "bucket". the problem is, that in order to sort two nodes with same distance according to their serial number takes at least )O nlogn of sorting. I tried to come up with a use v-1 empty arrays and tried to send each "bucket" nodes acc. to their index in the graph and print it - that takes )V^2.

This is NOT homework, I really did try my best. Your comments are appreciated.

by user118972 at October 07, 2015 07:03 PM


Als nächstes räumen die Amis Toyota weg. Schalten ...

Als nächstes räumen die Amis Toyota weg. Schalten die jetzt der Reihe nach alle Konkurrenten aus, damit die US-Hersteller doch noch eine Chance am Markt haben?

October 07, 2015 07:01 PM




correlated random variables with additional autocorrelation - multi dimensional cholesky?

for my thesis im currently generating several time series of random numbers, so far so good. Now i realized some autocorrelation in the series as well and dont really know how to cope with it. Can i use the cholesky factorization to generate random numbers with auto-correlation and then afterwards use the cholesky decomposition again for simulating with the overall correlation structure between the different time series? because im uncertain whether that destroys the autocorrelation i previously created?

or put different. im currently doing this for n variables: xt,1 = xt,0* exp(my+stdrv1) yt,1 = yt,0 exp(my+std(p*rv1+(1-p^2)^0.5*rv2)

now those are correlated just fine, but how do i insert the autocorrelation without harming the cross series correlation? or is it unaffacted when i change the random variables? thanks in advance

by Orang Africano at October 07, 2015 06:39 PM




Preordered OpenBSD 5.8 CD Sets Arriving

The first CD set arrived report to appear on misc@ was the one from M Wheeler, located somewhere in the UK, who wrote:

CD's arrived today UK. Thanks again.

In a followup message, Theo de Raadt (deraadt@) announced that pre-orderers have a special treat in store:


October 07, 2015 06:24 PM


What does "Computers cannot solve a problem for which there is no solution outside the computer" means? [on hold]

I have been on a research team which is investigation the case "Computers cannot solve a problem for which there is no solution outside the computer". I have surfed the internet for more than enough now and I can't find any useful information.

I would be very grateful if you can help.

by Gideon Appoh at October 07, 2015 06:24 PM


How not to solve P=NP?

There are lots of attempts at proving either $\mathsf{P} = \mathsf{NP} $ or $\mathsf{P} \neq \mathsf{NP}$, and naturally many people think about the question, having ideas for proving either direction.

I know that there are approaches that have been proven to not work, and there are probably more that have a history of failing. There also seem to be so-called barriers that many proof attemps fail to overcome.

We want to avoid investigating into dead-ends, so what are they?

by Raphael at October 07, 2015 06:23 PM


Account with multiple deposits, a withdrawal and a quarterly compounding interest [on hold]

On April 1, 2006 Francine opened a savings account paying 9.2% convertible quarterly with a deposit of $4500. On October 1, 2007, she withdrew $2400. On July 1, 2008, she deposited $3000. What was the account balance on January 1, 2011?

The equation I came up with for a solution involved discounting all of the values back to the original balance and solving for X, but this isn't giving me the right answer

by user151876 at October 07, 2015 06:20 PM


Can quantum computer become perfect chess player?

Can quantum computer become perfect chess player?

Can it determine whether (when both players are perfect) win white or black? (or is it dead heat?)

by porton at October 07, 2015 06:18 PM


computing sell price from ohlc

I'm relative new to this, so I might be asking something that doesn't make sense. Here is my scenario:

I have intraday day at 1 minute intervals. This data has ohlc data and I want to compute for any given interval what the likely sell price would be. I could just assume the worst case and take the low price, but I'm assuming there is something a little more accurate than that.

I get that there is no way to accurately predict what the sell price would be, since an actual order potentially changes the outcome. I just want to know if there is a best practice for predicting what the sell price would be if I tried to execute an order on a given interval using historical data.

by smacbeth at October 07, 2015 06:15 PM

Planet Emacsen

Ben Simon: More Linux-on-Android success: emacs, racket and git

Shira often cringes when I join her at the craps table. Will I drop some forbidden statement like wow, nobody's rolled a 7 in forever!, thereby guaranteeing the next roll to be a 7? In general, I just don't respect the juju that goes with gambling; that underlying assumption that through external means you can control random events. Turns out, this is a studied phenomena, known as the Hot Hand Fallacy. When ProgrammingPraxis took up the topic, I knew I had to jump in and implement the exercise.

I was writing this on fresh Linux system, so the first order of business was to pull in tools. I did something along the lines of:

  apt-get install emacs racket git

The plan was to code the solution in emacs, execute it in racket and publish my work using git. Once the above command finished, I fired up emacs, kicked off a Scheme buffer powered by racket and started coding away. The ProgrammingPraxis exercise called for the implementation of an experiment mentioned in the Wall Street Journal that showed how the Hot Handy fallacy could be experienced. Here's my implementation:


(define (flip)
  (if (= (random 2) 1) 'H 'T))

(define (sample)
  (list (flip) (flip) (flip) (flip)))

(define (hot? s)
  (cond ((null? s) (void))
 ((null? (cdr s)) (void))
 ((eq? 'H (car s)) (eq? 'H (cadr s)))
 (else (void))))

(define (collect fn sample)
  (if (null? sample)
      (let ((v (fn sample)))
 (if (void? v)
     (collect fn (cdr sample))
     (cons v (collect fn (cdr sample)))))))

(define (only-true items)
  (filter (lambda (t) t) items))

(define (percentify items)
  (if (null? items) 0
      (exact-gt;inexact (/ (length (only-true items))
    (length items)))))

(define (try count thunk)
  (let loop ((avg (thunk)) (count (- count 1)))
    (if (= 0 count)
 (loop (/ (+ (thunk) avg) 2) (- count 1)))))

(define (experiment)
  (percentify (collect hot? (sample))))

Each experiment consisted of flipping a coin 4 times and reviewing the outcome. I then kicked off 2,000,000 of these experiments:

 (try 2000000 experiment)

Here's a screenshot of the results:

If you squint, you'll see that I did occasionally get close the 40% behavior mentioned in the article, though I got plenty of other behavior, too. Chances are, the implementation above is buggy.

The little Android icon in the screenshot above reveals the truly remarkable part of this exercise: the above was all executed on my Galaxy Note5 using Gnuroot. I was amazed at how nearly flawless emacs and racket performed. It was just like I was on any old Linux environment. And when I was finished with my solution, I pushed the code to github using the command line version of git.

I was disappointed to see that Gambit Scheme is no longer available on Google Play. But having access to racket and other standard Linux tools makes up for this and then some.

I did note above that the tools were nearly flawless. There were some gotchas, including:

  • emacs: The arrow keys can become undefined. A fix is shown here
  • emacs: my standard desktop configuration loaded up properly, but I did get a warning message about being "past 95% of memory limit." I don't know what that means, but it sounds scary.
  • gnuroot: I don't have a good way to swatch back and forth between a Gnuroot terminal and other Android apps. I can bring up the Gnuroot launcher, but that means having to kick off another terminal session.

But these issues were all minor when compared to how much Gnuroot Just Works.

I see that the next ProgrammingPraxis exercise also covers a gambling related topic. Time to bust out my keyboard and get to work!

by Ben Simon ( at October 07, 2015 06:10 PM



Hey folks,

I've been trying to get nnmaildir to work along with mbsync and I just cannot get it to work with gnus.

I have in the server tab the following:

(nnmaildir "calgary" (directory "~/mail/ucalgary")) 

I can open the connection and it shows all the folders:

K 0: Calendar U 0: Inbox K 0: Sent Items 

When I open the Inbox I get the following error:

gnus-select-newsgroup: Couldn't request group nnmaildir+calgary:Inbox: No such group: Inbox 

Now if I look in /home/crb/mail/ucalgary/Inbox/cur/ there are a number of read and unread messages for example "1443481160.3793_16.Feddie,U=16:2,S ".

Any advice on how to fix this?

submitted by Mister_Bubbles
[link] [comment]

October 07, 2015 06:09 PM


Amazon Launches Snowball, A Rugged Storage Appliance For Importing Data To AWS By FedEx

Never underestimate the bandwidth of a station wagon full of tapes hurtling down the highway.

— Tanenbaum, Andrew S. (1989). Computer Networks. New Jersey: Prentice-Hall. p. 57. ISBN 0-13-166836-6.


by pushcx at October 07, 2015 06:09 PM


Forex P&l Attribution on Physical Forward position

Please validate my unrealized Fx P&L calculation on the commodity forward contract e.g. consider i have bought 1 MT of wheat at 300 EURO my financial currency for company is USD. I am using below formula to get attribution .

(Market Price - Commodity Price )*(Today FX Rate- Yesterday FX Rate)

The above formula is used for daily fx loss and gain on the forward position . But above failed when the market price = commodity price .

by user1131338 at October 07, 2015 06:02 PM



Math background for Algorithm design

What math background do I require to understand in depth the book on Algorithm Design and Analysis? I only have high school math background.

I am referring to the books by Jon Kleinberg or Steven Skienna

by Victor at October 07, 2015 05:58 PM


Could we estimate a portfolio's volatility using a GARCH on the portfolio returns?

Estimating the volatility of a portfolio is typically done by first estimating the covariance matrix. This, however, can be difficult to do accurately and predictivly. This paper gives a nice summary of the various methods.

But why make it so complicated?

Let's say there are $n$ securities $s_1, s_2 \dots s_n$, which at time $t$ has a price of $p_{i,t}$.

You're interested in the portfolio with weights $w_i$ in security $s_i$.

Why not take the time series of the portfolio value $\sum w_i p_{i,t}$ and do a normal GARCH estimate on that?

This technique seems more straight forward and probably just as accurate.

Am I missing something?

This was also asked here.

Update 10/7: To be clear, I would like to estimate the current volatility of the portfolio.

by JPN at October 07, 2015 05:41 PM


Proper way to change the value of non-autoloaded defs?

A few days ago I was reading the emacs manual and noticed this:

Directories listed in the variable grep-find-ignored-directories are automatically skipped by M-x rgrep. The default value includes the data directories used by various version control systems.

"Great!" I thought - "Now I can ignore that pesky node_modules folder in my node projects." Normally I use customize, but given that this is a list with a useful default value, I added this line to my init.el:

(setq grep-find-ignored-directories (cons "node_modules" grep-find-ignored-directories)) 

Which worked great, until I had to restart, and received this warning:

Symbol's value as variable is void: grep-find-ignored-directories 

So after digging through grep.el and learning a bit about autoload, I see that grep-find-ignored-directories is not autoloaded, and not available until after grep.el has been loaded. The solution I've come up with is:

(require 'grep) (setq grep-find-ignored-directories (cons "node_modules" grep-find-ignored-directories)) 

Which works, but frankly seems like overkill. Is there a better way to do this? Should that defcustom be autoloaded?

I should mention, I did find another, more efficient solution by using the default value that grep.el uses:

(setq grep-find-ignored-directories (cons "node_modules" vc-directory-exclusion-list)) 

vc-directory-exclusion-list is loaded on start (vc-hooks.el via loadup.el), so there's no worry about that value not being loaded. While this works, I'm still curious about the answers to my questions above.

submitted by artlogic
[link] [9 comments]

October 07, 2015 05:12 PM


difficult to understand function definition

cube (x,y,z) =
  filter (pcubes x) cubes

cubes = [(a,b,c) | a <- [1..30],b <- [1..30],c <- [1..30]]

pcubes x (b,n,m) = (floor(sqrt(b*n)) == x)

so this code works, cubes makes a list of tuples,pcubes is used with filter to filter all the cubes in which floor(sqrt(b*n)) == x is satisfied,but the person who has modified my code wrote pcubes x in filter (pcubes x) cubes,how does this work.pcubes x makes a function that will initial the cubes x (b,n,m) that will take in a tuple and output a bool.the bool will be used in the filter function. How does this sort of manipulation happen? how does pcubes x access the (b,n,m) part of the function?

by Charana at October 07, 2015 05:07 PM


Amazon Inspector – Automated Security Assessment Service

As systems, configurations, and applications become more and more complex, detecting potential security and compliance issues can be challenging. Agile development methodologies can shorten the time between “code complete” and “code tested and deployed,” but can occasionally allow vulnerabilities to be introduced by accident and overlooked during testing. Also, many organizations do not have enough security personnel on staff to perform time-consuming manual checks on individual servers and other resources.

New Amazon Inspector
Today we are announcing a preview of the new Amazon Inspector. As the name implies, it analyzes the behavior of the applications that you run in AWS and helps you to identify potential security issues.

Inspector works on an application-by-application basis. You start by defining a collection of AWS resources that make up your application:

Then you create and run a security assessment of the application:

The EC2 instances and other AWS resources that make up your application are identified by tags. When you create the assessment, you also define a duration (15 minutes, 1 / 8 / 12 hours, or 1 day).

During the assessment, an Inspector Agent running on each of the EC2 instances that play host to the application monitors network, file system, and process activity. It also collects other information including details of communication with AWS services, use of secure channels, network traffic between instances, and so forth. This information provides Inspector with a complete picture of the application and its potential security or compliance issues.

After the data has been collected, it is correlated, analyzed, and compared to a set of built-in security rules. The rules include checks against best practices, common compliance standards, and vulnerabilities and represent the collective wisdom of the AWS security team. The members of this team are constantly on the lookout for new vulnerabilities and best practices, which they codify into new rules for Inspector.

The initial launch of Inspector will include the following sets of rules:

  • Common Vulnerabilities and Exposures
  • Network Security Best Practices
  • Authentication Best Practices
  • Operating System Security Best Practices
  • Application Security Best Practices
  • PCI DSS 3.0 Assessment

Issues identified by Inspector (we call them “findings”) are gathered together and grouped by severity in a comprehensive report.

You can access the Inspector from the AWS Management Console, AWS Command Line Interface (CLI), or API.

More to Come
I plan to share more information about Inspector shortly after re:Invent wraps up and I have some time to catch my breath, so stay tuned!

— Jeff;

by Jeff Barr at October 07, 2015 05:06 PM


Should stories you submit be included in "Your Threads"?

I tend to think of myself as being part of a thread if I submit a story, even if I haven’t commented on it yet. Perhaps this is because often I share things because I want other people’s opinion on it. I guess it’s possible that people share things without caring about the comments people have on it, but that seems off to me. (Although very altruistic, in the right light.) What do you think? Is there a reason why you submitted stories should not be part of “your threads”?

(If people agree with me and think it should be, I volunteer to raise an issue about it.)

by stig at October 07, 2015 05:06 PM


AWS Config Rules – Dynamic Compliance Checking for Cloud Resources

The flexible, dynamic nature of the AWS cloud gives developers and admins the flexibility to launch, configure, use, and terminate processing, storage, networking, and other resources as needed. In any fast-paced agile environment, security guidelines and policies can be overlooked in the race to get a new product to market before the competition.

Imagine that you had the ability to verify that existing and newly launched AWS resources conformed to your organization’s security guidelines and best practices without creating a bureaucracy or spending your time manually inspecting cloud resources.

Last year I announced that you could Track AWS Resource Configurations with AWS Config. In that post I showed you how AWS Config captured the state of your AWS resources and the relationships between them. I also discussed Config’s auditing features, including the ability to select a resource and then view a timeline of configuration changes on a timeline.

New AWS Config Rules
Today we are extending Config with a powerful new rule system. You can use existing rules from AWS and from partners, and you can also define your own custom rules. Rules can be targeted at specific resources (by id), specific types of resources, or at resources tagged in a particular way. Rules are run when those resources are created or changed, and can also be evaluated on a periodic basis (hourly, daily, and so forth).

Rules can look for any desirable or undesirable condition. For example, you could:

  • Ensure that EC2 instances launched in a particular VPC are properly tagged.
  • Make sure that every instance is associated with at least one security group.
  • Check to make sure that port 22 is not open in any production security group.

Each custom rule is simply an AWS Lambda function. When the function is invoked in order to evaluate a resource, it is provided with the resource’s Configuration Item.  The function can inspect the item and can also make calls to other AWS API functions as desired (based on permissions granted via an IAM role, as usual). After the Lambda function makes its decision (compliant or not) it calls the PutEvaluations function to record the decision and returns.

The results of all of these rule invocations (which you can think of as compliance checks) are recorded and tracked on a per-resource basis and then made available to you in the AWS Management Console. You can also access the results in a report-oriented form, or via the Config API.

Let’s take a quick tour of AWS Config Rules, with the proviso that some of what I share with you will undoubtedly change as we progress toward general availability. As usual, we will look forward to your feedback and will use it to shape and prioritize our roadmap.

Using an Existing Rule
Let’s start by using one of the rules that’s included with Config. I open the Config Console and click on Add Rule:

I browse through the rules and decide to start with instances-in-vpc. This rule verifies that an EC2  instance belong to a VPC, with the option to check that it belongs to a specific VPC. I click on the rule and customize it as needed:

I have a lot of choices here.  The Trigger type tells Config to run the rule when the resource is changed, or periodically. The Scope of changes tells Config which resources are of interest. The scope can be specified by resource type (with an optional identifier) by tag name, or by a combination of tag name and value. If I am checking EC2 instances, I can trigger on any of the following:

  • All EC2 instances.
  • Specific EC2 instances, identified by a resource identifier.
  • All resources tagged with the key “Department.”
  • All resources tagged with the key “Stage” and the value “Prod.”

The Rule parameters allows me to pass additional key/value pairs to the Lambda function. The parameter names, and their meaning, will be specific to the function. In this case, supplying a value for the vpcid parameter tells the function to verify that the EC2 instance is running within the specified VPC.

The rule goes in to effect after I click on Save. When I return to the Rules page I can see that my AWS configuration is now noncompliant:

I can investigate the issue by examining the Config timeline for the instance in question:

It turns out that this instance has been sitting around for a while (truth be told I forgot about it). This is a perfect example of how useful the new Config Rules can be!

I can also use the Config Console to look at the compliance status of all instances of a particular type:

Creating a New Rule
I can create a new rule using any language supported by Lambda. The rule receives the Configuration Item and the rule parameters that I mentioned above, and can implement any desired logic.

Let’s look at a couple of excerpts from a sample rule. The rule applies to EC2 instances, so it checks to see if was invoked on one:

function evaluateCompliance(configurationItem, ruleParameters) {
    if (configurationItem.resourceType !== 'AWS::EC2::Instance') {
        return 'NOT_APPLICABLE';
    } else {
        var securityGroups = configurationItem.configuration.securityGroups;
        var expectedSecurityGroupId = ruleParameters.securityGroupId;
        if (hasExpectedSecurityGroup(expectedSecurityGroupId, securityGroups)) {
            return 'COMPLIANT';
        } else {
            return 'NON_COMPLIANT';

If the rule was invoked on an EC2 instance, it checks to see if any one of a list of expected security groups is attached to the instance:

function hasExpectedSecurityGroup(expectedSecurityGroupId, securityGroups) {
    for (var i = 0; i < securityGroups.length; i++) {
        var securityGroup = securityGroups[i];
        if (securityGroup.groupId === expectedSecurityGroupId) {
            return true;
    return false;

Finally, the rule stores the result of the compliance check  by calling the Config API’s putEvaluations function:

config.putEvaluations(putEvaluationsRequest, function (err, data) {
    if (err) {;
    } else {

The rule can record results for the item being checked or for any related item. Let’s say you are checking to make sure that an Elastic Load Balancer is attached only to a specific kind of EC2 instance. You could decide to report compliance (or noncompliance) for the ELB or for the instance, depending on what makes the most sense for your organization and your compliance model. You can do this for any resource type that is supported by Config.

Here’s how I create a rule that references my Lambda function:

On the Way
AWS Config Rules are being launched in preview form today and you can sign up now.  Stay tuned for additional information!


PS – re:Invent attendees can attend session SEC 314: Use AWS Config Rules to Improve Governance of Your AWS Resources (5:30 PM on October 8th in Palazzo K).

by Jeff Barr at October 07, 2015 05:03 PM


How many languages exist with input alphabet {0,1} and all strings in the language have length less than or equal to 5?

I'm trying to write a formal proof in automata theory to show a few properties of DFAs but I am having some trouble with this that I am trying to incorporate into my proof. I want to show how many languages $S$ there are such that $S\subseteq\{0,1\}^*$ and $\forall s\in S, |s|\leq 5$ where $s$ is a string.

I got that there are $2^1+2^2+...+2^5 = 62$ different strings such that $|s|\leq 5$ but that is where I am stuck. How many different languages can I create with $62$ strings? Would it simply be $62!$ ?

by TheSalamander at October 07, 2015 04:42 PM

How does one enter in a boolean expression into an SAT solver?

For example, if we had an extremely large expression, how do we even first get it into the program? I can't imagine entering each clause in one by one..

by Bill at October 07, 2015 04:35 PM


Amazon RDS Update – MariaDB is Now Available

We launched the Amazon Relational Database Service (RDS) almost six years ago, in October of 2009. The initial launch gave you the power to launch a MySQL database instance from the command line. From that starting point we have added a multitude of features, along with support for the SQL Server, Oracle Database, PostgreSQL, and Amazon Aurora databases. We have made RDS available in every AWS region, and on a very wide range of database instance types. You can now run RDS in a geographic location that is well-suited to the needs of your user base, on hardware that is equally well-suited to the needs of your application.

Hello, MariaDB
Today we are adding support for the popular MariaDB database, beginning with version 10.0.17. This engine was forked from MySQL in 2009, and has developed at a rapid clip ever since, adding support for two storage engines (XtraDB and Aria) and other leading-edge features. Based on discussions with potential customers, some of the most attractive features include parallel replication and thread pooling.

As is the case with all of the databases supported by RDS, you can launch MariaDB from the Console, AWS Command Line Interface (CLI), AWS Tools for Windows PowerShell, via the RDS API, or from a CloudFormation template.

I started out with the CLI and launched my database instance like this:

$ rds-create-db-instance jeff-mariadb-1 \
  --engine mariadb \
  --db-instance-class db.r3.xlarge \
  --db-subnet-group-name dbsub \
  --allocated-storage 100 \
  --publicly-accessible false \
  --master-username root --master-user-password PASSWORD

Let’s break this down, option by option:

  • Line 1 runs the rds-create-db-instance command and specifies the name (jeff-mariadb-1) that I have chosen for my instance.
  • Line 2 indicates that I want to run the MariaDB engine, and line 3 says that I want to run it on a db.r3.xlarge instance type.
  • Line 4 points to the database subnet group that  I have chosen for the database instance. This group lists the network subnets within my VPC (Virtual Private Cloud) that are suitable for my instance.
  • Line 5 requests 100 gigabytes of storage, and line 6 specifies that I don’t want the database instance to have a publicly accessible IP address.
  • Finally, line 7 provides the name and credentials for the master user of the database.

The command displays the following information to confirm my launch:

DBINSTANCE  jeff-mariadb-1  db.r3.xlarge  mariadb  100  root  creating  1  ****  db-QAYNWOIDPPH6EYEN6RD7GTLJW4  n  10.0.17  general-public-license  n  standard  n
      VPCSECGROUP  sg-ca2071af  active
SUBNETGROUP  dbsub  DB Subnet for Testing  Complete  vpc-7fd2791a
      SUBNET  subnet-b8243890  us-east-1e  Active
      SUBNET  subnet-90af64e7  us-east-1b  Active
      SUBNET  subnet-b3af64c4  us-east-1b  Active
      PARAMGRP  default.mariadb10.0  in-sync
      OPTIONGROUP  default:mariadb-10-0  in-sync

The RDS CLI includes a full set of powerful, high-level commands, all documented here. For example, I can create read replicas (rds-create-db-instance-read-replicas) and take snapshot backups (rds-create-db-snapshot) in minutes.

Here’s how I would launch the same instance using the AWS Management Console:

Get Started Today
You can launch RDS database instances running MariaDB today in all AWS regions. Supported database instance types include M3 (standard), R3 (memory optimized), and T2 (standard).


by Jeff Barr at October 07, 2015 04:34 PM



AWS Import/Export Snowball – Transfer 1 Petabyte Per Week Using Amazon-Owned Storage Appliances

Even though high speed Internet connections (T3 or better) are available in many parts of the world, transferring terabytes or petabytes of data from an existing data center to the cloud remains challenging. Many of our customers find that the data migration aspect of an all-in move to the cloud presents some surprising issues. In many cases, these customers are planning to decommission their existing data centers after they move their apps and their data; in such a situation, upgrading their last-generation networking gear and boosting connection speeds makes little or no sense.

We launched the first-generation AWS Import/Export service way back in 2009. As I wrote at the time, “Hard drives are getting bigger more rapidly than internet connections are getting faster.” I believe that remains the case today. In fact, the rapid rise in Big Data applications, the emergence of global sensor networks, and the “keep it all just in case we can extract more value later” mindset have made the situation even more dire.

The original AWS Import/Export model was built around devices that you had to specify, purchase, maintain, format, package, ship, and track. While many AWS customers have used (and continue to use) this model, some challenges remain. For example, it does not make sense for you to buy multiple expensive devices as part of a one-time migration to AWS. In addition to data encryption requirements and device durability issues, creating the requisite manifest files for each device and each shipment adds additional overhead and leaves room for human error.

New Data Transfer Model with Amazon-Owned Appliances
After gaining significant experience with the original model, we are ready to unveil a new one, formally known as AWS Import/Export Snowball. Built around appliances that we own and maintain, the new model is faster, cleaner, simpler, more efficient, and more secure. You don’t have to buy storage devices or upgrade your network.

Snowball is designed for customers that need to move lots of data (generally 10 terabytes or more) to AWS on a one-time or recurring basis. You simply request one or more from the AWS Management Console and wait a few days for the appliance to be delivered to your site. If you want to import a lot of data, you can order one or more Snowball appliances and run them in parallel.

The new Snowball appliance is purpose-built for efficient data storage and transfer. It is rugged enough to withstand a 6 G jolt, and (at 50 lbs) light enough for one person to carry. It is entirely self-contained, with 110 Volt power and a 10 GB network connection on the back and an E Ink display/control panel on the front. It is weather-resistant and serves as its own shipping container; it can go from your mail room to your data center and back again with no packing or unpacking hassle to slow things down. In addition to being physically rugged and tamper-resistant, AWS Snowball detects tampering attempts. Here’s what it looks like:

Once you receive a Snowball, you plug it in, connect it to your network, configure the IP address (you can use your own or the device can fetch one from your network using DHCP), and install the AWS Snowball client. Then you return to the Console to download the job manifest and a 25 character unlock code. With all of that info in hand you start the appliance with one command:

$ snowball start -i DEVICE_IP -m PATH_TO_MANIFEST -u UNLOCK_CODE

At this point you are ready to copy data to the Snowball. The data will be 256-bit encrypted on the host and stored on the appliance in encrypted form. The appliance can be hosted on a private subnet with limited network access.

From there you simply copy up to 50 terabytes of data to the Snowball and disconnect it (a shipping label will automatically appear on the E Ink display), and ship it back to us for ingestion. We’ll decrypt the data and copy it to the S3 bucket(s) that you specified when you made your request. Then we’ll sanitize the appliance in accordance with National Institute of Standards and Technology Special Publication 800-88 (Guidelines for Media Sanitization).

At each step along the way, notifications are sent to an Amazon Simple Notification Service (SNS) topic and email address that you specify. You can use the SNS notifications to integrate the data import process into your own data migration workflow system.

Creating an Import Job
Let’s step through the process of creating an AWS Snowball import job from the AWS Management Console. I create a job by entering my name and address (or choosing an existing one if I have done this before):

Then I give the job a name (mine is import-photos), and select a destination (an AWS region and one or more S3 buckets):

Next, I set up my security (an IAM role and a KMS key to encrypt the data):

I’m almost ready! Now I choose the notification options. I can create a new SNS topic and create an email subscription to it, or I can use an existing topic. I can also choose the status changes that are of interest to me:

After I review and confirm my choices, the job becomes active:

The next step (which I didn’t have time for in the rush to re:Invent) would be to receive the appliance, install it and copy my data over, and ship it back.

In the Works
We are launching AWS Import/Export Snowball with import functionality so that you can move data to the cloud. We are also aware of many interesting use cases that involve moving data the other way, including large-scale data distribution, and plan to address them in the future.

We are also working on other enhancements including continuous, GPS-powered chain-of-custody tracking.

Pricing and Availability
There is a usage charge of $200 per job, plus shipping charges that are based on your destination and the selected shipment method. As part of this charge, you have up to 10 days (starting the day after delivery) to copy your data to the appliance and ship it out. Extra days are $15 each.

You can import data to the US Standard and US West (Oregon) regions, with more on the way.


by Jeff Barr at October 07, 2015 04:23 PM



Amazon Kinesis Firehose – Simple & Highly Scalable Data Ingestion

Two years ago we introduced Amazon Kinesis, which we now call Amazon Kinesis Streams, to allow you to build applications that collect, process, and analyze streaming data with very high throughput. We don’t want you to have to think about building and running a fleet of ingestion servers or worrying about monitoring, scaling, or reliable delivery.

Amazon Kinesis Firehose was purpose-built to make it even easier for you to load streaming data into AWS. You simply create a delivery stream, route it to an Amazon Simple Storage Service (S3) bucket and/or a Amazon Redshift table, and write records (up to 1000 KB each) to the stream. Behind the scenes, Firehose will take care of all of the monitoring, scaling, and data management for you.

Once again (I never tire of saying this), you can spend more time focusing on your application and less time on your infrastructure.

Inside the Firehose
In order to keep things simple, Firehose does not interpret or process the raw data in any way. You simply create a delivery stream and write data records to it. After any requested compression (client-side) and encryption (server-side), the records are written to an S3 bucket that you designate. As my colleague James Hamilton likes to say (in other contexts), “It’s that simple.” You can even control the buffer size and the buffer interval for the stream if necessary.

If your client code isolates individual logical records before sending them to Firehose, it can add a delimiter. Otherwise, you can identify record boundaries later, once the data is in the cloud.

After your data is stored in S3, you have multiple options for analyzing and processing it. For example, you can attach an AWS Lambda function to the bucket and process the objects as they arrive. Or, you can point your existing Amazon EMR jobs at the bucket and process the freshest data, without having to make any changes to the jobs.

You can also use Firehose to route your data to an Amazon Redshift cluster. After Firehose stores your raw data in S3 objects, it can invoke a Redshift COPY command on each object. This command is very flexible and allows you to import and process data in multiple formats (CVS, JSON, AVRO, and so forth), isolate and store only selected columns, convert data from one type to another, and so forth.

Firehose From the Console
You can do all of this from the AWS Management Console, the AWS Command Line Interface (CLI), and via the Firehose APIs.

Let’s set up a delivery stream using the Firehose Console. I simply open it up and click on Create Delivery Stream. Then I give my stream a name, pick an S3 bucket (or create a new one), and set up an IAM role so that Firehose has permission to write to the bucket:

I can configure the latency and compression for the delivery stream. I can also choose to encrypt the data using one of my AWS Key Management Service (KMS) keys:

Once my stream is created, I can see it from the console.

Publishing to a Delivery Stream
Here is some simple Java code to publish a record (the string “some data”) to my stream:

PutRecordRequest putRecordRequest = new PutRecordRequest(); 

String data = "some data" + "\n"; // add \n as a record separator 
Record record = new Record(); 


And here’s a CLI equivalent:

$ aws firehose put-record --delivery-stream-name incoming-stream --record Data="some data\n"

We also supply an agent that runs on Linux. It can be configured to watch one more log files and to route them to Firehose.

Monitoring Kinesis Firehose Delivery Streams
You can monitor the CloudWatch metrics for each of your delivery streams from the Console:

By the Numbers
Individual delivery streams can scale to accommodate multiple gigabytes of data per hour. By default, each stream can support 2500 calls to PutRecord or PutRecordBatch per second and you can have up to 5 streams per AWS account (both of these values are administrative limits that can be raised upon request, so just ask if you need more).

This feature is available now and you can start using it today. Pricing is based on the volume of data ingested via each Firehose.

— Jeff;


by Jeff Barr at October 07, 2015 04:18 PM


Remote workers: do you have kit allowance?

If you work remotely, do your company pay for stuff like computer / desk / chair / internet access / office locally where you live? Is there a set allowance, or do you expense it? If you do have a kit allowance, is it “means tested” based on your country of residence? (I seem to remember someone—Trello? StackOverflow?—giving a flat USD5,000 allowance for kit, but I cannot find a reference for that now.)

I invoiced for my computer, and for travel to/from head office. (Obviously.) But I rent a separate office as I don’t have a suitable office space for full-time work from home (yet) and haven’t invoiced for that yet. (The rented office comes furnished, heated, with kitchen facilities and with fairly good internet connectivity.) When I first started I did not yet have internet connection at home, so renting this office was necessary for me to be able to commence working for them.

Now my home line is probably good enough—we had a fault on the line for a while, but it has been corrected now. However, I think I work better and with less distractions from this dedicated rented office than from home. I would like to continue, but it’s hard to justify the personal cost. Wondering what other people in similar situations do.

by stig at October 07, 2015 04:18 PM


Amazon QuickSight – Fast & Easy to Use Business Intelligence for Big Data at 1/10th the Cost of Traditional Solutions

Over the last couple of years, the process of collecting, uploading, storing, and processing data on AWS has become faster, simpler, and increasingly comprehensive. We have delivered a broad set of data-centric services that tackle many of the issues faced by our customers. For example:

  • Managing Databases is Painful and Difficult – Amazon Relational Database Service (RDS) addresses many of the pain points and provides many ease-of-use features.
  • SQL Databases do not Work Well at ScaleAmazon DynamoDB provides a fully managed, NoSQL model that has no inherent scalability limits.
  • Hadoop is Difficult to Deploy and ManageAmazon EMR can launch managed Hadoop clusters in minutes.
  • Data Warehouses are Costly, Complex, and SlowAmazon Redshift provides a fast, fully-managed petabyte-scale data warehouse at 1/10th the cost of traditional solutions.
  • Commercial Databases are Punitive and ExpensiveAmazon Aurora combines the speed and availability of high-end commercial databases with the simplicity and cost-effectiveness of their open source siblings.
  • Streaming Data is Difficult to CaptureAmazon Kinesis facilitates real-time data processing of data streams at terabyte scale.

With the services listed above as a base, many customers are ready to take the next step. They are able to collect, upload, process, and store the data. Now they want to analyze and visualize it, and they want to do it the AWS way—easily and cost-effectively at world scale!

In the past, Business Intelligence required an incredible amount of undifferentiated heavy lifting. You had to pay for, set up and run the infrastructure and the software, manage scale (while users fret), and hire consultants at exorbitant rates to model your data. After all that your users were left to struggle with complex user interfaces for data exploration while simultaneously demanding support for their mobile devices. Access to NoSQL and streaming data? Good luck with that!

Introducing QuickSight
Today we are announcing Amazon QuickSight. You get very fast, easy to use business intelligence for your big data needs at 1/10th the cost of traditional on-premises solutions. This cool new product will be available in preview form later this month.

After talking to many customers about their Business Intelligence (BI) needs, we believe that QuickSight will be able to handle many types of data-intensive workloads including ad targeting, customer segmentation, forecasting & planning, marketing & sales analytics, inventory & shipment tracking, IoT device stream management, and clickstream analysis. You’ve got the data and you’ve got the questions. Now you want the insights!

QuickSight lets you get started in minutes. You log in, point to a data source, and begin to visualize your data. As you do so, you’ll benefit from the following features:

Access to Data  Sources -QuickSight can access data from many different sources, both on-premises and in the cloud. There’s built-in support for Redshift, RDS, Amazon Aurora, EMR, DynamoDB, Kinesis, S3, MySQL, Oracle, SQL Server, PostgreSQL, and flat files. Connectors allow access to data stored in third-party sources such as Salesforce.

Fast Calculation – QuickSight is built around SPICE (the Super-fast, Parallel, In-memory Calculation Engine). We built it from the ground up to run in the cloud and to deliver a fast, interactive data visualization experience.

Ease of Use – QuickSight auto-discovers your AWS data sources and makes it easy for you to connect to them. As you select tables and fields, it recommends the most appropriate types of graphs and other visualizations. You can share your visualizations with your colleagues and you can assemble several visualizations in order to tell a story with data. You can even embed your reports in applications and websites.

Effortless Scale – QuickSight provides fast analytics and visualization while scaling to handle to hundreds of thousands of users and terabytes of data per organization.

Low Cost – All things considered, QuickSight will provide you with robust Business Intelligence at 1/10th the cost of on-premises solutions from the old guard.

Partner-Ready – QuickSight provides a simple SQL-like interface to enable BI tools from AWS Partners to access data stored in SPICE so that customers can use the BI tools they are familiar with and get even faster performance at scale. We’re already working with several partners including Domo, Qlik, Tableau, and Tibco. I’ll have more news on that front before too long.

Take the QuickSight Tour
Let’s take a tour through QuickSight. As a quick reminder, we’re still putting the finishing touches on the visuals and the images below are subject to change. Each organization will have their own QuickSight link. After the first user from an organization logs in, they have the ability to invite their coworkers.

After I log in, QuickSight discovers available data sources and lets me connect to the one I want with a couple of clicks:

After that I select a table from the data source:

And then the field(s) of interest:

I select the product category and sales amount in order to view sales by category:

The Fitness value looks interesting and I want to learn more! I simply click on it and choose to focus:

And that’s what I see:

Now I want to know more about what’s going on, so I drill in to the sub-categories with a click:

And here’s what I see. It looks like weight accessories, treadmills, and fitness monitors are my best sellers:

After I create the visualization, I can save it to a storyboard:


This quick tour has barely scratched the surface of what QuickSight can do, but I do want to keep some surprises in reserve for the official launch (currently scheduled for early 2016). Between now and then I plan to share more details about the mobile apps, the storyboards, and so forth.

QuickSight Pricing
I have alluded to expensive, inflexible, old-school pricing a couple of times already. We want to make QuickSight affordable to organizations of all sizes. There will be two service options, Standard and Enterprise. The Enterprise Edition provides up to twice the throughput & fine-grained access control, supports encryption at rest, integrates with your organization’s Active Directory, and includes a few other goodies as well. Pricing is as follows:

  • Standard Edition:
    • $12 per user per month with no usage commitment.
    • $9 per user month with a one-year usage commitment.
    • $0.25 / gigabyte / month for SPICE storage (beyond 10 gigabytes).
  • Enterprise Edition:
    • $24 per user per month with no usage commitment.
    • $18 per user per month with a one-year usage commitment.
    • $0.38 / gigabyte / month for SPICE storage (beyond 10 gigabytes).

Coming Soon
If you are interested in evaluating QuickSight for your organization, you can sign up for the preview today. We’ll be opening it up to an initial set of users later this month, and scaling up after that. As usual, we’ll start in the US East (Northern Virginia) region, expand quickly to US West (Oregon) and Europe (Ireland), and then shoot for the remaining AWS regions in time for the full-blown launch in 2016.


by Jeff Barr at October 07, 2015 04:04 PM


Hätte Sarah Sharp geschwiegen, wäre sie Philosophin ...

Hätte Sarah Sharp geschwiegen, wäre sie Philosophin geblieben. Aber sie hat jetzt diesen Blogpost geschrieben, und das liest sich wie das Who-is-Who der SJW-Positionen. Ich zitiere mal:
Conferences include child care, clearly labeled veggie and non-veggie foods, and a clear event policy
Wait, what?
Alcoholic drinks policy encourages participants to have fun, rather than get smashed
Code of conduct explicitly protects diverse developers, acknowledging the spectrum of privilege
Sorry, aber damit erscheint ihre Kritik an der Linux-Kernel-Mailingliste für mich in ganz neuem Licht. Ich frage mich ja, ob dieses Herangehensweise (Wir behandeln Andere alle als infantil und schaffen eine Gesinnungspolizei) ihre Wurzeln in der Polizei- und Gefängniskultur der USA hat. Und natürlich in so Konzepten wie "manifest destiny" und dem Amerikanischen Exzeptionalismus.

Krass. Und das ist bisher niemandem aufgefallen, wie die so tickt?

Update: Wie sich rausstellt, ist das schon Leuten aufgefallen. Schon 2013.

October 07, 2015 04:01 PM

Ich möchte hier mal eine These vertreten, um der Wahrheitsfindung ...

Ich möchte hier mal eine These vertreten, um der Wahrheitsfindung zu dienen.
Wenn Linus sich auf Mailinglisten fies und gemein verhält, ist er immer noch sozialer (bzw weniger asozial) als die Twitter/Tumblr-Empörer. Denn Linus toleriert fremde Meinungen und gibt ihnen negatives Feedback, während die Twitter/Tumblr-Empörer fremde Meinungen direkt wegfiltern.
Sozial nutze ich hier im Wortsinne, mit anderen Menschen Dinge tun. Linus zu beschimpfen ist von außen immer einfach, aber wer von euch hält es denn tagein tagaus auf einer "feindlichen" Mailingliste der Größe von LKML aus? Mit lauter Leuten, die endlos Dinge von ihm wollen, und seine Ablehnung persönlich nehmen? Egal was er macht, jemand wird es als persönlichen Angriff werten? Er sagt endlos voraus, "wenn ihr $foo macht, wird das Scheiße", dann machen die Leute $foo und es wird Scheiße und er will das Ergebnis nicht aufnehmen und dann ist ER Schuld?!

Also ich als Nerd möchte hier mal ein paar Dinge zu Protokoll geben.

Erstens. Nerds sind im persönlichen Gespräch sozial unterlegen, aber sie haben für Online-Situationen völlig klare Algorithmen entwickelt, die beide Seiten vorher kennen. In Online-Medien gibt es Verwechslungen und Missverständnisse (ich möchte jeden, der Linus als geifernden Proleten sieht, mal einladen, eines seiner Videos zu gucken. Der Mann ist geradezu charmant und ein Sympathieträger im persönlichen Gespräch! Der Kontrast könnte kaum größer sein!). Nerds wissen das und reagieren entsprechend. Wichtigstes Mittel zur Vermeidung von schlechten Situationen ist negatives Feedback. Nerds wissen das und geben frühzeitig negatives Feedback. Nerds wissen auch, dass auf der anderen Seite möglicherweise jemand sitzt, der ihr negatives Feedback für einen persönlichen Angriff hält und daher möglicherweise ignoriert und dann die Zeit aller verplempert und viel Drama und Schaden anrichtet. Daher wird der Nerd, wenn er den Eindruck hat, sein negatives Feedback wurde ignoriert, nochmal nachlegen. Es gibt hier eine relativ klare Eskalationsstruktur. Wie schnell die eskaliert, hängt vom Medium ab. In meiner Inbox zum Beispiel, wenn ich jemandem schreibe, er soll weggehen, und der schreibt mir nochmal, dann frage ich zurück, welchen Teil von "geh weg" er nicht verstanden hat, und dass er weg gehen soll, und wer dann immer noch mal schreibt, der kommt ins ewige Killfile. Das ist meine Eskalationsstrategie. Auf Mailinglisten wird man im Allgemeinen etwas langsamer eskalieren, aber es läuft grob ähnlich ab.

Zweitens. Ich hatte neulich eine Mail, wo jemand folgende Situation schilderte. Mailingliste, einkommende Anfrage, Null Reaktion. Hey, das ist unhöflich, ich antworte mal. Hat geantwortet, die Antwort war nicht 100% korrekt, andere Leute korrigierten dann, und zwar auf der Mailingliste, nicht per privater Mail. Das wertete der Einsender als privaten Angriff auf sich, und was bilden sich diese Leute alle ein, wenn die das besser wissen, hätten sie ja was antworten können! Aber so läuft das nicht unter Nerds. Nerds können sich an der Stelle in den Gegenüber reinversetzen und wissen: Lieber keine Antwort als falsche Antwort. Falsche Antwort verplempert mehr Zeit. Und eine falsche Antwort ist auch für alle anderen auf der Mailingliste ein Affront, denn die müssen die jetzt korrigieren. Das soziale Konstrukt einer Mailingliste ist nämlich, dass es irgendwo ein Archiv gibt, und vor dem Fragen soll man bitte im Archiv gucken, ob die Frage schonmal gestellt und beantwortet wurde. Das ist so, weil überflüssige Mails an die Liste alle Subscriber Zeit kosten, dein Suchen aber nur dich. Nerds sind da Techies und berechnen die verbrauchte Energie. Wenn du im Archiv suchst, wird dir geholfen, und du kostet niemanden sonst Zeit. Wenn du deine Frage, die im Archiv schon beantwortet ist, auf die Mailingliste postest, dann antworten wir nicht, weil es deine Aufgabe gewesen wäre, im Archiv nachzugucken. Die Energiebilanz ist dabei schon negativ, weil alle Subscriber die Zeit verplempern mussten, deine überflüssige Mail zu ignorieren. Wenn du eine Frage auf die Mailingliste schreibst, die schon im Archiv beantwortet wurde, und jemand antwortet jetzt etwas falsches, dann ist das der Worst Case, denn das muss korrigiert werden, sonst findet der nächste Typ, der sich an die Regeln hält und vorher im Archiv guckt, deine falsche Antwort und nicht die richtige und das ganze soziale Konstrukt ist kaputt.

So und jetzt überlegt euch mal, wie viel mehr Aufwand das Schreiben einer Korrekturmail verursacht im Vergleich zum Löschen einer dummen Frage. So viel mehr schlechte Laune erzeugt ihr mit einer falschen Antwort auf eine dumme Frage.

Aber vielleicht war die Frage ja gar nicht dumm und noch nie im Archiv beantwortet? Das kann sein. Denkbar ist es. In meiner langen Karriere in Mailinglisten und im Usenet kann ich mich gerade an keinen einzigen Fall erinnern, wo auf eine tatsächliche neue Frage keine Antworten gekommen sind. Und sei es nur "Das ist ja eine interessante Frage!".

Aber jetzt überlegt euch mal, wie die eben beschriebene Situation auf Außenstehende wirkt. Die Mailingliste ist voller Fragen von Neulingen, und die werden alle ignoriert oder abgekanzelt. Ab und zu antwortet mal jemand auf eine, und der kriegt dann sofort von allen Seiten auf die Fresse. Sieht das wie eine gute Atmosphäre aus? Natürlich nicht! Niemand will das! Aber wer ist Schuld daran?

Aus meiner Sicht: Die Neulinge, die sich nicht an die soziale Konvention gehalten haben. Und das ist einer der fundamentalen Unterschiede zwischen Nerds und anderen Menschen. Andere Menschen haben dann Mitleid mit den Neulingen. Was ist denn das für eine Community, fragen sie, in der man Neulinge so negativ behandelt? Ihr wollt doch wachsen! Seid lieb zu den Newbies!

Und da kann ich nur sagen: Nein, ich will gar nicht wachsen. Die Communities, in denen ich drin bin, sind alle genau richtig groß. Wachstum ist kein Wert an sich, und wenn ich mir Wachstum dadurch erkaufe, dass der Blick in die Mailingliste nur noch aus Schmerz und Trübsal besteht, weil sich keiner an die sozialen Normen hält und sich vorher selber zu helfen versucht, dann sehe ich Wachstum sogar eher negativ als positiv.

Und richtig schlimm wird es, wenn die Neulinge gar nicht kommen, weil sie zu deiner Community wollen, sondern weil sie gehört haben, dass auf deiner Mailingliste die Neulinge so schlecht behandelt werden, und sie was zum Empören gesucht haben.

Wenn man bei einer typischen Mailingliste all die überflüssige Fragen wegnimmt, und all die Antworten auf überflüssige Fragen, dann bleibt im Allgemeinen nicht viel übrig. Das kannst du doch nicht wollen, Fefe? Doch, genau das will ich. Wenn es nichts zu sagen gibt, dann hätte ich gerne Ruhe und würde mich gerne entspannen, anstatt schon wieder 100 sinnlose Mails löschen zu müssen.

Oh und einen noch. Ich beschrieb oben die Eskalationsstrategie von Nerds. Geschichten der Form "dieser höfliche, freundliche Mensch kam auf die Mailingliste und fragte höflich und zurückhaltend diese legitime Frage und wurde zusammengebrüllt und weggemobbt" sind meiner Erfahrung nach alles Lügen. Jede einzelne davon. Ich bin in meinem Leben auf einigen echt haarigen Mailinglisten gewesen, und NOCH NIE habe ich erlebt, dass jemand, der eine legitime Frage gestellt hat, und dabei höflich und zurückhaltend war, angepflaumt wurde. Nicht ein einziges Mal.

Ich habe auch ein paar Mal auf der Linux-Kernel-Mailingliste gepostet über die Jahre. So ein halbes Dutzend bis 10 Mal oder so würde ich schätzen. Wisst ihr, wie häufig ich angepflaumt wurde? ÜBERHAUPT NICHT.

In meinen Augen haben Leute, die auf Twitter Filterblasen betreiben, im Bezug auf das soziale Verhalten anderer mal gepflegt die Fresse zu halten, denn sie haben sich jegliches moralisches Standing verwirkt. Überhaupt finde ich es sehr problematisch, anderen Leuten sagen zu wollen, wie sie ihr Leben zu führen haben. Wenn euch die Linux-Kernel-Mailingliste nicht gefällt, macht halt eine Kernelflausch-Mailingliste auf, wo es gesitteter zugeht. Für über 90% des Inhaltes der echten Kernelmailingliste braucht man Linus' Anwesenheit nicht. Wenn das tatsächlich besser flutscht, dann wird die unsichtbare Hand des Marktes eurer Mailingliste Erfolg bescheren. Wenn nicht, dann werde ich oben auf der Brücke stehen und euch "told you so" zurufen.

Update: Noch was. Nicht auf eine dumme Frage zu antworten ist auch negatives Feedback und am Ende für den Fragesteller gedacht, nicht für die anderen. Der merkt dann hoffentlich, dass er was falsch gemacht hat, und korrigiert sein Verhalten. Newbies, die dumme Fragen stellen, zu helfen, ist nicht nur nicht negatives Feedback, es ist positives Feedback. Damit transportiert man die Nachricht, dass es OK ist, lieber anderer Leute Zeit zu verplempern als selber mal fünf Minuten ins Archiv zu gucken. Damit erzieht man die Leute zum genauen Gegenteil des gewünschen Verhaltens.

Update: Noch einen. Diese Ausführungen basieren auf der Annahme, dass es sich um eine technische, zielgerichtete Mailingliste handelt. Schlaue Projekte haben daher mehrere Mailinglisten. Einmal foo-dev für die Leute, die das Projekt vorantreiben. Einmal foo-announce für die Leute, die nur wissen wollen, wenn eine neue Version rauskam. Und einmal foo-talk als Auffangbecken für die ganzen Schwallbacken, die Mailinglisten als soziales Forum für Smalltalk begreifen.

Update: Auf einen Satz runtergebrochen: Deine Lebenszeit ist nicht wertvoller als meine. Daher mögen die Leute auch keine Werbung, keinen Spam, keine Prediger, keine Cold Calls, keine Vertreter und keine Zeugen Jehovas an der Tür. Wenn du nicht merkst, was für ein Arschloch-Move das ist, meine Zeit mit deinem trivialen Scheiß in Anspruch zu nehmen, dann verschiebst du dich damit direkt drei Punkte auf der Arschloch-Freund-Skala in Richtung Arschloch. Und kommt mir nicht mit freier Meinungsäußerung. Da muss das, was du zu sagen hast, direkt erstmal inhaltlich die drei Minuspunkte wettmachen, damit ich das unter dem Strich positiv bewerten kann. Ich empfinde es übrigens auch als ganz große Frechheit, wenn mich irgendwelche Leute anrufen. Besonders Journalisten halten das für völlig normal, von anderen zu erwarten, dass sie stundenlang mit ihnen telefonieren. Äh, nein. Ich denke gar nicht daran. Mein Telefon ist für Notfälle. Familienmitglied verschollen. Bekannter vermisst. DU bist im Zweifelsfall KEIN Notfall.

Update: Weil sich tatsächlich Einzelne nicht entblöden, das hier als "SO MACHEN NERDS DAS HALT UND DAHER IST DAS SO RICHTIG" zu strohmannisieren: Es ging hier nicht darum, zu sagen, dass das so richtig ist. Ist es vielleicht gar nicht. Es ging darum, zu sagen, dass das nicht böses Fies- und Gemeinsein ist, sondern dass sich da jemand was bei gedacht hat, und dass dein aus deiner Sicht völlig harmloses, gar höfliches Verhalten aus Sicht anderer an asoziales anderer-Leute-schöne-Dinge-kaputtmachen grenzt. Meiner Erfahrung nach ist es viel besser, zu erklären, warum Dinge so sind, als zu erklären, dass sie so sind und passt euch gefälligst an. Die Dinge sind so, weil die Nerds vorher die anderen Optionen durchprobiert haben und die waren alle noch schlechter. Bist du jetzt möglicherweise das Genie, das den Dritten Weg findet, der alles besser macht? Denkbar. Aber sehr, sehr unwahrscheinlich. Viel wahrscheinlicher ist, dass du gerade dabei bist, dich monumental zum Idioten zu machen, indem du die Fehler, die andere Leute schon für dich gemacht haben, nochmal machst. Auf Kosten anderer. :-)

October 07, 2015 04:01 PM


High Scalability

Zappos's Website Frozen for Two Years as it Integrates with Amazon

Here's an interesting nugget from a wonderfully written and deeply interesting article by Roger Hodge in the New Republic: A radical experiment at Zappos to end the office workplace as we know it:

Zappos's customer-facing web site has been basically frozen for the last few years while the company migrates its backend systems to Amazon's platforms, a multiyear project known as Supercloud.

It's a testament to Zappos that they still sell well with a frozen website while most of the rest of the world has adopted a model of continuous deployment and constant evolution across multiple platforms.

Amazon is requiring the move, otherwise a company like Zappos would probably be sensitive to the Conway's law implication of such a deep integration. Keep in mind Facebook is reportedly keeping WhatsApp and Instagram independent. This stop the world plan must mean something, unfortunately I don't have the strategic insight to understand why this might be. Any thoughts?

The article has more tantalizing details about what's going on with the move:

by Todd Hoff at October 07, 2015 03:56 PM




Java FileInverter for .dat files [on hold]

I have to create a program which reads in a data file (numbers.dat) and prints out the list of numbers in reverse order, followed by the sum of the numbers. For example is the .dat file says 1 2 3 4 5 6 7 8 then the output should be 8 7 6 5 4 3 2 1 | 36. I am not even sure how to start this program. Please Help!

by Josh M. at October 07, 2015 03:30 PM


Where can I download the history of dividends for Nasdaq?

I want to download the annual dividends(regular,special and repurchaces) for all stocks at Nasdaq for 5 years. Does anyone know where can I find it? Thank you

by user2933161 at October 07, 2015 03:07 PM

How to interpret regression coefficients with dummy explanatory variables?

I am a bit confused about the interpretation of the regression coefficients in a regression model:


where $R_{t}$ is the log return of some stock, which is defined as $log(P_t) - log(P_{t-1})$, $R_{mt}$ is the log return of some market index e.g., SP500) and $D_t$ is a dummy variable ($D_t=1$ if earnings announcements are published on day $t$ and $D_t = 0$ otherwise).

The results are $\beta_1= 0.024$ and $\beta_2= -0.03$. Is the following interpretation correct?

(1) an increase in the market return of 1% leads to an increase of the stock return of 2.4% or 0.024% (as both variables are in logs and thus $\beta_1$ can be interpreted as elasticity)?

(2) And on days with earnings announcements, the return is -3% or -0.03% lower than the average return of the stock (here we have a log dependent and a non-log independent)?

by jeffrey at October 07, 2015 03:07 PM


ZFS corruption after upgrade from 10.1 to 10.2

(Apologies for the long post.) This is my first larger ZFS configuration. Still learning, and manged to get myself into a little bit of trouble. I've lost my zpool after an upgrade. I was hoping to recover, if at all possible. Seeing this post gives me some hope that it is possible.

I have 3 pools:

  • Mirrored bootpool for mirrored root, configured by the FreeBSD installler

  • Full disk encrypted (geli), mirrored root, configured by the FreeBSD installer.

    • 2x 128GB
  • Full disk encrypted (geli), raidz2 Data pool

    • 8x 1TB

# uname -a FreeBSD onyx 10.1-RELEASE-p19 FreeBSD 10.1-RELEASE-p19 #0: Sat Aug 22 03:55:09 UTC 2015 amd64 

# dmesg | grep ZFS ZFS filesystem version: 5 ZFS storage pool version: features support (5000) 

# zpool status pool: bootpool state: ONLINE scan: scrub repaired 0 in 0h0m with 0 errors on Sun Sep 27 01:14:39 2015 config: NAME STATE READ WRITE CKSUM bootpool ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 gpt/boot0 ONLINE 0 0 0 gpt/boot1 ONLINE 0 0 0 errors: No known data errors pool: zroot state: ONLINE scan: resilvered 888K in 0h0m with 0 errors on Mon Sep 28 18:05:19 2015 config: NAME STATE READ WRITE CKSUM zroot ONLINE 0 0 0 mirror-0 ONLINE 0 0 0 ada0p4.eli ONLINE 0 0 0 ada1p4.eli ONLINE 0 0 0 errors: No known data errors 

I updated from 10.1 to 10.2 using freebsd-update and my data pool wasn't visible. Noticed that my disks were not all there while unlocking them. One of the disks became unplugged in the case from a loose cable. I had trouble unlocking the drives. It was a this point I decided to downgrade back to 10.1.

(I thought there was an issue with my unlocking script between introduced with 10.2, but it turns out the missing disk was da0, and all the disk numbers shifted over by one. This was my bad. I was panicked.)

I reseated the cable and then I was able to unlock the drives. But even after unlocking, I could not import the data pool with zpool import.

# zpool import pool: storage id: 2165434402974506719 state: UNAVAIL status: One or more devices are missing from the system. action: The pool cannot be imported. Attach the missing devices and try again. see: config: storage UNAVAIL insufficient replicas raidz2-0 UNAVAIL insufficient replicas da7.eli ONLINE 10453896490657840697 UNAVAIL cannot open 14039771922452082632 UNAVAIL cannot open 5539303168103740339 UNAVAIL cannot open 12558221764161796891 UNAVAIL cannot open 1351361181721955963 UNAVAIL cannot open 8917293899674258178 UNAVAIL cannot open 2642373353642314454 UNAVAIL cannot open # zpool import -a cannot import 'storage': no such pool or dataset Destroy and re-create the pool from a backup source. 

Using zdb, I get 2 different outputs: One with broken labels, and one with lots of pool data.

This on da0-da6

# zdb -l /dev/da0.eli -------------------------------------------- LABEL 0 -------------------------------------------- failed to unpack label 0 -------------------------------------------- LABEL 1 -------------------------------------------- failed to unpack label 1 -------------------------------------------- LABEL 2 -------------------------------------------- failed to unpack label 2 -------------------------------------------- LABEL 3 -------------------------------------------- failed to unpack label 3 

# zdb -l /dev/da7.eli -------------------------------------------- LABEL 0 -------------------------------------------- version: 5000 name: 'storage' state: 0 txg: 4145887 pool_guid: 2165434402974506719 hostid: 4020756484 hostname: 'onyx' top_guid: 4577084771010321867 guid: 4306957689091047968 vdev_children: 1 vdev_tree: type: 'raidz' id: 0 guid: 4577084771010321867 nparity: 2 metaslab_array: 34 metaslab_shift: 36 ashift: 12 asize: 8001599569920 is_log: 0 create_txg: 4 children[0]: type: 'disk' id: 0 guid: 4306957689091047968 path: '/dev/da0.eli' phys_path: '/dev/da0.eli' whole_disk: 1 DTL: 58 create_txg: 4 children[1]: type: 'disk' id: 1 guid: 10453896490657840697 path: '/dev/da1.eli' phys_path: '/dev/da1.eli' whole_disk: 1 DTL: 57 create_txg: 4 children[2]: type: 'disk' id: 2 guid: 14039771922452082632 path: '/dev/da2.eli' phys_path: '/dev/da2.eli' whole_disk: 1 DTL: 56 create_txg: 4 children[3]: type: 'disk' id: 3 guid: 5539303168103740339 path: '/dev/da3.eli' phys_path: '/dev/da3.eli' whole_disk: 1 DTL: 55 create_txg: 4 children[4]: type: 'disk' id: 4 guid: 12558221764161796891 path: '/dev/da4.eli' phys_path: '/dev/da4.eli' whole_disk: 1 DTL: 54 create_txg: 4 children[5]: type: 'disk' id: 5 guid: 1351361181721955963 path: '/dev/da5.eli' phys_path: '/dev/da5.eli' whole_disk: 1 DTL: 53 create_txg: 4 children[6]: type: 'disk' id: 6 guid: 8917293899674258178 path: '/dev/da6.eli' phys_path: '/dev/da6.eli' whole_disk: 1 DTL: 52 create_txg: 4 children[7]: type: 'disk' id: 7 guid: 2642373353642314454 path: '/dev/da7.eli' phys_path: '/dev/da7.eli' whole_disk: 1 DTL: 51 create_txg: 4 features_for_read: com.delphix:hole_birth com.delphix:embedded_data [... Label 1, 2, 3 ...] 

I'm hoping that because I couldn't attach my drives in 10.2, I would've been unable to directly cause damage to the pool. Perhaps that is the denial speaking, however, I'm not sure what my options are for recovery are, (That is, without going through expensive recovery services.)

Any ideas would be appreciated. Thanks!

submitted by ellkae
[link] [8 comments]

October 07, 2015 03:06 PM



Planet Emacsen

Irreal: Emacs, OS X, and PATHS

As I mentioned the other day, I encountered a difficulty in building the documentation for the latest version of SBCL. That's because I recently updated OS X to El Capitan, which has a new feature. It's impossible for anyone (even root) to write into the /usr directory (/user/local is OK, though). That's a bit annoying but I suppose it does make some sense from a security standpoint.

In any event, the MacTeX distribution used to set a symbolic link into /usr so that the shell and other applications could find the executables. Those links were removed when El Capitan was installed so the SBCL makefile couldn't find TeX to build the documentation. It's easy to fix this and Herbert Schulz has a nice writeup on how to do it. While I was at it I also downloaded and installed latest version of MacTeX. Once I had it loaded, I fixed my shell PATH variable and the SBCL documentation built without incident.

That left the problem of fixing the path variable in Emacs. That always seems much harder than it should. The reason is that the GUI version of Emacs under OS X does not import the PATH variable so you have to set it up by hand. Actually, you don't. Steve Purcell has a tremendously useful package called exec-path-from-shell that imports the shell PATH variable into Emacs for you.

That simplifies everything. I just installed the package and it took care of telling Emacs where the TeX executables were. I highly recommend this package if you are an OS X user.

by jcs at October 07, 2015 02:52 PM


At-the-money Call Spread approximation

In a trading manual I got during a course, the value of the ATM Call-Spread is approximated by $CS_{ATM}=\frac{1}{2}StrD+(F-m)\times\Delta CS$ The lecturer skipped the part where he derived this approximation. And couldn't answer why this formula holds. So does anyone have a clue? StdrD=Strike Difference, m= midpoint between the strikes of the call spread F=future

$\Delta CS$ was approximated by $0.33\times\frac{StrD}{Straddle}$ (which is a consequence of the normal distribution) Where we take straddle equal to a standard deviation (actually $\sqrt(\frac{2}{\pi})$ times would be more precise.

by Cindy88 at October 07, 2015 02:50 PM


Is it possible to build an operating system that could easily work through only a browser?

Does anyone know if it could be possible to build an operating system that unlike remote desktop clients (too laggy unless connected to the same network) could actually run properly when accessed through a browser?

submitted by simonromano2007
[link] [13 comments]

October 07, 2015 02:48 PM


Complexity of calculating average across distributed network?

What is the communication complexity of the best known algorithm for computing the average of a set of N*D numbers distributed across N nodes with D values per node? Assume you have full control over the network topology and protocol.

by elplatt at October 07, 2015 02:45 PM



Can adding an uncorrelated high vol strategy to a low vol portfolio result in a portfolio with even lower volatility?

Let's say I have fund A with 20% annualized volatility and portfolio B with 15% annualized volatility. If A and B have 0 correlation, can the combination of these funds have volatility < 15% ? Are there any papers explaining this?

by siegel at October 07, 2015 02:35 PM


Finding a Regular Expression for an Intersection of Two Regular Expressions

Finding a Regular Expression for an Intersection of Two Regular Expressions

PAIR of regular expressions is ((ss*)t*) and ((ss*) + (tt*)).

How do I find a regular expression that represents the intersection of the languages defined by the pair?

by Cindy at October 07, 2015 02:33 PM


Dave Winer

Blogging like it's 1999

I get ideas that are paragraph length.

I don't want to try to save them in Facebook.

They don't fit in Twitter.

But each of these systems has a certain gravity, they pull ideas into them.

Can my ideas have an existence outside of Twitter and Facebook?

My blog had more features, worked better for me, in 1999.

In the last ten years I've had to pull features out of my blogging system, instead of making it better, it lost functionality.

So in a way, my "better blogging system" would just be what I already had working 15 years ago.

I keep remembering that, between Google Reader and its limits (items must have titles), and Twitter with its limits (only 140 chars, no titles, one link, no styling), same with Facebook (no links or styling) that my online writing has diminished dramatically, conforming to the contradictory limits of each of these systems.

I keep working on this, still am. Every day.

At some point I will have to try to stop doing it the way they want me to do it.

Let's go Mets!

October 07, 2015 02:14 PM


minimum height in tree

I'm trying to optimize the code:

data Tree = Empty | Node Integer Tree Tree
minHeight Empty = -1
minHeight (Node _ Empty Empty) = 0
minHeight (Node _ l     r    ) = (min (minHeight l) (minHeight r)) + 1

My goal is to not calculate the branches whose depth will be higher than the already known one.

by Лилия Сапурина at October 07, 2015 02:13 PM


Mein Mailsetup hat seit rund einem Tag durch einen ...

Mein Mailsetup hat seit rund einem Tag durch einen Tippfehler in der Spamabwehr-Konfiguration alle Mails abgewiesen. Sollte jetzt wieder gehen.

Wobei... das war der entspannteste Tag seit langem, so ganz ohne reinkommende Mail :-)

October 07, 2015 02:01 PM






Planet Theory

Randomness by Complexity

Let n be the following number
135066410865995223349603216278805969938881475605667027524485143851526510604859533833940287150571909441798207282164471551373680419703964191743046496589274256239341020864383202110372958725762358509643110564073501508187510676594629205563685529475213500852879416377328533906109750544334999811150056977236890927563 (RSA 1024)

What is the probability that the fifth least significant bit of the smallest prime factor of n is a one? This bit is fully defined--the probability is either one or zero. But if gave me better than 50/50 odds one way or the other I would take the bet, unless you worked for RSA Labs.

How does this jibe with a Frequentist or Bayesian view of probability? No matter how often you factor the number you will always get the same answer. No matter what randomized process was used to generate the original primes, conditioned on n being the number above, the bit is determined.

Whether we flip a coin, shuffle cards, choose lottery balls or use a PRG, we are not creating truly random bits, just a complex process who unpredictability is,we hope, indistinguishable from true randomness. We know from Impagliazzo-Wigderson, and its many predecessors and successors, that any sufficiently complex process can be converted to a PRG indistinguishable from true randomness. Kolmogorov complexity tells us we can treat a single string with no short description as a "random string".

That's how we ground randomness in complexity: Randomness is not some distribution over something not determined, just something we cannot determine.

by Lance Fortnow ( at October 07, 2015 12:53 PM




How can we write an optimal function to get alphabetical sequence for a given number?

I want a function like

string getAlphabetEncoding(num){
    //some logic
    return strForNum;

input: 1 output: a, input: 5 output: e, input: 27 output: aa, input: 676 output: YZ, input: 677 output: za...

by Sam at October 07, 2015 12:11 PM


A Better Approach to Collaborative Problem Solving


It can be difficult to build team consensus on the best way to solve a technical problem. I believe the difficulty often stems from how each of us strives to present our own solutions without really listening to others in a spirit of true team support.

To improve the way we collaborate and overcome team dysfunction, I’m proposing a new working agenda.

When Passion Leads to Unintended Team Dysfunction

At Atomic, we commonly work in agile teams of two to eight designers and developers. When it’s time to create something new, the team comes together to discuss the best solution.

We are passionate makers, and in my experience, several of us would usually enter the discussion with pre-formed ideas—often strongly held, but not fully formed. Someone would take the lead and start talking about their idea. I’d listen somewhat impatiently, waiting to hear how an edge case I’d considered wasn’t covered by their solution. As soon I spotted a weakness, I’d pounce into the conversation, pointing out where the solution failed to meet a requirement and transitioning into promoting my own idea. It wasn’t uncommon to experience meetings where the “presentation token” would be stolen back and forth multiple times between several teammates because they were behaving just like I was.

The overall team dynamic remained collegial throughout projects, but the technical design discussions could become tense. Occasionally, if someone on the team hadn’t fully bought in to an approach, tension would continue into implementation or maintenance of a solution.

I believe it’s common for engineers to seek the failure potential of any solution. Our critical mindset makes us great at ensuring quality. We also take pride in applying our creativity to high-quality solutions. What we needed at Atomic was not less critical thinking, but a better agenda for collaborative, technical design sessions that would play to our strengths without unintentionally leading to a zero-sum competition.

Atomic has already learned important lessons on how to improve collaboration. I propose building on these ideas to create the following four-step agenda for more collaborative technical design sessions.


A More Productive Approach to Problem Solving

1. Identify who has ideas to share.

To ensure that everyone will have an opportunity to present their ideas, start the discussion by asking who has something to share. When you ensure that everyone on the team will have a chance to present, individuals will be less likely to interrupt or jump in to make their idea heard.

Inviting everyone to share ideas might make the session run longer, but the marginal time increase will pay off during implementation if everyone feels heard and leaves the meeting aligned on the outcome.

2. Get aligned on the problem definition.

The foundation of every solution is the problem it solves. If the team can’t agree on what the problem is, they aren’t likely to agree on a solution.

When a team member has ideas to propose, ask them to share the way they frame the problem. Visualize the problem on a whiteboard so others can add to it as additional dimensions of the problem arise. Get the entire team aligned on a shared understanding of sunny-day and rainy-day test cases the solution must handle. Ask all team members to brainstorm edge-case situations.

Once the team is aligned on a shared understanding of the problem, invite team members to share their ideas for a solution.


3. Share and build on ideas for solutions.

Remind team members who came to the meeting with solutions that everyone will get a chance to share their ideas. Ask the team to listen to each idea from a standpoint of support. Use the mindset of “yes, and…” instead of “yes, but…” to help build constructive feedback.

For example, consider these contrasting approaches:

#1. “Yes, but your idea fails in situation X. My solution handles X well because I’ve considered…”
#2. “Yes, and I’d like to help figure out a way to extend your idea to also handle situation X. How might we add logic to…”

Approach #1 discounts all of the good thinking in an idea, highlights a weakness, and uses the weakness to gain control of the conversation to present a competing idea.

Approach #2 validates all of the good thinking in an idea, highlights a design challenge, and opens the conversation up to collaborative design.

Remind team members that it is okay to have strong opinions, but those opinions should be weakly held. That way, everyone is open to supporting others and letting others change and build on their own ideas.

Be sure to visualize solutions on a whiteboard. Seeing both the problem and solution(s) visualized will help all team members identify the part of the solution that needs to be improved instead of discounting the entire idea. It’s easy to circle a subcomponent of a technical solution to bring focus to a specific point you are trying to make. It’s hard to verbally navigate an entire team through their often misaligned mental models.

4. Choose the best path forward.

If your team has taken a constructive and supportive approach, it’s fairly common to come out of the idea sharing step with buy-in on a single solution. If there is more than one viable solution, all options will have been collaboratively defined. At this point, the selection of a solution will be depersonalized, and the decision can be made by considering more objective heuristics like cost, complexity, extensibility, etc.

Does your team make technical decisions collaboratively? If so, I hope you’ll share any useful tips you are using to facilitate constructive discussion. Please also share your feedback if you try this agenda. I’m eager to hear how it works for you.

The post A Better Approach to Collaborative Problem Solving appeared first on Atomic Spin.

by Shawn Crowley at October 07, 2015 12:00 PM

Oleg Kiselyov


Pattern matching for lambdas?

Is it possible to have pattern matching of arguments and casing for an anonymous function? If so, what is the syntax?

Ipsum lorem

by Old Geezer at October 07, 2015 11:53 AM


k-best n-dimensional assignment algorithm [on hold]

Just a simple question/request: I am looking for any open-source codes, which are able to solve k-best n-dimensional assignment problem (n > 2).

I will be very happy for any tips.

by michal at October 07, 2015 11:18 AM

How Cellular Automata is related to Automata Theory?

I have read about Automata Theory where it is about the study of abstract machines and automata.

And i know that an abstract machine takes the input, process it and create the output, just like Conway's Game of life.

But, what a formal defination can relate it to Cellular Automata, what about Elementary Cellular Automata, can it related also?

by David Kerry at October 07, 2015 11:08 AM


Consumption Based Asset Pricing

I am working on some consumption based asset pricing models. I am modelling consumption growth in several different ways. An obvious one is to model consumption growth as an AR(1) process:

$g_{t+1} = \phi_0 + \phi_1 g_t +\epsilon_{t+1 }$

Where $g_{t+1}$ is consumption growth. What are the implications of having $\phi_1=0$? What about $\phi_1=1$?

In which case is consumption growth iid?

by volcompt at October 07, 2015 11:03 AM


Decide whether a point is a vertex of a polytope?

Inspired by the question, I would like to ask the following question:

Input: A polytope specified by $\Theta=\{\vec{x}\mid A\vec{x}\leq b\}$, and its affine projection $f(\Theta)= \{(\vec{c}_1\cdot \vec{x}+ d_1, \cdots, \vec{c}_m\cdot \vec{x}+d_m)\mid A\vec{x}\leq b\}$, a point $\vec{y}\in \mathbf{Q}^m$

Decide whether $\vec{y}$ is a vertex of $f(\Theta)$, what's the complexity?

I strongly believe there is a polynomial time (one might obtain many $\vec{x}$'s which project to the given $\vec{y}$ and check one of these $\vec{x}$'s is a vertex of $\Theta$?). However, I do not have a clear algorithm.

by maomao at October 07, 2015 10:34 AM


How to calculate volatility on intraday data?

I have several weeks of minute-by-minute stock data (start and end prices, volume). Everything I've read so far leads me to believe there isn't a standard method for volatility, which is leaving me confused on what to use.

I want to calculate the volatility for each minute so I can see how volatility is changing through the day and whether there is a correlation with other variables. What methods would apply here?

by brykm at October 07, 2015 10:30 AM



Finding mean vector and covariance matrix for annual returns given quarterly returns

I am currently trying to calculate a vector for the mean annual returns of 4 different asset classes along with their 4x4 covariance matrix in excel. However, I am having problems since the data I have been supplied is quarterly returns (as calculated from the relevant stock indicies). I know that quarterly and annual returns are linked by the following equation $$r_{Annual,x}=\prod_{q=1}^{4}(1+r_{q,x})-1$$ Where $r_{i}$ is the return for the $i^{th}$ quarter and $x$ denotes that this is the $x^{th}$ year. What I have done is calculate $(1+r_{i,j})^4-1$ where $r_{i,j}$ is the $j^{th}$ quarterly return observation for the $i^{th}$ asset class (I have 4 asset class).

This results in 4 new vectors where each quarterly return has been converted to an effective annual return. I then apply the mean and covariance functions to these converted values to get the mean vector and covariance matrix.

However I get the feeling this is wrong. Any feedback would be greatly appreciated and if you could provide a worked example with the following (quarterly return) data simulated in excel that would be great.

Asset 1 Asset 2 Asset 3 Asset 4
1.98%   2.99%   1.91%   3.24%
-1.22%  -1.87%  1.18%   2.35%
5.00%   6.10%   1.08%   4.46%
2.45%   -0.10%  1.75%   3.06%
5.70%   3.70%   1.47%   2.10%
8.29%   1.96%   1.51%   0.62%
-3.36%  -1.25%  1.11%   2.72%
-2.01%  0.21%   1.49%   0.90%
-4.99%  2.44%   1.48%   2.02%
4.67%   -0.29%  1.48%   1.51%
0.86%   2.92%   1.54%   2.38%
7.52%   -0.75%  1.43%   3.36%
-1.24%  3.11%   1.20%   1.93%
-4.25%  1.41%   1.35%   0.84%
3.89%   -0.25%  0.86%   1.46%
0.67%   4.13%   1.45%   2.30%
3.93%   1.53%   1.43%   1.35%
-5.00%  -0.63%  1.50%   3.35%
3.47%   2.99%   1.50%   3.06%
7.76%   3.61%   0.98%   3.79%
-8.26%  1.03%   1.18%   2.90%
5.45%   1.57%   1.05%   3.38%
-2.65%  2.25%   1.44%   1.45%
0.76%   7.50%   1.52%   1.79%
4.55%   -2.72%  1.31%   1.82%
1.32%   8.70%   1.36%   1.24%
-3.02%  -1.43%  1.52%   3.92%
2.05%   1.20%   1.94%   2.50%
7.37%   0.29%   1.64%   2.73%
-0.66%  2.36%   1.75%   2.68%
-6.86%  -1.40%  1.35%   2.62%
13.55%  6.03%   1.30%   1.33%
-2.23%  5.26%   1.44%   1.48%
-4.26%  0.45%   1.61%   2.93%
6.75%   1.70%   1.08%   1.26%
-0.84%  -2.16%  0.89%   2.45%
-0.66%  4.03%   1.66%   2.98%
0.15%   1.94%   1.21%   3.13%
-4.74%  0.26%   1.44%   2.35%
1.29%   -0.76%  0.96%   1.52%
13.22%  0.37%   1.74%   4.11%
-2.37%  0.41%   1.25%   2.71%
3.16%   1.27%   1.22%   2.54%
1.56%   4.54%   0.87%   3.55%
0.97%   2.18%   1.52%   1.75%
3.32%   2.41%   1.64%   2.12%
-1.80%  -2.07%  1.12%   0.77%
7.16%   1.87%   1.61%   3.68%
10.65%  4.89%   1.25%   2.47%
5.47%   4.26%   1.33%   1.94%
0.31%   5.53%   1.32%   3.89%
3.59%   4.55%   1.46%   2.27%
7.24%   1.50%   1.67%   1.32%
3.06%   0.87%   1.78%   3.43%
7.43%   3.92%   1.58%   3.05%
11.29%  2.17%   1.47%   2.76%
11.85%  3.64%   1.59%   1.57%
1.68%   -1.25%  1.48%   2.37%
9.93%   -0.53%  1.95%   1.76%
-9.09%  -3.53%  1.32%   3.30%
-3.09%  3.00%   1.41%   2.86%
2.99%   1.49%   1.34%   1.97%
0.28%   3.81%   1.30%   2.27%
9.23%   6.06%   1.17%   2.44%
6.74%   7.60%   1.52%   1.66%
0.48%   5.72%   1.61%   1.25%
-2.28%  0.96%   1.30%   2.69%
4.16%   -1.94%  1.68%   2.95%
7.21%   4.77%   1.59%   1.40%
-3.97%  -0.84%  1.56%   2.11%
2.48%   1.46%   1.22%   3.60%
5.89%   1.65%   1.83%   1.62%
-2.73%  4.73%   1.37%   1.99%
9.12%   -0.52%  1.29%   2.89%
1.29%   3.89%   1.35%   2.47%
4.18%   -1.76%  1.58%   3.85%
-9.78%  2.66%   1.09%   1.63%
10.73%  -1.80%  1.57%   3.11%
3.37%   -0.03%  1.67%   1.57%
11.17%  5.92%   1.26%   2.04%
7.13%   3.02%   1.79%   2.29%
13.12%  -0.75%  1.25%   2.72%
-2.73%  1.45%   1.04%   3.61%
-8.38%  -1.19%  1.16%   2.15%
4.63%   0.53%   1.33%   3.00%
1.93%   4.88%   1.39%   3.65%
-6.75%  0.74%   0.95%   2.40%
5.43%   0.75%   1.33%   2.76%
-4.68%  3.02%   0.90%   2.38%
4.99%   4.99%   1.66%   0.84%
-0.44%  3.68%   1.51%   1.84%
13.18%  3.82%   1.69%   2.88%
8.05%   2.02%   0.79%   2.50%
-3.39%  1.93%   1.15%   2.65%
21.79%  3.30%   1.36%   2.12%
-5.96%  6.49%   1.58%   3.43%
1.43%   -1.17%  0.89%   0.57%
10.21%  4.70%   1.27%   1.69%
-13.53% 7.99%   1.31%   2.62%
-5.25%  -0.46%  1.08%   3.75%

by user135784 at October 07, 2015 09:59 AM

Use of Black-Scholes Model on Guaranteed Fund Investment

I am stuck with a revision question at home on Black-Scholes pricing model.

The question is on a fund manager selling one unit of the fund to a customer for $S(0)$ at time $0$ and then guaranteeing at time T to pay customer maximum of $S(0)$ or $e^(-at) x S(T)$ for $a>0$ being the fund fee. The model I read about talks about maximum of $S(T)-K$ or $0$.

How can one value the customer pay-off at time $0$ or time t?

by Mulita at October 07, 2015 09:50 AM


How to copy a column of code manipulated into a specific position?

I found myself needing to do the following today and realized I had no good way to do it. I wanted to use the enum types from this:


to turn this:

static const struct RFstring ast_type_strings[] = { RF_STRING_STATIC_INIT("root"), RF_STRING_STATIC_INIT("block"), RF_STRING_STATIC_INIT("variable declaration"), RF_STRING_STATIC_INIT("return statement"), RF_STRING_STATIC_INIT("type declaration"), RF_STRING_STATIC_INIT("type operator"), RF_STRING_STATIC_INIT("type leaf") } 

into this:

static const struct RFstring ast_type_strings[] = { [AST_ROOT] = RF_STRING_STATIC_INIT("root"), [AST_BLOCK] = RF_STRING_STATIC_INIT("block"), [AST_VARIABLE_DECLARATION] = RF_STRING_STATIC_INIT("variable declaration"), [AST_RETURN_STATEMENT] = RF_STRING_STATIC_INIT("return statement"), [AST_TYPE_DECLARATION] = RF_STRING_STATIC_INIT("type declaration"), [AST_TYPE_OPERATOR] = RF_STRING_STATIC_INIT("type operator"), [AST_TYPE_LEAF] = RF_STRING_STATIC_INIT("type leaf") } 

Unfortunately in the few seconds of thinking how I can do it the smart way I came up with nothing so I did it manually. I even tried to help myself by rectangle copying the enum values and then yanking them. That also took the newlines in and did not help.

So the question is how can I do it in a smarter way with emacs?

submitted by LefterisJP
[link] [8 comments]

October 07, 2015 09:36 AM


reduction of maxcut problem

Show that if the MAX CUT decision problem can be solved in polynomial time so can the MAX CUT optimization problem by writing an algorithm that solves the optimization problem using an algorithm for the decision problem as a subroutine


INSTANCE: A graph G = (V,E), a function c giving each edge e ∈ E an integer capacity c(e) and an integer B. QUESTION: (D) Does the graph have a cut of size at least B? (O) Find a cut with the maximum size.


Find the maxcut value k by binary search from 1 to m using decision subroutine
     where m is global maximum value
for all edge e ∈ E
  select a edge e=(u,v) ∈ E, construct G' after removing egde e from G
  if SUBROUTINE(G',k)=1 =>removal of edge has no effect on MAXCUT
    add u and v to same set S or S'(use 2 SAT to determine which group)
    add u and v to different sets S or S'

This algorithm works only if there is only one MAXCUT solution in the graph(i have considered only unweighted graph). In case of a complete graph this algorithm fails as there is more than one MAXCUT solution present.

consider 4 nodes complete graph with edges (1,2),(1,3),(1,4),(2,3),(2,4),(3,4). The algorithm fails because when we remove any one edge still there is a MAXCUT of size 4.

Please suggest any other algorithm which works or any additional condition which can solve this problem.

Thanks in advance

by suhastheju at October 07, 2015 09:33 AM


Fred Wilson

Twitter Moments

So the thing I blogged about last week launched yesterday. Twitter is calling it Moments.

I think this is a big deal for first time and casual users of Twitter. It’s an easier way to consume the content in Twitter for people who don’t have the time or inclination to customize Twitter to work for them the way many of us have.

Since the AVC community was fairly negative on this in concept, I’m wondering how all of you are thinking about it now that it is out. Let the conversation commence.

by Fred Wilson at October 07, 2015 08:54 AM



What is the time complexity of checking if a number is prime?

Could some one please explain how to get the time complexity of checking if a number is prime? Im really confused as to if its O(sqrt(n)) or O(n^2).

I iterate from i=2 to sqrt(n) and continuously checking if n%i!=0

However, do we consider the complexity of the square root function also?

If we do, the Newtons method for finding square roots has a complexity of n^2.


by user3409705 at October 07, 2015 07:56 AM



Inference rule with two conclusions or rather inverse function application

I want to express a simple correctness theorem for a term-desugaring function $\Delta$. The goal is to express that if the evaluation of a desugared term yields a value, this value is the desugared result of evaluating the original term.

Since the implication operator $\Rightarrow$ is already used in my dynamic semantics, I'd rather use an inference rule to express the implication. However, in the premise I have to introduce a variable $v$ that is the result of applying $\Delta$ to a variable bound in the conclusion. Since there is no inverse of $\Delta$ I am somewhat stuck between two odd variants to express the relation.

The first variant uses two conclusions (like multiple premises the idea is that both always hold):

$$ \frac{\Delta(t) \Downarrow v}{t \Downarrow v^\prime \quad \Delta(v^\prime) = v} $$

This looks odd to me. The alternative would be to see the rule as a scheme subject to some substitution over the meta-variables $t$ and $v$. In that case I could also write:

$$ \frac{\Delta(t) \Downarrow \Delta(v)}{t \Downarrow v} $$

This is obviously cleaner, but I am not quite happy with the implications (pun not intended) to the theorem: In the first case, if $v \neq \Delta(v^\prime)$ the premise still holds, but the conclusion does not, so $\Delta$ is not correct. In the second case the premise would not apply - so correctness simply does not cover the case at all.

Is there a common pattern to deal with such problems? Did I overlook something?

by choeger at October 07, 2015 07:09 AM

Disco Zoo Complexity

A popular mobile game, DiscoZoo, is about "rescuing" animals from a 5x5 grid of cells. Each animal represents a unique pattern (some have 3 cells, some have 4). The object is that, given this 5x5 grid, find all (or as many as possible) of the given animals within 10 turns, where the pattern of animals can be at any place within the grid (but cannot be rotated, change size, etc. - only translated such that all the coordinates of each animal are valid, and there is only one animal per square). We can only "get" an animal if we can successfully find each of the squares of that animal within the number of turns left; otherwise, we do not.

Here is an example. The sheep has a pattern of 4 horizontal adjacent squares, so clearly the squares (5, 2) (indexing from 1 and as (row, col)) and (5, 5) will be a sheep (and thus get the sheep). For the rabbit, it has a pattern of 4 vertical adjacent squares, so the squares (2, 1) and (4, 1) will be the rabbit. For the cow, it has a pattern of 3 horizontal adjacent squares, and since square (4, 2) was selected (and no animal placed there), the only square left for the cow is (4, 5). This is assuming that we do have attempts left (but there aren't any for this example).

enter image description here

More formally, we are given a 2D array $a[1..n][1..n]$, and $m$ animals $A_1,...A_m$ such that $A_i$ has value $p_i$. In the grid are the values of the $p_i$'s (in their respective positions) and all the rest are 0. Each cell either has 0 or one non-zero value (i.e., only one animal). However, none of the values are given to us at the start. Once we "select" a cell, that cell's value is revealed to us. Our objective is to find all or the maximum possible number of non-zero cells in the grid within $g$ turns.

What is the complexity of this problem in general?

Edit: after some thinking about the problem, let's restrict ourselves to having $m$ copies of the same animal $A$, and let that animal be a 2x2 square (i.e., occupies coordinates $(x, y), (x+1, y), (x, y+1), (x+1, y+1)$). What is the complexity of this version of the problem?

Even if we restrict further by looking at 1x1 squares, this is equivalent to finding nonzero elements in the matrix - it is trivially solvable in $\Theta(n^2)$ time, but can we do better?

by Ryan at October 07, 2015 07:09 AM



What are some efficient ways to find the differences between two large corpuses of text that have similar, but differently ordered content?

I have two large files containing paragraphs of English text:

  1. The first text is about 200 pages long and has about 10 paragraphs per page (each paragraph is 5 sentences long).
  2. The second text contains almost precisely the same paragraphs and text as the first. It is also 200 pages long with 10 paragraphs per page. However, the paragraphs are randomized and in a different order when compared to the first text. Also, a large percentage of the paragraphs have small changes in wording compared to similar paragraphs. For example, a paragraph in the first text might have a sentence like Like Jimmy, I wanted to go to the palace while the corresponding sentence in the paragraph of the second text would read Like Jimmy, I really wanted to go to the castle.

I want to be able to capture the changes here like the addition of really and the deletion of palace with replacement of castle. If the paragraphs were roughly aligned, then this would be pretty trivial as there are plenty of ways to diff text. However, since the paragraphs aren't aligned, that isn't the case.

If the files were small (handful of paragraphs), Levenshtein Distance probably would work fine, but because the files are huge, it would be inefficient to compare each paragraph of text 1 to each paragraph of text 2 to find out which paragraphs match.

What would be some other approaches to this problem to handle it efficiently?

by vikram7 at October 07, 2015 07:01 AM

Intermediate complexity classes

Given $c\geq1$ is there a appropriate time complexity regime between following?

polynomial $n^c$ and superpolynomial $2^{\log^c n}$

superpolynomial $2^{\log^c n}$ and subexponential $2^{n^{1/c}}$

subexponential $2^{n^{1/c}}$ and exponential $2^{cn}$

by Arul at October 07, 2015 07:00 AM



What's the equivalent of Clojure's iterate function in Racket

I'm playing with Racket today, and trying to produce an indefinite sequence of numbers based on multiple applications of the same function.

In Clojure I'd use the iterate function for this, but I'm not sure what would be the equivalent in Racket.

by interstar at October 07, 2015 06:57 AM


Turing machine to programing languages [on hold]

can you enumerate the different programming techniques in a Turing machine. (Describe one technique briefly.)

by Kenneth Reyes at October 07, 2015 06:56 AM

Recursive algorithm runtime [duplicate]

This question already has an answer here:

For the following algorithm, I am looking for a recurrence relation that expresses its runtime/time complexity T(n):

 Algorithm ƒ(non-negative integer n)
    ... [ set of operations that take Θ((n log n)^3) time ] ...
            if n < 11 then return random integer
            for i=1 to 8 do
            x ← x+ ƒ(|_n/2_|)

by katebeckett at October 07, 2015 06:52 AM




How to implement Konno's Mean-Absolute Deviation Portfolio Optimization Model using LP methods in Excel

Konno proposed a LP method for portfolio optimization using the Mean Absolute Deviation (MAD)

by purbani at October 07, 2015 05:41 AM


Wes Felter

Planet Theory

Tenure-track Faculty at Santa Clara University (apply by December 1 2015)

Our strongest interest is in candidates with research interests in all areas related to data science, but we also welcome applicants with research interests in design and analysis of algorithms. Position available starting in September 2016. Ph.D. or equivalent required by September 2016. Undergraduate teaching only.



by theorycsjobs at October 07, 2015 04:13 AM


Calculating repeat based on criteria

Clip	Calling Date	Product
3665642	9/3/2015 10:49	Others
3829930	9/27/2015 14:51	Others
4870146	9/9/2015 12:15	Others
5025501	9/27/2015 16:59	Others
5025516	9/3/2015 19:10	Others
5025516	9/3/2015 20:13	Others
5025770	9/10/2015 9:22	Others
5025994	9/14/2015 13:33	Others
5035642	9/12/2015 9:30	HS
5035642	9/12/2015 9:30	Others
5035678	9/6/2015 11:23	MS
5035678	9/6/2015 12:35	Others
5035678	9/11/2015 18:02	Others
5035678	9/23/2015 21:21	Others
6225816	9/26/2015 14:20	Others
6358231	9/27/2015 22:10	Others
6358231	9/30/2015 19:48	Others
6362841	9/9/2015 11:02	Others
6384177	9/12/2015 14:42	Others
6396981	9/26/2015 10:06	Others
6442061	9/12/2015 13:59	HS
6525701	9/24/2015 11:17	Others

If the same number (Clip) is appearing in combination of same Product within 3 days period prior to calling Date then it would be considered as Repeat call.

I've tried below code but did not get the result. It is being advised to me by one of my acquaintances who is good in programming. Kindly assist me to correct it or provide alternative solution.

Option Compare Database
Option Explicit

Private Sub Command0_Click()

Dim db As Database
Dim rs As DAO.Recordset

Dim sql As String

sql = " SELECT tblPrestige.Clip, tblPrestige.Product, tblPrestige.[Cch Date] " _
        & " From tblPrestige " _
        & " GROUP BY tblPrestige.Clip, tblPrestige.Product, tblPrestige.[Cch Date] " _
        & " ORDER BY tblPrestige.Clip, tblPrestige.Product, tblPrestige.[Cch Date] "

Set db = CurrentDb
Set rs = db.OpenRecordset(sql)

Dim pClip As Double
Dim pProd As String
Dim pDate As Date

Do While rs.EOF = False

If rs.Fields("Clip") = pClip And rs.Fields("Product") = pProd And DateDiff("d", rs.Fields("[Cch Date]"), pDate) <= 3 Then

WithinSLA rs.Fields("Clip"), rs.Fields("Product"), rs.Fields("[Cch Date]")

End If


pClip = rs.Fields("Clip")
pProd = rs.Fields("Product")
pDate = rs.Fields("[Cch Date]")


End Sub

Private Sub WithinSLA(clip As Double, product As String, cchDate As Date)

DoCmd.SetWarnings False

Dim sql As String

sql = " Insert into tblPresExceed (Clip,Product,[Cch Date]) " _
    & " Values (" & clip & ", '" & product & "', #" & cchDate & "#) "
'' Execute it
DoCmd.RunSQL sql

DoCmd.SetWarnings True

End Sub

by user3434664 at October 07, 2015 04:12 AM




Making an algorithm

So I'm taking a "Learning Java" class this semester at my university as an Econ major and we have an assignment due in which we have to make an algorithm for a list of situations but I have no idea even where to begin. This is probably elementary to most of you but If I got help for this first one, I am confident that I can do the rest of them. Thanks! Sorry the formatting is super weird, I copy/pasted from my professor's word doc.

1.) Make an algorithm for a program that will obtain from the user an hourly pay rate and the number of hours worked for the week.

The program will then calculate and output their weekly pay according to the following: - Regular pay is the pay up to 40 hours. - Overtime pay is pay for the hours over 40. Overtime is paid at a rate of 1.5 times the hourly rate. - Gross pay is the sum of the regular pay and the overtime pay. Also test your algorithm using the following sets of values. Show your work.

Test 1: hourly pay rate 10.50 number of hours 20 Test 2: hourly pay rate 9.25 number of hours 45

submitted by MuteIsCool
[link] [comment]

October 07, 2015 03:38 AM




Monad Transformers in C#

I am working on using monad transformers in C#.
I would like to know if the following code I present, shows that I have understood this.
I am fairly new to this so any feedback / comments are really welcome.
This example is just for wrapping a maybe monad in a validation monad.

using System;
using NUnit.Framework;

namespace Monads
    public static class MaybeExtensions
        public static IMaybe<T> ToMaybe<T>(this T value)
            if (value == null)
                return new None<T>();

            return new Just<T>(value);

    public interface IMaybe<T>
        IMaybe<U> Select<U>(Func<T, U> f);

        IMaybe<U> SelectMany<U>(Func<T, IMaybe<U>> f);

        U Fold<U>(Func<U> error, Func<T, U> success);

    public class Just<T> : IMaybe<T>
        public Just(T value)
            this.value = value;


        public IMaybe<U> Select<U>(Func<T, U> f)
            return f(value).ToMaybe();

        public IMaybe<U> SelectMany<U>(Func<T, IMaybe<U>> f)
            return f(value);

        public U Fold<U>(Func<U> error, Func<T, U> success)
            return success(value);

        public IValidation<U, T> ToValidationT<U>()
            return new ValidationMaybeT<U, T>(this, default(U));

        private readonly T value;

    public class None<T> : IMaybe<T>
        public IMaybe<U> Select<U>(Func<T, U> f)
            return new None<U>();

        public IMaybe<U> SelectMany<U>(Func<T, IMaybe<U>> f)
            return new None<U>();

        public U Fold<U>(Func<U> error, Func<T, U> success)
            return error();

        public IValidation<U, T> ToValidationT<U>(U exceptionalValue)
            return new ValidationMaybeT<U, T>(this, exceptionalValue);

    public class Customer
        public Customer(string name)
            Name = name;

        public string Name { get; set; }

    public interface IValidation<T, U>
        IValidation<T, V> Select<V>(Func<U, V> f);

        IValidation<T, V> SelectMany<V>(Func<U, IValidation<T, V>> f);

    public class ValidationError<T, U> : IValidation<T, U>
        public ValidationError(T error)
            Error = error;

        public IValidation<T, V> Select<V>(Func<U, V> f)
            return new ValidationError<T, V>(Error);

        public IValidation<T, V> SelectMany<V>(Func<U, IValidation<T, V>> f)
            return new ValidationError<T, V>(Error);

        public T Error { get; private set; }

    public class ValidationSuccess<T, U> : IValidation<T, U>
        public ValidationSuccess(U value)
            Result = value;

        public IValidation<T, V> Select<V>(Func<U, V> f)
            return new ValidationSuccess<T, V>(f(Result));

        public IValidation<T, V> SelectMany<V>(Func<U, IValidation<T, V>> f)
            return f(Result);

        public U Result { get; private set; }

    public class ValidationMaybeT<T, U> : IValidation<T, U>
        public ValidationMaybeT(IMaybe<U> value, T error)
            Value = value;
            Error = error;

        public IValidation<T, V> Select<V>(Func<U, V> f)
            return Value.Fold<IValidation<T, V>>(() => new ValidationError<T, V>(Error), s => new ValidationSuccess<T, V>(f(s)));

        ValidationError<T, V> SelectManyError<V>()
            return new ValidationError<T, V>(Error);

        public IValidation<T, V> SelectMany<V>(Func<U, IValidation<T, V>> f)
            return Value.Fold(() => SelectManyError<V>(), s => f(s));

        public IMaybe<U> Value { get; private set; }

        public T Error { get; private set; }

    public interface ICustomerRepository
        IValidation<Exception, Customer> GetById(int id);

    public class CustomerRepository : ICustomerRepository
        public IValidation<Exception, Customer> GetById(int id)

            if (id < 0)
                return new None<Customer>().ToValidationT<Exception>(new Exception("Customer Id less than zero"));

            return new Just<Customer>(new Customer("Structerre")).ToValidationT<Exception>();

    public interface ICustomerService
        void Delete(int id);

    public class CustomerService : ICustomerService
        public CustomerService(ICustomerRepository customerRepository)
            this.customerRepository = customerRepository;


        public void Delete(int id)
                .SelectMany(x => SendEmail(x).SelectMany(y => LogResult(y)));


        public IValidation<Exception, Customer> LogResult(Customer c)
            Console.WriteLine("Deleting: " + c.Name);
            return new ValidationSuccess<Exception, Customer>(c);
            //return new ValidationError<Exception, Customer>(new Exception("Unable write log"));

        private IValidation<Exception, Customer> SendEmail(Customer c)
            Console.WriteLine("Emailing: " + c.Name);
            return new ValidationSuccess<Exception, Customer>(c);

        ICustomerRepository customerRepository;

    public class MonadTests
        public void Testing_With_Maybe_Monad()
            new CustomerService(new CustomerRepository()).Delete(-1);

Another smaller sub question is if C# had higher kinded types could I just implement this class once (ValidationT) and it work for all other wrapped monads or is this incorrect?

by Blair Davidson at October 07, 2015 02:32 AM

How to curry a Func and pass it to another Func with fewer arguments inside the constructor?

I have a class that looks like this

 public class CacheManger<T> where T : class, new()
     public CacheManger(Func<int, T> retriveFunc, Func<List<T>> loadFunc)
        _retriveFunc = retriveFunc;
        _loadFunc = loadFunc;

    public T GetValueByid(int id)
        return _retriveFunc(id);

I have another class which is as follows

public class AccountCache 

    public AccountCache ()
        CacheManager = new CacheManger<Account>(GetAccount, LoadAcc);
        // LoadAcc is another method that returns a List<Account>

    private Account GetAccount(int accID)
        return CacheManager.CacheStore.FirstOrDefault(o => o.ID == accID);
        //CacheStore is the List<T> in the CacheManager.(internal datastore)

    public Account GetProdServer(int accID)
        return CacheManager.GetValueByid(accID);

Now as you could see I can pass GetAccount to the constructor of CacheManager. Now I have another class where i have a method like this

public User GetUser(string accountID, int engineID)


How could I pass this function to the CacheManager's constructor. I could carry the function, but then how could I pass it as a constructor argument?

What I am doing doing right now:

private User GetUserInternal(string accountID, int engineID)
    /* actual code to get user */

private Func<string, Func<int, User>> Curry(Func<string, int, User> function)
    return x => y => function(x, y);

public UserGetAccount(string accountID, int engineID)
    _retriveFunc = Curry(GetUserInternal)(accountID);
    CacheManager.RetrivalFunc = _retriveFunc; //I really dont want to do this. I had to add a public property to CacheManager class for this
    return CacheManager.GetValueByid(engineID);// This will call GetUserInternal

by Ashley John at October 07, 2015 02:23 AM



Why is foldl defined in a strange way in Racket?

In Haskell, like in many other functional languages, the function foldl is defined such that, for example, foldl (-) 0 [1,2,3,4] = -10.

This is OK, because foldl (-) 0 [1, 2,3,4] is, by definition, ((((0 - 1) - 2) - 3) - 4).

But, in Racket, (foldl - 0 '(1 2 3 4)) is 2, because Racket "intelligently" calculates like this: (4 - (3 - (2 - (1 - 0)))), which indeed is 2.

Of course, if we define auxiliary function flip, like this:

(define (flip bin-fn)
  (lambda (x y)
    (bin-fn y x)))

then we could in Racket achieve the same behavior as in Haskell: instead of (foldl - 0 '(1 2 3 4)) we can write: (foldl (flip -) 0 '(1 2 3 4))

The question is: Why is foldl in racket defined in such an odd (nonstandard and nonintuitive) way, differently than in any other language?

by Racket Noob at October 07, 2015 01:32 AM

Functional programming: AggregateUntil function

so here's an interesting problem. i'm trying to use a functional approach to solve something that would be really easy in an imperative manner. the goal is to take a sequence and foldl/reduce it to to a single value, however I want to stop and exit early once the accumulated value satisfies a given condition. you might say I want to define IEnumerable<T>.AggregateUntil. here's how I would write it in an imperative fashion:

public static TAccumulate AggregateUntil<TSource, TAccumulate>(
    this IEnumerable<TSource> source,
    TAccumulate seed,
    Func<TSource, TAccumulate, TAccumulate> accumulate,
    Func<TAccumulate, bool> until)
    var result = seed;

    foreach (var s in source)
        result = accumulate(s, result);

        if (until(result))

    return result;

how would you go about writing that in a functional way, without the foreach loop? i'm specifically trying to find a way of doing it in a way that doesn't cause me to have to re-implement Aggregate in its entirety, with just this one little behavior difference. i also would like to do so without iterating over the collection twice. I'm still working on this and will post an update if I figure it out, but if someone out there wants to help with the challenge that's welcome as well.

EDIT #1:

here's a stab at how to implement it without the Until concept, just to get the juices flowing:

private static TAccumulate AggregateUntil<TSource, TAccumulate>(
    IEnumerable<TSource> source,
    TAccumulate seed,
    Func<TSource, TAccumulate, TAccumulate> accumulate)
    using (var enumerator = source.GetEnumerator())
        return AggregateUntil(enumerator, seed, accumulate);

private static TAccumulate AggregateUntil<TSource, TAccumulate>(
    IEnumerator<TSource> source,
    TAccumulate seed,
    Func<TSource, TAccumulate, TAccumulate> accumulate)
    return source.MoveNext()
        ? AggregateUntil(source, accumulate(source.Current, seed), accumulate, until)
        : seed;

EDIT #2:

OK, I've implemented my goal function as far as functionality goes, but I haven't yet figured how to do it without just reimplementing basically all of the foldl/reduce/aggregate logic + the until condition. I feel like I'm missing a basic trick of FP composability if I can't figure out how to re-use the logic in Aggregate as-is:

private static TAccumulate AggregateUntil<TSource, TAccumulate>(
    IEnumerable<TSource> source,
    TAccumulate seed,
    Func<TSource, TAccumulate, TAccumulate> accumulate,
    Func<TAccumulate, bool> until)
    using (var enumerator = source.GetEnumerator())
        return AggregateUntil(enumerator, seed, accumulate, until);

private static TAccumulate AggregateUntil<TSource, TAccumulate>(
    IEnumerator<TSource> source,
    TAccumulate seed,
    Func<TSource, TAccumulate, TAccumulate> accumulate,
    Func<TAccumulate, bool> until)
    TAccumulate result;

    return source.MoveNext()
        ? until(result = accumulate(source.Current, seed))
            ? result
            : AggregateUntil(source, result, accumulate, until)
        : seed;

by skb at October 07, 2015 01:30 AM

arXiv Discrete Mathematics

Sum of Squares Basis Pursuit with Linear and Second Order Cone Programming. (arXiv:1510.01597v1 [math.OC])

We devise a scheme for solving an iterative sequence of linear programs (LPs) or second order cone programs (SOCPs) to approximate the optimal value of semidefinite and sum of squares (SOS) programs. The first LP and SOCP-based bounds in the sequence come from the recent work of Ahmadi and Majumdar on diagonally dominant sum of squares (DSOS) and scaled diagonally dominant sum of squares (SDSOS) polynomials. We then iteratively improve on these bounds by pursuing better bases in which more relevant SOS polynomials admit a DSOS or SDSOS representation. Different interpretations of the procedure from primal and dual perspectives are given. While the approach is applicable to semidefinite relaxations of general polynomial programs, we apply it to two problems of discrete optimization: the maximum independent set problem and the partition problem. We further show that some completely trivial instances of the partition problem lead to strictly positive polynomials on the boundary of the sum of squares cone and hence make the SOS relaxation fail.

by <a href="">Amir Ali Ahmadi</a>, <a href="">Georgina Hall</a> at October 07, 2015 01:30 AM

Splicing Systems from Past to Future: Old and New Challenges. (arXiv:1510.01574v1 [cs.FL])

A splicing system is a formal model of a recombinant behaviour of sets of double stranded DNA molecules when acted on by restriction enzymes and ligase. In this survey we will concentrate on a specific behaviour of a type of splicing systems, introduced by P\u{a}un and subsequently developed by many researchers in both linear and circular case of splicing definition. In particular, we will present recent results on this topic and how they stimulate new challenging investigations.

by <a href="">Luc Boasson</a>, <a href="">Paola Bonizzoni</a>, <a href="">Clelia De Felice</a>, <a href="">Isabelle Fagnot</a>, <a href="">Gabriele Fici</a>, <a href="">Rocco Zaccagnino</a>, <a href="">Rosalba Zizza</a> at October 07, 2015 01:30 AM

Optimal Fronthaul Compression for Synchronization in the Uplink of Cloud Radio Access Networks. (arXiv:1510.01545v1 [cs.NI])

A key problem in the design of cloud radio access networks (CRANs) is that of devising effective baseband compression strategies for transmission on the fronthaul links connecting a remote radio head (RRH) to the managing central unit (CU). Most theoretical works on the subject implicitly assume that the RRHs, and hence the CU, are able to perfectly recover time synchronization from the baseband signals received in the uplink, and focus on the compression of the data fields. This paper instead dose not assume a priori synchronization of RRHs and CU, and considers the problem of fronthaul compression design at the RRHs with the aim of enhancing the performance of time and phase synchronization at the CU. The problem is tackled by analyzing the impact of the synchronization error on the performance of the link and by adopting information and estimationtheoretic performance metrics such as the rate-distortion function and the Cramer-Rao bound (CRB). The proposed algorithm is based on the Charnes-Cooper transformation and on the Difference of Convex (DC) approach, and is shown via numerical results to outperform conventional solutions.

by <a href="">Eunhye Heo</a>, <a href="">Osvaldo Simeone</a>, <a href="">Hyuncheol Park</a> at October 07, 2015 01:30 AM

High Precision Fault Injections on the Instruction Cache of ARMv7-M Architectures. (arXiv:1510.01537v1 [cs.CR])

Hardware and software of secured embedded systems are prone to physical attacks. In particular, fault injection attacks revealed vulnerabilities on the data and the control flow allowing an attacker to break cryptographic or secured algorithms implementations. While many research studies concentrated on successful attacks on the data flow, only a few targets the instruction flow. In this paper, we focus on electromagnetic fault injection (EMFI) on the control flow, especially on the instruction cache. We target the very widespread (smartphones, tablets, settop-boxes, health-industry monitors and sensors, etc.) ARMv7-M architecture. We describe a practical EMFI platform and present a methodology providing high control level and high reproducibility over fault injections. Indeed, we observe that a precise fault model occurs in up to 96% of the cases. We then characterize and exhibit this practical fault model on the cache that is not yet considered in the literature. We comprehensively describe its effects and show how it can be used to reproduce well known fault attacks. Finally, we describe how it can benefits attackers to mount new powerful attacks or simplify existing ones.

by <a href="">Lionel Rivi&#xe8;re</a>, <a href="">Zakaria Najm</a>, <a href="">Pablo Rauzy</a>, <a href="">Jean-Luc Danger</a>, <a href="">Julien Bringer</a>, <a href="">Laurent Sauvage</a> at October 07, 2015 01:30 AM

Spectral Properties of Laplacian and Stochastic Diffusion Process for Edge Expansion in Hypergraphs. (arXiv:1510.01520v1 [cs.DM])

There has been recent work [Louis STOC 2015] to analyze the spectral properties of hypergraphs with respect to edge expansion. In particular, a diffusion process is defined on a hypergraph such that within each hyperedge, measure flows from nodes having maximum weighted measure to those having minimum. The diffusion process determines a Laplacian, whose spectral properties are related to the edge expansion properties of the hypergraph.

It is suggested that in the above diffusion process, within each hyperedge, measure should flow uniformly in the complete bipartite graph from nodes with maximum weighted measure to those with minimum. However, we discover that this method has some technical issues. First, the diffusion process would not be well-defined. Second, the resulting Laplacian would not have the claimed spectral properties.

In this paper, we show that the measure flow between the above two sets of nodes must be coordinated carefully over different hyperedges in order for the diffusion process to be well-defined, from which a Laplacian can be uniquely determined. Since the Laplacian is non-linear, we have to exploit other properties of the diffusion process to recover a spectral property concerning the "second eigenvalue" of the resulting Laplacian. Moreover, we show that higher order spectral properties cannot hold in general using the current framework.

by <a href="">T-H. Hubert Chan</a>, <a href="">Zhihao Gavin Tang</a>, <a href="">Chenzi Zhang</a> at October 07, 2015 01:30 AM

Efficient Certificateless Signcryption Tag-KEMs for Resource-constrained Devices. (arXiv:1510.01446v1 [cs.CR])

Efficient certificateless one-pass session key establishment protocols can be constructed from key encapsulation mechanisms (KEMs) by making use of tags and signcryption schemes. The resulting primitives are referred to as Certificateless Signcryption Tag Key Encapsulation Mechanisms (CLSC-TKEMs). In this paper we propose two novel CLSC-TKEM protocols, the first, named LSW-CLSC-TKEM, makes use of the signature scheme of Liu et al., the second, named DKTUTS-CLSC-TKEM, is based on the direct key transport using a timestamp (DKTUTS) protocol first described by Zheng. In order to achieve greater efficiency both schemes are instantiated on elliptic curves without making use of pairings and are therefore good candidates for deployment on resource constrained devices.

by <a href="">Wenhao Liu</a>, <a href="">Maurizio Adriano Strangio</a>, <a href="">Shengbao Wang</a> at October 07, 2015 01:30 AM

Distance-2 MDS codes and latin colorings in the Doob graphs. (arXiv:1510.01429v1 [math.CO])

The maximum independent sets in the Doob graphs D(m,n) are analogs of the distance-2 MDS codes in Hamming graphs and of the latin hypercubes. We prove the characterization of these sets stating that every such set is semilinear or reducible. As related objects, we study vertex sets with maximum cut (edge boundary) in D(m,n) and prove some facts on their structure. We show that the considered two classes (the maximum independent sets and the maximum-cut sets) can be defined as classes of completely regular sets with specified 2-by-2 quotient matrices. It is notable that for a set from the considered classes, the eigenvalues of the quotient matrix are the maximum and the minimum eigenvalues of the graph. For D(m,0), we show the existence of a third, intermediate, class of completely regular sets with the same property.

by <a href="">Denis Krotov</a> (Sobolev Institute of Mathematics, Novosibirsk, Russia), <a href="">Evgeny Bespalov</a> (Sobolev Institute of Mathematics, Novosibirsk, Russia) at October 07, 2015 01:30 AM

Haystack: In Situ Mobile Traffic Analysis in User Space. (arXiv:1510.01419v1 [cs.NI])

Despite our growing reliance on mobile phones for a wide range of daily tasks, we remain largely in the dark about the operation and performance of our devices, including how (or whether) they protect the information we entrust to them, and with whom they share it. The absence of easy, device-local access to the traffic of our mobile phones presents a fundamental impediment to improving this state of affairs. To develop detailed visibility, we devise Haystack, a system for unobtrusive and comprehensive monitoring of network communications on mobile phones, entirely from user-space. Haystack correlates disparate contextual information such as app identifiers and radio state with specific traffic flows destined to remote services, even if encrypted. Haystack facilitates user-friendly, large-scale deployment of mobile traffic measurements and services to illuminate mobile app performance, privacy and security. We discuss the design of Haystack and demonstrate its feasibility with an implementation that provides 26-55 Mbps throughput with less than 5% CPU overhead. Our system and results highlight the potential for client-side traffic analysis to help understand the mobile ecosystem at scale.

by <a href="">Abbas Razaghpanah</a>, <a href="">Narseo Vallina-Rodriguez</a>, <a href="">Srikanth Sundaresan</a>, <a href="">Christian Kreibich</a>, <a href="">Phillipa Gill</a>, <a href="">Mark Allman</a>, <a href="">Vern Paxson</a> at October 07, 2015 01:30 AM

Abstraction/Representation Theory for Heterotic Physical Computing. (arXiv:1510.01391v1 [cs.LO])

We give a rigorous framework for the interaction of physical computing devices with abstract computation. Device and program are mediated by the non-logical 'representation relation'; we give the conditions under which representation and device theory give rise to commuting diagrams between logical and physical domains, and the conditions for computation to occur. We give the interface of this new framework with currently existing formal methods, showing in particular its close relationship to refinement theory, and the implications for questions of meaning and reference in theoretical computer science. The case of hybrid computing is considered in detail, addressing in particular the example of an internet-mediated 'social machine', and the abstraction/representation framework used to provide a formal distinction between heterotic and hybrid computing. This forms the basis for future use of the framework in formal treatments of nonstandard physical computers.

by <a href="">Dominic C. Horsman</a> at October 07, 2015 01:30 AM


Is spoofing financially risky?

It's alleged that Navinder Singh Sarao contributed to the flash crash by placing huge, fake, order for S&P Minis. Mr. Singh Sarao then cancelled the huge orders before they were filled. The spoofed orders created false impressions in the market which Mr. Singh then profited from.

Here's what I don't get...why weren't the fake orders filled, leaving him with a real position? Order execution is typically very fast so how was he able to show the fake order to the market then cancel it before it was executed? It seems like he was putting himself in a very risky position that the huge fake order might get executed.

Note: I realize spoofing is illegal and very risky in that sense. I'm trying to understand the positional financial risk aspect (not counting fines, legal fees, jail, etc.).

by RichAmberale at October 07, 2015 01:26 AM