netNatter

Tuesday, January 11, 2005

The Free Lunch Is Over: A Fundamental Turn Toward Concurrency in Software

A nice article by Herb Sutter about how processor speeds are capping out and developers getting automagic perf improvements in their code due to improved CPU performance is coming to an end. He talks about the trend towards multi-core processors (not just hyperthreading) and the need for developers to understand concurrent programming more than ever before to get the maximum from the CPU (which will predominantly be multi-core).

Rather than programmers getting better at concurrent programming, I would rather have the JITers in the .NET runtime (or Java runtime) to be smarter about generating parallel execution paths, do the necessary magic to get better cache coherency and the languages supporting hints on what is parallelizable that the JITers can act upon. RDBMS's (SQL Server, Oracle, et al) do a good job in generating parallel code execution plans out of SQL, our OO languages need to do the same and here is where JITed code stands to benefit imo. Herb also states that most apps are going to be CPU bound soon (rather than IO bound). I guess he also include cache misses and CPU stalls in this, but for me being CPU bound is if the app is really crunching the CPU vs stalling for code or data to be fetched. I think the latter is a bigger problem at this point and for some time in the future.

posted by Farookh at 7:28 PM Comment (1)

Comments

I agree, but I think Herb's point is more than what can be solved by having smarter JITers. A smarter compiler can figure out parallel exection paths, but that applies only to a single thread of execution; I would hazard a guess that most contemporary compilers are already very good at this. When it comes to multiple threads however, there is not much a compiler or a JITer can do. JITers can't make your multithreaded code run faster beyond the optmization they do already, because the synchronization semantics are understood only at the application level. In this case, the programmer simply has to get better at concurrent programming -- something that is very hard indeed. Most programs need to do disk or network I/O quite often, so programmers will need to do more multi-threaded stuff at the application level if CPU speeds are not going to keep on increasing.

The other alternative (and I think something that shows a lot of promise, if enough investment is made) is to develop new languages with better and safer concurrent programming models. The academics have been working in this area for a while, but we need more of these things to come into the mainstream. One example is Erlang which is a language developed by a research lab within Ericsson. It eschews shared shared memory in favour of message-passing concurrency; while that sounds limiting, they have been to write huge concurrent systems safely by making the cost of process creation extremely small.

Regarding your last point about the meaning of CPU bound, I did not understand which ("the latter") you meant as being the bigger problem: CPU crunching or cache misses? Stalling for code or going to main memory due to a cache miss is going to be a problem, I agree -- however this is something they can improve even with processor speeds capping out by continuing to increase cache sizes, at least to a certain extent.

In the last few years, the rate at which network speeds have been increasing has exceeded the rate at which CPU speeds have been increasing. A statistic that I read recently said that in the last decade processor speeds have been increasing by a factor of 100 every decade, while disk capacities have been increasing by a factor or 1000 and optical network bandwidth by a factor of 10000. These are exponentials at work here -- it won't be long before network access will be almost as fast as main memory access. In fact, I am pretty sure that network access will be faster than accessing your local disk. This reversal of trends will have a drastic effect on common assumptions and trade-offs that programmers have traditionally made. Such a shift in mindset is difficult to come to grips with. For instance, it is true today that on high-end processors floating-point multiplications are faster than integer multiply operations, yet many oldtime programmer (including me) instinctively try to do most arithmetic with integers -- it is just hard to change the mindset. If/when the "CPU cycles are cheap, network access is costly" assumption goes out the window, a lot of things will have to change. For instance, as I mentioned to Fario the other day, codec research may end up being a dead-end.

posted by

Kaushik at 9:30 PM, January 11, 2005