Wednesday, April 9, 2008

Parallel Computing: Why the Future Is Synchronous

Synchronous Processing Is Deterministic

I have always maintained (see COSA Software Model) that all elementary processes (operations) in a parallel program should be synchronized to a global virtual clock and that all elementary calculations should have equal durations, equal to one virtual cycle. The main reason for this is that synchronous processing (not to be confused with synchronous messaging) guarantees that the execution order of operations is deterministic. Temporal order determinism goes a long way toward making software stable and reliable. This is because the relative execution order (concurrent or sequential) of a huge number of events in a deterministic system can be easily predicted and the predictions can in turn be used to detect violations, i.e., bugs. Expected events (or event correlations) are like constraints. They can be used to force all additions or modifications to an application under construction to be consistent with the code already in place. The end result is that, in addition to being robust, the application is easier and cheaper to maintain.

Synchronous Processing Is Easy to Understand

The second most important reason for having a synchronous system has to do with the temporal nature of the human brain. There is a direct causal correlation between the temporal nature of the brain and program comprehensibility. Most of us may not think of the world that we sense as being temporally deterministic and predictable but almost all of it is. If it weren’t, we would have a hard time making sense of it and adapting to it. Note that, here, I am speaking of the macroscopic world of our senses, not the microscopic quantum universe, which is known to be probabilistic. For example, as we scan a landscape with our eyes, the relative motion of the objects in our visual field occurs according to the laws of optics and perspective. Our visual cortex is genetically wired to learn these deterministic temporal correlations. Once the correlations are learned, the newly formed neural structures become fixed and they can then be used to instantly recognize previously learned patterns every time they occur.

The point I am driving at is that the brain is exquisitely programmed to recognize deterministic temporal patterns within an evolving sensory space. Pattern predictability is the key to comprehension and behavioral adaptation. This is the main reason that multithreaded programs are so hard to write and maintain: they are unpredictable. The brain finds it hard to learn and understand unpredictable patterns. It needs stable temporal relationships in order to build the corresponding neural correlations. It is partially for this reason that I claim that, given a synchronous execution environment, the productivity of future parallel programmers will be several orders of magnitude greater than that of their sequential programming predecessors.

Synchronous Processing and Load Balancing

An astute reader wrote to me a few days ago to point out a potential problem with parallel synchronous processing. During any given cycle, the cores will be processing a variety of operations (elementary actions). Not all the operations will last an equal number of real time clock cycles. An addition might take two or three cycles while a multiplication might take ten cycles. The reader asked, does this mean that a core that finishes first has to stay idle until all the others are finished? The answer is, not at all. And here is why. Until and unless technology advances to the point where every operator is its own processor (the ultimate parallel system), a multicore processor will almost always have to execute many more operations per parallel cycle than the number of available cores. In other words, most of the times, even in a thousand-core processor, a core will be given dozens if not hundreds of operations to execute within a given parallel cycle. The reason is that the number of cores will never be enough to satisfy our need for faster machines, as we will always find new processor-intensive applications that will push the limits of performance. The load balancing mechanism of a multicore processor must be able to mix the operations of different durations among the cores so as to achieve a near perfect balance overall. Still, even in cases when the load balance is imperfect, the performance penalty will be insignificant compared to the overall load. Good automatic load balancing must be a priority of multicore research. This is the reason that I am so impressed with Plurality’s load-balancing claim for its Hypercore processor. However, as far as I can tell, Plurality does not use a synchronous software model. They are making a big mistake in this regard, in my opinion.


In conclusion, I will reiterate my conviction that the designers of future parallel systems will have to adopt a synchronous processing model. Synchronous processing is a must, not only for reliability, but for program comprehension and programmer productivity as well. Of course, the adoption of a pure, fine-grain, synchronous software model has direct consequences on the design of future multicore processors. In my next article, I will go over the reasons that the future of parallel computing is necessarily reactive.

[This article is part of my downloadable e-book on the parallel programming crisis.]

See Also:

How to Solve the Parallel Programming Crisis
Nightmare on Core Street
Parallel Computing: The End of the Turing Madness
Parallel Computing: Why the Future Is Non-Algorithmic
Parallel Computing: Why the Future Is Reactive
Why Parallel Programming Is So Hard
Parallel Programming, Math and the Curse of the Algorithm
The COSA Saga

PS. Everyone should read the comments at the end of Parallel Computing: The End of the Turing Madness. Apparently, Peter Wegner and Dina Goldin of Brown University have been ringing the non-algorithmic/reactive bell for quite some time. Without much success, I might add, otherwise there would be no parallel programming crisis to speak of.

No comments: