Wednesday, August 29, 2007
Synchronous Processes: COSA vs. Erlang
The Big Switch
Both COSA and Erlang use asynchronous signaling between concurrent modules. That is to say, a message/signal sender does not have to wait for a reply from the receiver. This is good for many reasons, not the least of which being failure localization. In this article, I would like to point out a major difference between Erlang and COSA processes. I fervently hope that this will persuade the computer industry to switch from the hopelessly flawed algorithmic model of computing to a reactive and synchronous model, a model fit for the 21st century. This is absolutely crucial to the future of computing, in my opinion. It is true that getting the entire computer industry to admit that they have been doing it wrong right from the start and that they need to retrace their steps and change the way they build and program computers is not an easy task, but someone's got to do it.
The bad news is that, if we don’t switch soon, we are going to be in a much bigger mess than we already are. The reason is that there is an inescapable correlation between complexity and unreliability in any system that is based on the algorithmic model. This much we already know, what with Fred Brooks, “No Silver Bullet” and all that jazz. It is a much bigger problem than most of us suspect, however. As I have said many times before, unreliability places an upper limit on the complexity of our systems. We could conceivably be riding in self-driving vehicles right now but concerns over safety, liability and development costs will not allow it. As a result, we will continue to suffer over forty thousand traffic fatalities every year, in the US alone! And that’s just the tip of the unreliability iceberg. The good news is that there is an alternative.
In an Erlang program (and every system that uses algorithmic threads and processes), the timing of operations in one concurrent module is independent of the timing of operations in another. This is analogous to having objects in different universes interacting with each other. Bad things are bound to happen. The problem is that the timing of events (whether internal or external to a module) becomes unpredictable. Why is this important, you ask? It is because, without rock solid temporal expectations, timing constraints cannot be enforced. Detecting violations to assumptions made about the timing of actions is the primary method of ensuring reliability in behaving systems. Measuring and comparing temporal intervals is as important to computing as measuring and comparing distances is to architecture or map making.
By contrast, in a COSA operating environment, all operations are elementary synchronous processes and they all execute in locked steps. That is to say, they are synchronized to a master clock and they all have equal durations, exactly one system cycle (the actual length of the cycle can vary, but everything must remain synchronized). Alegorically speaking, we might say that they all belong in the same universe and obey the same laws. Aditionally, since COSA is reactive, meaning that all objects react to their signals as they happen, there is none of the latency problems normally associated with algorithms. As a result, the timing of events is 100% deterministic. If event A is expected to always occur at the same time as event B, then this is what should happen. Otherwise, something is wrong. This makes it easy to test modifications to a complex system and make sure that they do not introduce bad side effects.
Do not underestimate the power of timing expectations to uncover hidden bugs in a complex program. Everything that takes place on a computer is precisely timed and, in a system that enforces deterministic timing, even the tiniest abnormality is bound to violate the expectations and trigger an alarm. One of the nice things about the COSA model is that the discovery and enforcement of timing constraints can be automated. As an aside, deterministic timing, combined with the inherent reactivity of COSA objects (nothing happens unless a change is detected in the object’s environment), allows us to implement other beneficial mechanisms such as the automatic detection and resolution of data dependencies. This is explained on the COSA System page.
In conclusion, temporal integrity and coherence (including motor conflict detection/resolution) and the automatic resolution of dependencies effectively break the correlation between complexity and unreliability that is a characteristic of the algorithmic model. This is rather counterintuitive but the COSA promise is that the robustness (fitness) of a system will be proportional to its complexity. So, the COSA motto should be, “Keep it complex, stupid”.