Abstract
In Part II, I wrote that Brooks's excuse for crappy software is actually a curse because it is much more detrimental to society than is commonly assumed. In this post, I defend the thesis that there is a way to lift the curse and solve the software reliability problem once and for all. Timing is the key.
Brooks's Blunder
In his famous No Silver Bullet paper, computer scientist Frederic Brooks wrote the following:
The complexity of software is an essential property, not an accidental one.Brooks's thesis is that the accidental complexity of software programming (e.g., syntax, spelling, infinite loops, null pointers, etc.) can be effectively managed but its essential complexity is so vast that errors cannot be prevented. Brooks stakes his entire thesis on the following assertion:
From the complexity comes the difficulty of enumerating, much less understanding, all the possible states of the program, and from that comes the unreliability.In other words, Brooks argues that, in order for a program to be reliable, one must be able to enumerate and understand all the possible states of the program. This may seem obvious at first glance but is it really true? Well, I hate to disappoint those of you who have believed in this fairy tale for the last twenty years or so but Brooks's thesis can be easily falsified as follows:
Let's take a simple temperature control program that is designed to read a register T that represents room temperature. The program turns on the heater if T decreases below t1 and turns it off when T increases above t2. Is it necessary to enumerate all the possible states of this program in order to know that it works correctly? Of course not. The program will work correctly if the two conditions (i.e., T < t1 and T > t2) that control its behavior are tested and shown to elicit the right behavior. There is no need to test all the possible states of T in order to know that the program is working properly as designed.
Note: Many years ago, speaking about Brooks's 'No Silver Bullet' paper, I wrote that "no other paper in the annals of software engineering has had a more detrimental effect on humanity's efforts to find a solution to the software reliability crisis." I still stand by that statement.
Thinking of Everything: It's All in the Timing
It is common knowledge that the human mind is prone to errors and often overlooks important aspects of a complex design. My claim is that there is a way to solve this problem. In my thesis on the correct approach to programming, timing is an essential and fundamental aspect of software, not an afterthought. This means that a program should be synchronous and reactive and should consist entirely of sensors (comparison operators or event detectors) and effectors (normal data operators). It also means that the relative timing of every elementary operation can be easily discovered via the use of simultaneity and sequence detectors. During testing, a special tool called the temporal discovery tool (TDT) can automatically discover the normal relative temporal order of all the sensors and effectors used in the program. This knowledge can be used to automatically generate alert detectors for every possible temporal anomaly. What would constitute an anomaly? Suppose the TDT discovers that signals A, B, C, and D always occur sequentially in that order. It will then create anomaly detectors that fire if there is a break somewhere in the sequence.
In practice, the TDT will check for both simultaneity and sequentiality in order to cover every possible eventuality. An application must not be deployed unless every discovered anomaly (alert signal) is adequately handled. This way, it is no longer necessary for the programmer to think of every possible anomalous condition in the program since that is the job of the TDT.
An Example
A simple program like the temperature control program I described above would suffice in most situations but what if some anomaly occurred such as an external hardware malfunction? Given enough temperature sensors, the TDT can discover many things about the way temperature changes in the room. For example, the TDT can easily discover that the temperature cannot increase above X twice in a row without first decreasing below X in between.
In other words, it discovers that an increase above X normally alternates with a decrease below X. So it creates an alert sensor that sends a signal if the rule is broken. This sort of thing can occur if the temperature sensor undergoes a malfunction for whatever reason. The discovery process is fully automatic and thorough and guarantees that every possible contingency is covered.
Dear Toyota Software Managers
Please take note. If Toyota had a tool like the TDT, it would not be in the predicament that it finds itself in. The TDT would have automatically discovered that pressing the brake pedal at the same time as the gas pedal is an anomaly. Of course, Toyota's engineers knew that but, with the TDT, they would also know that everything is guaranteed to be in working order and, as a result, they would not be shy about adding as much complexity to the control program as necessary, or as desired.
Conclusion
The TDT tool described above can only work in a deterministic software environment. What this means is that the system must make it possible to determine the temporal relationship (simultaneous or sequential) of any two events in a program. The beauty of the COSA software model is that it enables temporally deterministic programs. Non-deterministic and/or abnormal behavior can be fully accounted for during development. COSA does not construct your code for you. It simply makes sure that whatever you create is rock-solid and complete with no stone left unturned. If only the not so wise programming managers at Toyota, or wherever safety-critical code is developed, would take notice.
Next: How to Construct 100% Bug-Free Software
See Also:
Parallel Computing: Why the Future Is Synchronous
Parallel Computing: Why the Future Is Reactive
How to Solve the Parallel Programming Crisis
Parallel Computing: The End of the Turing Madness
Why Software Is Bad and What We can Do to Fix It
Project COSA
The COSA Operating System


In a non-algorithmic program, by contrast, there is no limit to the number of predecessors or successors that an element can have. A non-algorithmic program is inherently parallel. As seen below, signal flow is multi-dimensional and any number of elements can be processed at the same time.
Note the similarity to a
Adding more cores to the processor does not affect existing non-algorithmic programs; they should automatically run faster, that is, depending on the number of objects to be processed in parallel. Indeed the application developer should not have to think about cores at all, other than as a way to increase performance. Using the non-algorithmic software model, it is possible to design an auto-scalable,
Computer geeks often write to argue that it is easier and faster to write keywords like ‘while’, ‘+’, ‘-‘ and ‘=’ than it is to click and drag an icon. To that I say, phooey! The real beauty of event-driven reactive programming is that it makes it easy to create and use plug-compatible components. Once you’ve build a comprehensive collection of low-level components, then there is no longer a need to create new ones. Programming will quickly become entirely high-level and all programs will be built entirely from existing components. Just drag’m and drop’m. This is the reason that I have been saying that
Please read the paragraph on 