Wednesday, February 24, 2010

Unreliable Software, Part II: Brooks's Curse

Part I, II, III

Abstract

In Part I, I described what I call Brooks's excuse (named after computer scientist Fred Brooks), the main reason given by programmers and computer scientists to explain why complex programs are not reliable. The excuse essentially amounts to asserting that, when it comes to programming complex software, nobody can think of everything. In this post, I explain why Brooks's excuse turned out to be a very costly curse.

Brooks's Curse

Many articles have been written on the cost of the software reliability crisis but I think they only scratch the surface. Society is paying a much higher price than is commonly perceived. Brooks's excuse has become a self-fulfilling prophecy of sorts because programmers are now convinced of its veracity. Their reasoning is that, since programs cannot be guaranteed to be bug-free, there is no point in trying to find a solution. The best that can be done is to alleviate the problem with the use of formal methods and the application of so-called software metrics. These methods are not only costly and time consuming, they just don't work. The Toyota debacle is proof of that. The end result is that it is impossible to develop and deploy truly sophisticated software for use in safety-critical applications.

Brooks's curse is the reason that cars do not drive themselves, a technology that would prevent over 40,000 yearly fatalities on US roads alone (Wheels of Death). It's the reason that humans are still required to operate trains, buses, aircrafts and other dangerous vehicles and equipment. It's the reason that we still have human air traffic controllers. But it is going to get much worse because the industry has settled on multithreading as the software model for its multicore processors. If you think single-threading is unreliable, wait till multithreading becomes widespread. About the only hope we have is that multithreading is such a pain in the ass to work with that it will seriously hurt the computer industry's bottom line. This might convince the big dogs in the business of the seriousness of the problem and put pressure on them to do something about it, something that has not been tried before. We can only hope.

Superstition

In Part III, I will explain why Brooks's curse is based on nothing more than superstition. There is a way to guarantee bug-free code and there is a way to think of everything even if humans normally can't.

See Also:

Why Software Is Bad and What We can Do to Fix It

No comments: