Sunday, May 4, 2008

Parallel Computing: Why the Future Is Non-Algorithmic

Single Threading Considered Harmful

There has been a lot of talk lately about how the use of multiple concurrent threads is considered harmful by a growing number of experts. I think the problem is much deeper than that. What many fail to realize is that multithreading is the direct evolutionary outcome of single threading. Whether running singly or concurrently with other threads, a thread is still a thread. In my writings on the software crisis, I argue that the thread concept is the root cause of every ill that ails computing, from the chronic problems of unreliability and low productivity to the current parallel programming crisis. Obviously, if a single thread is bad, multiple concurrent threads will make things worse. Fortunately, there is a way to design and program computers that does not involve the use of threads at all.

Algorithmic vs. Non-Algorithmic Computing Model

A thread is an algorithm, i.e., a one-dimensional sequence of operations to be executed one at a time. Even though the execution order of the operations is implicitly specified by their position in the sequence, it pays to view a program as a collection of communicating elements or objects. Immediately after performing its operation, an object sends a signal to its successor in the sequence saying, ‘I am done; now it’s your turn”. As seen in the figure below, an element in a thread can have only one predecessor and one successor. In other words, only one element can be executed at a time. The arrow represents the direction of signal flow.


In a non-algorithmic program, by contrast, there is no limit to the number of predecessors or successors that an element can have. A non-algorithmic program is inherently parallel. As seen below, signal flow is multi-dimensional and any number of elements can be processed at the same time.

Note the similarity to a neural network. The interactive nature of a neural network is obviously non-algorithmic since sensory (i.e., non-algorithmically obtained) signals can be inserted into the program while it is running. In other words, a non-algorithmic program is a reactive system.
Note also that all the elements (operations) in a stable non-algorithmic software system must have equal durations based on a virtual system-wide clock; otherwise signal timing would quickly get out of step and result in failure. Deterministic execution order, also known as synchronous processing, is absolutely essential to reliability. The figure below is a graphical example of a small parallel program composed using COSA objects. The fact that a non-algorithmic program looks like a logic circuit is no accident since logic circuits are essentially non-algorithmic behaving systems.

No Two Ways About It

The non-algorithmic model of computing that I propose is inherently parallel, synchronous and reactive. I have argued in the past and I continue to argue that it is the solution to all the major problems that currently afflict the computer industry. There is only one way to implement this model in a Von Neumann computer. As I have said repeatedly elsewhere, it is not rocket science. Essentially, it requires a collection of linked elements (or objects), two buffers and a loop mechanism. While the objects in one buffer are being processed, the other buffer is filled with objects to be processed during the next cycle. Two buffers are used in order to prevent signal racing conditions. Programmers have been using this technique to simulate parallelism for ages. They use it in such well-known applications as neural networks, cellular automata, simulations, video games, and VHDL. And it is all done without threads, mind you. What is needed in order to turn this technique into a parallel programming model is to apply it at the instruction level. However, doing so in software would be too slow. This is the reason that the two buffers and the loop mechanism should ideally reside within the processor and managed by on-chip circuitry. The underlying process should be transparent to the programmer and he or she should not have to care about whether the processor is single-core or multicore. Below is a block diagram for a single-core non-algorithmic processor.
Adding more cores to the processor does not affect existing non-algorithmic programs; they should automatically run faster, that is, depending on the number of objects to be processed in parallel. Indeed the application developer should not have to think about cores at all, other than as a way to increase performance. Using the non-algorithmic software model, it is possible to design an auto-scalable, self-balancing multicore processor that implements fine-grained deterministic parallelism and can handle anything you can throw at it. There is no reason to have one type of processor for graphics and another for general purpose programs. One processor should do everything with equal ease. For a more detailed description of the non-algorithmic software model, take a look at Project COSA.

Don’t Trust Your Dog to Guard Your Lunch

The recent flurry of activity among the big players in the multicore processor industry underscores the general feeling that parallel computing has hit a major snag. Several parallel computing research labs are being privately funded at major universities. What the industry fails to understand is that it is the academic community that got them into this mess in the first place. British mathematician Charles Babbage introduced algorithmic computing to the world with the design of the analytical engine more than 150 years ago. Sure, Babbage was a genius but parallel programming was the furthest thing from his mind. One would think that after all this time, computer academics would have realized that there is something fundamentally wrong with basing software construction on the algorithm. On the contrary, the algorithm became the backbone of a new religion with Alan Turing as the godhead and the Turing machine as the quintessential algorithmic computer. The problem is now firmly institutionalized and computer academics will not suffer an outsider, such as myself, to come on their turf to teach them the correct way to do things. That’s too bad. It remains that throwing money at academia in the hope of finding a solution to the parallel programming problem is like trusting your dog to guard your lunch. Bad idea. Sooner or later, something will have to give.

Conclusion

The computer industry is facing an acute crisis. In the past, revenue growth has always been tied to performance increases. Unless the industry finds a quick solution to the parallel programming problem, performance increases will slow down to a crawl and so will revenue. However, parallel programming is just one symptom of a deeper malady. The real cancer is the thread. Get rid of the thread by adopting a non-algorithmic, synchronous, reactive computing model and all the other symptoms (unreliability and low productivity) will disappear as well. In an upcoming post, I will go over the reasons that the future of parallel programming is necessarily graphical.

[This article is part of my downloadable e-book on the parallel programming crisis.]

See Also:

How to Solve the Parallel Programming Crisis
Parallel Computing: The End of the Turing Madness
Parallel Computing: Why the Future Is Synchronous
Parallel Computing: Why the Future Is Reactive
Why Parallel Programming Is So Hard
Why I Hate All Computer Programming Languages
Parallel Programming, Math and the Curse of the Algorithm
The COSA Saga

PS. Everyone should read the comments at the end of Parallel Computing: The End of the Turing Madness. Apparently, Peter Wegner and Dina Goldin of Brown University have been ringing the non-algorithmic/reactive bell for quite some time. Without much success, I might add, otherwise there would be no parallel programming crisis to speak of.

30 comments:

dashxdr said...

Louis,

I think what you're describing is similiar to stream programming, I once skimmed a chapter on the subject. With the stream approach, a processor has data coming in, and it works on the data, and sends the data out. You can gang processors together in a chain and the streams are all taken care of for you.

I tend to agree with you that threads programming is a disaster. It becomes unmaintainable. The dangers of semaphore lockups grow exponentially as the complexity increases. It becomes impossible to even visualize what's going on.

I wrote a very simple system that was using threads, and I realized it just couldn't keep growing -- no other programmer could remember all the gotchas in the code, and a year or two after I stopped working on it I'd have forgotten them also.

I heard recently Knuth has come out against parallel programming. What a jackass, I've never thought he was worth listening to. But the problem isn't going parallel, the problem is, as you say, the underlying paradigm. Parallel is the future, there is no question whatsoever. But the right kind of parallel is critical.

Hang in there!

-Dave

Dave said...

I just discovered your blog today while searching for parallel computing.

Are you for real? This blog seems far too elaborate to have been created for the sole purpose of parody or goading CS nerds to argue with you.

Maybe the reason your paradigm isn't catching on is because it's proponents spend a large portion of their time ranting about how stupid everybody else is rather than building something substantial to prove their point.

Say what you will about 'Ivory Tower' academic elitists, at least they produce something more than rhetoric.

Louis Savain said...

Dave wrote: I think what you're describing is similiar to stream programming,

Nope. I am describing a signal-based (not data-based) synchronous reactive model. Not the same thing.

PS. BTW Dave, forget trying to get me to participate in your AI forum. I'm not interested.

Louis Savain said...

Dave #2 wrote: Are you for real?

If you don't like what I write, don't read it. It's not meant for you. Sorry.

tbcpp said...

That being said, could we get a complex COSA example, please? I'd like to see something like a "psudo-code" raytracer. For instance how would we do this in COSA?

def fireray(ray):
for each obj in objects:
if ray intersects obj:
return obj.color

def main():
for each pixel in screen:
setpixel (pixel, fireray(ray))

Or perhaps a more complex example of a string compare? I'm having a bit of a issue understanding how COSA would work for a larger task. Examples are key to understanding any programming concept. So, can we get more?

Thanks

tbcpp said...

and yeah, blogger removed the tabs from my last post. Oh well, you get the idea.

Louis Savain said...

tbcpp,

check out the Project COSA pages. There is a QuickSort example that you may find interesting. If you are really interested, you should read up on the COSA concept first, especially the COSA Operating System and Software Composition pages.

Dave said...

tbcpp:

Don't even bother. The reason he is not providing any substantial examples is because even if he managed to create one the COSA diagram would look so complex that it would instantly show how absurd the idea of 'Drag-N-Drop' programming is.

Browse the blog comments and look at how many people well versed in CS have trouble reading the COSA quick-sort diagram -- let alone something non-trivial.

The idea of 'drag-N-drop' programming is nothing new -- it has been shown over and over again that even non-technical users are more productive using a good Domain Specific Language rather than these supposedly 'universal' graphical representations of programs.

He is right about one thing: This blog is not for me and this will be my last post.

Louis Savain said...

Hey Dave,

Does your boss at GTSI knows how clueless you really are? No wonder you're posting anonymously. ahahaha...

tbcpp said...

Still, my comment stands, can we get even one more example besides the quicksort?

I'm interested in COSA, but all the demo program has is some basic logic (AND, OR, etc.).

Examples, please?

Hiro said...

I'd like to bring up two points.

But first, let's drop this "algorithmic vs. non-algorithmic" language. It's a false dichotomy. Obviously your graphical programs implement algorithms.

Instead, I'll use the phrases "sequential" and "graphical." (where "graphical" means "relating to a graph;" it does not mean "visual.")

Here goes:

I. Fully Specified Flow Graphs

There is an idea called a Fully Specified Flow Graph (FSFG), which captures the same sort of concept you outline here. What you have essentially given here is a double-buffering method for implementing FSFGs, which you can extend to an arbitrary number of processor cores.

This is a nice idea. I'm sure it's not new, but I happen to agree with the spirit of a quote, I think from John Carmack, to the effect that if you independently reinvent the wheel you are just as much an inventor of the wheel as the first guy. (It's a little bit of an overstatement, but what pithy quotes aren't?)

One problematic aspect of double buffering is that it requires memory twice as large as your state (you need to store both the past state and the state you're currently computing). By scheduling operations appropriately, you can avoid this.

Still, since this is a difference of only a small constant factor (two), your method is still in the same memory complexity class as the "better" algorithm which represents the state in the same way but reorders computations cleverly to avoid additional storage.

These are issues that people in the digital signal processing community have worked out nicely.

II. Hardware

You can think of a digital electronics engineer working at the logic-gate level as a "COSA programmer." He is limited to boolean signals and has a restricted set of operators/blocks (NAND, NOT), but he can write any program. His "processors" are the logic gates themselves.

You can take a step up the abstraction ladder and look at an engineer writing VHDL or Verilog to combine existing "IP" (the electronics industry's way of referring to black boxes other people built). He too is writing COSA, in a way.

At the top of the abstraction ladder, there is the programmer who uses the cores of a multicore processor, rather than logic gates or smaller I.P. blocks.

So the question is: Where is the line between programming and hardware design? You imply that current processors are not suited to COSA programs. Isn't the "ideal" programmable "COSA processor" really an FPGA?

It seems to me that what we're looking at is a tradeoff between abstraction and parallelism -- a tradeoff that places you somewhere between microelectronics design and software engineering, depending on where you decide to strike your balance.

John L said...

Jaron Lanier and Marvin Minksy are also advocating a complete re-think of our computing paradigms. They say 2D / boolean computing is a dead-end.

Are you and they thinking along the same lines? Lanier is advocating for "phenotropic" or biology-like computing. Minsky is thinking more along "emotional machines" seeking to emulate more of how human beings process data.

Patrick H. Madden said...

Wow; saw your post over on the Intel blog, and followed you over here. I agree with you that Intel is in deep trouble with multi-core, but the COSA model doesn't seem to me as a step in the right direction.

I'll agree with a lot of the other commenters here that a working prototype would go a long way to convincing people, and an FPGA seems like an obvious platform. There's an ocean of people offering fixes for the end of clock scaling (most of them offering ones that have failed a few times already). To get the funding support, you're going to have to show that it works. Life's not fair, but that's the way it is.

Louis Savain said...

Patrick,

I am not sure why you decided to leave a comment on my blog since you must realize that I am strongly biased against the computer academic community. You people are Turing worshippers, which makes you brain-damaged in my opinion.

Now, to the idea that I must have a working prototype of a COSA processor in order to get funding, I say fooey! You people in academia are getting all sorts of funding to do research in parallel computing. Where is your working solution that we may all marvel? You got absolutely nothing. You got less than nothing; a bunch of failed (should I say, stupid?) ideas is what you got. And you've been at it for decades, an eternity in this business. It's about time for someone else to get a turn at the wheel. Your driving sucks and you have no sense of direction.

What I'm proposing is not rocket science. If you academics can't understand it, it is because you are either stupid or you are brain-damaged from years of indoctrination in your Turing cult. I always say it like I see it.

Will said...

I like the cut of your jib, Savain.

Can't honestly say I give a shit about AI or reliable software, but I'll give your COSA material a sincere twice-over.

Louis Savain said...

All right. I decided to block free commenting because this blog is a dictatorship. If you don't like what I write, go read somebody else's blog. After all, nobody is twisting your arm, right? As far as my usage of the word 'algorithm' is concerned, let me just say that I refuse to use the bastardized definition adopted by the Turing cult. I use algorithm as it was originally defined: a sequence of instructions or steps. If you don't like it, too bad. To the rest of you, thanks for your support.

Louis Savain said...

Can't honestly say I give a shit about AI or reliable software, but I'll give your COSA material a sincere twice-over.

You're mistaken about AI and reliable software. Solving the parallel programming problem automatically solves the software reliability problem. It also does wonders for AI because intelligent systems, with their millions or billions of connections, are quintessentially parallel.

Raw Think Tank said...

If someone figures out how to create a virtual processor that uses as many cores as are thrown at it, then we dont even hav to debate on for anything new.

Louis Savain said...

If someone figures out how to create a virtual processor that uses as many cores as are thrown at it, then we dont even hav to debate on for anything new.

I take it that you are proposing a virtual processor that can run current sequential programs on a multicore processor. Don't hold your breath too long on that one. Having said that, let me add that it would not be an easy thing to write, but a virtual COSA processor is certainly possible on an existing multicore processor. But why use a virtual processor when you can use the programming model as a blueprint with which to design and build a real processor that is orders of magnitude faster and more energy efficient than a virtual one?

Tom said...

I don't get your take on an "algorithm". As I understand it, it is a sequence of instructions to perform a certain task. However, I've never seen anything read about the instructions being *executed* sequentially; merely *declared* sequentially. Using the example of a recipe as an algorithm, there are many cases where parallelism in a kitchen is desirable. However, almost all recipes are represented as a sequence (likely because it makes it easier to understand).

So, I don't think that CS has stolen the word "algorithm". My gut reaction is that you've deliberately misinterpeted the common definition to give yourself a straw-man to attack.

I also take issue with your argument that "graphical" processing is somehow more interactive or reactive. The only significant change is execution behavior; the overall output of the system should be identical, therefore anything that could be done interactively/reactively with a graphical application could be done as well with a sequential application.

Nitpicks over, time to check out the COSA links...

Martin said...

"they should automatically run faster, that is, depending on the number of objects to be processed in parallel"

so you're still limited by the serial fraction, like any other parallel model?

Dr Senior said...

There is a wealth of academic research into this subject. An early pioneer was the mathematician Robin Milner who first exposed the needs for langagues to support concurrency (in the sixties).

There are many languages capable of expressing concurrent problems, which can be implicitly spread over an arbitrary number of processors. These range from languages based on Pi-Calculus (like ADL) to those based on Petri nets (like YAWL).

Do some research and find out what's out there (and quite mature and well understood). Find out who your natural allies are, and stop being so damn arrogant.

Louis Savain said...

Do some research and find out what's out there (and quite mature and well understood). Find out who your natural allies are, and stop being so damn arrogant.

In other words, you academics knew it all along, right? If so, why is the industry still using processors that are designed for processing algorithms as opposed to reactive parallel software? I mean, you've known it was the way to go since the 60's, right? What's taking you so damn long?

I believe it was a former chairman of General Motors who once said: "First they tell you that you're wrong; then they tell you that you're right but it's not important; finally, they tell you that it's important but they knew it all along."

Talk about arrogance! You, in the computer science community, are the arrogant ones for having formed a cult around the Turing machine and having led the computing world on the wrong path for half a century. Look in the mirror, amigo.

Navid said...

What's going on Louis. You seem very angry and filled with hatred. I read a bit through the comments, and you insult people on your own blog for disagreeing with you or even when they agree with you. If you can't even think about other opinions, then you'll not come very far, no matter how right you are and how wrong they are.

Get easy on it. If you really *know* better, then you do not have to get angry because you are so far above them that you can look at them easy and with no fear. And that's the reason someone gets angry: Fear. (E.g. of their world falling into pieces.)

I thought your idea was quite good. While not perfect, it's a good start, and I came up with similar things myself for parts of game engines.

But you attitude in the comments section just turned my opinion completely over. Ouch... And yeah, I AM reading your blog and I'm not liking parts of it. AND: How am I able to know if I like it, before I've read id??

You are - in fact - a perfect example of self-fulfilling prophecy. Of course they will never accept you if you threat them like shit. I bet you are a very intelligent man. Use your brain. Use psychology and tactics. Be wise. Play with the world. I think you have a fair chance of winning! :D

Good luck! :D

P.S.: You are not allowed to read this line. :P

Louis Savain said...

Navid:

Be wise. Play with the world. I think you have a fair chance of winning!

Nah. I tried playing with the world before. It does not work. When you have a disruptive message, people feel threatened regardless of how gentle and polite you try to be. Besides, I am a rebel at heart and I refuse to kiss the world's ass. Victory on this issue would be sweet but I don't need it that bad.

At any rate, it doesn't matter in the end. I will win anyway. Why? Because I am right on this issue and the parallel programming problem is getting so bad as to be unbearable for the powers that be.

There is a lot of people in the industry who know that I am right but they refuse to acknowledge my work because they don't like what I stand for. They especially don't like the fact that I am a Christian and not a materialist or an atheist. The latest tactic to push me to the side is to claim they have always known that this was the way to go from the beginning. Go figure!

Louis Savain said...

And, yes, academics have known of this one too since the 60's (most have probably preferred functional programming as it is more concise and elegant--closer to the mathematics). But, industry has not been listening to them either.

Wow! This is a good one. You have got to be kidding me. For your information, industry does nothing but listen to academia. Guess who is in charge of the parallel programming research groups at Stanford, Berkeley, University of Chicago and all the other labs in academia that are being sponsored by industry? You guessed it, computer academics. Guess where all the technology leaders in industry came from? You got it again, straight from academia.

Computer scientists have been pushing functional programming for years. Erlang is already accepted at Ericson in Sweden as God's gift to humanity. The only reason that functional programming has not taken off like a rocket is because reality keeps kicking it in the ass.

When you get a chance, go take a look at the work being conducted at Stanford's Pervasive Parallelism Lab. Download their PDF doc. What do you see? You see academics talking about 10 of thousands of threads, that's what. You see the cream of the crop of computer science cluelessly talking about domain specific languages (DSL) as the answer to the parallel programming problem.

So don't give me the shit that the industry is not listening to you. This is precisely why the industry is in a crisis. They do nothing but listen to you people.

PS. I am tired of people telling me that I am alienating the very people that I am trying to impress. The last thing I want to do is impress a bunch of ivory tower academics who are convinced that they are the smartest people in the universe and condescendingly talk down to the lay public, the same public who ultimately pays their salaries. The hell with the computer science community. They are the problem, not the solution. They can kiss my ass. How about that? Hey, I like the sound of that. I think I might turn it into a regular post, just for fun.

TheXenocide said...

uhh... yeah... your perspective is very narrow. Just because you have some good ideas doesn't mean you can assume that everybody should or will think about them the same way as you. Discrete mathematics would indicate that even a signal based synchronized workflow with dependent and independent instructions is still an algorithm... see, just because it's a step by step set of directions doesn't mean that the steps HAVE to be linear, in fact you can just change your perspective on the scenario to see a large algorithm with steps that consist of simultaneously occurring sub-algorithms.

Food recipes are one of the first examples of an algorithm anybody learns about... guess what? Sometimes you put the turkey in the oven and before it's done you start some potatoes and corn, then maybe you're still cooking the potatoes and you pull the turkey out to make some gravy... it's still a step by step recipe for a good thanksgiving and it consists of many smaller recipes; it just so happens that it's difficult to serve food fresh if you wait until one thing is done to cook the next one, though you still need to let your turkey go for a bit to make a good gravy... two different components to a meal, one dependent on a signal from another, while simultaneously not dependent on other components...

I'm not saying you don't have good ideas, but you're killing the effectiveness of your own ideas by cutting off relative explanations through perspective alteration just because it's not the way you think about it. The most effective means to convey an idea is to let the idea be accepted and absorbed rather than to mash it down somebody's throat. And if they don't accept your ideas, chances are there are some valid points that can be learned from, your ideas can be expanded and improved; nobody gets it perfect the first or even the 50th time around, ideas are forever evolving. So instead of starting wars with people why don't you consider a new approach where you express your thoughts and then ask what other people think, maybe even getting some constructive conversation in there so everyone can actually benefit from the amazingness of the modern worlds ability to exchange ideas rapidly instead of starting wars where peace treaties have to happen before anybody can move forward.

TheXenocide said...
This comment has been removed by a blog administrator.
Paul Nathan said...

FWIW, your model of computation is known as data-flow computation, and it is a pain to program general purpose programs in. A few summers ago I worked on a compiler for a dataflow language/hardware combo. It has very good use for specific applications, e.g., DSP.

Have a good day.

Louis Savain said...

TheXenoxide,

Sorry, you have nothing interesting or constructive to add to the points made by the article. See ya.

Nathan,

If COSA is data-flow then so is VHDL. Has anybody ever called VHDL or Verilog, data-flow? I don't think so and yet, COSA works a lot like both. Can a data-flow program handle fine-grain, instruction level, deterministic parallelism like a COSA program does? I don't think so. Do all operations in a data-flow program have the same virtual duration as in COSA? Answer: No. Can you determine whether or not two random operations in a data-flow program are concurrent or sequential? I don't think so but it's an easy thing to do in COSA.

Conclusion: COSA is not data-flow. Please get a clear understanding of the subject matter before you make false assertions about it.