Essentially, Tilera and the other multicore CPU architecture designers (AMD and Intel included) look at multicore parallelism as one would look at a distributed computer cluster. They do because they cannot shake the algorithmic model out of their heads. It is part of their training, their indoctrination. They look at parallelism the same way Joe Armstrong, the main creator of Erlang, looks at concurrency in a programming language: just create a bunch of light weight concurrent algorithmic processes and have them send messages back and forth to each other. Problem is, fine-grain parallelism gets killed in the process. In other words, let’s say you want to implement a well-tuned parallel QuickSort module. Is that possible with light weight concurrent processes? Not really. You mean, I get 64 processors on a chip and I can’t run fine-grain parallel software? Yep.
All right, so you think I am kidding, eh? Call Tilera’s chief technologist, Arnant Agarwal, and ask him if his fancy Tile64 CPU can do fine-grain parallelism using MIMD (multiple instructions, multiple data). Call Armstrong as well, and ask him if there is a parallel QuickSort in Erlang. There is no need to call, though. I already know the answer. Sure, Tilera will mention something about having “tools for SIMD (single instruction, multiple data) intrinsics to enable fine-grain parallelism” but using SIMD to develop applications is worse than pulling teeth with a crowbar. The truth is that both Tile64 and Erlang were designed by thread monkeys. Fine-grain parallelism was never a part of their design goals. Yes, I understand why but I don’t have to like it. And I don't.
All those fast pretty little cores and all you can do is run a bunch of cheesy threads. This is pure folly, I know. But of course, and I have been saying it for a long time, there is a better way to construct fine-grain, parallel applications and there is a better way to design both single and multicore processors to run those applications using MIMD. It’s the COSA way.
Parallel Programming, Math and the Curse of the Algorithm
The Age of Crappy Concurrency: Erlang, Tilera, Intel, AMD, IBM, Freescale, etc…
Half a Century of Crappy Computing
Parallel Computers and the Algorithm: Square Peg vs. Round Hole