Thursday, September 20, 2007

Why I Think Functional Programming Languages Like Erlang and Haskell are Crap

All Multicore and Parallel Programming Articles
All Erlang Related Articles

Computing Is About Behaving, Not Calculating

The main problem with functional programming is that it was created by mathematicians for mathematicians. In the minds of mathematicians, computation is synonymous with function evaluation. They see the computer strictly as a glorified calculator, a machine for the evaluation of math problems. My position, by contrast, is that computing is not about function evaluation at all but about behavior. In other words, a computer is a behaving machine. Let me explain the difference.

Hidden Variables and Side Effects

Functional programming proponents love to talk about how their style of programming avoids states and state changes. What they fail to tell you is that functions are high-level constructs and that the compiler creates, initializes and manipulates hidden states internally on your behalf. For example, here’s a C function that calculates the sum of integers from 1 to 10:


sum = 0;
for (i=1; i<=10; ++i)
sum += i;
And here is how it’s done in Haskell:

Now, the Haskell fanatics will tell you that there was no variable updates in the sum function. Don’t you believe it! There is no way the compiler can add a list of integers without updating at least one variable. The language simply hides it. You may say, so what? As long as the programmer does not have to deal with variables, it prevents side effects, right? Wrong. The worst side effects in programming have to do with timing and unless you can precisely control how and when those variables are changed, you cannot control timing.
sum [1..10]


Temporal Side Effects

Let’s say you encapsulate a simple sum function in an Erlang process. You can then invoke the process by sending it a message. Subsequently, the process will send another message with the result back to the original sender. Never mind, for now, that Erlang stupidly requires the entire list of integers to be copied onto a message queue and that the receiving process creates a new variable, which is copied onto another queue to be returned to the sender (what an unnecessary waste of CPU cycles!). The point I am getting at is that the time between sending and receiving in an Erlang (or any functional language) program is non-deterministic!

On the Crucial Importance of Temporal Determinacy

This, ladies and gentlemen, is the biggest problem with functional languages: they are temporally non-deterministic. Why is this so important? Well, let’s take security as a case in point. It is entirely possible that some malware surreptitiously replaces your little Erlang process with a spying module. The sending process or client has no way of knowing this. The best and most effective way to detect malware is to use a temporally deterministic system. Such a system is necessarily based on fine-grain concurrency and has a stable temporal signature. What this means is that a number of timing watchdogs can be inserted at various places in the system to detect changes in its temporal signature. Any code modification (such as the insertion of a spying module) would change the signature and trigger an alarm (see Impregnable Security in Concurrent Reactive Systems). Granted, not every part of a software system can be deterministic but every reaction or even chain reaction is.

Security is one thing. Overall reliability and maintainability is another. It is true that a lot of math problems are not concerned with the precise order of most operations but a huge number of real world applications (e.g., simulations, automation, neural networks, etc…) certainly are and the ability to automatically detect changes in event timing at the elementary operation level is crucial to maintainability.

Faking Purity and the Mark of the Beast

The idea that one can implement real world software systems using a purely functional style of programming is fantasy at best, and an outright lie at worst. Let us take the example of a robot that uses two I/O registers to control the movement of a specific actuator. Let’s say that one register is used to start the motion of the actuator and the other to stop it. You can repeat until you turn blue that those registers are not variables but they are. You can scream “MONADS” at the top of your lungs until you drop. Makes no difference, they are still variables. Jumping up and down and foaming at the mouth is not going to help either. :-D

Faking purity and adherence to a mathematical ideal while using impure loopholes to mask the inadequacy of a paradigm is the ultimate exercise in self-deception. Those two control registers are variables that may be used by various behavior subsystems or modules within the robot’s brain. Worst, they may be accessible by any number of cells within a given module. There is only one way to properly manipulate them without introducing all sorts of signal racing and conflict problems and that is to use a temporally deterministic system. Forget threads and all the evil :-) thread monkeys (at Intel, AMD, Sun, IBM, Tilera, Freescale, etc…) who promote them for their crappy multithreaded CPUs. Threads are the absolute work of the devil. Why? Because they are non-deterministic, that’s why. There is a better way to build concurrent deterministic systems without threads. We will need non-threaded, fine-grain, auto-scalable, MIMD multicore CPUs. Unfortunately for those of us who refuse to accept the mark of the Beast (the Algorithm), unless you have the mark on your forehead, you can neither buy nor sell in this business. But that will change soon enough. Slowly but surely, the number of rebels is growing and, sooner or later, the time for the uprising will come. Ahahaha…

Fine-Grain Parallelism, Deterministic Timing and Behavior

I wrote above that computing is about behavior. What I mean is that every elementary operation in a software system (additions, subtractions, divisions, multiplications, comparisons, etc…) should be seen as a basic behaving entity that acts on or reacts to an environmental variable. A behaving entity can be either a sensor (comparator) or an effector (operator). A computer should be viewed as a universal behaving machine (UBM), one which consists of a number of concurrent behaving and communicating entities. The only way to build a truly deterministic system is by adopting a convention whereby every behaving entity executes itself in one system cycle immediately upon receiving a signal. This is called synchronous reactive computing (not the same thing as synchronous messaging). A UBM is a radically different type of computer with a new kind of CPU and programming model. The goal of Project COSA is to pave the way for the development of the first UBM and the first true non-algorithmic, concurrent, synchronous, reactive OS. Until that happens, I will continue to bash existing systems and paradigms and gain new converts, one at a time. It’s not easy but, at least, it’s fun. :-D

See Also:
How to Solve the Parallel Programming Crisis

Next: Why Functional Programming Is Worse Than Crap: State Changes Are Essential to Reliability

12 comments:

Meryn Stol said...

Your post is so refreshing and enlightening it actually makes me want to get back in computer science. Thank you.

Dan said...

Bashing existing paradigms might be fun , as long as that leaves you enough time to develop COSA , else you'll
REMAIN STUCK WITH THOSE PARADIGMS . >:) evil laughter : Muhuhaha!

zbrown said...

Meryn Stol:

You're a damned fool if you believe a word of that. COSA is a pipe dream and he knows it. You're a fool if you think otherwise.

MoMonkey said...

Pure functions are only one (minor) part of "Functional Programming". Using higher-order functions to build succinct, super-composable code is the real power. It should really be called "Function Oriented Programming". I agree that working entirely in a world without state is not very practical. Perhaps you should try a more pragmatic FP language like F#.

jeff said...

Precise timing is important in some cases, but for most software it's not. And the order of execution only matters when there's a temporal dependency, which ideally you'd like to minimize anyway so you can run the operations in parallel.

As for security, temporal determinacy doesn't necessarily help. In fact, it can be a security vulnerability. See, for example, timing attacks.

Louis Savain said...

Pure functions are only one (minor) part of "Functional Programming". Using higher-order functions to build succinct, super-composable code is the real power. It should really be called "Function Oriented Programming". I agree that working entirely in a world without state is not very practical. Perhaps you should try a more pragmatic FP language like F#.

In my opinion, functional programming should not be promoted as a programming model by its practitioners. Whether the language used is imperative, functional or otherwise, that has no impact on the actual model used by all current computer languages. I am talking about the algorithmic software model. It is just as difficult (if not more) to extract fine-grain parallelism from an Erlang or a Haskell program as it is from a C program. Fine-grain parallelism is the future of computing. Functional programming fanatics can rant all they want about the suitability of FP for parallel programming but the fact is that their approach to parallelism is limited to coarse-grain parallelism. I have yet to find a fine-grain parallel QuickSort written in FP. From my standpoint, this is a clear indication of the inadequacy of FP as a model of computing.

Having said that, I see a FP compiler simply as a special-purpose tool (used by math geeks) to be grafted on top of an existing software model. There are only two real software models in the world, algorithmic and non-algorithmic. FP can be grafted on top of either. Since the non-algorithmic paradigm is the true future of programming, FP will end up being grafted on top of that. In conclusion, it is confusing at best and disingenuous at worst to speak of FP as a computing model. It is not. Computing is entirely about behavior and evaluating functions is just one of the many types of behaviors that computers can handle. Don't confuse a type of behavior with a computing model, please.

Louis Savain said...

Precise timing is important in some cases, but for most software it's not. And the order of execution only matters when there's a temporal dependency, which ideally you'd like to minimize anyway so you can run the operations in parallel.

I disagree, of course. When I use the term 'deterministic timing' I am not referring to real time but to a virtual system-wide clock. All concurrent elementary opperations that must execute at a particular time must do so within a single cycle of the virtual clock. What is important is not the real time precision of individual operations (an impossibility in current systems) but their relative order of execution (concurrent or sequential) according to a virtual heartbeat. Furthermore, I disagree that we should try to minimize temporal dependencies. They are an essential characteristic of computing and should be welcome. I am currently writing a new FP-bashing (LOL) article in which I will explain how temporal dependencies can be used to increase the correctness and reliability of a program using the software model that I am proposing. Indeed, the more dependencies, the merrier. This is in sharp contrast to FP dogma and will ruffle the feathers of FP fanatics.

As for security, temporal determinacy doesn't necessarily help. In fact, it can be a security vulnerability. See, for example, timing attacks.

Apparently, it's not much of a vulnerability since cryptography is used world-wide by all sorts of public and private organizations to effectively secure their private data. Again, I was not referring to real time determinacy but execution order determinacy. Still, real time determinacy is much more attainable and maintanable in a system that enforces execution order determinacy. I will explain why in my upcoming article

David Hopwood said...

In the minds of mathematicians, computation is synonymous with function evaluation. They see the computer strictly as a glorified calculator, a machine for the evaluation of math problems. My position, by contrast, is that computing is not about function evaluation at all but about behavior.

This is a completely silly argument when applied to languages like Haskell and Erlang. Haskell uses monadic I/O (including extensions like transactional memory in GHC Haskell) to specify behaviour; Erlang uses concurrent processes with message passing. The people working on these languages, whether mathematicians or not, are perfectly aware that pure function evaluation alone isn't sufficient for a practical programming language, or even a complete model of computation.

We could argue about the merits of using a functional core language to define processes, but I think your opinions are too polarised for that argument to be productive.

namekuseijin said...

I stoped after the first few paragraphs.

I shouldn't probably feed trolls, but I actually have a few remarks:

"functional programming is that it was created by mathematicians for mathematicians. In the minds of mathematicians, computation is synonymous with function evaluation. They see the computer strictly as a glorified calculator, a machine for the evaluation of math problems."

First: all problems in the world can be reduced to math problems, given the right mathematical model. That's how we can map real-world concepts, images and sounds to numerical representations able to be manipulated by a computer.

Computer Science is a branch of mathematics dealing with computations and computability. And the computer IS a glorified calculator: all images, sounds, UML models, java classes and whatnot are ultimately fed as numbers to the computer, which is a Turing Machine, ie. a "function" which receives inputs from an infinite length list and produces an output list which can be retrofed to the input list or even used to change the behaviour of the "function" itself.

Need I also go into the Turing-Church thesis about the equivalence of the Turing Machine and Lambda Calculus as computation models?

"sum [1..10]
There is no way the compiler can add a list of integers without updating at least one variable."

I don't know about Haskell, but here's a conceptually simple implementation in Scheme:

(define (sum from to)
  (if (>= from to) to
    (+ from (sum (+ 1 from) to))))

you my call it:
(sum 1 10)

There's no hidden variables anywhere other than bindings to formal arguments during function application. Sure, you can argue the compiler transforms arguments into physical registers in the machine and updates them on the go. But the update is based on the value of a recursive function call, ie. the result of a computation.

BTW, this implementation is naive: 1) it won't compute (sum n m) for n > m; 2) no tail-recursion means the stack will quickly overflow during recursive calls. Here's the slightly modified properly tail-recursive version:

(define (sum from to)
  (define (s f t res)
    (if (>= f t) res
      (s (+ 1 f) t (+ res f))))
  (s from to to))

this doesn't break the stack, ever. The result of the recursive computations is stored in an accumulator parameter (res) which is returned when from >= to. It's not a "destructive update" behaviour, but merely parameter passing during successive calls.

(sum 10 10) ;=> 10
(sum 5 1) ;=> 1
(sum 1 5) ;=> 15, as expected

Louis Savain said...

namekuseijin wrote: Need I also go into the Turing-Church thesis about the equivalence of the Turing Machine and Lambda Calculus as computation models?

You may repeat over and over that computers are about mathematics but it's just plain bullshit. Math is just one of the behaviors that can be performed by a computer. All that Turing Machine crap is besides the point and only has to do with calculation. Computing is not about computation, even if the name 'computer' implies computation. That's just a historical accident. The French seem to have figured this out because they refuse to call them computers. In French, a computer is 'ordinateur' which means, more or less, an organizer.

Truth is, computing is about behavior. And behavior is about sensing, acting and timing. This means that a computer program is a collection of elementary sensors (comparators), effectors (operators), environment (variable data) and a timing mechanism. That is all. The brain, for example, does not compute. It just behaves. That is, it senses stimuli and reponds by activating one or more effectors. No numerical computation is involved.

Two more things. 1) Stop being such a pompous know-it-all. I am not easily impressed, especially when people are showing off. Opinions are a dime a dozen, including yours. You're still a fool, from my vantage point. 2) I can't stand anonymous cowards. Go buy yourself some guts from the nearest meat market and identify yourself. Otherwise, I may just trash this whole interchange just for the hell of it.

namekuseijin said...

funny you didn't even try to comment on my explanation of how the sum function works.

"You may repeat over and over that computers are about mathematics but it's just plain bullshit."

computers were first built and given theoretic basis by mathematicians and they still work by manipulating numbers alone.

"Math is just one of the behaviors that can be performed by a computer."

yes, all others are able to be computed by first adequately transforming them to numbers and functions applying to them.

"All that Turing Machine crap is besides the point and only has to do with calculation."

Turing has provided the model for any computers built or to be build and even proved what they'll can or cannot compute. Deal with it.

"Computing is not about computation, even if the name 'computer' implies computation. That's just a historical accident."

oh, yeah! Sure computer could've been invented by bakers and today we'd have the Bread VM!

"The French seem to have figured this out because they refuse to call them computers. In French, a computer is 'ordinateur' which means, more or less, an organizer."

ordenador, em portugues. It means sorter, because sorting is and will always be one of the primary and most useful computations performed by a computer. Just you ask Google.

"Truth is, computing is about behavior. And behavior is about sensing, acting and timing. This means that a computer program is a collection of elementary sensors (comparators), effectors (operators), environment (variable data) and a timing mechanism. That is all. The brain, for example, does not compute. It just behaves. That is, it senses stimuli and reponds by activating one or more effectors. No numerical computation is involved."

now just look at this:

-elementary sensors (comparators)

that seems to me like a function taking 2 or more arguments and producing as result one of them. Or a multiplexer.

-effectors (operators)

it's a function which takes arguments and returns results. In low level Von Neumann machine, this may mean the result of the computation is put into a register or a set of registers.

-environment (variable data)

function scope.

-timing mechanism

flow of control: you start with a function and goes evaluating it step-by-step.

"Two more things. 1) Stop being such a pompous know-it-all."

look who is talking...

I can say the same: I am not easily impressed, especially when people are showing off. Opinions are a dime a dozen, including yours. You're still a fool, from my vantage point.

"Go buy yourself some guts from the nearest meat market and identify yourself."

My name's Piccolo Daimao.

RJ Ryan said...

"The main problem with functional programming is that it was created by mathematicians for mathematicians."

You're confused about Erlang.

From Wikipedia:

'[Erlang] was designed by Ericsson to support distributed, fault-tolerant, soft-real-time, non-stop applications.'

'In 1998, the Ericsson AXD301 switch was announced, containing over a million lines of Erlang, and reported to achieve a reliability of nine "9"s.'