Tuesday, January 30, 2007

Real functional programming or "Why the IO-monad is evil"

Edit: This article contains some errors and wasn't able to transport my intention correctly. So I've created a new version of this article which hopefully contains less errors and is better to read.

What is functional programming? Many people tend to think that having closures in a language makes that language a functional one. But by this definition almost every modern language would qualify as 'functional'. Even Java (because of Javas anonymous inner classes which are closures too). So if this can't really be the qualifying property of a language, what else is it?

Of course it's the "referential transparency": A 'function' is an special kind of relation which assigns values of a domain-set to values of a result-('codomain') set. To qualify as a function this mapping has to be unambiguous: For every element of the domain-set the function always gives the single, same result.

If a 'function' isn't referential transparent, this property isn't fulfilled, so it's not a function anymore. It's something which is often called 'procedure': A block of code which creates a result by applying some algorithm. We can argue if this property really has to be fulfilled rigorously by a language to qualify as 'functional', but I would say 'Yes, it has!'. The reason is that in every language we can use procedures to create functions (just by making sure that there are no side-effects), but those languages aren't still called 'functional' only because the programmer can force himself to use them in a functional way.

If this would be enough, we could also call assembler a functional language, because we can use it to write functional code too. The same would be true for every other paradigm, Assembler would be for example an object-oriented language too (And this is of course true for every Turing-complete language, not only of Assembler).

But with this rigorous definition of the notion 'functional', there aren't many functional languages anymore. Ocaml for example is clearly non-functional now - and even Haskell isn't. So should we maybe lift this requirement a bit? I don't think so.

Why isn't Haskell a functional language? The reason is the IO-monad. To do I/O in Haskell there are functions which aren't referential transparent, because they return a different value at each invocation. If we write

let v = getStr in ...

v is bound to a different value at each invocation. Even if this value is contained in the IO monad, it's still a different value, because we can write code which runs different on each invocation. This creates some kind of avalanche effect which can turn huge parts of the code into simple imperative code - and because of this, we can't talk of Haskell as a functional language. Even if we want to create pure functional code, it's not possible in Haskell in the moment we need I/O (and which program doesn't?).

That's the reason why I consider the IO-monad as 'evil': Haskell relies on it and thus isn't functional anymore.

But is it possible to do I/O in a really functional way? If it's not possible then the idea of functional programming would be deeply flawed and should maybe even abandoned for practical programming. What use is a concept which works only in theory and fails in the moment you want to do something as common as I/O?

But fortunately it is possible. The idea is to put all the external input of a program into a 'world'-value and then call our program with it:

result = app(world)

If 'app' is a command-line application for example, then 'world' would represent all the input the program receives and 'result' would be all the output the program generates. This is quite an easy concept, but how would it work if we need interaction? The idea is to use lazy evaluation: Instead of reading the whole 'world' before calling 'app', 'world' is only evaluated on demand. The same is true for 'result' which is written out in the moment data becomes available and not at the end of the program. So our program could give back a partial result (for example an input prompt), before even requiring any real input.

This would work for GUI-programs too. We can for example put all the mouse/keyboard/etc-events in the 'world' object and 'result' contains all the drawing-commands the applications issues. This approach would create a very distinct structure of the program, because input and output are now clearly separated.

This works well for some problems, but often there is some 'natural' interaction between both. Your simple command-line application may for example open, read and write some data-files, based on commands issued by the user. We can not create a simple function like 'readFile(file_name)' because this 'function' could give different data at each invocation and is thus no function. So how to solve this problem?

The answer is to put all 'external data' into the 'world' object: All files, all system resources etc. Of course this also works with lazy evaluation so those data is not really read and put into a value. The 'world'-value behave just as this is the case. And if we now start our application with all those data in the argument, it would again give the same result as long as the input value is the same too. If we only change a single character in a file which is contained in the 'world', the value isn't the same anymore and so we could get a different result.

The advantage of this approach is that we can limit out 'world-value' to the parts of the system which can be accessed by our 'app'. If we don't want that 'app' can read a file for example, we simply don't put the 'file-system' into the value. This allows very strict security constraints on data access in a very natural way. And our 'app' is of course not a big monolithic function but calls other functions which in turn call again other functions etc.. So we can give those other functions limited subsets of the world-value to limit their access, too.

This approach is truly functional and also very natural: An application simply transforms one 'world-state' into another.

And there is an additional advantage of this approach compared to monads: Higher performance.

To use the plain IO-monad, the language isn't referential transparent anymore. So many of the nice and clever optimization techniques for functional programs can't be used anymore. Using the IO-monad forces a certain order of evaluation and also disallow caching of already calculated values.

But even if you use other monads to simulate state, the result isn't really as fast anymore. The result is that monads are constructors and can only 'give' some value. But how can we create and modify state, if we can only return values and have no 'input'?

With state-monads it works this way: Instead of transforming input-data (which contains the old state) into output-data (which contains the new state), a state monad returns a new program. It doesn't do transformation on the data but on the program itself! In the end this transformation does nothing else then transforming the code into 'transformer-style' code which gets the state as input and creates a new state as output. It does this by putting chunks of code into closures and chaining them together using the 'bind'-function. So if you use a state monad in Haskell, the state monad does nothing more than a code-transformation and then evaluating this code which in turn chains a state-value thru the invocations.

But this transformation and evaluation is done at runtime! Sure, sometimes it can be cached or even unrolled at compile-time, but only in rare cases. The reason is that the 'to transformed' code is dependent on the input values of a function which can (for example with a if-then-else construct) create a different kind of code every time. The Haskell compiler can't resolve this in most non-trivial cases and thus the whole code-transformation and evaluation has do be redone over and over again. This costs time and also often even kills tail recursion. So by using a state monad we often have much less efficient code than by using the transformer based approach directly

A language which does I/O by the transformer approach (but in a different way as proposed here) is Clean, and if you look the benchmarks at the shootout, it seems clear that there is really a performance advantage here.

And whats the disadvantages? Of course there are some, or else nobody had even considered to use monads instead.

The main advantage of monads is that they are in principle code transformers. With monads we can create embedded DSLs, similar to Lisp-macros (But since monads do this code transformation at runtime, it costs time and memory, while Lisps macros are evaluated by the compiler before creating the real code). Having the ability to create new 'embedded languages' has some appeal on many people. With the transformer approach this isn't possible - but of course there are still other ways to do this, if it's really wanted.

The second disadvantage is that the transformer approach requires an additional parameter in each invocation. Instead of writing

a <- getStr
b <- getStr
putStr $ a ++ "," ++ b

we have to write

(a, state) = getStr state
(b, state) = getStr state
state = putStr state (a ++ ", " ++ b)

instead. This look pretty obfuscated compared to the Haskell approach. But the reason is that Haskell has syntactic support for monads but none for the other approach. If we had to write the first example without the syntactic support the 'do' construct provides, it would look even uglier than the second one (just try it, it's also a good practice if you are new to monads).

So why not simply add syntactic support for the transform-approach too? What about this:

with state let
a <- getStr
b <- getStr
putStr a ++ ", " ++ b

This is quite short too, and the 'with' syntax can be used by the compiler to use the function type to chain 'state' it thru all calls which have a parameter with a matching type and also return a value of the same type.

The last disadvantage is that the transformation approach need additional semantic support. Why? Because lazy evaluation alone isn't enough. We often need a certain order of evaluation to read and write data in a certain required order. Often this order is 'automatically correct', but sometimes it isn't.

For example (without the above proposed syntax to make the problem more clear):

state = putStr state "Please enter your name: "
(state, name) = getStr state
state = putStr state ("hello '" ++ name ++ "', please enter your age: ")
(state, age) = getStr state

If we evaluate this statements, the program will write

Please enter your name: hello '

and wait for input. Why? Because the 'getStr state' line isn't evaluated before 'name' is actually required. That's the funny thing with general lasy evaluation: Sometimes the evaluation-order can be quite unexpected.

How so solve this problem? There are multiple solutions. First we can replace the '++' after "hello '" by a strict variant which requires evaluation of both parameters before it continues. This would force the program to the wanted behavior, but it would also require additional thinking by the programmer.

A better way would be to create an internal dependency between data. For example 'putStr' and 'getStr' would create and check a dependency on 'state' which would force evaluation of all previous 'putStr' before a 'getStr' can occur (and vice versa). This would only fore evaluation order into a certain form, but each of the functions would remain referential transparent. But there has to be some compiler support to support this feature.

So I/O is possible, in a totally functional way and with some support from the language even in a relatively simple way. Without monads we lose the possibility to create embedded languages, but I think that this isn't really a big disadvantage. But for I/O monads aren't necessary and even harmful.


Pete said...

I'm relatively new to FP so I admit that what you've written leaves me somewhat baffled.

It seems to me that the IO Monad is basically your "world" set up differently. Please explain (short words, for us simpletons) how this is not so.

As for your final example and suggestion that the execution order is not clearly defined...I don't think that is true. What you are really suggesting in your example is something like:

state' = putStr state "Please enter your name: "
(state'', name) = getStr state'
state''' = putStr state'' ("hello '" ++ name ++ "', please enter your age: ")
(state'''', age) = getStr state'''

Since the value returned from each function invocation must be distinct for this to be an IO-style function (otherwise the first (state, name) = getStr state and the second (state, age) = getStr state must return the same value since they both take exactly the same input) it isn't valid that the same state object be re-used - each reference must be a new value.

So this suggests that strictness is not necessary as the sequencing is done through the threading of state, state', etc. Thus it is "state" and its derivatives that force evaluation and not 'name' or 'age'.

Again, if I'm missing something here please explain it.

I don't see how this idea is essentially different from the IO Monad in Haskell. Possibly performance is better. I can only suspect that Monad (and I definitely don't understand them well enough yet) has significant advantages in structuring that it has been evelated to a higher status in haskell.

Regards and thanks for the article!

Karsten Wagner said...

The IO monad is not the 'world', because you can't put anything into functions using it. A function like getStr don't need any input, it only returns a value.What 'getStr' really is, is a statement in an imperative sub-language created by the IO monad. The 'getStr' function don't returns the string-value, it returns a 'statement' which is executed later - and this statement then returns the string-value.

This is like we write a program in Java, use a compiler to translate it into a pure functional program (which is of course possible by using the state-chaining approach) and now say "Java is a functional language", because the code is just an imperative description for our functional program.

But this is obviously nonsense. Java is no functional language - and the sub-language implemented by the IO monad is no functional language too. But because we need this imperative sub-language to do output in Haskell, Haskell isn't a functional language anymore, because we can't do anything 'real' in Haskell without using this sub-language.


About the last example: You're right here.

I first had the following code for the example:

addNumbers ins = doIt ([], 0) ins
doIt (out, act) (v:ins) = ("Number (" ++ show act ++ "): " ++ new_out, res)
(new_out, res) =
if v == "." then (out, act)
else doIt (out, act + (read v)::Int) ins

if you call 'addNumbers ["1", "x", "10", "."]' you get the output, but with the mentioned problem. But here input and output are separated and to force the right order I need a strict '++' after "Number(" to force it.

I used this example in the text first, but found it to complex and so I used a new one without thinking enough about it first. Thank you for the correction.

Adam Ierymenko said...

What about concurrency? What if an element of world changes *after* it is lazily evaluated and then needs to be lazily evaluated again? Doesn't this still break the functional model?

Absolutely pure FP does indeed seem to have a problem. I suspect this is why FP is not in general use and why functional langauges tend to take on non-functional hacks for things like IO.

John said...

pete - the idea is that the "world" is a script of everything that will happen to the program. So by definition, every time you run the program with the script, it will produce the exact same output.

An IO monad is different - it's just a door through which the user shoves information. There's no "guarantee" that the result will be the same.

Having said that, Karsten, this whole article seems hopelessly pedantic and this approach would cause far more problems than it would cause. There's a reason people don't write programs in lambda calculus - because syntactic sugar is incredibly valuable for getting things done.

I do appreciate, however, the diligent thought, creativity and philosophical passion you used to write this.

nmessenger said...

It may interest you to know that your World-transforming recommendation is pretty much GHC's implementation of IO. No, not just "pretty much", you have implemented it. Check out the Haskell wiki article IO inside.

A type 'IO a' is a function that takes the World, transforms it, and results in the both an 'a' and the transformed World. (i.e. type IO a = World -> (a, World)). Try it right now. Go into ghci and type ':i IO'. It uses some funny-looking primitive types for performance reasons, but is effectively equivalent. Any claim that the Haskell's IO monad is "evil" or not referentially transparent would apply to your proposal, too. They are the same!

Also, IO's bind operator produces just the nice syntax you want for your explicit World-passing code, using plain-jane Haskell.

Karsten Wagner said...

@Adam Ierymenko:

> What about concurrency? What if an element of world changes *after* it is lazily evaluated and then needs to be lazily evaluated again? Doesn't this still break the functional model?

No, because the 'world' can't change 'backward in time'. If you have a dynamic 'world', then you wouldn't use a static snapshot but a list of events instead. Or you transform a series of snapshots of the world into some output. Both methods are possible, but it depends on the problem which is better.

> Absolutely pure FP does indeed seem to have a problem.

Yes, it seems to have a problem, but I think it is solvable and we don't need hacks like the IO-monad. IMO Haskell went astray in this regard and thats why I wrote the article. I know that there are still 'holes' in the solution proposed here, but to bring functional programming forward it's necessary to think about it.

The solution the Clean language uses to solve the problem is IMO more functional. Again not totally - but with some small changes it probably could be.


The IO-monad represents an imperative sub-language. If you use it, you've simply left 'functional-land'. Having this in a language like Haskell is like the confession of the inability of functional programming handling things like I/O. You may think that this is a pedantic point of view, but I think it's necessary to solve the problem instead of simply falling back to the old habits of imperative programming. And I'm not against 'syntactic sugar', I even proposed a syntax to ease the chaining-thru of state.


It doesn't really matter how it is implemented. What matters is how you use it. Like I wrote in my previous comment: You can create a compiler which translates Java-code into a pure functional-code. But Java is still imperative, even if it's possible to translate it into a pure functional language and because of this you also lose the advantages of functional programming. With the IO-monad it's the same.

Anonymous said...

Because any turing-complete language can be translated into any other turing-complete language, all turing-complete languages are the same?

The value of getStr :: IO String is the same on every invocation (actually, it doesn't have invocations, since it's a constant). It's always an IO action that will pass the string given by the user at that point in the program to the next function passed to bind. The value of an IO action is not the same as the value of what it passes to the next function, any more than the value of a list is the value at its head.

Try to forget you ever heard the words "imperative sublanguage". IO is a monad because it's a pain to pass world to every IO function. That is it's purpose; that it separates IO and pure functions is a consequence of this purpose and of Haskell's type safety, not a reason by itself to use it for IO.

Anonymous said...

You're quite right that passing a world around like this lets you write a purely functional program that talks to the outside world. It seems you have reinvented the idea yourself, which is pretty impressive, but you haven't quite gotten around to realizing the IO monad is referentially transparent for the same reason. Basically, IO a is just an abstract synonym for functions taking a world and returning a new world along with a value of type a, and the >>=
hides the world passing. It's nicely explained at the beginning of Tackling the Awkward Squad.

Karsten Wagner said...

> Because any turing-complete language can be translated into any other turing-complete language, all turing-complete languages are the same?

I tried to point out the exact opposite: Because we can do this translation with every language, we have to acknowledge that the IO-monad creates a sub-language which is not a functional one. Even if it is maybe translated into pure functional code somewhere. So to solve I/O in Haskell, the solution is not a functional one, but simply an imperative. And because of this, Haskell has no solution for the problem of doing I/O in a functional way, it created a imperative DSL to do it.

Monads are sub-languages: The maybe-monad creates a if-then-else language, the list-monad creates a list-comprehension sub-language and the IO-monad creates a imperative IO-sub-language etc.

But this is totally different to the state-chaining approach where we always can 'break out' of the standard usage pattern. It's like embedding Java into Haskell. This is totally possible by creating a monad (even if the syntax won't be the same as the original Java of course, the semantics can be exactly the same). If we now program something in this embedded Java, can we say that we have 'written a program in a functional language'? No, of course not. Even if we used the Haskell parser to parse the code and the Haskell compiler to eventually execute it, in the middle we still have written our code in Java. And by using the IO-monad we're doing the same! Not with Java of course, but with a DSL created for I/O. This language is imperative and so we're simply using no functional language anymore.

The IO-value don't contains the 'world'. It contains an operation on the world. For example getStr is a constant which returns an "IO String" which has a type like "world -> (world, String)". So getStr returns a function which takes the 'world', and returns a 'world' and a String. This is the 'statement' I mentioned above. With the 'bind' function those statements are now bound together and maybe even transformed - but still not necessarily executed. So the result of the 'do' block is a structure which describes our imperative program which is then executed in a way similar to manually chaining the 'world' thru the calls.

But this is not the same than doing it by yourself because we don't have access to the 'world' anymore. It's contained in the IO value and used by the compiler to do some tricks to compile the program more efficiently.

But without having access to this world value, we have no other way of programming I/O than by using this embedded imperative programming. For example I can't store the 'world' somewhere, do some operations on it and than put it back to the old state if an operation fails.

George Moschovitis said...

From what I understand, the Monad is just some syntax sugar around the 'world' concept. They are the same thing.

bd_ said...

George, syntax sugar is half of it. The other half is making sure you don't try to, for example, copy the world, and in one thread of the world ask X, read Y, and the other ask Y, read Z. Since we can't currently fork() the world, the IO monad takes steps to prevent this kind of copying from happening.

Maximilian Claus said...

I read your article, and it seems that you did not understand monads. Haskell is as pure as Clean (and does not rely on a strange addition to it's type system).

In general, monads do nothing else than making the world parameter implicit by abstracting it with a combinator (">>="). At the same time, this guarantees single threadedness, for which Clean needs it's uniqueness.