Pages

Tuesday, January 30, 2007

Real functional programming or "Why the IO-monad is evil"

Edit: This article contains some errors and wasn't able to transport my intention correctly. So I've created a new version of this article which hopefully contains less errors and is better to read.




What is functional programming? Many people tend to think that having closures in a language makes that language a functional one. But by this definition almost every modern language would qualify as 'functional'. Even Java (because of Javas anonymous inner classes which are closures too). So if this can't really be the qualifying property of a language, what else is it?

Of course it's the "referential transparency": A 'function' is an special kind of relation which assigns values of a domain-set to values of a result-('codomain') set. To qualify as a function this mapping has to be unambiguous: For every element of the domain-set the function always gives the single, same result.

If a 'function' isn't referential transparent, this property isn't fulfilled, so it's not a function anymore. It's something which is often called 'procedure': A block of code which creates a result by applying some algorithm. We can argue if this property really has to be fulfilled rigorously by a language to qualify as 'functional', but I would say 'Yes, it has!'. The reason is that in every language we can use procedures to create functions (just by making sure that there are no side-effects), but those languages aren't still called 'functional' only because the programmer can force himself to use them in a functional way.

If this would be enough, we could also call assembler a functional language, because we can use it to write functional code too. The same would be true for every other paradigm, Assembler would be for example an object-oriented language too (And this is of course true for every Turing-complete language, not only of Assembler).

But with this rigorous definition of the notion 'functional', there aren't many functional languages anymore. Ocaml for example is clearly non-functional now - and even Haskell isn't. So should we maybe lift this requirement a bit? I don't think so.

Why isn't Haskell a functional language? The reason is the IO-monad. To do I/O in Haskell there are functions which aren't referential transparent, because they return a different value at each invocation. If we write

let v = getStr in ...

v is bound to a different value at each invocation. Even if this value is contained in the IO monad, it's still a different value, because we can write code which runs different on each invocation. This creates some kind of avalanche effect which can turn huge parts of the code into simple imperative code - and because of this, we can't talk of Haskell as a functional language. Even if we want to create pure functional code, it's not possible in Haskell in the moment we need I/O (and which program doesn't?).

That's the reason why I consider the IO-monad as 'evil': Haskell relies on it and thus isn't functional anymore.

But is it possible to do I/O in a really functional way? If it's not possible then the idea of functional programming would be deeply flawed and should maybe even abandoned for practical programming. What use is a concept which works only in theory and fails in the moment you want to do something as common as I/O?

But fortunately it is possible. The idea is to put all the external input of a program into a 'world'-value and then call our program with it:

result = app(world)

If 'app' is a command-line application for example, then 'world' would represent all the input the program receives and 'result' would be all the output the program generates. This is quite an easy concept, but how would it work if we need interaction? The idea is to use lazy evaluation: Instead of reading the whole 'world' before calling 'app', 'world' is only evaluated on demand. The same is true for 'result' which is written out in the moment data becomes available and not at the end of the program. So our program could give back a partial result (for example an input prompt), before even requiring any real input.

This would work for GUI-programs too. We can for example put all the mouse/keyboard/etc-events in the 'world' object and 'result' contains all the drawing-commands the applications issues. This approach would create a very distinct structure of the program, because input and output are now clearly separated.

This works well for some problems, but often there is some 'natural' interaction between both. Your simple command-line application may for example open, read and write some data-files, based on commands issued by the user. We can not create a simple function like 'readFile(file_name)' because this 'function' could give different data at each invocation and is thus no function. So how to solve this problem?

The answer is to put all 'external data' into the 'world' object: All files, all system resources etc. Of course this also works with lazy evaluation so those data is not really read and put into a value. The 'world'-value behave just as this is the case. And if we now start our application with all those data in the argument, it would again give the same result as long as the input value is the same too. If we only change a single character in a file which is contained in the 'world', the value isn't the same anymore and so we could get a different result.

The advantage of this approach is that we can limit out 'world-value' to the parts of the system which can be accessed by our 'app'. If we don't want that 'app' can read a file for example, we simply don't put the 'file-system' into the value. This allows very strict security constraints on data access in a very natural way. And our 'app' is of course not a big monolithic function but calls other functions which in turn call again other functions etc.. So we can give those other functions limited subsets of the world-value to limit their access, too.

This approach is truly functional and also very natural: An application simply transforms one 'world-state' into another.

And there is an additional advantage of this approach compared to monads: Higher performance.

To use the plain IO-monad, the language isn't referential transparent anymore. So many of the nice and clever optimization techniques for functional programs can't be used anymore. Using the IO-monad forces a certain order of evaluation and also disallow caching of already calculated values.

But even if you use other monads to simulate state, the result isn't really as fast anymore. The result is that monads are constructors and can only 'give' some value. But how can we create and modify state, if we can only return values and have no 'input'?

With state-monads it works this way: Instead of transforming input-data (which contains the old state) into output-data (which contains the new state), a state monad returns a new program. It doesn't do transformation on the data but on the program itself! In the end this transformation does nothing else then transforming the code into 'transformer-style' code which gets the state as input and creates a new state as output. It does this by putting chunks of code into closures and chaining them together using the 'bind'-function. So if you use a state monad in Haskell, the state monad does nothing more than a code-transformation and then evaluating this code which in turn chains a state-value thru the invocations.

But this transformation and evaluation is done at runtime! Sure, sometimes it can be cached or even unrolled at compile-time, but only in rare cases. The reason is that the 'to transformed' code is dependent on the input values of a function which can (for example with a if-then-else construct) create a different kind of code every time. The Haskell compiler can't resolve this in most non-trivial cases and thus the whole code-transformation and evaluation has do be redone over and over again. This costs time and also often even kills tail recursion. So by using a state monad we often have much less efficient code than by using the transformer based approach directly

A language which does I/O by the transformer approach (but in a different way as proposed here) is Clean, and if you look the benchmarks at the shootout, it seems clear that there is really a performance advantage here.

And whats the disadvantages? Of course there are some, or else nobody had even considered to use monads instead.

The main advantage of monads is that they are in principle code transformers. With monads we can create embedded DSLs, similar to Lisp-macros (But since monads do this code transformation at runtime, it costs time and memory, while Lisps macros are evaluated by the compiler before creating the real code). Having the ability to create new 'embedded languages' has some appeal on many people. With the transformer approach this isn't possible - but of course there are still other ways to do this, if it's really wanted.

The second disadvantage is that the transformer approach requires an additional parameter in each invocation. Instead of writing

do
a <- getStr
b <- getStr
putStr $ a ++ "," ++ b

we have to write

let
(a, state) = getStr state
(b, state) = getStr state
state = putStr state (a ++ ", " ++ b)
in
state

instead. This look pretty obfuscated compared to the Haskell approach. But the reason is that Haskell has syntactic support for monads but none for the other approach. If we had to write the first example without the syntactic support the 'do' construct provides, it would look even uglier than the second one (just try it, it's also a good practice if you are new to monads).

So why not simply add syntactic support for the transform-approach too? What about this:

with state let
a <- getStr
b <- getStr
putStr a ++ ", " ++ b

This is quite short too, and the 'with' syntax can be used by the compiler to use the function type to chain 'state' it thru all calls which have a parameter with a matching type and also return a value of the same type.

The last disadvantage is that the transformation approach need additional semantic support. Why? Because lazy evaluation alone isn't enough. We often need a certain order of evaluation to read and write data in a certain required order. Often this order is 'automatically correct', but sometimes it isn't.

For example (without the above proposed syntax to make the problem more clear):

let
state = putStr state "Please enter your name: "
(state, name) = getStr state
state = putStr state ("hello '" ++ name ++ "', please enter your age: ")
(state, age) = getStr state

If we evaluate this statements, the program will write

Please enter your name: hello '

and wait for input. Why? Because the 'getStr state' line isn't evaluated before 'name' is actually required. That's the funny thing with general lasy evaluation: Sometimes the evaluation-order can be quite unexpected.

How so solve this problem? There are multiple solutions. First we can replace the '++' after "hello '" by a strict variant which requires evaluation of both parameters before it continues. This would force the program to the wanted behavior, but it would also require additional thinking by the programmer.

A better way would be to create an internal dependency between data. For example 'putStr' and 'getStr' would create and check a dependency on 'state' which would force evaluation of all previous 'putStr' before a 'getStr' can occur (and vice versa). This would only fore evaluation order into a certain form, but each of the functions would remain referential transparent. But there has to be some compiler support to support this feature.

So I/O is possible, in a totally functional way and with some support from the language even in a relatively simple way. Without monads we lose the possibility to create embedded languages, but I think that this isn't really a big disadvantage. But for I/O monads aren't necessary and even harmful.

Saturday, January 06, 2007

What makes a programming language 'more productive'

First: What does 'productive' mean? I would loosely define it in the context of programming as "Given two equally skilled programmers or teams of programmers, the less time they need to create and maintain a program for a given problem, the more productive is the implementation language".

This definition may have some holes in it, but I first want to make clear, that productivity for me is not an abstract thing like "code beauty" or "conciseness", its simply outcome oriented. If you're able to create a program in a consistently shorter time by using a certain language or framework compared to another, I think it's a good idea to consider it as 'more productive'.

What's the secret of productivity? In fact its utterly simple: Code reuse.

That's it? Code reuse? What about powerful abstractions, concise code or clean and powerful language design? Doesn't matter (unless it leads to better code reuse)! Let me explain:

If you have to solve a problem you have to think about it first. This takes a lot of time. You also have to think about it while you're implementing a solution and later while debugging or maintaining your solution. Thinking about the problem generally requires the most part of the time in software development. There's only one way a programming language can solve it: By providing an existing solution. Sure, there are simple cases where you only have to type a simple solution in without thinking about it much. But even then: If you don't have to write the code yourself and reuse existing code it's simply the fastest way to a working result. Existing code is

It doesn't even matter how powerful a language is: If you can build your solution on existing code with only small and easy modification, you will be faster then using the most powerful language. Also existing code is already debugged and tested. And if it's from a 3rd party developer you don't even have to maintain it.

But code reuse isn't always identical to 'using a framework' or 'importing a lib'. Sometimes it's build right into the language: When I switched from C++ to Java I experienced a huge productivity gain. A big contributor to this was garbage collection. Instead of thinking about memory allocation, when to release an object, how to write a fast custom allocator, allocating data on the stack or on the heap etc. I no could allocate objects without thinking much about it. I could simply consider the problem as 'solved' by the language.

Something like this happens on many more occasions: I can create classes in C, but by having the language do it like in C++ or Java, I can consider the problem 'solved'. In assembler I have to create function calls and local variables myself - in a higher level language this problem is again solved. All this is code reuse: Somebody has looked at common programming situations ('patterns') and created a universal solution for it. Sometimes this solution is directly incorporated into the language, more often it's put into a library. But nearly always both ways are possible, so it doesn't make sense to call code reuse which is part of a language a 'abstraction' and code reuse from a library simple 'code reuse'.

So the reason why certain languages are more 'productive' that others is code reuse. C is more productive than assembler because we don't have to think about allocating variables or building stack-frames. By having the problem solved we don't need to think about it, we don't need to implement it again and again and we don't need to search for bugs which result from making mistakes by doing it ourself.

Now we can try to identify the reasons why certain languages are more productive than others, and why sometimes even more powerful looking languages don't deliver.

Lets first look at Java. I mentioned that my productivity increased a lot after switching from C++ to Java. Besides the already mentioned garbage collection, the huge number of available libraries are another part of the reason. A interesting question is: Why are there so many libs for Java? There are other languages which had huge commercial support but never had as many reusable code as Java. Why?

If I want to reuse code, the code has to fit into my existing code. But more important: It has to fit into the code of other vendors. If I create a program which uses 5 different libs and some 1000 lines of my own code, writing my own code in a way that it fits to the libs is possible - but this doesn't solve the problem of fitting those 5 libs together if all those are written independently. To make code 'fit' it has to use the same conventions, similar interfaces and standards. This works much better if the language enforces it.

One example is garbage collection: When I used C++ there where multiple methods of doing memory allocations. Many used reference counting, but since there was no standard implementation, each lib used their own one. But how can you combine two libs which both have their own ref-counting scheme? In Java this was a non-problem because the language has solve the gc problem.

But there are other examples. Like having standard libs for most of the basic stuff. If two libs use their own array implementations you can't simply use an array from one lib in the other. But if the standard libs provide something it's unlikely that every vendors creates it's own solution.

But it continues on a higher level: If two libraries use different methodologies to solve a similar problem you will get an impedance mismatch if you try to use them together. Maybe one is written in a more procedural and the other in a more object oriented way: Now you need lots of boilerplate code to make such code work together which in turn makes reuse harder.

Java tackled this problem by removing abilities from the language to enforce a certain way of programming. If there is only one sensible way to do something (because other ways are artificially made much more difficult) the programmer may curse the language for this in the moment, but at a later time he may be happy about it because it allowed him to reuse the code in a different application. And that possible because the language enforced a certain way of solving things.

And if independent programmer creates libraries without even knowing in the moment who will use the libraries later, it can really help if they are forced to use a certain methodology, even if it may hurt at the moment.

So while Java really has it's downsides, in the regard of code reuse it really made a lot of progress compared to many earlier languages. So if we want to create better languages we always have to consider this lesson we learned from Java: Having a powerful language alone isn't enough to gain productivity if the language neglects code reuse. Even a less powerful language can fly ahead a more powerful one if it encourages and eases the creation of reusable code.

But time has moved ahead and many ask if its possible to reach the goal of better code reuse AND have a more powerful and more concise language than Java? I'm sure it is, but only as long as we have the idea of code reuse in mind. Creating a language with only 'clarity', 'conciseness' or 'power' in mind isn't enough, we always have to think about how it's possible to enforce the creation of reusable code in this language. Yes, we need to enforce it. Not because programmers are stupid or deliberately write un-reusable code, but because only by enforcing we can be sure that two teams of developers who don't know about each other can create code which will fit together later if reused by a third team of developers. We simply need rules to make their work fit together.

But this leads immediately to a conclusion: Multi-paradigm-languages won't work. While it looks as a good idea to give programmers more freedom to express their ideas, this in turn leads do code which is to different to fit together.

(I suspect that this is the prime reason why Lisp never made a real breakthrough - but also why there are some success stories with Lisp. If you don't need to reuse code (for example if you work in a new field where simply no code exists) Lisp can give you a big productivity gain and create a nice success story. But if a language like Java can play out it's code-reuse card than the gains are irrelevant because the Java programmer simply puts some libs together while the Lisp developer is still thinking about which libs there are on the market and if it's possible to integrate them into one design or do a new implementation instead).

But multi-paradigm isn't the only 'no-no'. Making languages to customizable is another one: If a language can be easily customized by macros, reflection, template meta-programming etc., this can also reduce code reuse: If we want to integrate two libraries which both use and rely on lots of those customizations, it's quite probable that those customizations won't fit together. It can work but often it won't.

This is not only true for 'real' meta-programming like macros or reflection, it can also happen with 'to flexible abstractions'. Lets take a short look at Haskell's monads: They are very powerful - but this leads to problems. If you have code which uses a A-monad and want to fit it together with code which runs in a B-monad, you get problems. And if some 'monad-free' code requires to be run into some monad later you maybe have to rewrite it completely, even if only a very small portion need access to the monad. This can be quite annoying if you have to rewrite your own code - but if you have to reuse 3rd party code it can even render it impossible.

The problem here is not the monad concept itself, its the choice you have to use it or not. The choice creates the possibility to go different way - and using the wrong way can lead to lots of rewrites or reimplementations you have to do instead of simply reusing existing code.

So the secret of successful code reuse is to "removing choice". But thats something which seems unswallowable for many programmers, especially those who consider them selfs 'hackers'.

If you are one of those, let me ask you a question: Would you like a game like chess more if there are no fixed rules and you could do everything? I doubt it, the fixed rules are just the reason why chess is interesting: You have to solve problems in the context of a fixed and rather limited set of rules. If you could simply win by hitting your opponent over the head with a club, chess would loose lots of it's appeal, wouldn't it? So just look at the 'choice problem' in a different way: If there is a limited set of ways to solve a problem, can't this not even makes it more interesting to solve it?

And to the language designers: Isn't it an interesting problem to create features which are expressive AND lead the programmer in a certain direction? Simply putting everything into a language is not difficult (think of Homer in the Simpsons Episode where he designed a car). But it's also simply just to remove things: Creating a language based on a single easy concept is simple too. And a dynamically typed language is much more simple to design than a language with a good static type system. If you really like the challenge then design something new, something different, something difficult.

A language which allows for good code reuse don't have to be simple, it has to force the user to solving problems in a certain way without limiting him to much. This sound like a contradiction and yes it is, but thats the difference between theory and practice: We always have to do compromises or we create things which are good in theory but unusable in practice.