I realized two important things the day before yesterday. The first is that I have programmed exclusively on Libaudioverse since the summer started. The second is that I know how to make an interpreter and, quite possibly, a compiler. Coupled with my interests (contributing to projects like Pypy, getting into grad school) and all the people who have said that compilers are a core part of CS, I've bumped it up my list a bit. My college does not give us the opportunity to take a class, so it's time to make my own.

To that end, I decided to finally learn Haskell and spent all of yesterday doing so. This has also been something on my list for a very long while. Basically, I was burned out with Libaudioverse, so it was time to do something else. Haskell is very unusual and also potentially very useful. I was actually quite impressed by how it was able to get at the core of some ideas, but there are some gotchas that prevented the code from being as clean as it could have been. I've got a commented implementation of binary search trees here and want to talk about just how strange Haskell is and why I'm actually very glad I'm learning it.

Please note that I am trying to convert some very deep ideas to English and am losing a lot of the elegance in the process. We can make nice analogies with imperative languages and real-world situations; this is not so with functional languages. Functional languages require working with them to understand them, in much the way that people say you shouldn't learn a foreign language by mapping it to your native tongue. If this post seems confusing or somehow lacking in detail, the reason is that the ideas in Haskell do not map well to English.

Go read my commented example. Disclaimer: this is learning code and not production code. It may have bugs I haven't spotted.

Haskell is a lazy functional programming language.

Let's take the functional part of that first. A functional programming language goes beyond first-class functions. In imperative languages, the emphasis is on loops and mutable data structures and your program executes from top to bottom, left to right. In a functional language, more emphasis is placed on the function as the unit of algorithmic description: loops can be replaced with recursion, and you don't typically worry about overflowing the stack. Haskell is a purely functional language, meaning that it is impossible to fall into the imperative style of programming: there are no for or while loops and if is an expression and not a statement. Furthermore, variables are not typical-they take after their mathematical counterparts in that you can't change them once assigned. We are in fact talking about assign in the mathematical sense; to that end, you can declare them after you use them and everything is fine.

In my opinion, laziness is actually the harder concept in practice, though not in theory. A language supporting laziness allows one to assign variables to expressions. Rather than evaluating the expression immediately, however, a language supporting laziness will only evaluate the expression if its value is actually needed somewhere, and it will wait until just before the value is needed to do it. In Haskell, everything is lazy. In other languages such as Scala, you declare it explicitly.

When you combine these two things together, you actually end up with a very rare thing: Haskell's order of execution is not described by the order of statements in a file, but rather by the dependencies between them. If, in some pseudo-codish language, I have the following three lines:

x = foo()
y = bar()
z = bas()


Then, if this were valid Haskell,the compiler would be free to execute them in any order. It probably wouldn't actually execute any of them at all, unless the program actually went and used x, y, and z. But consider the following:

x = foo()
y = bar(x)
z = bas(y)


Then making use of z requires bas which requires y, requiring bar, etc. But it's more complex-if the functions never use their parameter, it's possible that bas never triggers bar and that bar never triggers foo. Haskell is going to wait as long as possible to actually evaluate anything, so x, y, and z are just records that say how to get their values until the first time they're needed. Furthermore, reordering those lines actually changes nothing abut the order they execute. What we are saying is "Here is how to compute x, y, and z when you have to", but unless the program has to it's not going to bother.

A value is finally evaluated when something wishes to pattern match on it. Think of pattern matching as a very powerful if statement: Haskell actually translates the if/else expression to a pattern matching. It's also somewhat akin to Python's tuple unpacking, except that you can assert that certain fields of the tuple must be certain values. But it's about 5 or 10 times more powerful than that, because it works on any Haskell type. The example is linked to this blog post for a reason, because this is the part that really can only be demonstrated by reading code.

### So, what exactly do I think?

My first impression and one I still consider accurate is that Haskell is potentially absurdly powerful once I finish learning it. Let's address the above topics individually.

Functional means that you describe data structures as what they are rather than how you operate on them, at least in theory. The basic implementation of a binary search tree before I modified it to scale to millions of items actually did in fact recurse as described in most of the qualitative explanations. But the recursion goes further than that-any data structure that is typically described in terms of recursion and then followed up with "but you need to use loops because of stack overflows" is actually implemented with recursion in Haskell, which doesn't have a traditional stack.

The only gotcha I've run into so far was related to laziness. Haskell's lazy evaluation works by recording operations and then evaluating them later, even at runtime. A simple statement like let x = foo + bar + bas may do one of three things: it may be evaluated at compile time, evaluated at the let binding, or evaluated the first time x is needed. Which it is actually a compiler detail: we don't get to assume it's evaluated until we actually use x. This extends to everything, including functions that wish to recursively add: Haskell delays actually running the additions until something looks at the return result of the function, so writing a recursive function to add the numbers between 1 and 1 million may actually accidentally make a list of a million items-essentially a big to-do list of adds. Haskell fixes this by providing a few mechanisms to force strict evaluation. The reason I even ran into this is that all the tutorials don't mention them, assuming that you're not going to do what I did: start playing with trees of 1 million integers just to see what happens. The solution was a minor rewrite using a common Haskell idiom and forcing evaluation-the Haskell maintainers realized this would be a problem too, so we have the tools to deal with it. Specifically, it's the strange '!' before acc in the treeLength function. I'm not going to go into it anymore because it's really a mini-tutorial about a language you probably don't know.

So, let's talk about bigger things.

First is that evaluation order stuff. Haskell is actually always functionally pure. This means that functions cannot have observable side effects unless you explicitly use something called the IO monad. If the functions you are dealing with are all pure, then order of evaluation literally doesn't matter because functions always have to return the same value for the same arguments. Implementing many data structures in terms of pure functions is actually very easy: insert, given a tree and a value, must always do exactly the same thing to the tree.

here comes the hard part of this post. Monads. Monads are what make Haskell so powerful. A monad can do many things, but one of the most common is to guarantee order. Some monads work by establishing the aforementioned data dependencies. If m is a monad, then we can say foo m bar m bas, which might look something like bas(bar(foo())). The point here is that it must evaluate foo then bar then bas, in that order, always. Haskell provides syntactic sugar and a lot of support for monads. While you cant' truly fall into the imperative style, monads basically let you get the same thing. Also, monads are usually implemented in Haskell-they're a consequence of how the language works and a required thing, but not something that Haskell appears to have gone out of its way to add. That is, they fall out naturally from other capabilities of the language already. Some, like IO, do have special status at a lower level. I must confess at this point that I do not fully understand monads yet-they are an intermediate Haskell concept and are what takes one from doing binary trees to doing things like Pandoc (which is Haskell so far as I know). The good news is that monads are the kind of thing that can be used without fully grasping them, but I will soon wish to write my own.

But what do monads actually do? The most basic example is state. In Haskell, you can't modify variables, only return things from functions. But what if every function returned a copy of the program's universe with whatever modifications it wants to make and expected a copy of said universe to also be passed in? This is tedious to write ourselves-we suddenly have to nest parentheses fifteen levels deep and introduce all sorts of temporary bindings. But let the universe be a parameter that the monad moves around for us, and suddenly it's not such a big deal. Alternatively, we can use IO monad and an IORef which provides a way to escape the immutability-but from everything I've read, this is akin to using Python's ctypes to save some memory, template metaprogramming to save 2% of the CPU, etc.

You can go even further with monads, however. The most interesting I have seen is software-transactional memory, which very few languages actually have in any form. The hidden data can be anything at all and the monad exposes a way to operate on it, so why not expose something representing memory and wrap it to be STM? There are monads for all sorts of things, and I've only scratched the surface. While Haskell is conceptually immutable for most intents and purposes, monads provide a way to pretend it's not (or perhaps actually make it not, I'm still mentally digesting the implications of monads).

Note: In Haskell and outside monadic code, changes to data are done by conceptually copying and returning the changed data to our caller. This copy is only conceptual, not actually how the compiler does it. The tree insert function in the example linked at the top of this post is actually returning a new tree that it builds from the old one, but the compiler is smart enough to share nodes automatically when it can, and the code nevertheless works fine for millions of items.

In closing, I expect that I can implement an interpreter in a week so long as I maintain interest. Haskell is absurdly good at transformations of data structures, i.e. abstract syntax trees. The oddest part of the entire process is that everything I've done in an imperative language seems to map without issue, even mutability, one of the first things you learn Haskell doesn't let you use. There is a huge difference between the programming language and machine code; the copying required to write a mutability abstraction therefore doesn't actually start copying multiple megabytes of data at every call.

While I'd not use this for every project, I'm actually quite glad that I've learned it: it seems to be a superset of those features I like in programming languages I use, at least once I understand monads fully.

But the value in it for me comes from my beliefs about programming languages. A programming language is useful only in so far as it allows the programmer to express certain types of thoughts. Imperative languages are good at expressing some things, for example business applications and servers; functional languages are good at data structures, transformations, and similar. I could do a business application in a functional language, or even a game, but it would probably be easier to just use Python for it.

There is one thing I believe is a mistake as a programmer: we should never constrain our thought process to the idioms and structures of one language. And yet I see programmers do so all the time. I am guilty of it myself. To that end, the other project I want to do with Haskell is space invaders. This is a problem that I don't know how to solve functionally. Most people just say to do it imperatively. It therefore follows that doing so will teach me how to think about a much larger number of imperative constructs in the mental model of Haskell, while still giving me the mental model of all the mainstream imperative languages when I have a problem that best fits them. Of course, I will then miss first-class functions when I don't have them even more than I already do, but there's a price for everything.