6. Sequences – Being lazy is allowed
I remember the day I got the mind shift about functional programming. I switched from being a fervent defender of Java and C# to a passionate of LISP dialects. At the time I was convinced that there were nothing better than OOP and curly bracket syntaxes. I was then fighting for conventions and simplicity rather than language expressivity.
- Why do not C and C++ have a module system ?
- Why do not have C and C++ a garbage collector ?
- Why C does not have namespaces ?
- Why C does not have safer string and array primitives ?
- Why aren’t Unicode/UTF-8 strings the default in C and C++ ?
- Why are the C and C++ libraries naming convention so bad ?
- Why is C++ so complex?
As a good ignorant I thought that languages evolved with the time, and that these problems were not addressed because nobody thought about them before. And I discovered LISP the second oldest programming language with none of these problems and even powerful features other languages don’t have such as macros or continuation capture. I got interested in functional programming languages and I learned Haskell, CAML, and finally F#.
I don’t know why but the concept of sequences in functional programming languages amazed me.
How do you perceive infinity?
“How do we perceive infinity?” This is the way Gerald Jay Sussman and Guy L. Steele introduced their presentation about sequences (streams in LISP).
I never thought about it before, since resources on a computer are finite. But interesting question, let say I want to represent the infinite sequence of numbers, what characterizes this infinity and makes it exist? Is it the fact that these numbers are all present in the computer memory at the same time (which is impossible) or the fact that I can get one of them when needed.
How could we possibly define the infinite sequence of natural integers?
Starting from 0 we can recursively define these numbers just by adding one, however this recursion shouldn’t take place until needed, so that the whole number sequence is never in memory. There is no choice that to delay the evaluation of the recursion building the sequence. We could take advantage of the fact that list are pairs, the car of the pair would be a value and the cdr of the pair a delayed evaluation leading to a pair. Delaying a computation, it’s when laziness comes into play.
What is laziness?
Wikipedia: “Laziness (also called indolence) is a disinclination to activity or exertion despite having the ability to do so. It is often used as a pejorative; related terms for a person seen to be lazy include couch potato , slacker , and bludger .”
We are close to what we need !
Although very useful to deal with infinity, as for human laziness is unfortunately not well considered by programming languages. Indeed most programming languages favor eager evaluation over lazy evaluation.
“Never put off until tomorrow what you can do the day after tomorrow.”
Eager evaluation consists in evaluating as soon as possible expressions whereas lazy evaluations consists in evaluating an expression only when it is actually needed.
Below is an example:
In a language like C# this program does nothing valuable except crashing (by exploding the stack). In a language using lazy evaluation this program does nothing. Why? Because DoNothing does nothing with its argument x, so why computing it? However with eager evaluation we start computing immediately the value of x although we don’t need it!
If you think about it you already know some cases of laziness:
Indeed in all languages conditional expressions are lazy. Why? Because if they weren’t, both the consequence and the alternative branches of the condition would be evaluated! However what is expected from a conditional expression is to first evaluate its condition and then depending on the result either evaluate the consequence or the alternative branch. In the same way the and and or short-circuit operators are lazy.
An other way of achieving laziness is to capture expressions into functions. By doing so we are capturing the future of an expression, the value in becoming of this expression, also called its continuation. The problem in languages which doesn’t favor laziness is that you would have to build all these continuations by hand and call them at the right place when needed.
In F# you can get laziness using the lazy keyword or by building sequences. This will ensure you that this expression will be evaluated only when needed.
Let’s do it
Ok we know that we need to build an infinite sequence using laziness. F# has its own sequence type that I am going to present you later, now I want you to look at a simple implementation of infinite sequences (which is really inefficient due to boxing and unboxing needed to escape static typing), the only purpose is to show you how it works:
Here is the result:
Look the infinite integers sequence is a list which starts by 0 and whose next value is not computed yet: “Value is not created”.
With lazyMap, lazyFilter you can operate on integers to create new infinite lists ! If you look at the lazyMap and lazyFilter function definitions you won’t find any break condition, these functions don’t need to stop, since they are working on data with infinite length !
Why do we need lazy sequences?
- Resolve algorithms: Infinite sequences allow us to find results for some recursive algorithms especially on big numbers that we couldn’t resolve with traditional methods. We can for example use the sieve of Eratosthenes to get a list of all primes number without excessive computation.
- Performance: When using functional programming languages it is common to compose map, filter, fold functions to get a result, each time you apply one of these on a list you are browsing the list from start to end. Thus to map and filter a list you will browse the list twice. Through the use of sequences this could be avoided, since the sequence elements will be mapped and filtered at the same time once the computation is triggered.
- Design Patterns: Did you already think about how to model I/O operations? Most languages usually represents them with the concept of streams that can be read from or written to. That’s good but there is nothing revolutionary. What about representing binary files by possibly infinite sequences of bytes and textual files as sequences of characters, words or lines for example? It would mean all your functions working on list could be used naturally
- Avoiding side effects: I already told you that side effects are bad, so how to model change for an object without relying on side effects? Sequences can be use to represent each state of an object through time, so no need for side effects.
What are F# sequences?
F# sequences are created using the seq computation expression (cf. my article to come on Workflows). They are represented by the generic type seq<’a> with ‘a being the type of the elements in the sequence. Under the hood they map the well known .NET type IEnumerable<T>. The Seq module provides for use with sequences most of the functions working with lists. Finally F# arrays and lists are instance of seq<’a>.