April 2010

Volume 25 Number 04

F# Basics - An Introduction to Functional Programming for .NET Developers

By Chris Marinos | April 2010

By now, there’s a good chance you’ve heard about F#, the latest addition to the Microsoft Visual Studio language family. There are many exciting reasons to learn F#—its clean syntax, its powerful multi-threading capabilities and its fluid interoperability with other Microsoft .NET Framework languages. However, F# includes some important new concepts you’ll need to understand before you can take advantage of these features.

A whirlwind tour is a good way to start learning another object-oriented language, or even a dynamic language like Ruby or Python. That’s because you already know most of the vocabulary and you just need to learn new syntax. F# is different, though. F# is a functional programming language, and with that comes more new vocabulary than you may expect. Moreover, functional languages have traditionally been used in academia, so the definitions for these new terms can be difficult to understand.

Fortunately, F# is not designed to be an academic language. Its syntax allows you to use functional techniques to solve problems in new and better ways while still supporting the object-oriented and imperative styles that you’re accustomed to as a .NET developer. Unlike other .NET languages, F#’s multi-paradigm structure means that you are free to choose the best style of programming for the problem that you’re trying to solve. Functional programming in F# is about writing concise, powerful code to solve practical software problems. It’s about using techniques like higher order functions and function composition to create powerful and easy to understand behaviors. It’s also about making your code easier to understand, test and parallelize by removing hidden complexities.

But in order for you to take advantage of all of these fantastic features of F#, you need to understand the basics. In this article, I’ll explain these concepts using vocabulary that you are already familiar with as a .NET developer. I will also show you some functional programming techniques that you can apply to your existing code and some ways in which you are already programming functionally. By the end of the article, you’ll know enough about functional programming so that you can hit the ground running with F# in Visual Studio 2010.

Functional Programming Basics

For most .NET developers, it’s easier to understand what functional programming is by understanding what it isn’t. Imperative programming is a style of programming that is considered to be the opposite of functional programming. It’s also the style of programming you are probably most familiar with because most mainstream programming languages are imperative.

Functional programming and imperative programming differ on a very fundamental level, and you can see this in even the simplest code:

int number = 0;
number++;

This obviously increments a variable by one. That’s not very exciting, but consider a different way that you could solve the problem:

const int number = 0;
const int result = number + 1;

The number is still incremented by one, but it’s not modified in place. Instead, the result is stored as another constant because the compiler does not allow you to modify the value of a constant. You would say that constants are immutable because you can’t change their values once they are defined. Conversely, the number variable from my first example was mutable because you can modify its value. These two approaches show one of the fundamental differences between imperative programming and functional programming. Imperative programming emphasizes the use of mutable variables whereas functional programming uses immutable values.

Most .NET developers would say that number and result in the previous example are variables, but as a functional programmer you need to be more careful. After all, the idea of a constant variable is confusing at best. Instead, functional programmers say that number and result are values. Make sure you reserve the term variable for objects that are mutable. Note that these terms are not exclusive to functional programming, but they are a lot more important when programming in a functional style.

This distinction may seem small, but it’s the foundation for a lot of the concepts that make functional programming so powerful. Mutable variables are the root cause of a lot of nasty bugs. As you will see below, they lead to implicit dependencies between different parts of your code, which makes for many problems, especially related to concurrency. In contrast, immutable variables introduce significantly less complexity. They lead to functional techniques like using functions as values and compositional programming which I’ll also explore in more detail later.

If you’re skeptical of functional programming at this point, don’t worry. That’s natural. Most imperative programmers are trained to believe that you can’t do anything useful with immutable values. However, consider this example:

string stringValue = "world!";
string result = stringValue.Insert(0, "hello ");

The Insert function built the “hello world!” string, but you know that Insert doesn’t modify the source string’s value. That’s because strings are immutable in .NET. The designers of the .NET Framework used a functional approach because it made it easier to write better code with strings. Because strings are among the most widely used data types in the .NET Framework (along with other base types, like integers, DateTimes and so on), there’s a good chance you’ve done more useful functional programming than you realize.

Putting F# to Work

F# comes with Visual Studio 2010, and you can find the latest version at msdn.microsoft.com/vstudio. If you use Visual Studio 2008, you can download an F# add-in from the F# Developer Center at msdn.microsoft.com/fsharp, where you’ll also find installation instructions for Mono.

F# adds a new window to Visual Studio called F# Interactive that, unsurprisingly, allows you to interactively execute F# code. You can think of it as a more powerful version of the Immediate Window that you can access even when you’re not in debug mode. If you’re familiar with Ruby or Python, you’ll recognize that F# Interactive is a Read-Evaluate-Print Loop (REPL), which is a helpful tool to have for learning F# and quickly experimenting with code.

I’ll use F# Interactive in this article to show you what happens when example code is compiled and run. If you highlight code in Visual Studio and press Alt+Enter, you send the code to F# Interactive. To see this, here is the simple addition example in F#:

let number = 0
let result = number + 1

When you run this code in F# Interactive, you get the following:

val number : int = 0
val result : int = 1

You can probably guess by the term val that number and result are both immutable values, not mutable variables. You can see this by using <-, the F# assignment operator:

> number <- 15;;

  number <- 15;;
  ^^^^^^^^^^^^

stdin(3,1): error FS0027: This value is not mutable
>

Because you know that functional programming is based on immutability, this error should make sense. The let keyword is used to create immutable bindings between names and values. In C# terms, everything is const by default in F#. You can make a mutable variable if you want, but you have to explicitly say so. The defaults are just the opposite of what you’re familiar with in imperative languages:

let mutable myVariable = 0
myVariable <- 15

Type Inference and Whitespace Sensitivity

F# lets you declare variables and values without specifying their type, so you might assume that F# is a dynamic language, but that’s not true. It is important to understand that F# is a static language just like C# or C++. However, F# has a powerful type inference system that allows you to avoid specifying the types of objects in many places. This allows for a simple and succinct syntax, while still providing the type safety of static languages.

Although type inference systems like this aren’t really found in imperative languages, type inference isn’t directly related to functional programming. However, type inference is a critical notion to understand if you want to learn F#. Fortunately, if you’re a C# developer, chances are you’re already familiar with basic type inference because of the var keyword:

// Here, the type is explictily given
Dictionary<string, string> dictionary = 
  new Dictionary<string, string>();

// but here, the type is inferred
var dictionary = new Dictionary<string, string>();

Both lines of C# code create new variables that are statically typed as Dictionary<string, string>, but the var keyword tells the complier to infer the type of the variable for you. F# takes this concept to the next level. For example, here is an add function in F#:

let add x y =
    x + y
    
let four = add 2 2

There isn’t a single type annotation in the above code, but F# 
Interactive reveals the static typing:

val add : int -> int -> int
val four : int = 4

I’ll explain the arrows in more detail later, but for now you can interpret this to mean that add is defined to take two int arguments, and that four is an int value. The F# compiler was able to infer this based on the way add and four were defined. The rules the complier uses to do this are beyond the scope of this article, but you can learn more about them at the F# Developer Center if you’re interested.

Type inference is one way that F# reduces noise in your code, but notice that there are no curly braces or keywords to denote the body or return value of the add function. That’s because F# is a whitespace-sensitive language by default. In F#, you indicate the body of a function by indentation, and you return a value by making sure that it is the last line in the function. Like type inference, whitespace sensitivity has no direct relationship to functional programming, but you need to be familiar with the concept in order to use F#.

Side Effects

Now you know that functional programming is different from imperative programming because it relies on immutable values instead of mutable variables, but that fact isn’t very useful by itself. The next step is to understand side effects.

In imperative programming, a function’s output depends on its input argument and the current state of the program. In functional programming, functions only depend on their input arguments. In other words, when you call a function more than once with the same input value, you always get the same output value. The reason this isn’t true in imperative programming is because of side effects, as demonstrated in Figure 1.

Figure 1 Side Effects of Mutable Variables

public MemoryStream GetStream() {
  var stream = new MemoryStream();
  var writer = new StreamWriter(stream);
  writer.WriteLine("line one");
  writer.WriteLine("line two");
  writer.WriteLine("line three");
  writer.Flush();
  stream.Position = 0;
  return stream;
}

[TestMethod]
public void CausingASideEffect() {
  using (var reader = new StreamReader(GetStream())) {
    var line1 = reader.ReadLine();
    var line2 = reader.ReadLine();

    Assert.AreNotEqual(line1, line2);
  }
}

On the first call to ReadLine, the stream gets read until a new line is encountered. Then, ReadLine returns all of the text up to the new line. In between those steps, a mutable variable representing the stream’s position gets updated. That’s the side effect. On the second call to ReadLine, the value of the mutable position variable has changed, so ReadLine returns a different value.

Now let’s look at one of the most significant consequences of using side effects. First, consider a simple PiggyBank class and some methods to work with it (see Figure 2).

Figure 2 Mutable PiggyBanks

public class PiggyBank{
  public PiggyBank(int coins){
    Coins = coins;
  }

  public int Coins { get; set; }
}

private void DepositCoins(PiggyBank piggyBank){
  piggyBank.Coins += 10;
}

private void BuyCandy(PiggyBank piggyBank){
  if (piggyBank.Coins < 7)
    throw new ArgumentException(
      "Not enough money for candy!", "piggyBank");

  piggyBank.Coins -= 7;
}

If you have a piggy bank with 5 coins in it, you can call DepositCoins before BuyCandy, but reversing the order raises an exception:

// this works fine
var piggyBank = new PiggyBank(5);

DepositCoins(piggyBank);
BuyCandy(piggyBank);

// but this raises an ArgumentException
var piggyBank = new PiggyBank(5);

BuyCandy(piggyBank);
DepositCoins(piggyBank);

The BuyCandy function and the DepositCoins function both update the state of the piggy bank through the use of a side effect. Consequently, the behavior of each function depends on the state of the piggy bank. Because the number of coins is mutable, the order in which these functions execute is significant. In other words, there is an implicit timing dependency between these two methods.

Now let’s make the number of coins read only to simulate an immutable data structure. Figure 3 shows that BuyCandy and DepositCoins now return new PiggyBank objects instead of updating an existing PiggyBank.

Figure 3 Immutable PiggyBanks

public class PiggyBank{
  public PiggyBank(int coins){
    Coins = coins;
  }

  public int Coins { get; private set; }
}

private PiggyBank DepositCoins(PiggyBank piggyBank){
  return new PiggyBank(piggyBank.Coins + 10);
}

private PiggyBank BuyCandy(PiggyBank piggyBank){
  if (piggyBank.Coins < 7)
    throw new ArgumentException(
      "Not enough money for candy!", "piggyBank");

  return new PiggyBank(piggyBank.Coins - 7);
}

As before, if you try to call BuyCandy before DepositCoins, you will get an argument exception:

// still raises an ArgumentException
var piggyBank = new PiggyBank(5);

BuyCandy(piggyBank);
DepositCoins(piggyBank);

But now, even if you revert the order, you’ll get the same result:

// now this raises an ArgumentException,  too!
var piggyBank = new PiggyBank(5);

DepositCoins(piggyBank);
BuyCandy(piggyBank);

Here, BuyCandy and DepositCoins only depend on their input argument because the number of coins is immutable. You can execute the functions in any order and the result is the same. The implicit time dependency is gone. However, since you probably want BuyCandy to succeed, you need to make the result of BuyCandy depend on the output of DepositCoins. You need to make the dependency explicit:

var piggyBank = new PiggyBank(5);
BuyCandy(DepositCoins(piggyBank));

This is a subtle difference with far-reaching consequences. Shared mutable state and implicit dependencies are the source of some of the most diabolical bugs in imperative code, and they’re the reason that multi-threading is so difficult in imperative languages. When you have to worry about the order in which functions execute, you need to rely on cumbersome locking mechanisms to keep things straight. Pure functional programs are free of side effects and implicit time dependencies, so the order in which functions execute doesn’t matter. This means you don’t have to worry about locking mechanisms and other error-prone multi-threading techniques.

Easier multi-threading is a major reason that functional programming is getting attention lately, but there are many other benefits of programming in a functional style. Side effect-free functions are easier to test because each function relies only on its input arguments. They are easier to maintain because they don’t implicitly rely on logic from other setup functions. Side effect-free functions also tend to be smaller and easier to combine. I’ll cover this last point in more detail shortly.

In F#, you focus on evaluating functions for their result values instead of their side effects. In imperative languages, it is common to call a function to do something; in functional languages, functions are called to yield a result. You can see this in F# by looking at the if statement:

let isEven x =
    if x % 2 = 0 then
        "yes"
    else
        "no"

You know that in F#, the last line of a function is its return value, but in this example, the last line of the function is the if statement. This isn’t a compiler trick. In F#, even if statements are designed to return values:

let isEven2 x =
    let result = 
        if x % 2 = 0 then
            "yes"
        else
            "no"
    result

The result value is of type string, and it is assigned directly to the if statement. It’s similar to the way the conditional operator works in C#:

string result = x % 2 == 0 ? "yes" : "no";

The conditional operator emphasizes returning a value over causing a side effect. It’s a more functional approach. In contrast, the C# if statement is more imperative because it does not return a result. All it can do is cause side effects.

Composing Functions

Now that you’ve seen some of the benefits of side-effect-free functions, you’re ready to use functions to their full potential in F#. First, let’s start with some C# code to take the square of the numbers zero through 10:

IList<int> values = 0.Through(10).ToList();

IList<int> squaredValues = new List<int>();

for (int i = 0; i < values.Count; i++) {
  squaredValues.Add(Square(values[i]));
}

Aside from the Through and Square helper methods, this code is fairly standard C#. Good C# developers would probably take umbrage with my use of a for loop instead of a foreach loop, and rightly so. Modern languages like C# offer foreach loops as an abstraction to make walking through enumerations easier by removing the need for explicit indexers. They succeed in this goal, but consider the code in Figure 4.

Figure 4 Using foreach Loops

IList<int> values = 0.Through(10).ToList();

// square a list
IList<int> squaredValues = new List<int>();

foreach (int value in values) {
  squaredValues.Add(Square(value));
}

// filter out the even values in a list
IList<int> evens = new List<int>();

foreach(int value in values) {
  if (IsEven(value)) {
    evens.Add(value);
  }
}

// take the square of the even values
IList<int> results = new List<int>();

foreach (int value in values) {
  if (IsEven(value)) {
    results.Add(Square(value));
  }
}

The foreach loops in this example are similar, but each loop body performs a slightly different operation. Imperative programmers have traditionally been okay with this code duplication because it’s considered to be idiomatic code.

Functional programmers take a different approach. Instead of creating abstractions like foreach loops to help walk lists, they use side effect-free functions:

let numbers = {0..10}
let squaredValues = Seq.map Square numbers

This F# code also squares a sequence of numbers, but it does so using a higher-order function. Higher-order functions are simply functions that accept another function as an input argument. In this case, the function Seq.map accepts the Square function as an argument. It applies this function to each number in the numbers sequence and returns the sequence of squared numbers. Higher-order functions are why many people say functional programming uses functions as data. This just means that functions can be used as parameters or assigned to a value or variable just like an int or a string. In C# terms, it’s very similar to the concepts of delegates and lambda expressions.

Higher order functions are one of the techniques that makes functional programming so powerful. You can use higher-order functions to isolate the duplicated code inside of foreach loops and encapsulate it into standalone, side effect-free functions. These functions each perform one small operation that the code inside a foreach loop would have handled. Because they are side effect-free, you can combine these functions to create more readable, easier-to-maintain code that accomplishes the same thing as foreach loops:

let squareOfEvens = 
    numbers
    |> Seq.filter IsEven
    |> Seq.map Square

The only confusing part about this code may be the |> operator. This operator is used to make code more readable by allowing you to reorder the arguments to a function so that the last argument is the first thing that you read. Its definition is very simple:

let (|>) x f = f x

Without the |> operator, the squareOfEvens code would look like this:

let squareOfEvens2 = 
  Seq.map Square (Seq.filter IsEven numbers)

If you use LINQ, employing higher-order functions in this way should seem very familiar. That’s because LINQ is deeply rooted in functional programming. In fact, you can easily translate the square of evens problem into C# using methods from LINQ:

var squareOfEvens =
  numbers
  .Where(IsEven)
  .Select(Square);

This translates to the following LINQ query syntax:

var squareOfEvens = from number in numbers
  where IsEven(number)
  select Square(number);

Using LINQ in C# or Visual Basic code allows you to exploit some of the power of functional programming on an everyday basis. It’s a great way to learn functional programming techniques.

When you start to use higher-order functions on a regular basis, you will eventually come across a situation in which you want to create a small, very specific function to pass into a higher-order function. Functional programmers use lambda functions to solve this problem. Lambda functions are simply functions you define without giving them a name. They are normally small and have a very specific use. For example, here is another way that you could square even numbers using a lambda:

let withLambdas =
    numbers
    |> Seq.filter (fun x -> x % 2 = 0)
    |> Seq.map (fun x -> x * x)

The only difference between this and the previous code to square even numbers is that the Square and IsEven are defined as lambdas. In F#, you declare a lambda function using the fun keyword. You should only use lambdas to declare single-use functions because they can’t easily be used outside the scope in which they are defined. For this reason, Square and IsEven are poor choices for lambda functions because they are useful in many situations.

Currying and Partial Application

You now know almost all of the basics you need to start working with F#, but there is one more concept you should be familiar with. In previous examples, the |> operator and the arrows in type signatures from F# Interactive are both tied to a concept known as currying.

Currying means breaking a function with many arguments into a series of functions that each take one argument and ultimately produce the same result as the original function. Currying is probably the most challenging topic in this article for a .NET developer, particularly because it is often confused with partial application. You can see both at work in this example:

let multiply x y =
    x * y
    
let double = multiply 2
let ten = double 5

Right away, you should see behavior that is different from most imperative languages. The second statement creates a new function called double by passing one argument to a function that takes two. The result is a function that accepts one int argument and yields the same output as if you had called multiply with x equal to 2 and y equal to that argument. In terms of behavior, it’s the same as this code:

let double2 z = multiply 2 z

Often, people mistakenly say that multiply is curried to form double. But this is only somewhat true. The multiply function is curried, but that happens when it is defined because functions in F# are curried by default. When the double function is created, it’s more accurate to say that the multiply function is partially applied.

Let’s go over these steps in more detail. Currying breaks a function with many arguments into a series of functions that each take one argument and ultimately produce the same result as the original function. The multiply function has the following type signature according to F# Interactive:

val multiply : int -> int -> int

Up to this point, you decrypted this to mean that multiply is a function that takes two int arguments and returns an int result. Now I’ll 
explain what really happens. The multiply function is really a series of two functions. The first function takes one int argument and returns another function, effectively binding x to a specific value. This function also accepts an int argument that you can think of as the value to bind to y. After calling this second function, x and y are both bound, so the result is the product of x and y as defined in the body of double.

To create double, the first function in the chain of multiply functions is evaluated to partially apply multiply. The resulting function is given the name double. When double is evaluated, it uses its argument along with the partially applied value to create the result.

Using F# and Functional Programming

Now that you have enough vocabulary to get started with F# and functional programming, you have plenty of options for what to do next.

F# Interactive allows you to explore F# code and quickly build up F# scripts. It is also useful for validating everyday questions about the behavior of .NET library functions without resorting to help files or Web searches.

F# excels at expressing complicated algorithms, so you can encapsulate these portions of your applications into F# libraries that can be called from other .NET languages. This is especially useful in engineering applications or multi-threaded situations.

Finally, you can use functional programming techniques in your everyday .NET development without even writing F# code. Use LINQ instead of for or foreach loops. Try using delegates to create higher-order functions. Limit your use of mutability and side effects in your imperative programming. Once you start writing code in a functional style, you’ll soon find yourself wishing you were writing more F# code.                     


Chris Marinos  is a software consultant at SRT Solutions in Ann Arbor, Mich.

Thanks to the following technical experts for reviewing this article: Luke Hoban