October 2009

Volume 24 Number 10

Patterns in practice - Functional Programming for Everyday .NET Development

By Jeremy Miller | October 2009

What is the most important advance in the .NET ecosystem over the past three or four years?  You might be tempted to name a new technology or framework, like the Windows Communication Foundation (WCF) or Windows Presentation Foundation (WPF). For me personally, however, I would say that the powerful additions to the C# and Visual Basic languages over the last two releases of the .NET Framework have had a far more significant impact on my day-to-day development activities. In this article I'd like to examine in particular how the new support for functional programming techniques in .NET 3.5 can help you do the following:

  1. Make your code more declarative.
  2. Reduce errors in code.
  3. Write fewer lines of code for many common tasks.

The Language Integrated Query (LINQ) feature in all of its many incarnations is an obvious and powerful use of functional programming in .NET, but that's just the tip of the iceberg.

To keep with the theme of "everyday development," I've based the majority of my code samples on C# 3.0 with some JavaScript sprinkled in. Please note that some of the other, newer programming languages for the CLR, like IronPython, IronRuby and F#, have substantially stronger support or more advantageous syntax for the functional programming techniques shown in this article. Unfortunately, the current version of Visual Basic does not support multiline Lambda functions, so many of the techniques shown here are not as usable in Visual Basic. However, I would urge Visual Basic developers to consider these techniques in preparation for the next version of the language, shipping in Visual Studio 2010.

First-Class Functions

Some elements of functional programming are possible in C# or Visual Basic because we now have first-class functions that can be passed around between methods, held as variables or even returned from another method. Anonymous delegates from .NET 2.0 and the newer Lambda expression in .NET 3.5 are how C# and Visual Basic implement first-class functions, but "Lambda expression" means something more specific in computer science. Another common term for a first-class function is "block."  For the remainder of this article I will use the term "block" to denote first-class functions, rather than "closure" (a specific type of first-class function I discuss next) or "Lambda," to avoid accidental inaccuracies (and the wrath of real functional programming experts). A closure contains variables defined from outside the closure function. If you've used the increasingly popular jQuery library for JavaScript development, you've probably used closures quite frequently. Here's an example of using a closure, taken from my current project:

// Expand/contract body functionality          
var expandLink = $(".expandLink", itemElement);
var hideLink = $(".hideLink", itemElement);
var body = $(".expandBody", itemElement);
body.hide();
// The click handler for expandLink will use the
// body, expandLink, and hideLink variables even
// though the declaring function will have long
// since gone out of scope.
expandLink.click(function() {
    body.toggle();
    expandLink.toggle();
    hideLink.toggle();
});

This code is used to set up a pretty typical accordion effect to show or hide content on our Web pages by clicking an <a> element. We define the click handler of the expandLink by passing in a closure function that uses variables created outside the closure. The function that contains both the variables and the click handler will exit long before the expandLink can be clicked by the user, but the click handler will still be able to use the body and hideLink variables.

Lambdas as Data

In some circumstances, you can use the Lambda syntax to denote an expression in code that can be used as data rather than executed. I didn't particularly understand that statement the first several times I read it, so let's look at an example of treating a Lambda as data from an explicit object/relational mapping using the Fluent NHibernate library:

 

public class AddressMap : DomainMap<Address>
    {
        public AddressMap()
        {
            Map(a => a.Address1);
            Map(a => a.Address2);
            Map(a => a.AddressType);
            Map(a => a.City);
            Map(a => a.TimeZone);
            Map(a => a.StateOrProvince);
            Map(a => a.Country);
            Map(a => a.PostalCode);
        }
    }

Fluent NHibernate never evaluates the expression a => a.Address1.  Instead, it parses the expression to find the name Address1 to use in the underlying NHibernate mapping. This technique has spread widely through many recent open-source projects in the .NET space. Using Lambda expressions just to get at PropertyInfo objects and property names is frequently called static reflection.

Passing Blocks

One of the best reasons to study functional programming is to learn how first-class functions allow you to reduce duplication in code by providing a more fine-grained mechanism for composition than the class. You will often come across sequences of code that are essentially identical in their basic form except for one step somewhere in the middle of the sequence. With object-oriented programming, you can use inheritance with the template method pattern to try to eliminate the duplication. More and more I find that passing blocks representing the variable step in the middle to another method that implements the basic sequence to be a cleaner way to eliminate this duplication.

One of the best ways to make an API easier to use and less prone to error is to reduce repetitive code. For example, consider the common case of an API designed to access a remote service or resource like an ADO.NET IDbConnection object or a socket listener that requires a stateful or persistent connection. You must typically "open" the connection before using the resource. These stateful connections are often expensive or scarce in terms of resources, so it is often important to "close" the connection as soon as you are done to release the resource for other processes or threads.

The following code shows a representative interface for the gateway to a stateful connection of some type:

 

public interface IConnectedResource
    {
        void Open();
        void Close();
        // Some methods that manipulate the connected resource
        BigDataOfSomeKind FetchData(RequestObject request);
        void Update(UpdateObject update);
    }

Every single time another class consumes this IConnectedResource interface, the Open method has to be called before using any other method, and the Close method should always be called afterward, as shown in Figure 1.

In an earlier article I discussed the idea of essence versus ceremony in our designs. (See msdn.microsoft.com/magazine/dd419655.aspx.) The "essence" of the ConnectedSystemConsumer class's responsibility is simply to use the connected resource to update some information. Unfortunately, most of the code in ConnectedSystemConsumer is concerned with the "ceremony" of connecting to and disconnecting from the IConnectedResource interface and error handling.

Figure 1 Using IConnectedResource

public class ConnectedSystemConsumer
{
private readonly IConnectedResource _resource;
public ConnectedSystemConsumer(IConnectedResource resource)
{
_resource = resource;
}
public void ManipulateConnectedResource()
{
try
{
// First, you have to open a connection
_resource.Open();
// Now, manipulate the connected system
_resource.Update(buildUpdateMessage());
}
finally
{
_resource.Close();
}
}
}

Worse yet is the fact that the "try/open/do stuff/finally/close" ceremony code has to be duplicated for each use of the IConnectedResource interface. As I've discussed before, one of the best ways to improve your design is to stamp out duplication wherever it creeps into your code. Let's try a different approach to the IConnectedResource API using a block or closure. First, I'm going to apply the Interface Segregation Principle (see objectmentor.com/resources/articles/isp.pdf for more information) to extract an interface strictly for invoking the connected resource without the methods for Open or Close:

 

public interface IResourceInvocation
    {
        BigDataOfSomeKind FetchData(RequestObject request);
        void Update(UpdateObject update);
    }

Next, I create a second interface that is used strictly to gain access to the connected resource represented by the IResourceInvocation interface:

 

public interface IResource
    {
        void Invoke(Action<IResourceInvocation> action);
    }

Now, let's rewrite the ConnectedSystemConsumer class to use the newer, functional-style API:

 

public class ConnectedSystemConsumer
    {
        private readonly IResource _resource;
 
        public ConnectedSystemConsumer(IResource resource)
        {
            _resource = resource;
        }
        public void ManipulateConnectedResource()
        {
            _resource.Invoke(x =>
            {
                x.Update(buildUpdateMessage());
            });
        }
    }

This new version of ConnectedSystemConsumer no longer has to care about how to set up or tear down the connected resource. In effect, ConnectedSystemConsumer just tells the IResource interface to "go up to the first IResourceInvocation you see and give it these instructions" by passing in a block or closure to the IResource.Invoke method. All that repetitive "try/open/do stuff/finally/close" ceremony code that I was complaining about before is now in the concrete implementation of IResource, as shown in Figure 2.

Figure 2 Concrete Implementation of IResource

public class Resource : IResource
{
public void Invoke(Action<IResourceInvocation> action)
{
IResourceInvocation invocation = null;
try
{
invocation = open();
// Perform the requested action
action(invocation);
}
finally
{
close(invocation);
}
}
private void close(IResourceInvocation invocation)
{
// close and teardown the invocation object
}
private IResourceInvocation open()
{
// acquire or open the connection
}
}

I would argue that we've improved our design and API usability by putting the responsibility for opening and closing the connection to the external resource into the Resource class. We've also improved the structure of our code by encapsulating the details of infrastructure concerns from the core workflow of the application. The second version of ConnectedSystemConsumer knows far less about the workings of the external connected resource than the first version did. The second design enables you to more easily change how your system interacts with the external connected resource without changing and potentially destabilizing the core workflow code of your system.

The second design also makes your system less error-prone by eliminating the duplication of the "try/open/finally/close" cycle. Every time a developer has to repeat that code, he risks making a coding mistake that could technically function correctly but exhaust resources and harm the scalability characteristics of the application.

Delayed Execution

One of the most important concepts to understand about functional programming is delayed execution.  Fortunately, this concept is also relatively simple. All it means is that a block function you've defined inline doesn't necessarily execute immediately. Let's look at a practical use of delayed execution.

In a fairly large WPF application, I use a marker interface called IStartable to denote services that need to be, well, started up as part of the application bootstrapping process.

 

public interface IStartable
    {
        void Start();
    }

All the services for this particular application are registered and retrieved by the application from an Inversion of Control container (in this case, StructureMap). At application startup, I have the following bit of code to dynamically discover all the services in the application that need to be started and then start them:

 

// Find all the possible services in the main application
// IoC Container that implements an "IStartable" interface
List<IStartable> startables = container.Model.PluginTypes
    .Where(p => p.IsStartable())
    .Select(x => x.ToStartable()).ToList();
         
// Tell each "IStartable" to Start()
startables.Each(x => x.Start());

There are three Lambda expressions in this code. Let's say you attached the full source code copy of the .NET base class library to this code and then tried to step through it with the debugger. When you try to step into the Where, Select, or Each calls, you would notice that the Lambda expressions are not the next lines of code to execute and that as these methods iterate over the internal structures of the container.Model.PluginTypes member, the Lambda expressions are all executed multiple times. Another way to think about delayed execution is that when you invoke the Each method, you're just telling the Each method what to do anytime it comes across an IStartable object.

Memoization

Memoization is an optimization technique used to avoid executing expensive function calls by reusing the results of the previous execution with the same input. I first came into contact with the term memoization in regards to functional programming with F#, but in the course of researching this article I realized that my team frequently uses memoization in our C# development. Let's say you often need to retrieve some sort of calculated financial data for a given region with a service like this:

 

public interface IFinancialDataService
    {
        FinancialData FetchData(string region);
    }

IFinancialDataService happens to be extremely slow to execute and the financial data is fairly static, so applying memoization would be very beneficial for application responsiveness. You could create a wrapper implementation of IFinancialDataService that implements memoization for an inner IFinancialDataService class, as shown in Figure 3.

Figure 3 Implementing an Inner IFinancialDataService Class

public class MemoizedFinancialDataService : IFinancialDataService
{
private readonly Cache<string, FinancialData> _cache;
// Take in an "inner" IFinancialDataService object that actually
// fetches the financial data
public MemoizedFinancialDataService(IFinancialDataService
innerService)
{
_cache = new Cache<string, FinancialData>(region =>
innerService.FetchData(region));
}
public FinancialData FetchData(string region)
{
return _cache[region];
}
}

The Cache<TKey, TValue> class itself is just a wrapper around a Dictionary<TKey, TValue> object. Figure 4 shows part of the Cache class.

Figure 4 The Cache Class

public class Cache<TKey, TValue> : IEnumerable<TValue> where TValue :
class
{
private readonly object _locker = new object();
private readonly IDictionary<TKey, TValue> _values;
private Func<TKey, TValue> _onMissing = delegate(TKey key)
{
string message = string.Format(
"Key '{0}' could not be found", key);
throw new KeyNotFoundException(message);
};
public Cache(Func<TKey, TValue> onMissing)
: this(new Dictionary<TKey, TValue>(), onMissing)
{
}
public Cache(IDictionary<TKey, TValue>
dictionary, Func<TKey, TValue>
onMissing)
: this(dictionary)
{
_onMissing = onMissing;
}
public TValue this[TKey key]
{
get
{
// Check first if the value for the requested key
// already exists
if (!_values.ContainsKey(key))
{
lock (_locker)
{
if (!_values.ContainsKey(key))
{
// If the value does not exist, use
// the Func<TKey, TValue> block
// specified in the constructor to
// fetch the value and put it into
// the underlying dictionary
TValue value = _onMissing(key);
_values.Add(key, value);
}
}
}
return _values[key];
}
}
}

If you're interested in the internals of the Cache class, you can find a version of it in several open-source software projects, including StructureMap, StoryTeller, FubuMVC and, I believe, Fluent NHibernate.

The Map/Reduce Pattern

It turns out that many common development tasks are simpler with functional programming techniques. In particular, list and set operations in code are far simpler mechanically in languages that support the “map/reduce” pattern. (In LINQ, “map” is “Select” and “reduce” is “Aggregate”.) Think about how you would calculate the sum of an array of integers. In .NET 1.1, you had to iterate over the array something like this:

int[] numbers = new int[]{1,2,3,4,5}; int sum = 0; for (int i = 0; i < numbers.Length; i++) { sum += numbers[i]; } Console.WriteLine(sum);

The wave of language enhancements to support LINQ in .NET 3.5 provided the map/reduce capabilities common in functional programming languages. Today, the code above could simply be written as:

int[] numbers = new int[]{1,2,3,4,5}; int sum = numbers.Aggregate((x, y) => x + y);

or more simply as:

int sum = numbers.Sum(); Console.WriteLine(sum);

Continuations

Roughly put, a continuation in programming is an abstraction of some sort that denotes "what to do next" or the "rest of the computation."  Sometimes it is valuable to finish part of a computational process at another time, as in a wizard application in which a user can explicitly allow the next step or cancel the whole process.

Let's jump right into a code sample. Say you are developing a desktop application in WinForms or WPF. You frequently need to initiate some type of long-running process or access a slow external service from a screen action. For the sake of usability, you certainly do not want to lock up the user interface and make it unresponsive while the service call is happening, so you run it in a background thread. When the service call does finally return, you may want to update the user interface with the data coming back from the service, but as any experienced WinForms or WPF developer knows, you can update user interface elements only on the main user interface thread.

You can certainly use the BackgroundWorker class that's in the System.ComponentModel namespace, but I prefer a different approach based on passing Lambda expressions into a CommandExecutor object, represented by this interface:

public interface ICommandExecutor
    {
        // Execute an operation on a background thread that
        // does not update the user interface
        void Execute(Action action);
        // Execute an operation on a background thread and
        // update the user interface using the returned Action
        void Execute(Func<Action> function);
    }

The first method is simply a statement to perform an activity in a background thread. The second method that takes in a Func<Action> is more interesting. Let's look at how this method would typically be used within application code.

First, assume that you're structuring your WinForms or WPF code with the Supervising Controller form of the Model View Presenter pattern. (See msdn.microsoft.com/magazine/cc188690.aspx for more information on the MVP pattern.) In this model, the Presenter class is going to be responsible for calling into a long-running service method and using the return data to update the view. Your new Presenter class will simply use the ICommandExecutor interface shown earlier to handle all the threading and thread marshalling work, as shown in Figure 5.

Figure 5 The Presenter Class

public class Presenter
{
private readonly IView _view;
private readonly IService _service;
private readonly ICommandExecutor _executor;
public Presenter(IView view, IService service, ICommandExecutor
executor)
{
_view = view;
_service = service;
_executor = executor;
}
public void RefreshData()
{
_executor.Execute(() =>
{
var data = _service.FetchDataFromExtremelySlowServiceCall();
return () => _view.UpdateDisplay(data);
});
}
}

The Presenter class calls ICommandExecutor.Execute by passing in a block that returns another block. The original block invokes the long-running service call to get some data, and returns a Continuation block that can be used to update the user interface (the IView in this scenario). In this particular case, it's important to use the Continuation approach instead of just updating the IView at the same time because the update has to be marshaled back to the user interface thread.

Figure 6 shows the concrete implementation of the ICommandExecutor interface.

Figure 6 Concrete Implementation of ICommandExecutor

public class AsynchronousExecutor : ICommandExecutor
{
private readonly SynchronizationContext _synchronizationContext;
private readonly List<BackgroundWorker> _workers =
new List<BackgroundWorker>();
public AsynchronousExecutor(SynchronizationContext
synchronizationContext)
{
_synchronizationContext = synchronizationContext;
}
public void Execute(Action action)
{
// Exception handling is omitted, but this design
// does allow for centralized exception handling
ThreadPool.QueueUserWorkItem(o => action());
}
public void Execute(Func<Action> function)
{
ThreadPool.QueueUserWorkItem(o =>
{
Action continuation = function();
_synchronizationContext.Send(x => continuation(), null);
});
}
}

The Execute(Func<Action>) method invokes Func<Action> in a background thread and then takes the Continuation (the Action returned by Func<Action>) and uses a SynchronizationContext object to execute the Continuation in the main user interface thread.

I like passing blocks into the ICommandExecutor interface because of how little ceremonial code it takes to invoke the background processing. In an earlier incarnation of this approach, before we had Lambda expressions or anonymous delegates in C#, I had a similar implementation that used little Command pattern classes like the following:

public interface ICommand
    {
        ICommand Execute();
    }
    public interface JeremysOldCommandExecutor
    {
        void Execute(ICommand command);
    }

The disadvantage of the former approach is that I had to write additional Command classes just to model the background operation and the view-updating code. The extra class declaration and constructor functions are a little more ceremony code we can eliminate with the functional approach, but more important to me is that the functional approach allows me to put all this closely related code in a single place in the Presenter rather than having to spread it out over those little Command classes.

Continuation Passing Style

Building on the Continuation concept, you can use the Continuation Passing style of programming to invoke a method by passing in a Continuation instead of waiting for the return value of the method. The method accepting the Continuation is in charge of deciding whether and when to call the Continuation.

In my current Web MVC project, there are dozens of controller actions that save updates from user input sent from the client browser via an AJAX call to a domain entity object. Most of these controller actions simply invoke our Repository class to save the changed entity, but other actions use other services to perform the persistence work. (See my article in the April issue of MSDN Magazine at msdn.microsoft.com/magazine/dd569757.aspx for more information about the Repository class.)

The basic workflow of these controller actions is consistent:

  1. Validate the domain entity and record any validation errors.
  2. If there are validation errors, return a response to the client indicating that the operation failed and include the validation errors for display on the client.
  3. If there are no validation errors, persist the domain entity and return a response to the client indicating that the operation succeeded.

What we would like to do is centralize the basic workflow but still allow each individual controller action to vary how the actual persistence is done. Today my team is doing this by inheriting from a CrudController<T> superclass with plenty of template methods that each subclass can override to add or change the basic behavior. This strategy worked out well at first, but it is rapidly breaking down as the variations have increased. Now my team is going to move to using Continuation Passing style code by having our controller actions delegate to something like the following interface:

 

public interface IPersistor
    {
        CrudReport Persist<T>(T target, Action<T> saveAction);
        CrudReport Persist<T>(T target);
    }

 A typical controller action would tell IPersistor to perform the basic CRUD workflow and supply a Continuation that IPersistor uses to actually save the target object. Figure 7 shows a sample controller action that invokes IPersistor but uses a different service than our basic Repository for the actual persistence.

Figure 7 A Sample Controller Action

public class SolutionController
{
private readonly IPersistor _persistor;
private readonly IWorkflowService _service;
public SolutionController(IPersistor persistor, IWorkflowService
service)
{
_persistor = persistor;
_service = service;
}
// UpdateSolutionViewModel is a data bag with the user
// input from the browser
public CrudReport Create(UpdateSolutionViewModel update)
{
var solution = new Solution();
// Push the data from the incoming
// user request into the new Solution
// object
update.ToSolution(solution);
// Persist the new Solution object, if it's valid
return _persistor.Persist(solution, x => _service.Create(x));
}
}

I think the important thing to note here is that IPersistor itself is deciding whether and when the Continuation supplied by SolutionController will be called. Figure 8 shows a sample implementation of IPersistor.

Figure 8 An Implementation of IPersistor

public class Persistor : IPersistor
{
private readonly IValidator _validator;
private readonly IRepository _repository;
public Persistor(IValidator validator, IRepository repository)
{
_validator = validator;
_repository = repository;
}
public CrudReport Persist<T>(T target, Action<T> saveAction)
{
// Validate the "target" object and get a report of all
// validation errors
Notification notification = _validator.Validate(target);
// If the validation fails, do NOT save the object.
// Instead, return a CrudReport with the validation errors
// and the "success" flag set to false
if (!notification.IsValid())
{
return new CrudReport()
{
Notification = notification,
success = false
};
}
// Proceed to saving the target using the Continuation supplied
// by the client of IPersistor
saveAction(target);
// return a CrudReport denoting success
return new CrudReport()
{
success = true
};
}
public CrudReport Persist<T>(T target)
{
return Persist(target, x => _repository.Save(x));
}
}

Write Less Code

Frankly, I originally chose this topic because I was interested in learning more about functional programming and how it can be applied even within C# or Visual Basic. In the course of writing this article, I've learned a great deal about just how useful functional programming techniques can be in normal, everyday tasks. The most important conclusion I've reached, and what I've tried to convey here, is that compared with other techniques, functional programming approaches can often lead to writing less code and often more declarative code for some tasks—and that's almost always a good thing.     


Jeremy Miller, a Microsoft MVP for C#, is also the author of the open source StructureMap (structuremap.sourceforge.net) tool for Dependency Injection with .NET and the forthcoming StoryTeller (storyteller.tigris.org) tool for supercharged FIT testing in .NET. Visit his blog, The Shade Tree Developer, at codebetter.com/blogs/jeremy.miller, part of the CodeBetter site.