January 2010
Volume 25 Number 01
Patterns in Practice - Internal Domain Specific Languages
By Jeremy Miller | January 2010
Domain Specific Languages (DSLs) have been a popular topic over the past couple years and will probably grow in importance in years to come. You might already be following the “Oslo” project (now called SQL Server Modeling) or experimenting with tools such as ANTLR to craft “external” DSLs. A more immediately approachable alternative is to create “internal” DSLs that are written within an existing programming language such as C#.
Internal DSLs may not be quite as expressive and readable to non-developers as external DSLs that can read like English, but the mechanics of creating an internal DSL are simpler because you are not employing compilers or parsers external to your code.
Please note that I am not suggesting that the DSLs in this article are suitable for review by business experts. For this article, I will focus only on how the patterns of internal DSLs can make our jobs as developers easier by crafting APIs that are easier to read and write.
I am pulling a lot of examples out of two open source projects written in C# that I administer and develop. The first is StructureMap, one of the Inversion of Control (IoC) Container tools for the Microsoft .NET Framework. The second is StoryTeller, an acceptance-testing tool. You can download the complete source code for both projects via Subversion at https://structuremap.svn.sourceforge.net/svnroot/structuremap/trunk or storyteller.tigris.org/svn/storyteller/trunk (registration required). I can also suggest the Fluent NHibernate project (www.fluentnhibernate.org) as another source of examples.
Literal Extensions
One of the more important points I want to make in this article is that there are many small tricks you can do to make your code read more cleanly and be more expressive. These small tricks can really add value to your coding efforts by making code easier to write correctly as it becomes more declarative and more intention-revealing.
More and more frequently I use extension methods on basic objects such as strings and numbers to reduce repetitiveness in the core .NET Framework APIs and to increase readability. This pattern of extending value objects is called “literal extensions.”
Let’s start with a simplistic example. My current project involves configurable rules for reoccurring and scheduled events. We initially attempted to create a small internal DSL for configuring these events (we are in the process of moving to an external DSL approach instead). These rules depend heavily on TimeSpan values for how often an event should reoccur, when it should start and when it expires. That might look like this snippet:
x.Schedule(schedule =>
{
// These two properties are TimeSpan objects
schedule.RepeatEvery = new TimeSpan(2, 0, 0);
schedule.ExpiresIn = new TimeSpan(100, 0, 0, 0);
});
In particular, pay attention to “new TimeSpan(2, 0, 0)” and “new TimeSpan(100, 0, 0, 0).” As an experienced .NET Framework developer you may parse those two pieces of code to mean “2 hours” and “100 days,” but you had to think about it, didn’t you? Instead, let’s make the TimeSpan definition more readable:
x.Schedule(schedule =>
{
// These two properties are TimeSpan objects
schedule.RepeatEvery = 2.Hours();
schedule.ExpiresIn = 100.Days();
});
All I did in the sample above was use some extension methods on the integer object that return TimeSpan objects:
public static class DateTimeExtensions
{
public static TimeSpan Days(this int number)
{
return new TimeSpan(number, 0, 0, 0);
}
public static TimeSpan Seconds(this int number)
{
return new TimeSpan(0, 0, number);
}
}
In terms of implementation, switching from “new TimeSpan(2, 0, 0, 0)” to “2.Days()” isn’t that big of a change, but which one is easier to read? I know that when I’m translating business rules into code, I’d rather say two days than “a time span consisting of two days, zero hours and zero minutes.” The more readable version of the code is easier to scan for correctness, and that’s enough reason for me to use the literal expression version.
Semantic Model
When I build a new DSL I need to solve two problems. First, my team and I start by asking ourselves how we’d like to express the DSL in a logical and self-describing way that will make it easy to use. As much as possible, I try to do this without regard for how the actual functionality will be structured or built.
For example, the StructureMap Inversion of Control (IoC) container tool allows users to configure the container explicitly inside StructureMap’s “Registry DSL” like this:
var container = new Container(x =>
{
x.For<ISendEmailService>().HttpContextScoped()
.Use<SendEmailService>();
});
If you aren’t already familiar with the usage of an IoC container, all that code is doing is stating that when you ask the container at runtime for an object of type ISendEmailService, you will get an instance of the concrete type SendEmailService. The call to HttpContextScoped directs StructureMap to “scope” the ISendEmailService objects to a single HttpRequest, meaning that if the code is running inside ASP.NET, there will be a single unique instance of ISendEmailService for each individual HTTP request no matter how many times you request an ISendEmailService within a single HTTP request.
Once I have an idea for the desired syntax, I’m left with the crucial question of exactly how I’m going to connect the DSL syntax to code that implements the actual behavior. You could place the behavioral code directly into the DSL code such that runtime actions happen directly in Expression Builder objects, but I would strongly recommend against this in any but the most simplistic cases. The Expression Builder classes can be somewhat difficult to unit test, and debugging by stepping through a fluent interface is not conducive to either productivity or your sanity. You really want to put yourself in a position to be able to unit test (preferably), debug and troubleshoot the runtime behavioral elements of your DSL without having to step through all the code indirection in a typical fluent interface.
I need to build the runtime behavior and I need to craft a DSL that expresses the DSL user’s intent as cleanly as possible. In my experience, it has been extremely helpful to separate the runtime behavior into a “semantic model,” defined by Martin Fowler as “The domain model that’s populated by a DSL” (https://martinfowler.com/dsl.html).
The key point about the previous code snippet is that it doesn’t do any real work. All that little bit of DSL code does is configure the semantic model of the IoC container. You could bypass the fluent interface above and build the semantic model objects yourself like this:
var graph = new PluginGraph();
PluginFamily family = graph.FindFamily(typeof(ISendEmailService));
family.SetScopeTo(new HttpContextLifecycle());
Instance defaultInstance = new SmartInstance<SendEmailService>();
family.AddInstance(defaultInstance);
family.DefaultInstanceKey = defaultInstance.Name;
var container = new Container(graph);
The Registry DSL code and the code directly above are identical in runtime behavior. All the DSL does is create the object graph of the PluginGraph, PluginFamily, Instance and HttpContextLifecycle objects. So the question is, why bother with two separate models?
First of all, as a user I definitely want the DSL version of the two previous code samples because it’s far less code to write, more cleanly expresses my intent and doesn’t require the user to know very much about the internals of StructureMap. As the implementer of StructureMap, I need an easy way to build and test functionality in small units, and that’s relatively hard to do with a fluent interface by itself.
With the semantic model approach, I was able to build and unit test the behavioral classes quite easily. The DSL code itself becomes very simple because all it does is configure the semantic model.
This separation of DSL expression and semantic model has turned out to be very beneficial over time. You will frequently have to iterate somewhat with your DSL syntax to achieve more readable and writeable expressions based on feedback from usage. That iteration will go much more smoothly if you don’t have to worry quite so much about breaking runtime functionality at the same time you’re changing syntax.
On the other hand, by having the DSL as the official API for StructureMap, I’ve on several occasions been able to extend or restructure the internal semantic model without breaking the DSL syntax. This is just one more example of benefits of the “Separation of Concerns” principle in software design.
Fluent Interfaces and Expression Builders
A fluent interface is a style of API that uses method chaining to create a terse, readable syntax. I believe the most well-known example is probably the increasingly popular jQuery library for JavaScript development. jQuery users will quickly recognize code such as the following:
var link = $(‘<a></a>’).attr("href", "#").appendTo(binDomElement);
$(‘<span></span>’).html(binName).appendTo(link);
A fluent interface lets me “densify” code into a smaller window of text, potentially making the code easier to read. Also, it often helps me guide the user of my APIs to select the proper choices. The simplest and perhaps most common trick in making a fluent interface is to simply make an object return itself from method calls (this is largely how jQuery works).
I have a simple class I use in StoryTeller to generate HTML called “HtmlTag.” I can build up an HtmlTag object quickly with method chaining like this:
var tag = new HtmlTag("div").Text("my text").AddClass("collapsible");
Internally, the HtmlTag object is just returning itself from the calls to Text and AddClass:
public HtmlTag AddClass(string className)
{
if (!_cssClasses.Contains(className))
{
_cssClasses.Add(className);
}
return this;
}
public HtmlTag Text(string text)
{
_innerText = text;
return this;
}
In a more complicated scenario you may separate the fluent interface into two parts, the semantic model that supplies the runtime behavior (more on this pattern later) and a series of “Expression Builder” classes that implement the DSL grammars.
I use an example of this pattern in the StoryTeller user interface for defining keyboard shortcuts and dynamic menus. I wanted a quick programmatic way to define a keyboard shortcut for an action in the user interface. Also, because most of us can’t remember every keyboard shortcut for each application we use, I wanted to create a single menu in the UI that exposed all the available shortcuts and the keyboard combinations to run them. Also, as screens are activated in the main tab area of the StoryTeller UI, I wanted to add dynamic menu strip buttons to the UI that were specific to the active screen.
I certainly could have just coded this the idiomatic Windows Presentation Foundation (WPF) way, but this would have meant editing a couple different areas of XAML markup for keyboard gestures, commands, the menu strip objects for each screen and menu items–and then making sure that these were all correctly tied together. Instead, I wanted to make this registration of new shortcuts and menu items as declarative as possible, and I wanted to reduce the surface area of the code to a single point. I of course made a fluent interface that configured all the disparate WPF objects for me behind the scenes.
In usage, I can specify a global shortcut to open the “Execution Queue” screen with the following code:
// Open the "Execution Queue" screen with the
// CTRL - Q shortcut
Action("Open the Test Queue")
.Bind(ModifierKeys.Control, Key.Q)
.ToScreen<QueuePresenter>();
In the screen activation code for an individual screen, I can define temporary keyboard shortcuts and the dynamic menu options in the main application shell with code like this:
screenObjects.Action("Run").Bind(ModifierKeys.Control, Key.D1)
.To(_presenter.RunCommand).Icon = Icon.Run;
screenObjects.Action("Cancel").Bind(ModifierKeys.Control, Key.D2)
.To(_presenter.CancelCommand).Icon = Icon.Stop;
screenObjects.Action("Save").Bind(ModifierKeys.Control, Key.S)
.To(_presenter.SaveCommand).Icon = Icon.Save;
Now, let’s take a look at the implementation of this fluent interface. Underlying it is a semantic model class called ScreenAction that does the actual work of building all the constituent WPF objects. That class looks like this:
public interface IScreenAction
{
bool IsPermanent { get; set; }
InputBinding Binding { get; set; }
string Name { get; set; }
Icon Icon { get; set; }
ICommand Command { get; }
bool ShortcutOnly { get; set; }
void BuildButton(ICommandBar bar);
}
This is an important detail. I can build and test the ScreenAction object independently of the fluent interface, and now the fluent interface merely has to configure ScreenAction objects. The actual DSL is implemented on a class called ScreenObjectRegistry that tracks the list of active ScreenAction objects (see Figure 1).
Figure 1 DSL Is Implemented on the ScreenActionClass
public class ScreenObjectRegistry : IScreenObjectRegistry
{
private readonly List<ScreenAction> _actions =
new List<ScreenAction>();
private readonly IContainer _container;
private readonly ArrayList _explorerObjects = new ArrayList();
private readonly IApplicationShell _shell;
private readonly Window _window;
public IEnumerable<ScreenAction> Actions {
get { return _actions; } }
public IActionExpression Action(string name)
{
return new BindingExpression(name, this);
}
// Lots of other methods that are not shown here
}
The registration of a new screen action begins with the call to the Action(name) method above and returns a new instance of the BindingExpression class that acts as an Expression Builder to configure the new ScreenAction object, partially shown in Figure 2.
Figure 2 BindingExpression Class Acting as Expression Builder
public class BindingExpression : IBindingExpression, IActionExpression
{
private readonly ScreenObjectRegistry _registry;
private readonly ScreenAction _screenAction = new ScreenAction();
private KeyGesture _gesture;
public BindingExpression(string name, ScreenObjectRegistry registry)
{
_screenAction.Name = name;
_registry = registry;
}
public IBindingExpression Bind(Key key)
{
_gesture = new KeyGesture(key);
return this;
}
public IBindingExpression Bind(ModifierKeys modifiers, Key key)
{
_gesture = new KeyGesture(key, modifiers);
return this;
}
// registers an ICommand that will launch the dialog T
public ScreenAction ToDialog<T>()
{
return buildAction(() => _registry.CommandForDialog<T>());
}
// registers an ICommand that would open the screen T in the
// main tab area of the UI
public ScreenAction ToScreen<T>() where T : IScreen
{
return buildAction(() => _registry.CommandForScreen<T>());
}
public ScreenAction To(ICommand command)
{
return buildAction(() => command);
}
// Merely configures the underlying ScreenAction
private ScreenAction buildAction(Func<ICommand> value)
{
ICommand command = value();
_screenAction.Binding = new KeyBinding(command, _gesture);
_registry.register(_screenAction);
return _screenAction;
}
public BindingExpression Icon(Icon icon)
{
_screenAction.Icon = icon;
return this;
}
}
One of the important factors in many fluent interfaces is trying to guide the user of the API into doing things in a certain order. In the case in Figure 2, I use interfaces on BindingExpression strictly to control the user choices in IntelliSense, even though I am always returning the same BindingExpression object throughout. Think about this. Users of this fluent interface should only specify the action name and the keyboard shortcut keys once. After that, the user shouldn’t have to see those methods in IntelliSense. The DSL expression starts with the call to ScreenObjectRegistry.Action(name), which captures the descriptive name of the shortcut that will appear in menus and returns a new BindingExpression object as this interface:
public interface IActionExpression
{
IBindingExpression Bind(Key key);
IBindingExpression Bind(ModifierKeys modifiers, Key key);
}
By casting BindingExpression to IActionExpression, the only choice the user has is to specify the key combinations for the shortcut, which will return the same BindingExpression object, but casted to the IBindingExpression interface that only allows users to specify a single action:
// The last step that captures the actual
// "action" of the ScreenAction
public interface IBindingExpression
{
ScreenAction ToDialog<T>();
ScreenAction ToScreen<T>() where T : IScreen;
ScreenAction PublishEvent<T>() where T : new();
ScreenAction To(Action action);
ScreenAction To(ICommand command);
}
Object Initializers
Now that we’ve introduced method chaining as the mainstay of internal DSL development in C#, let’s start looking at the alternative patterns that can often lead to simpler mechanics for the DSL developer. The first alternative is simply to use the object initializer functionality introduced in the Microsoft .NET Framework 3.5.
I can still remember my very first foray into fluent interfaces. I worked on a system that acted as a message broker between law firms submitting legal invoices electronically and their customers. One of the common use cases for us was to send messages to the customers on behalf of the law firms. To send the messages we invoked an interface like this:
public interface IMessageSender
{
void SendMessage(string text, string sender, string receiver);
}
That’s a very simple API; just pass in three string arguments and it’s good to go. The problem in usage is which argument goes where. Yes, tools such as ReSharper can show you which parameter you’re specifying at any one time, but how about scanning the calls to SendMessage when you’re just reading code? Look at the usage of the following code sample and you’ll understand exactly what I mean about errors from transposing the order of the string arguments:
// Snippet from a class that uses IMessageSender
public void SendMessage(IMessageSender sender)
{
// Is this right?
sender.SendMessage("the message body", "PARTNER001", "PARTNER002");
// or this?
sender.SendMessage("PARTNER001", "the message body", "PARTNER002");
// or this?
sender.SendMessage("PARTNER001", "PARTNER002", "the message body");
}
At the time, I solved the API usability issue by moving to a fluent interface approach that more clearly indicated which argument was which:
public void SendMessageFluently(FluentMessageSender sender)
{
sender
.SendText("the message body")
.From("PARTNER001").To("PARTNER002");
}
I genuinely believed this made for a more usable, less error-prone API, but let’s look at what the underlying implementation of the expression builders might look like in Figure 3.
Figure 3 Implementation of an Expression Builder
public class FluentMessageSender
{
private readonly IMessageSender _messageSender;
public FluentMessageSender(IMessageSender sender)
{
_messageSender = sender;
}
public SendExpression SendText(string text)
{
return new SendExpression(text, _messageSender);
}
public class SendExpression : ToExpression
{
private readonly string _text;
private readonly IMessageSender _messageSender;
private string _sender;
public SendExpression(string text, IMessageSender messageSender)
{
_text = text;
_messageSender = messageSender;
}
public ToExpression From(string sender)
{
_sender = sender;
return this;
}
void ToExpression.To(string receiver)
{
_messageSender.SendMessage(_text, _sender, receiver);
}
}
public interface ToExpression
{
void To(string receiver);
}
}
That’s a lot more code to create the API than was originally required. Fortunately, now we have another alternative with object initializers (or with named parameters in .NET Framework 4 or VB.NET). Let’s make another version of the message sender that takes in a single object as its parameter:
public class SendMessageRequest
{
public string Text { get; set; }
public string Sender { get; set; }
public string Receiver { get; set; }
}
public class ParameterObjectMessageSender
{
public void Send(SendMessageRequest request)
{
// send the message
}
}
Now, the API usage with an object initializer is:
public void SendMessageAsParameter(ParameterObjectMessageSender sender)
{
sender.Send(new SendMessageRequest()
{
Text = "the message body",
Receiver = "PARTNER001",
Sender = "PARTNER002"
});
}
Arguably, this third incarnation of the API reduces errors in usage with much simpler mechanics than the fluent interface version.
The point here is that fluent interfaces are not the only pattern for creating more readable APIs in the .NET Framework. This approach is much more common in JavaScript, where you can use JavaScript Object Notation (JSON) to completely specify objects in one line of code, and in Ruby, where it is idiomatic to use name/value hashes as arguments to methods.
Nested Closure
I think that many people assume that fluent interfaces and method chaining are the only possibilities for building DSLs inside C#. I used to believe that too, but I’ve since found other techniques and patterns that are frequently much easier to implement than method chaining. An increasingly popular pattern is the nested closure pattern:
Express statement sub-elements of a function call by putting them into a closure in an argument.
More and more .NET Web development projects are being done with the Model-View-Controller pattern. One of the side effects of this shift is much more need to generate snippets of HTML in code for input elements. Straight-up string manipulation to generate the HTML can get ugly fast. You end up repeating a lot of calls to “sanitize” the HTML to avoid injection attacks, and in many cases we may want to allow multiple classes or methods to have some say in the final HTML representation. I want to express HTML creation by just saying “I want a div tag with this text and this class.” To ease this HTML generation, we model HTML with an “HtmlTag” object that looks something like this in usage
var tag = new HtmlTag("div").Text("my text").AddClass("collapsible");
Debug.WriteLine(tag.ToString());
which generates the following HTML:
<div class="collapsible">my text</div>
The core of this HTML generation model is the HtmlTag object that has methods to programmatically build up an HTML element structure like this:
public interface IHtmlTag
{
HtmlTag Attr(string key, object value);
HtmlTag Add(string tag);
HtmlTag AddStyle(string style);
HtmlTag Text(string text);
HtmlTag SetStyle(string className);
HtmlTag Add(string tag, Action<HtmlTag> action);
}
This model also allows us to add nested HTML tags like this:
[Test]
public void render_multiple_levels_of_nesting()
{
var tag = new HtmlTag("table");
tag.Add("tbody/tr/td").Text("some text");
tag.ToCompacted().ShouldEqual(
"<table><tbody><tr><td>some text</td></tr></tbody></table>"
);
}
In real usage, I frequently find myself wanting to add a fully configured child tag in one step. As I mentioned, I have an open source project called StoryTeller that my team is using to express acceptance tests. Part of the functionality of StoryTeller is to run all of the acceptance tests in our continuous integration build and generate a report of the test results. The test result summary is expressed as a simple table with three columns. The summary table HTML looks like this:
<table>
<thead>
<tr>
<th>Test</th>
<th>Lifecycle</th>
<th>Result</th>
</tr>
</thead>
<tbody>
<!-- rows for each individual test -->
</tbody>
</table>
Using the HtmlTag model I described above, I generate the header structure of the results table with this code:
// _table is an HtmlTag object
// The Add() method accepts a nested closure argument
_table.Add("thead/tr", x =>
{
x.Add("th").Text("Test");
x.Add("th").Text("Lifecycle");
x.Add("th").Text("Result");
});
In the call to _table.Add I pass in a lambda function that completely specifies how to generate the first header row. Using the nested closure pattern allows me to pass in the specification without first having to create another variable for the “tr” tag. You might not like this syntax at first glance, but it makes the code terser. Internally, the Add method that uses the nested closure is simply this:
public HtmlTag Add(string tag, Action<HtmlTag> action)
{
// Creates and adds the new HtmlTag with
// the supplied tagName
var element = Add(tag);
// Uses the nested closure passed into this
// method to configure the new child HtmlTag
action(element);
// returns that child
return element;
}
For another example, the main StructureMap Container class is initialized by passing in a nested closure that represents all of the desired configuration for the container like this:
IContainer container = new Container(r =>
{
r.For<Processor>().Use<Processor>()
.WithCtorArg("name").EqualTo("Jeremy")
.TheArrayOf<IHandler>().Contains(x =>
{
x.OfConcreteType<Handler1>();
x.OfConcreteType<Handler2>();
x.OfConcreteType<Handler3>();
});
});
The signature and body of this constructor function is:
public Container(Action<ConfigurationExpression> action)
{
var expression = new ConfigurationExpression();
action(expression);
// As explained later in the article,
// PluginGraph is part of the Semantic Model
// of StructureMap
PluginGraph graph = expression.BuildGraph();
// Take the PluginGraph object graph and
// dynamically emit classes to build the
// configured objects
construct(graph);
}
I used the nested closure pattern in this case for a couple of reasons. The first is that the StructureMap container works by taking the complete configuration in one step, then using Reflection.Emit to dynamically generate “builder” objects before the container can be used. Taking the configuration in through a nested closure allows me to capture the entire configuration at one time and quietly do the emitting right before the container is made available for use. The other reason is to segregate the methods for registering types with the container at configuration time away from the methods that you would use at runtime to retrieve services (this is an example of the Interface Segregation Principle, the “I” in S.O.L.I.D.).
I have included the nested closure pattern in this article because it’s becoming quite prevalent in.NET Framework open source projects such as Rhino Mocks, Fluent NHibernate and many IoC tools. Also, I have frequently found the nested closure pattern to be significantly easier to implement than using only method chaining. The downside is that many developers are still uncomfortable with lambda expressions. Furthermore, this technique is barely usable in VB.NET because VB.NET doesn’t support multiline lambda expressions.
IronRuby and Boo
All of my samples in this article are written in C# for mainstream appeal, but if you’re interested in doing DSL development you may want to look at using other CLR languages. In particular, IronRuby is exceptional for creating internal DSLs because of its flexible and relatively clutter-free syntax (optional parentheses, no semicolons and very terse). Stepping farther afield, the Boo language is also popular for DSL development in the CLR.
The design pattern names and definitions are taken from the online draft of Martin Fowler’s forthcoming book on Domain Specific Languages at https://martinfowler.com/dsl.html.
Jeremy Miller*, a Microsoft MVP for C#, is also the author of the open source StructureMap (structuremap.sourceforge.net) tool for Dependency Injection with .NET and the forthcoming StoryTeller (storyteller.tigris.org) tool for supercharged FIT testing in .NET. Visit his blog, The Shade Tree Developer, at https://jeremydmiller.com/.*
Thanks to the following technical expert for reviewing this article: Glenn Block