January 2011

Volume 26 Number 01

The Working Programmer - Multiparadigmatic .NET, Part 5: Automatic Metaprogramming

By Ted Neward | January 2011

In last month’s piece, objects came under the microscope, and in particular we looked at the “axis” of commonality/variability analysis that inheritance offers us. While inheritance isn’t the only form of commonality/variability available within a modern object-oriented (OO) language such as C# or Visual Basic, it certainly stands at the center of the OO paradigm. And, as also discussed, it doesn’t always provide the best solution to all problems.

To recap, what we’ve discovered so far is that C# and Visual Basic are both procedural and OO languages—but clearly the story doesn’t stop there. Both are also metaprogrammatic languages—each offers the Microsoft .NET Framework developer the opportunity to build programs out of programs, in a variety of different ways: automatic, reflective and generative.

Automatic Metaprogramming

The core idea behind metaprogramming is simple: the traditional constructs of procedural or OO programming haven’t solved quite all of our software design issues, at least not in a way that we find satisfying. For example, to cite a basic flaw, developers frequently found a need for a data structure that maintained an ordered list of some particular type, such that we could insert items into the list in a particular slot and see the items in that exact order. For performance reasons, sometimes the list wanted to be in a linked list of nodes. In other words, we wanted an ordered linked list, but strongly typed to the type being stored within it.

Developers who came to the .NET Framework from the C++ world know one solution to this problem—that of parameterized types, also known more informally as generics. But, as developers who came to .NET through Java’s early days know, another solution emerged long before templates (which did, eventually, make it into the Java platform). That solution was to simply write each needed list implementation as necessary, as shown in Figure 1.

Figure 1 An Example of Writing List Implementations as Necessary

Class ListOfInt32
  Class Node
    Public Sub New(ByVal dt As Int32)
      data = dt
    End Sub
    Public data As Int32
    Public nextNode As Node = Nothing
  End Class
  Private head As Node = Nothing
  Public Sub Insert(ByVal newParam As Int32)
    If IsNothing(head) Then
      head = New Node(newParam)
      Dim current As Node = head
      While (Not IsNothing(current.nextNode))
        current = current.nextNode
      End While
        current.nextNode = New Node(newParam)
    End If
  End Sub
  Public Function Retrieve(ByVal index As Int32)
    Dim current As Node = head
    Dim counter = 0
    While (Not IsNothing(current.nextNode) And counter < index)
      current = current.nextNode
      counter = counter + 1
    End While
    If (IsNothing(current)) Then
      Throw New Exception("Bad index")
      Retrieve = current.data
    End If
  End Function
End Class

Now, obviously, this fails the Don’t Repeat Yourself (DRY) test—every time the design calls for a new list of this kind, it will need to be written “by hand,” which will clearly be a problem as time progresses. Although not complicated, it’s still going to be awkward and time-consuming to write each of these, particularly if more features become necessary or desirable.

Of course, nobody ever said developers had to be the ones writing such code. Which brings us around neatly to the solution of code generation, or as it’s sometimes called, automatic metaprogramming. Another program can easily do it, such as a program designed to kick out classes that are customized to each type needed, as shown in Figure 2.

Figure 2 An Example of Automatic Metaprogramming

Sub Main(ByVal args As String())
  Dim CRLF As String = Chr(13).ToString + Chr(10).ToString()
  Dim template As String =
   "Class ListOf{0}" + CRLF +
   "  Class Node" + CRLF +
   "    Public Sub New(ByVal dt As {0})" + CRLF +
   "      data = dt" + CRLF +
   "    End Sub" + CRLF +
   "    Public data As {0}" + CRLF +
   "    Public nextNode As Node = Nothing" + CRLF +
   "  End Class" + CRLF +
   "  Private head As Node = Nothing" + CRLF +
   "  Public Sub Insert(ByVal newParam As {0})" + CRLF +
   "    If IsNothing(head) Then" + CRLF +
   "      head = New Node(newParam)" + CRLF +
   "    Else" + CRLF +
   "      Dim current As Node = head" + CRLF +
   "      While (Not IsNothing(current.nextNode))" + CRLF +
   "        current = current.nextNode" + CRLF +
   "      End While" + CRLF +
   "      current.nextNode = New Node(newParam)" + CRLF +
   "    End If" + CRLF +
   "  End Sub" + CRLF +
   "  Public Function Retrieve(ByVal index As Int32)" + CRLF +
   "    Dim current As Node = head" + CRLF +
   "    Dim counter = 0" + CRLF +
   "    While (Not IsNothing(current.nextNode) And counter < index)"+ CRLF +
   "      current = current.nextNode" + CRLF +
   "      counter = counter + 1" + CRLF +
   "    End While" + CRLF +
   "    If (IsNothing(current)) Then" + CRLF +
   "      Throw New Exception()" + CRLF +
   "    Else" + CRLF +
   "      Retrieve = current.data" + CRLF +
   "    End If" + CRLF +
   "  End Sub" + CRLF +
   "End Class"
    If args.Length = 0 Then
      Console.WriteLine("Usage: VBAuto <listType>")
      Console.WriteLine("   where <listType> is a fully-qualified CLR typename")
      Console.WriteLine("Producing ListOf" + args(0))
      Dim outType As System.Type =
      Using out As New StreamWriter(New FileStream("ListOf" + outType.Name + ".vb",
        out.WriteLine(template, outType.Name)
      End Using
    End If

Then, once the class in question has been created, it needs only to be compiled and either added to the project or else compiled into its own assembly for reuse as a binary.

Of course, the language being generated doesn’t have to be the language in which the code generator is written—in fact, it will often help immensely if it isn’t, because then it will be easier to keep the two more clearly distinct in the developer’s head during debugging.

Commonality, Variability and Pros and Cons

In the commonality/variability analysis, automatic metaprogramming occupies an interesting place. In the Figure 2 example, it places structure and behavior (the outline of the class above) into commonality, allowing for variability along data/type lines, that of the type being stored in the generated class. Clearly, we can swap in any type desired into the ListOf type.

But automatic metaprogramming can reverse that, too, if necessary. Using a rich templating language, such as the Text Template Transformation Toolkit (T4) that ships with Visual Studio, the code generation templates can do source-time decision making, which then allows the template to provide commonality along data/structural lines, and vary by structural and behavioral lines. In fact, if the code template is sufficiently complex (and this isn’t necessarily a good angle to pursue), it’s even possible to eliminate commonality altogether and vary everything (data, structure, behavior and so on). Doing so typically becomes unmanageable quite quickly, however, and in general should be avoided. That leads to one of the key realizations about automatic metaprogramming: Because it lacks any sort of inherent structural restrictions, choose the commonality and variability explicitly, lest the source code template grow out of control trying to be too flexible. For example, given the ListOf example in Figure 2, the commonality is in the structure and behavior, and the variability is in the data type being stored—any attempt to introduce variability in the structure or behavior should be considered to be a red flag and a potential slippery slope to chaos.

Obviously, code generation carries with it some significant risks, particularly around areas of maintenance: Should a bug be discovered (such as the concurrency one in the ListOf… example in Figure 2), fixing it isn’t a simple matter. The template can obviously be fixed, but that doesn’t do anything for the code already generated—each of those source artifacts needs to be regenerated, in turn, and this is something that’s hard to track and ensure automatically. And, what’s more, any handmade changes to those generated files will be intrinsically lost, unless the template-generated code has allowed for customizations. This risk of overwrite can be mitigated by use of partial classes, allowing developers to fill in the “other half” (or not) of the class being generated, and by extension methods, giving developers an opportunity to “add” methods to an existing family of types without having to edit the types. But partial classes must be in place from the beginning within the templates, and extension methods carry some restrictions that prevent them from replacing existing behavior—again leaving neither approach as a good mechanism to carry negative variability.

Code Generation

Code generation—or automatic metaprogramming—is a technique that’s been a part of programming for many years, ranging all the way from the C preprocessor macro through to the C# T4 engine, and will likely continue to carry forward, owing to the conceptual simplicity of the idea. However, its principal flaws are the lack of compiler structure and checking during the expansion process (unless, of course, that checking is done by the code generator itself, a task that’s harder than it sounds) and the inability to capture negative variability in any meaningful way. The .NET Framework offers some mechanisms to make code generation easier—in many cases, those mechanisms were introduced to save other Microsoft developers some grief—but they won’t eliminate all the potential pitfalls in code generation, not by a long shot.

And yet, automatic metaprogramming remains one of the more widely used forms of metaprogramming. C# has a macro preprocessor, as do C++ and C. (Using macros to create “tiny templates” was common before C++ got templates.) On top of that, using metaprogramming as part of a larger framework or library is common, particularly for inter-process communication scenarios (such as the client and server stubs generated by Windows Communication Foundation). Other toolkits use automatic metaprogramming to provide “scaffolding” to ease the early stages of an application (such as what we see in ASP.NET MVC). In fact, arguably every Visual Studio project begins with automatic metaprogramming, in the form of the “project templates” and “item templates” that most of us use to create new projects or add files to projects. And so on. Like so many other things in computer science, automatic metaprogramming remains a useful and handy tool to have in the designer toolbox, despite its obvious flaws and pitfalls. Fortunately, it’s far from the only meta-tool in the programmer’s toolbox.           

Ted Neward is a principal with Neward & Associates, an independent firm specializing in enterprise .NET Framework and Java platform systems. He’s written more than 100 articles, is a C# MVP and INETA speaker, and has authored and coauthored a dozen books, including “Professional F# 2.0” (Wrox, 2010). He also consults and mentors regularly. Reach him at ted@tedneward.com and read his blog at blogs.tedneward.com.