October 2010

Volume 25 Number 10

The Working Programmer - Multiparadigmatic .NET, Part 2

By Ted Neward | October 2010

In my previous article (msdn.microsoft.com/magazine/ff955611), the first of this series, I mentioned that the two languages central to the Microsoft .NET Framework—C# and Visual Basic—are multiparadigm languages, just like C++, their syntactic (in the case of C#) or conceptual (in the case of Visual Basic) predecessor. Using a multiparadigmatic language can be confusing and difficult, particularly when the purposes of the different paradigms aren’t clear.

Commonality and Variability

But before we can start taking apart the different paradigms in these languages, a bigger question comes to mind: What, precisely, are we trying to do when we design a software system? Forget the “end result” goals—modularity, extensibility, simplicity and all that jazz—for a moment, and focus more on the “how” of the language. How, exactly, are we trying to create all those “end result” goals?

James O. Coplien—from his “Multi-Paradigm Design for C++” (Addison-Wesley Professional, 1998)—has an answer for us:

When we think abstractly, we emphasize what is common while suppressing detail. A good software abstraction requires that we understand the problem well enough in all of its breadth to know what is common across related items of interest and to know what details vary from item to item. The items of interest are collectively called a family, and families—rather than individual applications—are the scope of architecture and design. We can use the commonality/variability model regardless of whether family members are modules, classes, functions, processes or types; it works for any paradigm. Commonality and variability are at the heart of most design techniques.

Think about the traditional object paradigm for a moment. As object-oriented developers, we’re taught from early on to “identify the nouns” in the system and look for the things that make up a particular entity—to find all the things that pertain to being a “teacher” within the system, for example, and put them into a class called Teacher. But if several “nouns” have overlapping and related behavior—such as a “student” having some data and operations overlapping with a “person” but with some marked differences—then we’re taught that rather than replicate the common code, we should elevate the commonality into a base class, and relate the types to one another through inheritance. In other words, commonality is gathered together within a class, and variability is captured by extending from that class and introducing the variations. Finding the commonalities and variabilities within a system, and expressing them, forms the heart of design.

Commonalities are often the parts that are difficult to explicitly identify, not because we don’t recognize them, but because they’re so easily and intuitively recognizable it’s tough to spot them. For example, if I say, “Vehicle,” what image pops into your head? If we do this exercise with a group of people, each will have a different image, yet there will be vast commonality among them all. However, if we start listing the various vehicles imagined, the different kinds of variabilities begin to emerge and categorize themselves (we hope), such that we can still have some set of commonalities among the vehicles.

Positive and Negative Variability

Variability can come in two basic forms, one of which is easy to recognize and the other much more difficult. Positive variability is when the variability occurs in the form of adding to the basic commonality. For example, imagine the abstraction desired is that of a message, such as a SOAP message or e-mail. If we decide that a Message type has a header and body, and leave different kinds of messages to use that as the commonality, then a positive variability on this is a message that carries a particular value in its header, perhaps the date/time it was sent. This is usually easily captured in language constructs—in the object-oriented paradigm, for example, it’s relatively trivial to create a Message subclass that adds the support for date/time sent.

Negative variability, however, is much trickier. As might be inferred, a negative variability removes or contradicts some facet of the commonality—a Message that has a header but no body (such as an acknowledgement message used by the messaging infrastructure) is a form of negative variability. And, as you can probably already guess, capturing this in a language construct is problematic—neither C# nor Visual Basic has a facility to remove a member declared in a base class. The best we could do in this case is return null or nothing from the Body member, which will clearly play havoc with any code that expects a Body to be present, such as verification routines that run a CRC on the Body to ensure it was transmitted correctly.

(Interestingly, XML Schema types offer negative variability in their schema-validation definitions, something that no mainstream programming language yet offers, which is one of the ways that the XML Schema Definition can mismatch against programming languages. Whether this will become a forthcoming feature in some as-yet-unwritten programming language, and whether it would be a Good Thing To Have, is an interesting discussion best had over beer.)

In many systems, negative variability is often handled using explicit code constructs at the client level—meaning, it’s up to the users of the Message type to do some kind of if/else test to see what kind of Message it is before examining the Body, which renders the work put into the Message family all but irrelevant. Too much negative variability escaping the design is usually the underlying cause of calls from developers to “pitch it all and start over.”

Binding Commonality and Variability

The actual moment that commonality and variability are set varies with each paradigm, and in general, the closer to run time that we can bind those decisions, the more control we give customers and users over the system’s evolution and as a whole. When discussing a particular paradigm or technique within a paradigm, it’s important to recognize in which of these four “bind times” the variability kicks in:

  1. Source time. This is the time before the compiler fires up, when the developer (or some other entity) is creating the source files that will eventually be fed into the compiler. Code-generative techniques, such as the T4 template engine—and to a lesser degree the ASP.NET system—operate at a source-time binding.
  2. Compile time. As its name implies, this binding occurs during the compiler’s pass over the source code to render it into compiled bytecode or executable CPU instructions. A great deal of decision making is finalized here, though not all of it, as we’ll see.
  3. Link/load time. At the time the program loads and runs, an additional point of variability kicks in, based on the specific modules (assemblies, in the case of .NET; DLLs, in the case of native Windows code) that are loaded. This is commonly referred to as a plug-in- or add-in-style architecture when it’s applied at a whole-scale program level.
  4. Run time. During the program’s execution, certain variabilities may be captured based on user input and decision making, and potentially different code executed (or even generated) based on those decisions/input.

In some cases, the design process will want to start from these “bind times” and work backward to figure out what language constructs can support the requirement; for example, a user may want to have the ability to add/remove/modify variability at run time (so that we don’t have to go back through a compile cycle or reload code), which means that whatever paradigm the designer uses, it must support a runtime variability binding.

Challenge

In my previous article, I left readers with a question:

As an exercise, consider this: The .NET Framework 2.0 introduced generics (parameterized types). Why? From a design perspective, what purpose do they serve? (And for the record, answers of “It lets us have type-safe collections” are missing the point—Windows Communication Foundation uses generics extensively, clearly in ways that aren’t just about type-safe collections.)

Taking this a little further, look at the (partial) implementation of a Point class in Figure 1, representing a Cartesian X/Y point, like pixel coordinates on a screen, or a more classical graph.

Figure 1 Partial Implementation of a Point Class

Public Class Point
  Public Sub New(ByVal XX As Integer, ByVal YY As Integer)
    Me.X = XX
    Me.y = YY
  End Sub
  Public Property X() As Integer
  Public Property y() As Integer
  Public Function Distance(ByVal other As Point) As Double
    Dim XDiff = Me.X - other.X
    Dim YDiff = Me.y - other.y
    Return System.Math.Sqrt((XDiff * XDiff) + (YDiff * YDiff))
  End Function
  Public Overrides Function Equals(ByVal obj As Object) As Boolean
    ' Are these the same type?
    If Me.GetType() = obj.GetType() Then
      Dim other As Point = obj
      Return other.X = Me.X And other.y = Me.y
    End If
    Return False
  End Function
  Public Overrides Function ToString() As String
    Return String.Format("({0},{1}", Me.X, Me.y)
  End Function
End Class

In and of itself, it’s not really all that exciting. The rest of the implementation is left to the reader’s imagination, because it’s not central to the discussion.

Notice that this Point implementation has made a few assumptions about the way Points are supposed to be utilized. For example, the X and Y elements of the Point are integers, meaning that this Point class can’t represent fractional Points, such as Points at (0.5,0.5). Initially, this may be an acceptable decision, but inevitably, a request will come up asking to be able to represent “fractional Points” (for whatever reason). Now, the developer has an interesting problem: How to represent this new requirement?

Starting from the basics, let’s do the “Oh Lord don’t do that” thing and simply create a new Point class that uses floating-point members instead of integral members, and see what emerges (see Figure 2; note that PointD is short for “Point-Double,” meaning it uses Doubles). As is pretty clear, there’s a lot of conceptual overlap here between the two Point types. According to the commonality/variability theory of design, that means we need to somehow capture the common parts and allow for variability. Classic object-orientation would have us do it through inheritance, elevating the commonality into a base class or interface (Point), then implementing that in subclasses (PointI and PointD, perhaps).Figure 2 A New Point Class with Floating-Point Members

Figure 2 A New Point Class with Floating-Point Members

Public Class PointD
  Public Sub New(ByVal XX As Double, ByVal YY As Double)
    Me.X = XX
    Me.y = YY
  End Sub
  Public Property X() As Double
  Public Property y() As Double
  Public Function Distance(ByVal other As Point) As Double
    Dim XDiff = Me.X - other.X
    Dim YDiff = Me.y - other.y
    Return System.Math.Sqrt((XDiff * XDiff) + (YDiff * YDiff))
  End Function
  Public Overrides Function Equals(ByVal obj As Object) As Boolean
    ' Are these the same type?
    If Me.GetType() = obj.GetType() Then
      Dim other As PointD = obj
      Return other.X = Me.X And other.y = Me.y
    End If
    Return False
  End Function
  Public Overrides Function ToString() As String
    Return String.Format("({0},{1}", Me.X, Me.y)
  End Function
End Class

Interesting problems emerge from attempting this, however. First, the X and Y properties need a type associated with them, but the variability in the two different subclasses concerns how the X and Y coordinates are stored, and thus represented, to users. The designer could always simply opt for the largest/widest/most comprehensive representation, which in this case would be a Double, but doing so means having a Point that can only have Integer values is now lost as an option, and it undoes all of the work the inheritance was intended to permit. Also, because they’re related by inheritance now, the two Point-inheriting implementations are now supposedly interchangeable, so we should be able to pass a PointD into a PointI Distance method, which may or may not be desirable. And is a PointD of (0.0, 0.0) equivalent (as in Equals) to a PointI of (0,0)? All these issues have to be considered.

Even if these problems are somehow fixed or made tractable, other problems emerge. Later, we might want a Point that accepts values larger than can be held in an Integer. Or only absolute-positive values (meaning the origin is in the lower-left corner) are deemed acceptable. Each of these different requirements will mean new subclasses of Point must be created.

Stepping back for a moment, the original desire was to reuse the commonality of the implementation of Point but allow for variability in the type/representation of the values that make up the Point. In the ideal, depending on the kind of graph we’re working with, we should be able to choose the representation at the time the Point is created, and it would represent itself as an entirely distinct and different type, which is precisely what generics do.

Doing this, however, represents a problem: The compiler insists that “Rep” types won’t necessarily have “+” and “-” operators defined for it, because it thinks we want to put any possible type here—Integers, Longs, Strings, Buttons, DatabaseConnections or whatever else comes to mind—and that’s clearly a little too variable. So, once again, we need to express some commonality to the type that can be used here, in the form of a generic constraint on what types “Rep” can be (see Figure 3).

Figure 3 A Generic Constraint on Type

Public Class GPoint(Of Rep As {IComparable, IConvertible})
  Public Sub New(ByVal XX As Rep, ByVal YY As Rep)
    Me.X = XX
    Me.Y = YY
  End Sub
  Public Property X() As Rep
  Public Property Y() As Rep
  Public Function Distance(ByVal other As GPoint(Of Rep)) As Double
    Dim XDiff = (Me.X.ToDouble(Nothing)) - (other.X.ToDouble(Nothing))
    Dim YDiff = (Me.Y.ToDouble(Nothing)) - (other.Y.ToDouble(Nothing))
    Return System.Math.Sqrt((XDiff * XDiff) + (YDiff * YDiff))
  End Function
  Public Overrides Function Equals(ByVal obj As Object) As Boolean
    ' Are these the same type?
    If Me.GetType() = obj.GetType() Then
      Dim other As GPoint(Of Rep) = obj
      Return (other.X.CompareTo(Me.X) = 0) And (other.y.CompareTo(Me.Y) = 0)
    End If
    Return False
  End Function
  Public Overrides Function ToString() As String
    Return String.Format("({0},{1}", Me.X, Me.Y)
  End Function
End Class

In this case, two constraints are imposed: one to ensure that any “Rep” type can be converted to double values (to calculate the distance between the two points), and the other to ensure that the constituent X and Y values can be compared to see if they’re greater-than/equal-to/less-than one another.

And now the reason for generics becomes clearer: they support a different “axis” of variability for design, one that’s drastically different from the traditional inheritance-based axis. It allows the designer to render the implementation as commonalities, and the types being operated upon by the implementation to be variabilities.

Note that this implementation assumes that the variability is occurring at compile time, rather than at link/load time or run time—if the user wants or needs to specify the type of the X/Y members of the Point at run time, then a different solution is needed.

Not Dead (or Done) Yet!

If all of software design is a giant exercise in commonality and variability, then the need to understand multiparadigmatic design becomes clear: Each of the different paradigms offers different ways to achieve this commonality/variability, and mixing the paradigms creates confusion and leads to calls for a complete rewrite. Just as the human brain starts to get confused when we try to map three-dimensional constructs in our head into four and five dimensions, too many dimensions of variability in software cause confusion.

In the next half-dozen or so articles, I’ll be looking at the different ways that each paradigm supported by C# and Visual Basic—the structural, object-oriented, metaprogramming, functional and dynamic paradigms being the principals—provide functionality to capture commonality and allows for variability. When we’re through all of that, we’ll examine how some of them can be combined in interesting ways to make your designs more modular, extensible, maintainable and all that other stuff.

Happy coding!


Ted Neward is a principal with Neward & Associates, an independent firm specializing in enterprise .NET Framework and Java platform systems. He’s written more than 100 articles, is a C# MVP and INETA speaker and has authored and coauthored a dozen books, including “Professional F# 2.0” (Wrox, 2010). He also consults and mentors regularly. Reach him at ted@tedneward.com with questions or consulting requests, and read his blog at blogs.tedneward.com.

Thanks to the following technical expert for reviewing this article: Anthony Green