June 2010

Volume 25 Number 06

Cutting Edge - C# 4.0, the Dynamic Keyword and COM

By Dino Esposito | June 2010

Dino EspositoI grew up as a C/C++ developer and, especially before the advent of the Microsoft .NET Framework, I often chided my colleagues who programmed in Visual Basic for using such a weakly typed language.

There was a time when static typing and strongly typed programming were the obvious way to software happiness. But things change, and today the community of C# developers—to which it seems nearly all former C/C++ developers have migrated—often feel the distinct need for a much more dynamic programming model. Last month, I introduced some features of dynamic programming that Microsoft makes available through C# 4.0 and Visual Studio 2010. This month, I’ll delve deeper into some related scenarios, starting with one of the most compelling reasons for using C# 4.0—easy programming with COM objects within the .NET Framework.

Easy Access to COM Objects

An object is said to be dynamic when its structure and behavior aren’t fully described by a statically defined type that the compiler knows thoroughly. Admittedly, the word dynamic sounds a bit generic in this context, so let’s look at a simple example. In a scripting language such as VBScript, the following code runs successfully:

Set word = CreateObject("Word.Application")

The CreateObject function assumes that the string it gets as an argument is the progID of a registered COM object. It creates an instance of the component and returns its IDispatch automation interface. The details of the IDispatch interface are never visible at the level of the scripting language. What matters is that you can write code such as:

Set word = CreateObject("Word.Application")
word.Visible = True
Set doc = word.Documents.Add()
Set selection = word.Selection
selection.TypeText "Hello, world"
selection.TypeParagraph()

doc.SaveAs(fileName)

In this code, you first create a reference to a component that automates the behavior of the underlying Microsoft Office Word application. Next, you make the Word main window visible, add a new document, write some text into it and then save the document somewhere. The code is clear, reads well and, more importantly, works just fine.

The reason this works, however, is due to a particular capability offered by VBScript—late binding. Late binding means that the type of a given object isn’t known until the execution flow hits the object. When this happens, the runtime environment first ensures that the member invoked on the object really exists and then invokes it. No preliminary check whatsoever is made before the code is actually executed.

As you may know, a scripting language such as VBScript doesn’t have a compiler. However, Visual Basic (including the CLR version) for years had a similar feature. I confess I frequently envied my Visual Basic colleagues for their ability to more easily use COM objects—often valuable building blocks of an application you need to interop with, such as Office. In some cases, in fact, my team ended up writing some portions of our interop code in Visual Basic, even when the entire application was in C#. Should this be surprising? Isn’t polyglot programming a new frontier to reach?

In Visual Basic, the CreateObject function exists for (strong) compatibility reasons. The point is that .NET Framework-based languages were designed with early binding in mind. COM interoperability is a scenario addressed by the .NET Framework but never specifically supported by languages with keywords and facilities—not until C# 4.0.

C# 4.0 (and Visual Basic) has dynamic lookup capabilities that indicate late binding is now an approved practice for .NET Framework developers. With dynamic lookup, you can code access to methods, properties, indexer properties and fields in a way that bypasses static type checking to be resolved at run time.

C# 4.0 also enables optional parameters by recognizing default value in a member declaration. This means that when a member with optional parameters is invoked, optional arguments can be omitted. Furthermore, arguments can be passed by name as well as by position. At the end of the day, improved COM binding in C# 4.0 simply means that some common features of scripting languages are now supported by an otherwise static and strongly typed language. Before we look at how you can leverage the new dynamic keyword to operate seamlessly with COM objects, let’s delve a bit deeper into the internal mechanics of dynamic type lookup.

Dynamic Language Runtime

When you declare a variable as dynamic in Visual Studio 2010, you have no IntelliSense at all in the default configuration. Interestingly, if you install an additional tool such as ReSharper 5.0 (jetbrains.com/resharper), you can get some partial information through IntelliSense about the dynamic object. Figure 1 shows the code editor with and without ReSharper. The tool just lists the members that appear to be defined on the dynamic type. At the very minimum, the dynamic object is an instance of System.Object.

image: IntelliSense for a Dynamic Object in Visual Studio 2010, with and Without ReSharper
Figure 1 IntelliSense for a Dynamic Object in Visual Studio 2010, with and Without ReSharper

Let’s see what happens when the compiler encounters the following code (the code is deliberately trivial to simplify understanding the implementation details):

class Program
{
  static void Main(string[] args) 
  { 
    dynamic x = 1;
    Console.WriteLine(x);
  }
}

In the second line, the compiler doesn’t attempt to resolve the symbol WriteLine, and no warning or error is thrown as would happen with a classic static type checker. As far as the dynamic keyword is concerned, C# is like an interpreted language here. Consequently, the compiler emits some ad hoc code that interprets the expression where a dynamic variable or argument is involved.  The interpreter is based on the Dynamic Language Runtime (DLR), a brand-new component of the .NET Framework machinery. To use more specific terminology, the compiler has to generate an expression tree using the abstract syntax supported by the DLR and pass it to the DLR libraries for processing. Within the DLR, the compiler-provided expression is encapsulated in a dynamically updated site object. A site object is responsible for binding methods to objects on the fly. Figure 2 shows a largely sanitized version of the real code emitted for the trivial program shown earlier.

The code in Figure 2 has been edited and simplified for readability, but it shows the gist of what’s going on. The dynamic variable is mapped to a System.Object instance and then a site is created for the program in the DLR. The site manages a binding between the WriteLine method with its parameters and the target object. The binding holds within the context of the type Program. To invoke the method Console.WriteLine on a dynamic variable, you invoke the site and pass the target object (in this case the Console type) and its parameters (in this case the dynamic variable). Internally, the site will check whether the target object really has a member WriteLine that can accept a parameter like the object currently stored in the variable x. If something goes wrong, the C# runtime just throws RuntimeBinderException.

Figure 2 The Real Implementation of a Dynamic Variable

internal class Program
{
  private static void Main(string[] args)
  {
    object x = 1;

    if (MainSiteContainer.site1 == null)
    {
      MainSiteContainer.site1 = CallSite<
        Action<CallSite, Type, object>>
        .Create(Binder.InvokeMember(
          "WriteLine", 
          null, 
          typeof(Program), 
          new CSharpArgumentInfo[] { 
            CSharpArgumentInfo.Create(...) 
          }));
    }
    MainSiteContainer.site1.Target.Invoke(
      site1, typeof(Console), x);
  }

  private static class MainSiteContainer
  {
    public static CallSite<Action<CallSite, Type, object>> site1;
  }
}

Working with COM Objects

New C# 4.0 features working with  COM objects from within .NET Framework-based applications considerably easier today. Let’s see how to create a Word document in C# and compare the code you need in .NET 3.5 and .NET 4. The sample application creates a new Word document based on a given template, fills it up and saves it to a fixed location. The template contains a couple of bookmarks for common pieces of information. Whether you target the .NET Framework 3.5 or the .NET Framework 4, the very first step on the way to programmatically creating a Word document is adding the Microsoft Word Object Library (see Figure 3).

image: Referencing the Word Object Library
Figure 3 Referencing the Word Object Library

Before Visual Studio 2010 and the .NET Framework 4, to accomplish this you needed code such as that in Figure 4.

Figure 4 Creating a New Word Document in C# 3.0

public static class WordDocument
{
  public const String TemplateName = @"Sample.dotx";
  public const String CurrentDateBookmark = "CurrentDate";
  public const String SignatureBookmark = "Signature";

  public static void Create(String file, DateTime now, String author)
  {
    // Must be an Object because it is passed as a ref
    Object missingValue = Missing.Value;

    // Run Word and make it visible for demo purposes
    var wordApp = new Application { Visible = true };

    // Create a new document
    Object template = TemplateName;
    var doc = wordApp.Documents.Add(ref template,
      ref missingValue, ref missingValue, ref missingValue);
    doc.Activate();

    // Fill up placeholders in the document
    Object bookmark_CurrentDate = CurrentDateBookmark;
    Object bookmark_Signature = SignatureBookmark;
    doc.Bookmarks.get_Item(ref bookmark_CurrentDate).Range.Select();
    wordApp.Selection.TypeText(current.ToString());
    doc.Bookmarks.get_Item(ref bookmark_Signature).Range.Select();
    wordApp.Selection.TypeText(author);

    // Save the document 
    Object documentName = file;
    doc.SaveAs(ref documentName,
      ref missingValue, ref missingValue, ref missingValue, 
      ref missingValue, ref missingValue, ref missingValue, 
      ref missingValue, ref missingValue, ref missingValue, 
      ref missingValue, ref missingValue, ref missingValue,
      ref missingValue, ref missingValue, ref missingValue);

    doc.Close(ref missingValue, 
      ref missingValue, ref missingValue);
    wordApp.Quit(ref missingValue, 
      ref missingValue, ref missingValue);
  }
}

To interact with a COM automation interface, you often need Variant types. When you interact with a COM automation object from within a .NET Framework-based application, you represent Variants as plain objects. The net effect is that you can’t use a string to indicate, say, the name of the template file you intend to base your Word document on, because the Variant parameter must be passed by reference. You have to resort to an Object instead, as shown here:

Object template = TemplateName;
var doc = wordApp.Documents.Add(ref template,
  ref missingValue, ref missingValue, ref missingValue);

A second aspect to consider is that Visual Basic and scripting languages are much more forgiving than C# 3.0. So, for example, they don’t force you to specify all parameters that a method on a COM object declares. The Add method on the Documents collection requires four arguments, and you can’t ignore them unless your language supports optional parameters.

As mentioned earlier, C# 4.0 does support optional parameters. This means that while simply recompiling the code in Figure 4 with C# 4.0 works, you could even rewrite it and drop all ref parameters that carry only a missing value, as shown here:

Object template = TemplateName;
var doc = wordApp.Documents.Add(template);

With the new C# 4.0 “Omit ref” support, the code in Figure 4 becomes even simpler and, more importantly, it becomes easier to read and syntactically similar to scripting code. Figure 5 contains the edited version that compiles well with C# 4.0 and produces the same effect as the code in Figure 4.

Figure 5 Creating a New Word Document in C# 4.0

public static class WordDocument
{
  public const String TemplateName = @"Sample.dotx";
  public const String CurrentDateBookmark = "CurrentDate";
  public const String SignatureBookmark = "Signature";

  public static void Create(string file, DateTime now, String author)
  {
    // Run Word and make it visible for demo purposes
    dynamic wordApp = new Application { Visible = true };
            
    // Create a new document
    var doc = wordApp.Documents.Add(TemplateName);
    templatedDocument.Activate();

    // Fill the bookmarks in the document
    doc.Bookmarks[CurrentDateBookmark].Range.Select();
    wordApp.Selection.TypeText(current.ToString());
    doc.Bookmarks[SignatureBookmark].Range.Select();
    wordApp.Selection.TypeText(author);

    // Save the document 
    doc.SaveAs(fileName);

    // Clean up
    templatedDocument.Close();
    wordApp.Quit();
  }
}

The code in Figure 5 allows you to use plain .NET Framework types to make the call to the COM object. Plus, optional parameters make it even simpler.

The dynamic keyword and other COM interop features introduced in C# 4.0 don’t make a piece of code necessarily faster, but it enables you to write C# code as if it were script. For COM objects, this achievement is probably as important as an increment of performance.

No PIA Deployment

Since the beginning of the .NET Framework, you could wrap a COM object into a managed class and use it from a .NET-based application. For this to happen, you need to use  using a primary interop assembly (PIA) provided by the vendor of the COM object.PIAs are necessary and must be deployed along with client applications. However, more often than not, PIAs are too big and wrap up an entire COM API, so packing them with the setup may not be a pleasant experience.

Visual Studio 2010 offers the no-PIA option. No-PIA refers to the compiler’s ability to embed required definitions you’d get from a PIA in the current assembly. As a result, only definitions that are really needed are found in the final assembly and there’s no need for you to pack vendor’s PIAs in your setup. Figure 6 shows the option in the Properties box that enables no-PIA in Visual Studio 2010.

image: Enabling the No-PIA Option in Visual Studio 2010
Figure 6 Enabling the No-PIA Option in Visual Studio 2010

No-PIA is based on a feature of C# 4.0 known as type equivalence. In brief, type equivalence means that two distinct types can be considered equivalent at run time and used interchangeably. The typical example of type equivalence is two interfaces with the same name defined in different assemblies. They’re different types, but they can be used interchangeably as long as the same methods exist.

In summary, working with COM objects can still be expensive, but the COM interop support in C# 4.0 makes the code you write far simpler. Dealing with COM objects from .NET Framework-based applications connects you to legacy applications and critical business scenarios over which you’d otherwise have little control. COM is a necessary evil in the .NET Frameworok, but dynamic makes it a bit less so.


Dino Esposito  is the author of Programming ASP.NET MVC from Microsoft Press and has coauthored Microsoft .NET: Architecting Applications for the Enterprise (Microsoft Press, 2008). Based in Italy, Esposito is a frequent speaker at industry events worldwide.

Thanks to the following technical expert for reviewing this article: Alex Turner