June 2016

Volume 31 Number 6

[.NET Compiler Platform]

Language-Agnostic Code Generation with Roslyn

By Alessandro Del

The Roslyn code base providespowerful APIs you can leverage to perform rich code analysis over your source code. For instance, analyzers and code refactorings can walk through a piece of source code and replace one or more syntax nodes with new code you generate with the Roslyn APIs. A common way to perform code generation is via the SyntaxFactory class, which exposes factory methods to generate syntax nodes in a way that compilers can understand. The SyntaxFactory class is certainly very powerful because it allows generating any possible syntax element, but there are two different SyntaxFactory implementations: Microsoft.CodeAnalysis.CSharp.SyntaxFactory and Microsoft.Code­Analysis.VisualBasic.SyntaxFactory. This has an important implication if you want to write an analyzer with a code fix that targets both C# and Visual Basic—you have to write two different analyzers, one for C# and one for Visual Basic, using the two implementations of SyntaxFactory, each with a different approach due to the different way those languages handle some constructs. This likely means wasting time writing the analyzer twice, and maintaining them becomes more difficult. Fortunately, the Roslyn APIs also provide the Microsoft.CodeAnalysis.Editing.SyntaxGenerator, which allows for language-agnostic code generation. In other words, with Syntax­Generator you can write your code-generation logic once and target both C# and Visual Basic. In this article I’ll show you how to perform language-agnostic code generation with SyntaxGenerator, and I’ll give you some hints about the Roslyn Workspaces APIs.

Starting with Code

Let’s start with some source code that will be generated using SyntaxGenerator. Consider the simple Person class that implements the ICloneable interface in C# (Figure 1) and Visual Basic (Figure 2).

Figure 1 A Simple Person Class in C#

public abstract class Person : ICloneable
{
  // Not using auto-props is intentional for demo purposes
  private string _lastName;
  public string LastName
  {
    get
    {
      return _lastName;
    }
    set
    {
      _lastName = value;
    }
  }
  private string _firstName;
  public string FirstName
  {
    get
    {
      return _firstName;
    }
    set
    {
      _firstName = value;
    }
  }
  public Person(string LastName, string FirstName)
  {
    _lastName = LastName;
    _firstName = FirstName;
  }
  public virtual object Clone()
  {
    return MemberwiseClone();
  }
}

Figure 2 A Simple Person Class in Visual Basic

Public MustInherit Class Person
  Implements ICloneable
  'Not using auto-props is intentional for demo purposes
  Private _lastName As String
  Private _firstName As String
  Public Property LastName As String
    Get
      Return _lastName
    End Get
    Set(value As String)
      _lastName = value
    End Set
  End Property
  Public Property FirstName As String
    Get
      Return _firstName
    End Get
    Set(value As String)
      _firstName = value
    End Set
  End Property
  Public Sub New(LastName As String, FirstName As String)
    _lastName = LastName
    _firstName = FirstName
  End Sub
  Public Overridable Function Clone() As Object Implements ICloneable.Clone
    Return MemberwiseClone()
  End Function
End Class

You’d probably argue that declaring auto-implemented properties would have the same effect and would keep code much cleaner in this particular case, but later you’ll see why I’m using the expanded form.

This implementation of the Person class is very simple, but it contains a good number of syntax elements, making it helpful for understanding how to perform code generation with Syntax­Generator. Let’s generate this class with Roslyn.

Creating a Code Analysis Tool

The first thing to do is create a new project in Visual Studio 2015 with references to the Roslyn libraries. Because of the general purpose of this article, instead of creating an analyzer or refactoring, I’ll choose another project template available in the .NET Compiler Platform SDK, the Stand-Alone Code Analysis Tool, available in the Extensibility node of the New Project dialog (see Figure 3).

The Stand-Alone Code Analysis Tool Project Template
Figure 3 The Stand-Alone Code Analysis Tool Project Template

This project template actually generates a console application and automatically adds the proper NuGet packages for the Roslyn APIs, targeting the language of your choice. Because the idea is to target both C# and Visual Basic, the first thing to do is add the NuGet packages for the second language. For instance, if you initially created a C# project, you’ll need to download and install the following Visual Basic libraries from NuGet:

  • Microsoft.CodeAnalysis.VisualBasic.dll
  • Microsoft.CodeAnalysis.VisualBasic.Workspaces.dll
  • Microsoft.CodeAnalysis.VisualBasic.Workspaces.Common.dll

You can just install the latter from NuGet, and this will automatically resolve dependencies for the other required libraries. Resolving dependencies is important anytime you plan to use the SyntaxGenerator class, no matter what project template you’re using. Forgetting to do this will result in exceptions at run time.

Meet SyntaxGenerator and the Workspaces APIs

The SyntaxGenerator class exposes a static method called GetGenerator, which returns an instance of SyntaxGenerator. You use the returned instance to perform code generation. GetGenerator has the following three overloads:

public static SyntaxGenerator GetGenerator(Document document)
public static SyntaxGenerator GetGenerator(Project project)
public static SyntaxGenerator GetGenerator(Workspace workspace, string language)

The first two overloads work against a Document and a Project, respectively. The Document class represents a code file in a project, while the Project class represents a Visual Studio project as a whole. These overloads automatically detect the language (C# or Visual Basic) the Document or Project target. Document, Project, and Solution (an additional class that represents a Visual Studio .sln solution) are part of a Workspace, which provides a managed way to interact with everything that makes up an MSBuild solution with projects, code files, metadata and objects. The Workspaces APIs expose several classes you can use to manage workspaces, such as the MSBuildWorkspace class, which allows working against an .sln solution, or the AdhocWorkspace class, which is instead very useful when you’re not working against an existing MSBuild solution but want an in-memory workspace that represents one. In the case of analyzers and code refactorings, you already have an MSBuild workspace that allows you to work against code files using instances of the Document, Project and Solution classes. In the current sample project, there’s no workspace, so let’s create one using the third overload of SyntaxGenerator. To get a new empty workspace, you can use the AdhocWorkspace class:

// Get a workspace
var workspace = new AdhocWorkspace();

Now you can get an instance of SyntaxGenerator, passing the workspace instance and the desired language as arguments:

// Get the SyntaxGenerator for the specified language
var generator = SyntaxGenerator.GetGenerator(workspace, LanguageNames.CSharp);

The language name can be CSharp or VisualBasic, both constants from the LanguageNames class. Let’s start with C#; later you’ll see how to change the language name to VisualBasic. You have all the tools you need now and are ready to generate syntax nodes.

Generating Syntax Nodes

The SyntaxGenerator class exposes instance factory methods that generate proper syntax nodes in a way that’s compliant with the grammar and semantics of both C# and Visual Basic. For example, methods with names ending with the Expression suffix generate expressions; methods with names ending with the Statement suffix generate statements; methods with names ending with the Declaration suffix generate declarations. For each category, there are specialized methods that generate specific syntax nodes. For instance, MethodDeclaration generates a method block, PropertyDeclaration generates a property, FieldDeclaration generates a field and so on (and, as usual, IntelliSense is your best friend). The peculiarity of these methods is that each returns SyntaxNode, instead of a specialized type that derives from SyntaxNode, as happens with the SyntaxFactory class. This provides great flexibility, especially when generating complex nodes.

Based on the sample Person class, the first thing to generate is a using/Imports directive for the System namespace, which exposes the ICloneable interface. This can be accomplished with the NamespaceImportDeclaration method as follows:

// Create using/Imports directives
var usingDirectives = generator.NamespaceImportDeclaration("System");

This method takes a string argument that represents the namespace you want to import. Let’s go ahead and declare two fields, which is accomplished via the FieldDeclaration method:

// Generate two private fields
var lastNameField = generator.FieldDeclaration("_lastName",
  generator.TypeExpression(SpecialType.System_String),
  Accessibility.Private);
var firstNameField = generator.FieldDeclaration("_firstName",
  generator.TypeExpression(SpecialType.System_String),
  Accessibility.Private);

FieldDeclaration takes the field name, the field type, and the accessibility level as arguments. To supply the proper type, you invoke the TypeExpression method, which takes a value from the SpecialType enumeration, in this case System_String (don’t forget to use IntelliSense to discover other values). The accessibility level is set with a value from the Accessibility enumeration. When invoking methods from SyntaxGenerator, it’s very common to nest invocations to other methods from the same class, as in the case of TypeExpression. The next step is generating two properties, which is accomplished by invoking the PropertyDeclaration method, shown in Figure 4.

Figure 4 Generating Two Properties via the PropertyDeclaration Method

// Generate two properties with explicit get/set
var lastNameProperty = generator.PropertyDeclaration("LastName",
  generator.TypeExpression(SpecialType.System_String), Accessibility.Public,
  getAccessorStatements:new SyntaxNode[]
  { generator.ReturnStatement(generator.IdentifierName("_lastName")) },
  setAccessorStatements:new SyntaxNode[]
  { generator.AssignmentStatement(generator.IdentifierName("_lastName"),
  generator.IdentifierName("value"))});
var firstNameProperty = generator.PropertyDeclaration("FirstName",
  generator.TypeExpression(SpecialType.System_String),
  Accessibility.Public,
  getAccessorStatements: new SyntaxNode[]
  { generator.ReturnStatement(generator.IdentifierName("_firstName")) },
  setAccessorStatements: new SyntaxNode[]
  { generator.AssignmentStatement(generator.IdentifierName("_firstName"),
  generator.IdentifierName("value")) });

As you can see, generating a syntax node for a property is more complex. Here you still pass a string with the property name, then a TypeExpression for the property type, then the accessibility level. With a property you also typically need to provide the Get and Set accessors, especially for those situations in which you need to execute code other than for setting or returning the property value (such as raising the OnPropertyChanged event when implementing the INotifyPropertyChanged interface). Both the Get and Set accessors are represented by an array of SyntaxNode objects. In the Get, you typically return the property value, so here the code invokes the ReturnStatement method, which represents the return instruction plus the value or object it returns. In this case, the returned value is a field’s identifier. A syntax node for an identifier is obtained by invoking the IdentifierName method, which takes an argument of type string, and still returns SyntaxNode. The Set accessors in contrast store the property value into a field via an assignment. Assignments are represented by the AssignmentStatement method, which takes two arguments, the left and right sides of the assignment. In the current case, the assignment is between two identifiers, so the code invokes IdentifierName twice, one for the left side of the assignment (the field name) and one for the right side (the property value). Because the property value is represented by the value identifier in both C# and Visual Basic, it can be hardcoded.

The next step is code generation for the Clone method, which is required by the ICloneable interface implementation. Generally speaking, a method consists of the declaration, which includes the signature and block delimiters, and of a number of statements, which make up the method body. In the current example, Clone must also implement the ICloneable.Clone method. For this reason, a convenient approach is dividing the code generation for the method into three smaller syntax nodes. The first syntax node is the method body, which looks like the following:

// Generate the method body for the Clone method
var cloneMethodBody = generator.ReturnStatement(generator.
  InvocationExpression(generator.IdentifierName("MemberwiseClone")));

In this case, the Clone method returns the result of the invocation to the MemberwiseClone method it inherits from System.Object. For this reason, the method body is just an invocation to ReturnStatement, which you met previously. Here, the argument of the ReturnStatement is an invocation of the InvocationExpression method, which represents a method invocation and whose parameter is an identifier representing the name of the invoked method. Because the InvocationExpression argument is of type SyntaxNode, a convenient way to supply the identifier is using the IdentifierName method, passing the string representing the identifier of the method to invoke. If you had a method with a more complex method body, you’d need to generate an array of type SyntaxNode, with each node representing some code in the method body.

The next step is generating the Clone method declaration, which is accomplished like so:

// Generate the Clone method declaration
var cloneMethoDeclaration = generator.MethodDeclaration("Clone", null,
  null,null,
  Accessibility.Public,
  DeclarationModifiers.Virtual,
  new SyntaxNode[] { cloneMethodBody } );

You generate a method with the MethodDeclaration method. This takes a number of arguments, such as:

  • the method name, of type String
  • the method parameters, of type IEnumerable<SyntaxNode> (null in this case)
  • the type parameters for generic methods, of type IEnumerable<SyntaxNode> (null in this case)
  • the return type, of type SyntaxNode (null in this case)
  • the accessibility level, with a value from the Accessibility enumeration
  • the declaration modifiers, with one or more values from the DeclarationModifiers enumeration; in this case the modifier is virtual (Overridable in Visual Basic)
  • the statements for the method body, of type SyntaxNode; in this case, the array contains one element, which is the return statement defined earlier

You’ll see an example of how to add method parameters with the more specialized ConstructorDeclaration method shortly. The Clone method must implement its counterpart from the ICloneable interface, so this must be handled. What you need now is a syntax node that represents the interface name and that will also be useful when the interface implementation is added to the Person class. This can be accomplished by invoking the IdentifierName method, which returns a proper name from the specified string:

// Generate a SyntaxNode for the interface's name you want to implement
var ICloneableInterfaceType = generator.IdentifierName("ICloneable");

If you wanted to import the fully qualified name, System.ICloneable, you’d use DottedName instead of IdentifierName in order to generate a proper qualified name, but in the current example a NamespaceImportDeclaration for System was already added. At this point, you can put it all together. SyntaxGenerator has the AsPublicInterfaceImplementation and AsPrivateInterfaceImplementation methods that you use to tell the compiler that a method definition is implementing an interface, as in the following:

// Explicit ICloneable.Clone implemenation
var cloneMethodWithInterfaceType = generator.
  AsPublicInterfaceImplementation(cloneMethoDeclaration,
  ICloneableInterfaceType);

This is particularly important with Visual Basic, which explicitly requires the Implements clause. AsPublicInterfaceImplementation is the equivalent of implicit interface implementation in C#, whereas AsPrivateInterfaceImplementation is the equivalent of explicit interface implementation. Both work against methods, properties and indexers.

The next step is about generating the constructor, which is accomplished via the ConstructorDeclaration method. As with the Clone method, the constructor’s definition should be split into smaller pieces for easier understanding and cleaner code. As you’ll recall from Figure 1 and Figure 2, the constructor takes two parameters of type string, which are required for property initialization. So it’s a good idea to generate the syntax node for both parameters first:

// Generate parameters for the class' constructor
var constructorParameters = new SyntaxNode[] {
  generator.ParameterDeclaration("LastName",
  generator.TypeExpression(SpecialType.System_String)),
  generator.ParameterDeclaration("FirstName",
  generator.TypeExpression(SpecialType.System_String)) };

Each parameter is generated with the ParameterDeclaration method, which takes a string representing the parameter name, and an expression representing the parameter type. Both parameters are of type String, so the code simply uses the TypeExpression method, as you already learned. The reason for packing both parameters into a SyntaxNode is that the ConstructorDeclaration wants an object of this type to represent parameters.

Now you need to construct the method body, which takes advantage of the AssignmentStatement method you saw previously, as follows:

// Generate the constructor's method body
var constructorBody = new SyntaxNode[] {
  generator.AssignmentStatement(generator.IdentifierName("_lastName"),
  generator.IdentifierName("LastName")),
  generator.AssignmentStatement(generator.IdentifierName("_firstName"),
  generator.IdentifierName("FirstName"))};

In this case there are two statements, both grouped into a Syntax­Node object. Finally, you can generate the constructor, putting together the parameters and the method body:

// Generate the class' constructor
var constructor = generator.ConstructorDeclaration("Person",
  constructorParameters, Accessibility.Public,
  statements:constructorBody);

ConstructorDeclaration is similar to MethodDeclaration, but is specifically designed to generate a .ctor method in C# and a Sub New method in Visual Basic.

Generating a CompilationUnit

So far you’ve seen how to generate code for every member in the Person class. Now you need to put these members together and generate a proper SyntaxNode for the class. Class members must be supplied in the form of a SyntaxNode, and the following demonstrates how to put together all the members previously created:

// An array of SyntaxNode as the class members
var members = new SyntaxNode[] { lastNameField,
  firstNameField, lastNameProperty, firstNameProperty,
  cloneMethodWithInterfaceType, constructor };

Now you can finally generate the Person class, taking advantage of the ClassDeclaration method as follows:

// Generate the class
var classDefinition = generator.ClassDeclaration(
  "Person", typeParameters: null,
  accessibility: Accessibility.Public,
  modifiers: DeclarationModifiers.Abstract,
  baseType: null,
  interfaceTypes: new SyntaxNode[] { ICloneableInterfaceType },
  members: members);

As with other kinds of declarations, this method requires specifying the name, the generic type (null in this case), the accessibility level, the modifiers (Abstract in this case, or MustInherit in Visual Basic), base types (null in this case) and the implemented interfaces (in this case a SyntaxNode containing the interface name created previously as a syntax node). You might also want to encapsulate the class into a namespace. SyntaxGenerator includes the NamespaceDeclaration method, which accepts the namespace name and the SyntaxNode it contains. You use it like this:

// Declare a namespace
var namespaceDeclaration = generator.NamespaceDeclaration("MyTypes", classDefinition);

Compilers already know how to handle the generated syntax node for the complete namespace and nested members, and how to perform code analysis over syntax, but sometimes you need to return this result in the form of a CompilationUnit, a type that represents a code file. This is typical with analyzers and code refactorings. Here’s the code you write to return a CompilationUnit:

// Get a CompilationUnit (code file) for the generated code
var newNode = generator.CompilationUnit(usingDirectives, namespaceDeclaration).
  NormalizeWhitespace();

This method accepts one or more SyntaxNode instances as the argument.

The Output in C# and Visual Basic

After all this work, you’re ready to see the result. Figure 5 shows the generated C# code for the Person class.

The C# Roslyn-Generated Code for the Person Class
Figure 5 The C# Roslyn-Generated Code for the Person Class

Now, simply change the language to VisualBasic in the line of code that creates a new AdhocWorkspace:

generator = SyntaxGenerator.GetGenerator(workspace, LanguageNames.VisualBasic);

If you re-run the code, you’ll get a Visual Basic class definition, as shown in Figure 6.

The Visual Basic Roslyn-Generated Code for the Person Class
Figure 6 The Visual Basic Roslyn-Generated Code for the Person Class

The key point here is that, with SyntaxGenerator, you wrote code once and were able to generate both C# and Visual Basic code with which the Roslyn analysis APIs can work. When you’re done, don’t forget to invoke the Dispose method over the AdhocWorkspace instance, or simply enclose your code within a using statement.  Because nobody is perfect and the generated code might contain errors, you can also check the ContainsDiagnostics property to see if any diagnostics exist in the code and get detailed information about code issues via the GetDiagnostics method.

Language-Agnostic Analyzers and Refactorings

You can use the Roslyn APIs and the SyntaxGenerator class whenever you need to perform rich analysis over source code, but this approach is also very useful with analyzers and code refactorings. In fact, analyzers, code fixes, and refactorings have the DiagnosticAnalyzer, ExportCodeFixProvider, and ExportCodeRefactoringProvider attributes, respectively, each accepting the primary and secondary supported languages. By using SyntaxGenerator instead of SyntaxFactory, you can target both C# and Visual Basic simultaneously.

Wrapping Up

The SyntaxGenerator class from the Microsoft.CodeAnalysis.Editing namespace provides a language-agnostic way of generating syntax nodes, targeting both C# and Visual Basic with one code base. With this powerful class you can generate any possible syntax element in a way that’s compliant with both compilers, saving time and improving code maintainability.


Alessandro Del Sole has been a Microsoft MVP since 2008. Awarded MVP of the Year five times, he has authored many books, eBooks, instructional videos and articles about .NET development with Visual Studio. Del Sole works as a solution developer expert for Brain-Sys, focusing on .NET development, training and consulting. You can follow him on Twitter: @progalex.

Thanks to the following Microsoft technical experts for reviewing this article: Anthony D. Green and Matt Warren
Anthony D. Green is the Program Manager for Visual Basic. Anthony also worked on that Roslyn thing for 5 years. He's from Chicago and you can find him on Twitter @ThatVBGuy


Discuss this article in the MSDN Magazine forum