2019-10-01

October 2019

Volume 34 Number 10

[C#]

Accessing XML Documentation via Reflection

The .NET languages (C#, F# and Visual Basic) all support XML-formatted comments above types and members in source code. Aside from providing an easily intelligible standard for commenting code, these formatted comments are heavily integrated into Visual Studio and other development environments. They appear in tooltips and autocomplete suggestions, and in views like the Object Browser.

Even with all the benefits XML documentation currently provides, there’s still a lot of untapped potential. You could use XML documentation to track bugs. You could integrate it with code analyzers to provide better recommendations. You could use it to control continuous integration pipelines. You could add documentation generation as a step in your automated pipeline so your public documentation is always up-to-date.

The main issue with using XML documentation for any of these purposes is that there are no methods in .NET to access it directly from code. However, with one loading function and a handful of extension methods, you can easily add the ability to access XML documentation via reflection.

The XML Documentation File

By default, the compiler won’t do anything with XML documentation. You first have to enable the XML Documentation File option in Visual Studio, which is under the Build tab of a project’s settings. Once enabled, the XML documentation will be extracted and placed into the designated file each time the code is built.

It’s worth noting that the XML documentation settings are build-configuration-dependent. So you have to enable it on each build configuration you want it to run. Also, it’s a good idea to make the name of the output XML file the same as the output assembly (just with an .xml file extension).

Microsoft’s language guides detail all the recommended tags, such as summary, param, typeparam, returns and remarks. However, you can create tags of your own for XML documentation. For example, I like to include my own tags like “citation” and “runtime.” The compiler should extract any properly formatted XML tag from the comments.

The XML file produced at compile time is in a very simple format, with an assembly tag at the top to denote what assembly the file is documenting, and then every XML block from source code is placed into a separate member tag. The name property on the member XML tags represents the type/member of the code that the documentation is for. Figure 1 includes C# source code and Figure 2 shows the XML output from that source code.

Figure 1 The C# Source Code

namespace Example
{
  /// <summary>XML Documentation on ExampleClass.</summary>
  public class ExampleClass
  {
    /// <summary>XML Documentation on ExampleMethod1.</summary>
    public static void ExampleMethod1() { }
    /// <summary>XML Documentation on ExampleNestedGenericClass.</summary>
    /// <typeparam name="A">Generic type A.</typeparam>
    /// <typeparam name="B">Generic type B.</typeparam>
    /// <typeparam name="C">Generic type C.</typeparam>
    public class ExampleNestedGenericClass<A, B, C>
    {
      /// <summary>XML Documentation on ExampleMethod2.</summary>
      /// <typeparam name="D">Generic type D.</typeparam>
      /// <typeparam name="E">Generic type E.</typeparam>
      /// <typeparam name="F">Generic type F.</typeparam>
      /// <param name="a">Parameter a.</param>
      /// <param name="d">Parameter d.</param>
      /// <param name="b">Parameter b.</param>
      /// <param name="e">Parameter e.</param>
      /// <param name="c">Parameter c.</param>
      /// <param name="f">Parameter f.</param>
      public static void ExampleMethod2<D, E, F>(
        A a, D d, B[] b, E[] e, C[,,] c, F[,,] f) { }
    }
  }
}

Figure 2 The XML Output

<?xml version="1.0"?>
<doc>
  <assembly>
    <name>Example</name>
  </assembly>
  <members>
    <member name="T:Example.ExampleClass">
      <summary>XML Documentation on ExampleClass.</summary>
    </member>
    <member name="M:Example.ExampleClass.ExampleMethod1">
      <summary>XML Documentation on ExampleMethod1.</summary>
    </member>
    <member name="T:Example.ExampleClass.ExampleNestedGenericClass`3">
      <summary>XML Documentation on ExampleNestedGenericClass.</summary>
      <typeparam name="A">Generic type A.</typeparam>
      <typeparam name="B">Generic type B.</typeparam>
      <typeparam name="C">Generic type C.</typeparam>
    </member>
    <member name="M:Example.ExampleClass.ExampleNestedGenericClass`3.ExampleMethod2``3(`0,``0,`1[],``1[],`2[0:,0:,0:],``2[0:,0:,0:])">
      <summary>XML Documentation on ExampleMethod2.</summary>
      <typeparam name="D">Generic type D.</typeparam>
      <typeparam name="E">Generic type E.</typeparam>
      <typeparam name="F">Generic type F.</typeparam>
      <param name="a">Parameter a.</param>
      <param name="d">Parameter d.</param>
      <param name="b">Parameter b.</param>
      <param name="e">Parameter e.</param>
      <param name="c">Parameter c.</param>
      <param name="f">Parameter f.</param>
    </member>
  </members>
</doc>

The prefix of the name property determines what kind of code element the documentation is for, as follows: Methods “M:”; Types “T:”; Fields “F:”; Properties “P:”; Constructors “M:”; Events “E:”.

The remaining parts of the name property are the fully qualified type and member names, along with any necessary parameters and/or generic parameters. Generic parameters from types are represented with a single apostrophe followed by the index “`X,” while generic parameters from methods are represented with two apostrophes followed by the index “``Y.” Arrays and unsafe pointers have the same syntax as they have in the source code; however, multidimensional arrays with a rank greater than one include “0:” strings separated by commas for each rank. Ref/Out/In parameters are all handled the same with just an at sign (@) appended to the end of the type. Optional parameters (with default values) don’t have any special formatting.

As you can see, XML name properties can get a little complicated as you factor in various features of the languages, but the important takeaway is that the reflection types in .NET include all the logic necessary to build the “name” properties as they appear in the XML file.

Enough background—let’s get into the code for accessing XML documentation. You first need to load the XML file into memory, which can be easily done using the XmlReader class. Content can be stored using Dictionary<string, string>. The key for the dictionary will be the name property as it exists in the XML file, and the value will be the content of the XML documentation (the inner XML of the member tag in the XML file). See Figure 3.

Figure 3 XML Loading Function

internal static Dictionary<string, string> loadedXmlDocumentation =
  new Dictionary<string, string>();
public static void LoadXmlDocumentation(string xmlDocumentation)
{
  using (XmlReader xmlReader = XmlReader.Create(new StringReader(xmlDocumentation)))
  {
    while (xmlReader.Read())
    {
      if (xmlReader.NodeType == XmlNodeType.Element && xmlReader.Name == "member")
      {
        string raw_name = xmlReader["name"];
        loadedXmlDocumentation[raw_name] = xmlReader.ReadInnerXml();
      }
    }
  }
}

Next, let’s explore accessing XML from the dictionary. The reflection types in the System.Reflection namespace represent types and members of compiled code: Type, FieldInfo, MethodInfo, ConstructorInfo, PropertyInfo, EventInfo, MemberInfo and ParameterInfo. You can create extension methods that let you call the methods as if they were instance methods on the reflection types. In the extension methods, you just need to convert the reflection type into the key of the dictionary that holds the loaded XML documentation. I created an extension method called GetDocumentation, as shown in Figure 4.

Figure 4 Format the Key Strings

// Helper method to format the key strings
private static string XmlDocumentationKeyHelper(
  string typeFullNameString,
  string memberNameString)
{
    string key = Regex.Replace(
      typeFullNameString, @"\[.*\]",
      string.Empty).Replace('+', '.');
    if (memberNameString != null)
    {
        key += "." + memberNameString;
    }
    return key;
}
public static string GetDocumentation(this Type type)
{
  string key = "T:" + XmlDocumentationKeyHelper(type.FullName, null);
  loadedXmlDocumentation.TryGetValue(key, out string documentation);
  return documentation;
}
public static string GetDocumentation(this PropertyInfo propertyInfo)
{
  string key = "P:" + XmlDocumentationKeyHelper(
    propertyInfo.DeclaringType.FullName, propertyInfo.Name);
  loadedXmlDocumentation.TryGetValue(key, out string documentation);
  return documentation;
}

The extension methods for EventInfo and FieldInfo should be identical to the PropertyInfo method, just with “E:” and “F:” prefix strings, respectively. Note that the replacement of the “+” symbol with the “.” is to deal with nested types. The replacement of the bracketed string by “Regex” handles assembly information on the FullName of the Type.

The MethodInfo and ConstructorInfo extension methods are a little trickier. Constructors and methods can both have parameters—arrays, pointers, ref/in/out types, generic types and so forth. Methods can even define their own generic type parameters. It’s easiest to start by storing all the generic parameters in dictionaries, like so:

Dictionary<string, int> typeGenericMap = new Dictionary<string, int>();
int tempTypeGeneric = 0;
Array.ForEach(methodInfo.DeclaringType.GetGenericArguments(),
  x => typeGenericMap[x.Name] = tempTypeGeneric++);
Dictionary<string, int> methodGenericMap = new Dictionary<string, int>();
int tempMethodGeneric = 0;
Array.ForEach(methodInfo.GetGenericArguments(),
  x => methodGenericMap.Add(x.Name, tempMethodGeneric++));
ParameterInfo[] parameterInfos = methodInfo.GetParameters();

With the generic parameters stored in dictionaries, you can easily obtain their indices. Remember that the generic parameters appear in the XML documentation as apostrophes followed by the index. However, generic parameters aren’t the only special type you need to handle. Arrays, reference parameters, and pointers also have unique syntax in the XML file. Figure 5 has some pseudo code for converting the ParameterInfos into strings.

Figure 5 Parameters to Strings Pseudo Code

foreach (var parameterInfo in parameterInfos) {
  if (parameterInfo.ParameterType.HasElementType) {
    // The type is either an array, pointer, or reference
    if (parameterInfo.ParameterType.IsArray) {
      // Append the "[]" array brackets onto the element type
    }
    else if (parameterInfo.ParameterType.IsPointer) {
      // Append the "*" pointer symbol to the element type
    }
    else if (parameterInfo.ParameterType.IsByRef) {
      // Append the "@" symbol to the element type
    }
  }
  else if (parameterInfo.ParameterType.IsGenericParameter) {
    // Look up the index of the generic from the
    // dictionaries in Figure 5, appending "`" if
    // the parameter is from a type or "``" if the
    // parameter is from a method
  }
  else {
    // Nothing fancy, just convert the type to a string
  }
}

Figure 5 is heavily simplified, but hopefully it gets the idea across. Check out my project on GitHub to see the full code for the ConstructorInfo and MethodInfo extension methods (github.com/ZacharyPatten/Towel).

MemberInfo is a base class for the reflection types. You can make an extension method for the MemberInfo type that funnels into the extension methods for the specific types, as shown in Figure 6.

Figure 6 MemberInfo Extension Method

public static string GetDocumentation(this MemberInfo memberInfo)
{
  if (memberInfo.MemberType.HasFlag(MemberTypes.Field)) {
    return ((FieldInfo)memberInfo).GetDocumentation();
  }
  else if (memberInfo.MemberType.HasFlag(MemberTypes.Property)) {
    return ((PropertyInfo)memberInfo).GetDocumentation();
  }
  else if (memberInfo.MemberType.HasFlag(MemberTypes.Event)) {
    return ((EventInfo)memberInfo).GetDocumentation();
  }
  else if (memberInfo.MemberType.HasFlag(MemberTypes.Constructor)) {
    return ((ConstructorInfo)memberInfo).GetDocumentation();
  }
  else if (memberInfo.MemberType.HasFlag(MemberTypes.Method)) {
    return ((MethodInfo)memberInfo).GetDocumentation();
  }
  else if (memberInfo.MemberType.HasFlag(MemberTypes.TypeInfo) ||
    memberInfo.MemberType.HasFlag(MemberTypes.NestedType)) {
    return ((TypeInfo)memberInfo).GetDocumentation();
  }
  else {
    return null;
  }
}

Don’t forget ParameterInfo. It has a Member property that returns the MemberInfo for the parameter. Just call the memberInfo GetDocumentation extension method shown in Figure 6 and extract the XML for the specific parameter if it exists. Figure 7 shows how this is done.

Figure 7 ParameterInfo Extension Method

public static string GetDocumentation(this ParameterInfo parameterInfo)
{
  string memberDocumentation = parameterInfo.Member.GetDocumentation();
  if (memberDocumentation != null) {
    string regexPattern =
      Regex.Escape(@"<param name=" + "\"" + parameterInfo.Name + "\"" + @">") +
      ".*?" +
      Regex.Escape(@"</param>");
    Match match = Regex.Match(memberDocumentation, regexPattern);
    if (match.Success) {
      return match.Value;
    }
  }
  return null;
}

Automatically Loading the XML Files as Needed

Is there a way to bypass the loading function? If you follow the standard of having your output XML files in the same directory and with the same name as your assemblies, it’s easy to automatically look up an XML file so you don’t need to call the loading function.

You can get the assembly from any type with the GetAssembly method or the Assembly property. Then you can get the file location of an assembly through the CodeBase property of the assembly. Finally, just alter the file path to look for the XML file instead of the assembly, and call the loading function shown in Figure 4. You can see how this works in Figure 8.

Figure 8 Loading XML Documentation from Assembly

public static string GetDirectoryPath(this Assembly assembly)
{
  string codeBase = assembly.CodeBase;
  UriBuilder uri = new UriBuilder(codeBase);
  string path = Uri.UnescapeDataString(uri.Path);
  return Path.GetDirectoryName(path);
}
internal static HashSet<Assembly> loadedAssemblies = new HashSet<Assembly>();
internal static void LoadXmlDocumentation(Assembly assembly)
{
  if (loadedAssemblies.Contains(assembly)) {
    return; // Already loaded
  }
  string directoryPath = assembly.GetDirectoryPath();
  string xmlFilePath = Path.Combine(directoryPath, assembly.GetName().Name + ".xml");
  if (File.Exists(xmlFilePath)) {
    LoadXmlDocumentation(File.ReadAllText(xmlFilePath));
    loadedAssemblies.Add(assembly);
  }
}

Then you can update the extension methods to load the XML documentation of the assembly if it hasn’t already been loaded. The following code shows the System.Type extension method from before with automatic XML file loading:

public static string GetDocumentation(this Type type)
{
  LoadXmlDocumentation(type.Assembly);
  // ... Rest of the code
}

Usage Examples

Now that you have all of the necessary framework code, you just need to call it. Don’t forget to add a using statement at the top of the file so the extension methods are visible. Figure 9 shows usage examples for printing the XML documentation of the current assembly to the console.

Figure 9 Usage Examples

// Optional loading function
LoadXmlDocumentation(File.ReadAllText("PATH/TO/XML/FILE.xml"));
// Write the documentation of all types to the console
foreach (Type type in Assembly.GetExecutingAssembly().GetTypes()) {
  Console.WriteLine(type.GetDocumentation());
}
// Write all the documentation of every member to the console
foreach (Type type in Assembly.GetExecutingAssembly().GetTypes()) {
  foreach (MemberInfo memberInfo in type.GetMembers()) {
    Console.WriteLine(memberInfo.GetDocumentation());
  }
}
// Write all the documentation of every parameter to the console
foreach (Type type in Assembly.GetExecutingAssembly().GetTypes()) {
  foreach (MemberInfo memberInfo in type.GetMembers()) {
    if (memberInfo is MethodBase) {
      MethodBase methodBase = memberInfo as MethodBase;
      foreach (ParameterInfo parameterInfo in methodBase.GetParameters()) {
        Console.WriteLine(parameterInfo.GetDocumentation());
      }
    }
  }
}

The most obvious use case for this code is documentation generation. You could easily reflect through all the types and members in your assembly, grab the XML documentation from them, and dump that documentation into any format you want.

There are multiple documentation generators out there, but they’re often rather restrictive as to their output formats. What I wanted was to dump all the documentation into a simple HTML tree that I could style myself. So, writing my own documentation generator using this methodology was the easiest solution. Also, none of the documentation generators are able to handle my custom XML tags. Here are examples of custom XML tags to give you some tag ideas:

notes: adding additional documentation outside the standard tags
revision: help document the history of a member
todo: mark members for future development
link: URL link to further documentation and/or videos
critical: mark members that are critical for external projects and shouldn’t be edited
bug: express a known bug in a member until it can be fixed
test: link to the unit tests for a member, or perhaps metadata for the unit testing
runtime: big O-notation runtime complexity
citation: crediting sources for code

If you write a documentation Web site generator that makes use of these kinds of tags and you use continuous integration to deploy the Web site on code commits, then notifying users of bugs could be as simple as adding the “bug” XML tag to one of your members in source code. Nice.

You could also use XML documentation for code analysis. If you use standardized XML tags like “runtime,” it may help you resolve hot spots in code. If one method is documented to have a runtime complexity of “O(n),” while another overload of the method has a runtime complexity of “O(ln(n)),” you’d clearly want to use the latter if possible. Theoretically, this could be added to code analyzers or possibly Visual Studio extensions to offer coding suggestions.

There are already compilation warnings to encourage XML documentation on publicly visible types and members in .NET code, but there aren’t any warnings to encourage people to use the “exception” tag. Being able to grab the XML through reflection may allow you to more easily find members that don’t meet development and documentation standards.

Important Mentions

This solution is not future proof. As new features are added in future versions of languages like C# and Visual Basic, the format of the XML file may change. It’s unlikely that backward compatibility would be broken, but new features with new XML formats would require the extension methods to be modified to deal with them.

XML documentation and attributes are very similar. Both appear above members in code providing metadata about the member. However, XML documentation shouldn’t be considered an alternative to attributes. Attributes are guaranteed to be reachable at runtime, because they’re included in the compiled code. Any metadata necessary for the functionality of code should be implemented as an attribute.

I encourage you to look at the code in my Towel project on GitHub (github.com/ZacharyPatten/Towel). There are also unit tests for the code within the project. I’m sure there’s plenty of room for improvement in the code, and probably some test cases I missed when testing it. You could go a step further and write extension method overloads that take string parameters to extract specific tags from the XML, such as grabbing the “summary” content specifically.

I hope it’s now clear that you can indeed access XML documentation easily via reflection. Probably the biggest drawback is having to make sure that the XML exists in the necessary file location so it can be loaded into memory. Just be sure to check that the files exist, and this technique should do the trick until a better methodology comes along.

Zachary Patten is just another programmer who likes coffee, Power Rangers and dogs. He graduated from Kansas State University, and currently has approximately five years of professional programming experience.

Thanks to the following Microsoft technical experts for reviewing this article: Den Delimaschi, Bill Wagner
Bill Wagner a member of the .NET content team at Microsoft, responsible for the C# documentation on docs.microsoft.com. He is also a member of the ECMA C# Standards committee, and the author of “Effective C#” and “More Effective C#”.

Discuss this article in the MSDN Magazine forum

Share via