CA3009: Review code for XML injection vulnerabilities

Property Value
Rule ID CA3009
Title Review code for XML injection vulnerabilities
Category Security
Fix is breaking or non-breaking Non-breaking
Enabled by default in .NET 8 No

Cause

Potentially untrusted HTTP request input reaches raw XML output.

By default, this rule analyzes the entire codebase, but this is configurable.

Rule description

When working with untrusted input, be mindful of XML injection attacks. An attacker can use XML injection to insert special characters into an XML document, making the document invalid XML. Or, an attacker could maliciously insert XML nodes of their choosing.

This rule attempts to find input from HTTP requests reaching a raw XML write.

Note

This rule can't track data across assemblies. For example, if one assembly reads the HTTP request input and then passes it to another assembly that writes raw XML, this rule won't produce a warning.

Note

There is a configurable limit to how deep this rule will analyze data flow across method calls. See Analyzer Configuration for how to configure the limit in an EditorConfig file.

How to fix violations

To fix a violation, use one of the following techniques:

  • Don't write raw XML. Instead, use methods or properties that XML-encode their input.
  • XML-encode input before writing raw XML.
  • Validate user input by using sanitizers for primitive type conversion and XML encoding.

When to suppress warnings

Don't suppress warnings from this rule.

Pseudo-code examples

Violation

In this example, the input is set to the InnerXml property of the root element. Given input that contains valid XML, a malicious user can then completely alter the document. Notice that alice is no longer an allowed user after the user input is added to the document.

protected void Page_Load(object sender, EventArgs e)
{
    XmlDocument d = new XmlDocument();
    XmlElement root = d.CreateElement("root");
    d.AppendChild(root);

    XmlElement allowedUser = d.CreateElement("allowedUser");
    root.AppendChild(allowedUser);
    allowedUser.InnerXml = "alice";

    string input = Request.Form["in"];
    root.InnerXml = input;
}
Sub Page_Load(sender As Object, e As EventArgs)
    Dim d As XmlDocument = New XmlDocument()
    Dim root As XmlElement = d.CreateElement("root")
    d.AppendChild(root)

    Dim allowedUser As XmlElement = d.CreateElement("allowedUser")
    root.AppendChild(allowedUser)
    allowedUser.InnerXml = "alice"

    Dim input As String = Request.Form("in")
    root.InnerXml = input
End Sub

If an attacker uses this for input: some text<allowedUser>oscar</allowedUser>, then the XML document will be:

<root>some text<allowedUser>oscar</allowedUser>
</root>

Solution

To fix this violation, set the input to the InnerText property of the root element instead of the InnerXml property.

protected void Page_Load(object sender, EventArgs e)
{
    XmlDocument d = new XmlDocument();
    XmlElement root = d.CreateElement("root");
    d.AppendChild(root);

    XmlElement allowedUser = d.CreateElement("allowedUser");
    root.AppendChild(allowedUser);
    allowedUser.InnerText = "alice";

    string input = Request.Form["in"];
    root.InnerText = input;
}
Sub Page_Load(sender As Object, e As EventArgs)
    Dim d As XmlDocument = New XmlDocument()
    Dim root As XmlElement = d.CreateElement("root")
    d.AppendChild(root)

    Dim allowedUser As XmlElement = d.CreateElement("allowedUser")
    root.AppendChild(allowedUser)
    allowedUser.InnerText = "alice"

    Dim input As String = Request.Form("in")
    root.InnerText = input
End Sub

If an attacker uses this for input: some text<allowedUser>oscar</allowedUser>, then the XML document will be:

<root>some text&lt;allowedUser&gt;oscar&lt;/allowedUser&gt;
<allowedUser>alice</allowedUser>
</root>

Configure code to analyze

Use the following options to configure which parts of your codebase to run this rule on.

You can configure these options for just this rule, for all rules it applies to, or for all rules in this category (Security) that it applies to. For more information, see Code quality rule configuration options.

Exclude specific symbols

You can exclude specific symbols, such as types and methods, from analysis. For example, to specify that the rule should not run on any code within types named MyType, add the following key-value pair to an .editorconfig file in your project:

dotnet_code_quality.CAXXXX.excluded_symbol_names = MyType

Allowed symbol name formats in the option value (separated by |):

  • Symbol name only (includes all symbols with the name, regardless of the containing type or namespace).
  • Fully qualified names in the symbol's documentation ID format. Each symbol name requires a symbol-kind prefix, such as M: for methods, T: for types, and N: for namespaces.
  • .ctor for constructors and .cctor for static constructors.

Examples:

Option Value Summary
dotnet_code_quality.CAXXXX.excluded_symbol_names = MyType Matches all symbols named MyType.
dotnet_code_quality.CAXXXX.excluded_symbol_names = MyType1|MyType2 Matches all symbols named either MyType1 or MyType2.
dotnet_code_quality.CAXXXX.excluded_symbol_names = M:NS.MyType.MyMethod(ParamType) Matches specific method MyMethod with the specified fully qualified signature.
dotnet_code_quality.CAXXXX.excluded_symbol_names = M:NS1.MyType1.MyMethod1(ParamType)|M:NS2.MyType2.MyMethod2(ParamType) Matches specific methods MyMethod1 and MyMethod2 with the respective fully qualified signatures.

Exclude specific types and their derived types

You can exclude specific types and their derived types from analysis. For example, to specify that the rule should not run on any methods within types named MyType and their derived types, add the following key-value pair to an .editorconfig file in your project:

dotnet_code_quality.CAXXXX.excluded_type_names_with_derived_types = MyType

Allowed symbol name formats in the option value (separated by |):

  • Type name only (includes all types with the name, regardless of the containing type or namespace).
  • Fully qualified names in the symbol's documentation ID format, with an optional T: prefix.

Examples:

Option Value Summary
dotnet_code_quality.CAXXXX.excluded_type_names_with_derived_types = MyType Matches all types named MyType and all of their derived types.
dotnet_code_quality.CAXXXX.excluded_type_names_with_derived_types = MyType1|MyType2 Matches all types named either MyType1 or MyType2 and all of their derived types.
dotnet_code_quality.CAXXXX.excluded_type_names_with_derived_types = M:NS.MyType Matches specific type MyType with given fully qualified name and all of its derived types.
dotnet_code_quality.CAXXXX.excluded_type_names_with_derived_types = M:NS1.MyType1|M:NS2.MyType2 Matches specific types MyType1 and MyType2 with the respective fully qualified names, and all of their derived types.