Programming with Nodes
LINQ to XML developers who need to write programs such as an XML editor, a transform system, or a report writer often need to write programs that work at a finer level of granularity than elements and attributes. They often need to work at the node level, manipulating text nodes, processing instructions, and comments. This topic provides some details about programming at the node level.
Node Details
There are a number of details of programming that a programmer working at the node level should know.
Parent Property of Children Nodes of XDocument is Set to Null
The Parent property contains the parent XElement, not the parent node. Child nodes of XDocument have no parent XElement. Their parent is the document, so the Parent property for those nodes is set to null.
The following example demonstrates this:
XDocument doc = XDocument.Parse(@"<!-- a comment --><Root/>");
Console.WriteLine(doc.Nodes().OfType<XComment>().First().Parent == null);
Console.WriteLine(doc.Root.Parent == null);
Dim doc As XDocument = XDocument.Parse("<!-- a comment --><Root/>")
Console.WriteLine(doc.Nodes().OfType(Of XComment).First().Parent Is Nothing)
Console.WriteLine(doc.Root.Parent Is Nothing)
This example produces the following output:
True
True
Adjacent Text Nodes are Possible
In a number of XML programming models, adjacent text nodes are always merged. This is sometimes called normalization of text nodes. LINQ to XML does not normalize text nodes. If you add two text nodes to the same element, it will result in adjacent text nodes. However, if you add content specified as a string rather than as an XText node, LINQ to XML might merge the string with an adjacent text node.
The following example demonstrates this:
XElement xmlTree = new XElement("Root", "Content");
Console.WriteLine(xmlTree.Nodes().OfType<XText>().Count());
// this does not add a new text node
xmlTree.Add("new content");
Console.WriteLine(xmlTree.Nodes().OfType<XText>().Count());
// this does add a new, adjacent text node
xmlTree.Add(new XText("more text"));
Console.WriteLine(xmlTree.Nodes().OfType<XText>().Count());
Dim xmlTree As XElement = <Root>Content</Root>
Console.WriteLine(xmlTree.Nodes().OfType(Of XText)().Count())
' This does not add a new text node.
xmlTree.Add("new content")
Console.WriteLine(xmlTree.Nodes().OfType(Of XText)().Count())
'// This does add a new, adjacent text node.
xmlTree.Add(New XText("more text"))
Console.WriteLine(xmlTree.Nodes().OfType(Of XText)().Count())
This example produces the following output:
1
1
2
Empty Text Nodes are Possible
In some XML programming models, text nodes are guaranteed to not contain the empty string. The reasoning is that such a text node has no impact on serialization of the XML. However, for the same reason that adjacent text nodes are possible, if you remove the text from a text node by setting its value to the empty string, the text node itself will not be deleted.
XElement xmlTree = new XElement("Root", "Content");
XText textNode = xmlTree.Nodes().OfType<XText>().First();
// the following line does not cause the removal of the text node.
textNode.Value = "";
XText textNode2 = xmlTree.Nodes().OfType<XText>().First();
Console.WriteLine(">>{0}<<", textNode2);
Dim xmlTree As XElement = <Root>Content</Root>
Dim textNode As XText = xmlTree.Nodes().OfType(Of XText)().First()
' The following line does not cause the removal of the text node.
textNode.Value = ""
Dim textNode2 As XText = xmlTree.Nodes().OfType(Of XText)().First()
Console.WriteLine(">>{0}<<", textNode2)
This example produces the following output:
>><<
An Empty Text Node Impacts Serialization
If an element contains only a child text node that is empty, it is serialized with the long tag syntax: <Child></Child>. If an element contains no child nodes whatsoever, it is serialized with the short tag syntax: <Child />.
XElement child1 = new XElement("Child1",
new XText("")
);
XElement child2 = new XElement("Child2");
Console.WriteLine(child1);
Console.WriteLine(child2);
Dim child1 As XElement = New XElement("Child1", _
New XText("") _
)
Dim child2 As XElement = New XElement("Child2")
Console.WriteLine(child1)
Console.WriteLine(child2)
This example produces the following output:
<Child1></Child1>
<Child2 />
Namespaces are Attributes in the LINQ to XML Tree
Even though namespace declarations have identical syntax to attributes, in some programming interfaces, such as XSLT and XPath, namespace declarations are not considered to be attributes. However, in LINQ to XML, namespaces are stored as XAttribute objects in the XML tree. If you iterate through the attributes for an element that contains a namespace declaration, you will see the namespace declaration as one of the items in the returned collection.
The IsNamespaceDeclaration property indicates whether an attribute is a namespace declaration.
XElement root = XElement.Parse(
@"<Root
xmlns='https://www.adventure-works.com'
xmlns:fc='www.fourthcoffee.com'
AnAttribute='abc'/>");
foreach (XAttribute att in root.Attributes())
Console.WriteLine("{0} IsNamespaceDeclaration:{1}", att, att.IsNamespaceDeclaration);
Dim root As XElement = _
<Root
xmlns='https://www.adventure-works.com'
xmlns:fc='www.fourthcoffee.com'
AnAttribute='abc'/>
For Each att As XAttribute In root.Attributes()
Console.WriteLine("{0} IsNamespaceDeclaration:{1}", att, _
att.IsNamespaceDeclaration)
Next
This example produces the following output:
xmlns="https://www.adventure-works.com" IsNamespaceDeclaration:True
xmlns:fc="www.fourthcoffee.com" IsNamespaceDeclaration:True
AnAttribute="abc" IsNamespaceDeclaration:False
XPath Axis Methods Do Not Return Child White Space of XDocument
LINQ to XML allows for child text nodes of an XDocument, as long as the text nodes contain only white space. However, the XPath object model does not include white space as child nodes of a document, so when you iterate through the children of an XDocument using the Nodes axis, white space text nodes will be returned. However, when you iterate through the children of an XDocument using the XPath axis methods, white space text nodes will not be returned.
// Create a document with some white space child nodes of the document.
XDocument root = XDocument.Parse(
@"<?xml version='1.0' encoding='utf-8' standalone='yes'?>
<Root/>
<!--a comment-->
", LoadOptions.PreserveWhitespace);
// count the white space child nodes using LINQ to XML
Console.WriteLine(root.Nodes().OfType<XText>().Count());
// count the white space child nodes using XPathEvaluate
Console.WriteLine(((IEnumerable)root.XPathEvaluate("text()")).OfType<XText>().Count());
' Create a document with some white space child nodes of the document.
Dim root As XDocument = XDocument.Parse( _
"<?xml version='1.0' encoding='utf-8' standalone='yes'?>" & _
vbNewLine & "<Root/>" & vbNewLine & "<!--a comment-->" & vbNewLine, _
LoadOptions.PreserveWhitespace)
' Count the white space child nodes using LINQ to XML.
Console.WriteLine(root.Nodes().OfType(Of XText)().Count())
' Count the white space child nodes using XPathEvaluate.
Dim nodes As IEnumerable = CType(root.XPathEvaluate("text()"), IEnumerable)
Console.WriteLine(nodes.OfType(Of XText)().Count())
This example produces the following output:
3
0
XDeclaration Objects are not Nodes
When you iterate through the children nodes of an XDocument, you will not see the XML declaration object. It is a property of the document, not a child node of it.
XDocument doc = new XDocument(
new XDeclaration("1.0", "utf-8", "yes"),
new XElement("Root")
);
doc.Save("Temp.xml");
Console.WriteLine(File.ReadAllText("Temp.xml"));
// this shows that there is only one child node of the document
Console.WriteLine(doc.Nodes().Count());
Dim doc As XDocument = _
<?xml version='1.0' encoding='utf-8' standalone='yes'?>
<Root/>
doc.Save("Temp.xml")
Console.WriteLine(File.ReadAllText("Temp.xml"))
' This shows that there is only one child node of the document.
Console.WriteLine(doc.Nodes().Count())
This example produces the following output:
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<Root />
1