Using SelectSingleNode (or SelectNodes) on XML where the default namespace has been set
I've been stumped by this one at least two times over the last couple of years, so I thought it was a good candidate to be written up here.
I was trying to select a node from some standard XHTML where the default namespace was set. In otherwords the XHTML was something like:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "https://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"[]>
<html xmlns="https://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8" />
<title>MSN Search News: Microsoft</title> ...
Note the xmlns attribute on the root <html> node.
Without thinking too hard, I first tried to find the title of the page by going ...
XmlDocument resultsXhtml = new XmlDocument();
resultsXhtml.Load("https://search.msn.com/news/results.aspx?q=Microsoft");
XmlNode metaNode = resultsXhtml.SelectSingleNode("//title");
... which left metaNode as null.
This took me a little while to figure out. Clearly I need to identify in the XPath query that the title tag is in the default namespace, but how can I do that if that namespace has no prefix in the actual XML.
The solution (reasonably obviously!) is to register a prefix of my own choosing in an XmlNamespaceManager object, and then use that namespace manager when doing the select. Here's some code that works:
XmlDocument resultsXhtml = new XmlDocument();
resultsXhtml.Load("https://search.msn.com/news/results.aspx?q=Microsoft");
XmlNamespaceManager namespaceManager = new XmlNamespaceManager(resultsXhtml.NameTable);
namespaceManager.AddNamespace("myprefix", "https://www.w3.org/1999/xhtml");
XmlNode metaNode = resultsXhtml.SelectSingleNode("//myprefix:title", namespaceManager);
I think what's interesting about this problem, is the way you have to think about namespaces and XPath queries. The namespace is a logical entity denoted by the URI not the prefix in the actual XML. Therefore you can register that URI with any prefix you want in your XPath, which isn't a completely intuitive concept - to me at least!