Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Robert MacHale, MacHale Information Systems
Paul D. Sheriff, PDSA, Inc
February 2002
Summary: Describes the DOM parser in Microsoft Visual Studio .NET and how to work with XML documents within the Microsoft .NET Framework. This article also explains the differences between the Visual Studio .NET DOM parser and Microsoft Visual Studio 6.0 Microsoft.XMLDOM. (14 printed pages)
Objectives
- Learn about the DOM parser in Microsoft® .NET
- Learn to load an XML document
- Learn to select specific elements with XPath
- Learn to read element content and attributes
- Learn to update element content and attributes
Assumptions
The following should be true for you to get the most out of this document:
- You understand XML and how to put together an XML document
- You have used the Microsoft.XMLDOM object in Microsoft Visual Basic® 6.0
Contents
Introduction
The XML Parser
Loading and XML Document
Loading an XML Document with Microsoft.XMLDOM in Visual Basic 6.0
Loading an XML Document with System.Xml in Visual Basic .NET
XPath
XPath with XMLDOM in Visual Basic 6.0
XPath with System.Xml in Visual Basic .NET
NodeList
NodeList with XMLDOM in Visual Basic 6.0
NodeList with System.Xml in Visual Basic .NET
Reading an Attribute
Reading an Attribute with XMLDOM in Visual Basic 6.0
Reading an Attribute with System.Xml in Visual Basic .NET
Updating an Element
Updating an Element with XMLDOM in Visual Basic 6.0
Updating an Element with System.Xml in Visual Basic .NET
Updating an Attribute
Updating an Attribute with XMLDOM in Visual Basic 6.0
Updating an Attribute with System.Xml in Visual Basic .NET
Summary
About the Authors
Introduction
XML documents have become a great way to pass data from one application to another. Whether communicating from one DLL to another, from one EXE to another, or even from one server to another, XML is simple, easy, and efficient to pass around. In Microsoft Visual Basic 6.0, you used the Microsoft.XMLDOM object to process XML documents. In Microsoft .NET, the equivalent object is called System.Xml.XmlDocument. In this paper, you will learn how to work with XML documents within the Microsoft .NET Framework. In addition, you will see how these objects differ between Visual Basic 6.0 and Visual Basic .NET
The XML Parser
A .NET application inputs XML from a variety of sources. A client application transmits parameters to a server via an XML document. An application server parses and processes the content of the XML document. After completion of the server process, the client receives an XML document.
In the past, you have relied on the Microsoft.XMLDOM component to process XML documents. This component is commonly invoked from Active Server Pages, Visual Basic, and Internet Explorer. Fortunately, the .NET Framework provides a high-performance object that you can employ to parse and process your XML documents.
In this paper, you will learn how to load an XML document into .NET and display its content. You will see comparisons between the new System.Xml and the old Microsoft.XMLDOM. These comparisons provide a clear understanding of how XML works in the .NET Framework. Figure 1 shows the sample XML document that you will use throughout this paper.
Figure 1. A sample contact XML document that has several elements and a couple of attributes on one of the elements
Loading an XML Document
When you load and parse an XML document, you instantiate an object library that knows how to load and parse the document. After the object has been instantiated, you can command it to start working. Here are the steps to follow:
- Instantiate the XML parser.
- Load a specific XML document.
- Display the data located in the XML document.
Loading an XML Document with Microsoft.XMLDOM in Visual Basic 6.0
The Microsoft.XMLDOM object can be instantiated from Visual Basic, Active Server Pages, or Internet Explorer 5.x and above. This first example code demonstrates how to instantiate the XMLDOM component in Visual Basic. Once the object is ready, you can load an XML file. This file can then be displayed to the user.
Private Sub btnLoad_Click()
Set MyXMLDOM = CreateObject("Microsoft.XMLDOM")
MyXMLDOM.Load ("Contact.xml")
MsgBox MyXMLDOM.xml
End Sub
First, the example instantiates a Microsoft.XMLDOM object using the CreateObject function. This creates the XMLDOM object and places a reference to this new object in the variable MyXMLDOM. You then use the Load method of the XMLDOM object to load an XML file from disk. Finally, you can display the XML by accessing the XML property of the XMLDOM document object.
Loading an XML Document with System.Xml in Visual Basic .NET
In Visual Basic .NET, you instantiate the XML document provided with the .NET Framework. The object models you use here will be the same for any .NET-friendly language, including Visual Basic .NET. In this example, you instantiate the System.Xml.XmlDocument object.
This object provides the base functionality of loading and parsing the content of the XML document. After the object has been instantiated, you can load the XML document. Once the document is loaded, you can display it to the user.
Private Sub btnLoad_Click( _
ByVal sender As System.Object, _
ByVal e As System.EventArgs) Handles btnLoad.Click
Dim MyXMLDocument As System.Xml.XmlDocument
MyXMLDocument = New System.Xml.XmlDocument()
MyXMLDocument.Load("Contact.xml")
MessageBox.Show(MyXMLDocument.InnerXml)
MessageBox.Show(MyXMLDocument.InnerText)
End Sub
One of the syntactical changes in .NET is in the way you reference the content of the document. Rather than reference the XML property, you now reference the InnerXml property, as shown in Figure 2.
Figure 2. The InnerXml property returns the complete XML document
To display the data without the XML tags, use the InnerText property as shown in Figure 3.
Figure 3: The InnerText property
The use of InnerXml and InnerText properties resembles the DHTML object model in Internet Explorer. In DHTML, you access the HTML content of an element via the InnerHTML property. The text content of an element is referenced through the InnerText property.
One way you can simplify your code in .NET is through the use of a Namespace. By setting a reference to the System.Xml Namespace you can reference objects directly. With this method, you are not required to fully qualify the reference to the object. This makes your code easier to type as well as easier to read and maintain.
For more information on Namespaces in .NET refer to your MSDN Online Documentation.
XPath
XPath provides a query language for XML documents just like Structure Query Language (SQL) provides a query language for your relational database. The XPath vocabulary allows you to select specific nodes that meet specific criteria. When you open an XML document in .NET, you frequently execute an XPath query to retrieve a subset of the nodes contained in the document. Consider a document that contains a client contact, as shown in the next listing.
<contact>
<name>
<first>John</first>
<last>Smith</last>
</name>
<address city="PleasantVille" state="CA" zip="92222">
<street>1 Main Street</street>
<street>Apt. #5</street>
</address>
<website>www.johnsmith.com</website>
<email>john@smith.com</email>
<email>John.Smith@smith.com</email>
</contact>
To retrieve the data within the Address node, you can specify the following XPath query.
/contact/address
Rather than locate the element by an ordinal position, the XPath query is much more intuitive. This way, your code can remain flexible and manageable as the XML document structure evolves over the lifetime of the application. For example, you would not have to modify your code if a new element is added to the document ahead of the address.
The step in your code that employs XPath fits in after you load the XML document. You can then apply XPath to retrieve a subset of elements in the document. These elements can be processed by your application. The major steps are listed here.
- Instantiate the XML parser.
- Load a specific XML document.
- Define the XPath.
- Select the element that matches the XPath.
- Process the matching element.
XPath with XMLDOM in Visual Basic 6.0
After you load the XML document, you can apply XPath to retrieve a subset of elements contained within the document. This involves executing the SelectSingleNode method of the XMLDOM object. This method returns the single node that matches your criteria.
Private Sub btnSelect_Click()
Set MyXMLDOM = CreateObject("Microsoft.XMLDOM")
MyXMLDOM.Load ("Contact.xml")
MyXpath = "/contact/name/first"
Set MyNode = MyXMLDOM.selectSingleNode(MyXpath)
MsgBox MyNode.xml
End Sub
If you were to run this application, the MsgBox statement would display the XML fragment for the first name.
<first>John</first>
Notice that the content of the XML property of the Node object is the first name node alone. The parent nodes are not included. If this element contained children, these children would be contained within the Node object.
XPath with System.Xml in Visual Basic .NET
The process of applying XPath in Visual Basic .NET is consistent with your experience in Visual Basic 6.0. Once your XML document has been loaded, you can retrieve a specific child element with an XPath expression.
Private Sub btnSelectSingleNode_Click(ByVal sender As System.Object, _
ByVal e As System.EventArgs) Handles btnSelectSingleNode.Click
Dim MyXMLDocument As System.Xml.XmlDocument
MyXMLDocument = New System.Xml.XmlDocument()
MyXMLDocument.Load("Contact.xml")
Dim MyNode As System.Xml.XmlNode
MyNode = MyXMLDocument.SelectSingleNode(“/contact/name/first”)
MessageBox.Show(MyNode.OuterXml)
MessageBox.Show(MyNode.InnerXml)
End Sub
The difference between the XML property in XMLDOM and the InnerXml property in System.Xml becomes apparent when you select a single node with XPath. The content of InnerXml contains the element content without the element tag name.
John
If you are looking for both the content and the tag name, you will reference the OuterXml property.
<first>John</first>
If you have worked with DHTML in Internet Explorer, this syntax is going to be familiar. You will face the same issue when building a DHTML page where you will need to choose whether to reference the element alone or the element as well as its content.
NodeList
In the preceding examples, you retrieved just one node using an XPath query. What if the XPath matches multiple nodes? It’s possible to retrieve more than one matching node. Simply replace the SelectSingleNode() method with the SelectNodes() method when you expect the XPath to match against more than one node. The return value of the SelectNodes() method is a NodeList, which is an array of nodes.
When you apply an XPath that returns more than one element, you will need to make a couple of changes from the previous code. Call the SelectNodes() method rather than calling the SelectSingleNode() method. The resulting NodeList must be traversed with a FOR loop. Here are the steps you will follow to process a NodeList:
- Instantiate the XML parser.
- Load a specific XML document.
- Define the XPath.
- Select the elements that match the XPath.
- Iterate through all matching nodes.
NodeList with XMLDOM in Visual Basic 6.0
Here is the code that you use to retrieve a NodeList using Visual Basic 6.0 and the XMLDOM object.
Private Sub btnSelectNodes_Click()
Set MyXMLDOM = CreateObject("Microsoft.XMLDOM")
MyXMLDOM.Load ("Contact.xml")
MyXpath = "/contact/email"
Set MyNodeList = MyXMLDOM.selectNodes(MyXpath)
MsgBox MyNodeList.length
For x = 0 To MyNodeList.length - 1
MsgBox MyNodeList.Item(x).xml
Next
End Sub
The Contact.xml XML document contains more than one email address. When you select these email address nodes, you receive a NodeList containing each address. The Length property of the NodeList indicates how many nodes were returned from the XPath statement. Use this value to determine the upper bounds of the FOR loop. As you iterate through each member of the NodeList, you process each email address node separately.
NodeList with System.Xml in Visual Basic .NET
Working with a NodeList in System.Xml is similar to XMLDOM. The method calls are the same. The object names are the same. But the property you use to determine the quantity of matching elements is called Count.
Private Sub btnSelectNodes_Click(ByVal sender As System.Object, _
ByVal e As System.EventArgs) Handles btnSelectNodes.Click
Dim MyXMLDocument As System.Xml.XmlDocument
MyXMLDocument = New System.Xml.XmlDocument()
MyXMLDocument.Load("Contact.xml")
Dim MyXpath As String
MyXpath = "/contact/email"
Dim MyNodeList As System.Xml.XmlNodeList
MyNodeList = MyXMLDocument.SelectNodes(MyXpath)
MessageBox.Show(MyNodeList.Count)
Dim x As Integer
For x = 0 To MyNodeList.Count - 1
MessageBox.Show(MyNodeList.Item(x).InnerXml)
Next
End Sub
The Count property tells you the number of matching Nodes in the NodesList. Remember that the NodeList item index starts at position zero. This means that the quantity minus one is the upper bound of the array.
When you run this example, you will see a message containing the quantity of matching nodes. Next, the FOR loop iterates through each Node and displays the content. In this way, your applications process the matching Nodes of the XPath.
Reading an Attribute
XML documents structure data into elements and attributes. In Figure 1, the city, state, and zip are included as attributes of the Address element. In this section you will learn how to read these attributes programmatically.
Reading an Attribute with XMLDOM in Visual Basic 6.0
The structure of your document may contain a mixture of elements and attributes. In the Contact.xml document, there are attributes within the address element, namely city, state, and zip. Up to this point, you have seen how to access elements and their content. Here is what happens when you need to access the content of an attribute of an element.
Private Sub btnReadAttribute_Click()
Set MyXMLDOM = CreateObject("Microsoft.XMLDOM")
MyXMLDOM.async = False
MyXMLDOM.Load ("c:\DotNetJumpstart\Contact.xml")
MyXpath = "/contact/address/@city"
Set MyNode = MyXMLDOM.selectSingleNode(MyXpath)
MsgBox (MyNode.Text)
End Sub
The XPath vocabulary distinguishes attributes from elements by prefixing the node name with the @ character. To reference the city attribute of the address node, your XPath will look like this:
/contact/address/@city
In this way you can access the content of an attribute as easily as you access the text content of an element.
Reading an Attribute with System.Xml in Visual Basic .NET
You can reference attributes in .NET the same way you did in Visual Basic 6.0. Simply supply an XPath query that references the attribute of an element, which includes prefixing the attribute name with an @ character in the XPath.
Private Sub btnReadAttribute_Click(ByVal sender As System.Object, _
ByVal e As System.EventArgs) Handles btnReadAttribute.Click
Dim MyXMLDocument As System.Xml.XmlDocument
MyXMLDocument = New System.Xml.XmlDocument()
MyXMLDocument.Load("c:\DotNetJumpstart\Contact.xml")
Dim MyXpath As String
MyXpath = "/contact/address/@city"
Dim MyNode As System.Xml.XmlNode
MyNode = MyXMLDocument.SelectSingleNode(MyXpath)
MessageBox.Show(MyNode.OuterXml)
MessageBox.Show(MyNode.InnerXml)
End Sub
To access an attribute from XPath, you prefix the attribute name with the @ sign to indicate that you are looking for an attribute name. When you select this node, you can view the InnerXml to see the value of the attribute.
If you access the OuterXml property, you receive the attribute name and attribute value together. Generally, you will not need this combination of information during processing. Most frequently, you will use the InnerXml property of a node that contains an attribute.
Updating an Element
XML documents are updateable and the .NET application is able to modify the content of an XML document. The new information may come from a user interface form or other device. In this section you will learn how to update the value of an element.
Updating an Element with XMLDOM in Visual Basic 6.0
Sometimes you need to update the data within an XML document. The code shown below demonstrates how to accomplish this in Visual Basic 6.0.
Private Sub btnUpdateElement_Click()
Set MyXMLDOM = CreateObject("Microsoft.XMLDOM")
MyXMLDOM.Load ("Contact.xml")
MsgBox MyXMLDOM.xml
MyXpath = "/contact/name/first"
Set MyNode = MyXMLDOM.selectSingleNode(MyXpath)
MsgBox MyNode.xml
MyNode.Text = "Roberto"
MsgBox MyNode.xml
MsgBox MyXMLDOM.xml
End Sub
In order to modify the content of a single element, you need to select that single element. Once you have a node object that points to this element, you can modify its content. The Text property is both read and write.
Notice the content of each of the four message boxes. First, you see the XML document as it exists in the file. After selecting the single node that contains the first name, this single element is displayed to the user. After changing this node, the new value is displayed. To confirm that the node object is connected “live” to the original document, the XML document is displayed.
Updating an Element with System.Xml in Visual Basic .NET
The process of modifying an element with System.Xml is similar to XMLDOM, but the property names are different. Rather than referencing the Text property, you will reference the InnerText property.
Private Sub btnUpdateElement_Click(ByVal sender As System.Object, _
ByVal e As System.EventArgs) Handles btnUpdateElement.Click
Dim MyXMLDocument As System.Xml.XmlDocument
MyXMLDocument = New System.Xml.XmlDocument()
MyXMLDocument.Load("Contact.xml")
MessageBox.Show(MyXMLDocument.InnerXml)
Dim MyXpath As String
MyXpath = "/contact/name/first"
Dim MyNode As System.Xml.XmlNode
MyNode = MyXMLDocument.SelectSingleNode(MyXpath)
MessageBox.Show(MyNode.OuterXml)
MyNode.InnerText = "Roberto"
MessageBox.Show(MyNode.OuterXml)
MessageBox.Show(MyXMLDocument.InnerXml)
End Sub
In this way, you can modify the content of any element on the document. An application may receive an XML document as input. A process may modify one or more elements. The updated document may then be returned to the client or passed on to another server.
Updating an Attribute
XML documents structure data in elements and attributes. You have already learned how to update element content. In this section, you will learn how to update the content of an attribute.
Updating an Attribute with XMLDOM in Visual Basic 6.0
Just as your application updates element content, attributes are updated as well. To update the value of an attribute, you must first select the attribute. From this node object, you can modify its value.
Private Sub btnUpdateAttribute_Click()
Set MyXMLDOM = CreateObject("Microsoft.XMLDOM")
MyXMLDOM.async = False
MyXMLDOM.Load ("c:\DotNetJumpstart\Contact.xml")
MsgBox (MyXMLDOM.xml)
MyXpath = "/contact/address/@city"
Set MyNode = MyXMLDOM.selectSingleNode(MyXpath)
MsgBox (MyNode.Text)
MyNode.Text = "Lake Forest"
MsgBox (MyNode.Text)
MsgBox (MyXMLDOM.xml)
End Sub
Notice that the attribute is changed on the node. Because the node is connected “live” to the document, you will see this update reflected in the parent document. This example displays four message box windows to monitor the progress of the update.
Updating an Attribute with System.Xml in Visual Basic .NET
In .NET you can update the content of an attribute in virtually the same way you did in Visual Basic 6.0. First select the attribute using XPath as demonstrated above. After selecting the node you can supply its content. The main difference between .NET and Visual Basic 6.0 is that you will reference the InnerText property in .NET rather than the Text property as in Visual Basic 6.0.
Private Sub btnUpdateAttribute_Click(ByVal sender As System.Object, _
ByVal e As System.EventArgs) Handles btnUpdateAttribute.Click
Dim MyXMLDocument As System.Xml.XmlDocument
MyXMLDocument = New System.Xml.XmlDocument()
MyXMLDocument.Load("Contact.xml")
MessageBox.Show(MyXMLDocument.InnerXml)
Dim MyXpath As String
MyXpath = "/contact/address/@city"
Dim MyNode As System.Xml.XmlNode
MyNode = MyXMLDocument.SelectSingleNode(MyXpath)
MessageBox.Show(MyNode.OuterXml)
MyNode.InnerText = "Lake Forest"
MessageBox.Show(MyNode.OuterXml)
MessageBox.Show(MyXMLDocument.InnerXml)
End Sub
Notice that like the element, the attribute is managed as a node. Once you locate the attribute with XPath, you can display it or modify its value. To confirm the correct behavior of this example, you will see four message box windows displaying the progress each step of the way.
What’s Different in Visual Basic .NET From Visual Basic 6.0?
The XML concepts in .NET are very similar to the concepts in Visual Basic 6.0 and the System.Xml object embraces the standards set forth by the W3C specifications. Additionally, Microsoft has extended this base functionality with subtle refinements over the Visual Basic 6.0 implementation.
Summary
Your Visual Basic 6.0 and .NET applications both process XML documents. These documents may be processed on a client or processed on a server. In either case, you need to know how to manage XML documents.
The .NET Framework provides native XML parsing through the System.Xml assembly. You can load, select, and modify the content of an XML document. This latest version of the XML parser includes refinements that improve performance and extend functionality on some properties that are outside those defined in the DOM specification (i.e. ones Microsoft added).
Understanding these subtle differences will help you transition to the new .NET platform. The gains in performance and flexibility are worth the transition effort and you will enjoy working in this powerful platform.
About the Authors
Paul D. Sheriff is the owner of PDSA, Inc. (www.pdsa.com), a custom software development and consulting company in Southern California. Paul is the MSDN Regional Director for Southern California, is the author of a book on Visual Basic 6 called Paul Sheriff Teaches Visual Basic, and has produced over 72 videos on Visual Basic, SQL Server, .NET and Web Development for Keystone Learning Systems. Paul has co-authored a book entitled ASP.NET Jumpstart. Visit the PDSA, Inc. Web site for more information.
Robert MacHale owns MacHale Information Systems and has over nine years of experience developing online systems. He has taught courses on using the Internet, online services and communications software. Robert designs Internet web sites, database systems and computer networks. He has consulted the computer, financial and non-profit industries including developing diagnostic software for a PC Modem Manufacturer.
About Informant Communications Group
Informant Communications Group, Inc. (www.informant.com) is a diversified media company focused on the information technology sector. Specializing in software development publications, conferences, catalog publishing and Web sites, ICG was founded in 1990. With offices in the United States and the United Kingdom, ICG has served as a respected media and marketing content integrator, satisfying the burgeoning appetite of IT professionals for quality technical information.
Copyright © 2002 Informant Communications Group and Microsoft Corporation
Technical editing: PDSA, Inc., or KNG Consulting