Share via


Deborah Kurata is the cofounder of InStep Technologies Inc., a consulting company specializing in the design and development of Web and Windows applications (www.insteptech.com). She's the author of Doing Objects in Visual Basic 6 (Sams), which focuses on a pragmatic approach to object-oriented design and development. Reach her by e-mail at deborahk@insteptech.com.

This article may contain URLs that were valid when originally published, but now link to sites or pages that no longer exist. To maintain the flow of the article, we've left these URLs in the text, but disabled the links.

May 2001

Programming With  Class

Simplify XML Document Access

Develop a standard class to wrap the DOM and make it easier to access your XML documents.

by Deborah Kurata

Reprinted with permission from Visual Basic Programmer's Journal, May 2001, Volume 11, Issue 5, Copyright 2001, Fawcette Technical Publications, Palo Alto, CA, USA. To subscribe, call 1-800-848-5523, 650-833-7100, visit www.vbpj.com, or visit The Development Exchange.

An Extensible Markup Language (XML) document is one of the best techniques for passing data between the components of an application or between applications. This is true for any type of application, including Windows and Web-based applications, and for any component language, including Visual Basic, C++, C#, and Java. In this column, you'll design and build a VB component that uses Microsoft's XML parser (msxml3.dll) to wrap the XML Document Object Model (DOM).

XML, the tag format for defining data structures, is great as a data-transfer mechanism because it's platform-independent and you can tailor it to your application's data-structure needs. For example, you can use this XML document to pass your favorite teams' standings from your application's data-access component to a user-interface component:

<Teams>
   <Team TeamName="Packers"
      TeamGames="12" 
      TeamWins="11"/>
   <Team TeamName="Forty-Niners" 
      TeamGames="12"
      TeamWins="10"/>
   <Team TeamName="Raiders"
      TeamGames="12" 
      TeamWins="6"/>
</Teams>

Because an XML document is simply a string, you can write code to parse the string and read the data from the XML document. However, that limits your application's flexibility and potentially ties your code to a particular XML document structure. A better approach is to use an XML parser. As its name implies, an XML parser parses the XML document and provides access to data within that document through an application programming interface (API).

The World Wide Web Consortium (W3C) defines standards for XML, XML-parser APIs, and related technologies. One of the XML parser API standards is the XML DOM. The DOM defines the set of standard properties and methods required in an XML parser. If everyone plays by the rules and follows the standards, any vendor's parser can parse an XML file the same way. The XML DOM is object-oriented and accessible from a programming language such as VB. The DOM uses a tree-manipulation technique to read the contents of an XML document and to provide access to any portion of it by accessing nodes in that tree structure.

To try this out, create the preceding sample XML document and name the document Team.xml (you can download the sample code). Then, start a new VB project and set a reference to Microsoft XML version 3.0. On the default form, add a listbox named lstNames. This code first parses the XML file into the DOM tree structure using the Load method, then displays the team names from the XML document in the listbox:

Private xmlDoc As DOMDocument
Private Sub Form_Load()
Dim xmlChildNode As IXMLDOMNode
   Set xmlDoc = New DOMDocument
   xmlDoc.Load (App.Path & "/team.xml")
   For Each xmlChildNode In _
      xmlDoc.documentElement.childNodes
      lstNames.AddItem _
         xmlChildNode.Attributes. _
         getNamedItem("TeamName").Text
   Next
End Sub

The DOM defines each element of the XML as a node in the tree structure. You can access all the elements by looping through all the nodes in the tree structure, as this code example shows. You can then retrieve the value of an element's attribute by using the Attributes collection.

Design the Wrapper Class
One of the best things about the DOM is its flexibility: The DOM offers many ways to access portions of the XML document. One of the worst things about the DOM is its flexibility: There are too many ways to access portions of an XML document!

When you're building a particular application, you'll find you use the same DOM-access techniques again and again. Instead of having to write all the code to access the DOM every time, you can build a class module that wraps the DOM and provides an application-unique way of referencing the DOM to access your XML documents.

Building this wrapper class for the DOM has other benefits as well. You can ensure that all your team's members access the DOM in a standardized and consistent fashion. If all the team members use your standard DOM wrapper class, they don't even need to know they're using the DOM. In fact, they don't even need to know about XML, yet they can still access data from the XML document using your wrapper class. The wrapper class also allows you to change how you access the XML without changing the rest of your code that uses this component. For example, when Visual Studio.NET comes out, you could replace the code within this component to use the new XMLReader and XMLWriter features instead of the DOM, without impacting the rest of your code.

Begin building a wrapper class by first defining which of the DOM features you need to have in your application. Ask yourself whether the application needs to add nodes to or remove nodes from the tree structure. Also, decide whether the application needs to retrieve attribute information for all nodes, or just one or more nodes, of the tree structure. Finally, evaluate whether you could use Extensible Stylesheet Language (XSL) instead of a wrapper. XSL might be an easier choice if you simply need to format, filter, or sort the XML.

After you answer these questions, you'll have a better idea of the parts of the DOM that you need to include in your wrapper class. For the purposes of this column, assume your application needs to retrieve only attribute information. However, the application might want the attribute for only one element or for a set of elements.

You need to decide about naming conventions next. You should select the style of names you'll use for your wrapper-class properties and methods. Because you're working with data, and like many developers you're probably used to working with records and fields, you can use a data-style naming convention. For example, you can call the method to retrieve a particular attribute from an XML document GetFieldByName. Alternatively, you can standardize on the XML-style naming convention and name your method GetAttributeByName. Or you can select the DOM-style naming convention. In that case, you can use the GetNamedItem method name just as the DOM does.

One of the main reasons for using a DOM wrapper class is to make development easier for the development team. With that goal in mind, this example uses the data-style naming convention. That way, the other developers don't need to know the XML- or DOM-style names.

Build the Wrapper Class
Begin building a wrapper class by adding a class module (CDOMWrapper) to the project you started earlier. Remove the code that accesses the DOM from the form.

You must decide how to load the DOM within your wrapper. You have three choices. First, you can define a Load method in your wrapper. This allows the developer using the wrapper to select when to load the XML file onto the DOM. However, this requires that all developers know to use the load prior to any other DOM wrapper property or method.

Second, you can load the XML file onto the DOM within each wrapper-class property or method. This performs the load automatically when the developer using the wrapper calls the property or method. The downside to this approach is the inefficiency of reloading the DOM every time the developer uses a property or method.

Third, you can make the wrapper-class properties and methods smart enough to load the DOM only if it isn't loaded already. This example uses that approach. You then build the wrappers you need for your particular application. This example wrapper begins with a private method to load the DOM:

Private m_xmlDoc As DOMDocument
Private Function Load() As String
   Set m_xmlDoc = New DOMDocument

m_xmlDoc.Load _ (App.Path & "/team.xml")

' Return any error code Load = xmlDoc.parseError.reason End Function

The other wrapper functions call this method if the xmlDoc object variable is not yet set.

Next, create a routine to retrieve attributes by name. The developer using the wrapper must, of course, pass the desired attribute name to this routine. The routine then finds the first instance of that attribute and returns the associated value:

Public Function _
   GetFieldByName(sFieldName As _
   String) As String
Dim sPattern As String
Dim oNode As IXMLDOMNode
   If m_xmlDoc Is Nothing Then
      Load
   End If
   sPattern = _
      "//*[@" & sFieldName & "]"
   Set oNode = _
      m_xmlDoc.documentElement. _
      selectSingleNode(sPattern)
   GetFieldByName = oNode.Attributes. _
      getNamedItem(sFieldName). _
      nodeValue
End Function

The pattern string in this routine provides a glimpse into the DOM's power. You can use XPath, the language for addressing parts of an XML document, to define many types of patterns (see Table 1). This XML document is the sample used within Table 1:

<Teams>
   <Team Name="Packers"
      TeamGames="12" 
      TeamWins="11"/>
      <Player Name="Smith"/>
         <Player Name="Jones"/>
         <Player Name="Thompson">
   </Team>
   <Team Name="49ers" 
      TeamGames="12"
      TeamWins="10"/>
   <Team Name="Raiders"
      TeamGames="12" 
      TeamWins="6"/>
</Teams>

The code in the sample form creates the instance of the wrapper class, then calls its methods:

Private m_XML As CDOMWrapper
Private Sub Form_Load()
   Set m_XML = New CDOMWrapper 
   lstNames.AddItem _
      m_XML.GetFieldByName("TeamName")
End Sub

Notice how much easier this code is to create than the code you added to the form previously. This can aid all your team members' productivity significantly.

This simple example gives you a glimpse into the efficiency you can achieve by developing a DOM wrapper class in your applications for accessing XML documents. Your next step is to add other wrapper methods, such as AddRecord to add another node to the tree, or DeleteRecord to remove a node from the tree. What you add to your DOM wrapper class depends on the functionality of the DOM that your applications need.