Reading XML File With JScript
I am Titus working as a SDET in JScript team. Sometime back I came across a situation where the requirement was to pass a XML file and get a Tree Listing back. The Tree Listing should have all nodes in the file along with proper parent/child relationship as well as a good way to differentiate between nodes with/without values. Let’s call nodes with value as properties. I achieved this by using JScript. In this blog you will learn how to read/parse XML file using Microsoft’s XML DOM and use this to create the Tree Listing.
Let’s take a sample XML file, say test.xml (can be a URL or a file on your system) to get a clear picture of the kind of Tree Listing required and later we will look at the actual code.
The XML file can be looked as
Root Node, name is BookList, has 2 child nodes
childnode0, name is Book and has two properties,
Prop0: Author has a value Paul
Prop1: Price has a value 10.3
childnode1, name is Book and has three properties
Prop0: Author has a value Joe
Prop1: Price has a value 20.95
Prop2: Title has a value Web 2.0
The Required Tree Listing after parsing test.xml is
nName |
NodeName |
nValue |
NodeValue |
cNodes |
List of Child Nodes |
cProps |
List of Child Properties |
The ReadXMLFile function in the code listing below returns the Tree Listing as required.
Many a times you know the XML file contents and are interested in the list of only a specific node. Making a call to ReadXMLFile with second argument as the node name gives just such a list.
Referring test.xml, a call to ReadXMLFile(“test.xml”, “Author”) gives a list like
Whereas a call to ReadXMLFile(“test.xml”, “Book”), returns the list like the below one
If you have carefully noticed the Tree listing, cNodes as well as cProps is an Array. so by using the proper index value, one can reach the desired node.
Here goes the actual code:
var NODE_ELEMENT = 1;
var NODE_ATTRIBUTE = 2;
var NODE_TEXT = 3;
/**** INTERNALLY USED FUNCTIONS ****/
/*
* Builds up xmlNode list on parentXMLNode
* by iterating over each node in childNodesLst
*/
function getXMLNodeList_1(childNodesLst,
parentXMLNode)
{
var i;
var curNode;
var arrLen
//traverse nodelist to get nodevalues and all child nodes
for (i = 0; i < childNodesLst.length; i++) {
//we will ignore all other node types like
//NODE_ATTRIBUTE, NODE_CDATA_SECTION, …
if (childNodesLst[i].nodeType == NODE_ELEMENT
|| childNodesLst[i].nodeType == NODE_TEXT) {
if (childNodesLst[i].nodeType == NODE_TEXT) {
//we got the value of the parent node, populate
//parent node and return back
parentXMLNode.nValue = childNodesLst[i].nodeValue;
return;
}
//we have a new NODE_ELEMENT node
curNode = new XMLNode(childNodesLst[i].nodeName, childNodesLst[i].nodeValue);
if (childNodesLst[i].hasChildNodes) {
getXMLNodeList_1(childNodesLst[i].childNodes, curNode);
if (curNode.nValue != null) {
//we need to add this as a property to the parent node
if (parentXMLNode.cProps == null) {
parentXMLNode.cProps = new Array();
parentXMLNode.hasCProps = true;
}
arrLen = parentXMLNode.cProps.length;
parentXMLNode.cProps[arrLen] = curNode;
} else {
//we need to add this as child node to the parent node
if (parentXMLNode.cNodes == null) {
parentXMLNode.cNodes = new Array();
parentXMLNode.hasCNodes = true;
}
arrLen = parentXMLNode.cNodes.length;
parentXMLNode.cNodes[arrLen] = curNode;
}
} else {
//no use of such a node
//mark currNode as null for GC collection
curNode = null;
}
}
}
return;
}
/*
* Generates appropriate XMLNodeList from nodes
* in childNodes
*/
function getXMLNodeList(childNodes)
{
var xmlNode = new XMLNode(null, null);
getXMLNodeList_1(childNodes, xmlNode);
var xmlNodeList = null;
if (xmlNode.hasCNodes) {
xmlNodeList = xmlNode.cNodes;
} else if (xmlNode.hasCProps) {
xmlNodeList = xmlNode.cProps;
}
return xmlNodeList;
}
/**** INTERNALLY USED FUNCTIONS ****/
/* XMLNde DataStruct */
functionXMLNode(ndName, ndVal)
{
this.nName = ndName; //XMLNode name
this.nValue = ndVal; //the value(if any) associated with XMLNode
//As of now only property nodes have associated values
this.hasCNodes = false; //Bool to mark presense of Child Nodes
this.cNodes = null; //List of child nodes (of type XMLNode)
this.hasCProps = false; //Bool to mark presense of Property Nodes
this.cProps = null; //List of property nodes (of type XMLNode)
}
/* Exposed Functions */
function ReadXMLFile(fileName, tagName)
{
if (arguments.length < 1 || arguments.length > 2)
return null;
var xmlDoc = new ActiveXObject("Microsoft.XMLDOM");
//load the file sync'ly
xmlDoc.async = false
try {
xmlDoc.load(fileName);
} catch(e) {
//failed to load xml file
return null;
}
//lets get the child nodes
var childNodes = null;
if (arguments.length == 2) {
try {
childNodes = xmlDoc.getElementsByTagName(tagName);
} catch(e) {
return null;
}
} else {
childNodes = xmlDoc.childNodes;
}
return (getXMLNodeList(childNodes));
}
var xmlNodes;
xmlNodes = ReadXMLFile("https://www.noweb.com/test.xml");
//For a file on you system
//xmlNodes = ReadXMLFile ("C:\\My Documents\\test.xml");
//root node name is
var RootNodeName = xmlNodes[0].nName;
xmlNodes = ReadXMLFile("https://www.noweb.com/test.xml", "Book");
var cntBooks = xmlNodes.length;
xmlNodes = ReadXMLFile("https://www.noweb.com/test.xml", "Author");
var authorName = xmlNodes[0].nValue;
Hope you enjoyed the blog!
Thanks,
Titus
Comments
Anonymous
April 01, 2008
Instead of using JavaScript "constants" for the node types, why not implement it properly so that the element returns the proper integer code: http://developer.mozilla.org/en/docs/DOM:element.nodeTypeAnonymous
April 02, 2008
How to make it work in firefox?Anonymous
April 02, 2008
here's an even better idea: http://en.wikipedia.org/wiki/E4X also already part of FF and ECMA4/ AS3. a lot less convoluted and a lot more elegant looking that the solution above.Anonymous
April 02, 2008
The comment has been removedAnonymous
April 02, 2008
Is there ever a different between MSXML.DOMDocument and Microsoft.XMLDOM? I checked the registry on my machine and they're both going to CLSID {2933BF90-7B36-11D2-B20E-00C04F983E60}. That goes to "%SystemRoot%system32msxml3.dll", and my understanding is that that's version 3 of the library. Why not use "MSXML2.DOMDocument.6.0", which maps to CLSID {88d96a05-f192-11d4-a65f-0040963251e5} (which uses c:WINDOWSsystem32msxml6.dll) instead? Unless I'm insane or mis-remembering, Version 6 performs some operations a lot faster than 3 (like selectNodes()). Yeah, if a 7 comes out the code will need adjustment, but from what I've seen I don't mind making a minor adjustment to a constant somewhere. See http://blogs.msdn.com/xmlteam/archive/2006/10/23/using-the-right-version-of-msxml-in-internet-explorer.aspxAnonymous
April 02, 2008
Yawn. This is a great exercise for a CS 101 class, but converting an XML document to its JavaScript-native equivalent representation has been done a thousand times already and is a foundational skill for a web developer (I sometimes use it as an interview question). Google "XML to JSON" and you get the idea. Now, some native browser support for E4X would be nice.Anonymous
April 02, 2008
What?! 1.) Fix the Node Constants in JScript: [bug 256] http://webbugtrack.blogspot.com/2007/10/bug-256-dom-nodetype-constants-are-not.html 2.) Why on earth are you using ActiveX for this? What part of Web Standards slipped by you? Use XMLHTTPRequest (the "almost" native) one added in IE7, with a fallback to ActiveX only if the user is on a really old version of IE. 3.) Does the term JSON ring a bell? It has been around for ages, and does what you are trying to do a 100 times better.Anonymous
April 03, 2008
@Gerome: The reason for using ActiveX was we wanted it to work even from non browser-hosts especially cscript. @TMO: Thanks for pointing out the extra memory consumption. Actually the intended audience for the blog is mainly novice Jscript programmer who wants to learn how to parse an xml file using jscript, so we overlooked on memory, performance optimizations. Thanks all for your invaluable comments.Anonymous
June 15, 2008
but but, this will only work in IE, I don't think any developers who do cross-browser app's will actually use thisAnonymous
November 19, 2008
What a nice example telling us clearly why is JSON so much better than XML. Just store your books in an js file like this : var BookList = [ { Author: "Paul", Price : 10.30 } , { Author: "Joe", Price : 20.95, Title: "Web 2.0" } ] ; Why XML ?Anonymous
November 25, 2009
More vendor lock in with code that ties corporations to IE, I really wish no one used proprietry browser code in this day and age... If you want to provide a serious method for XML then E4X is still waiting to be put into IE. Please oh please by IE9.Anonymous
April 25, 2010
but but, this will only work in IE, I don't think any developers who do cross-browser app's will actually use this