Best Practices for Securing MSXML Code
This topic suggests best practices to make your MSXML applications more robust, and to reduce the threat of malicious intruders.
Set the resolveExternals property to False when you create new DOM documents.
When you create a new DOMDocument
object, the default value for the resolveExternals
property is True
for MSXML 3.0 and False
for MSXML 6.0. This allows files that contain external definitions to be included and resolved as part of the XML document stream at parse time. For example, the following types of external files and resolvable definitions might be resolved and incorporated into your parsed document:
Text files that contain resolvable modularized DTD instructions, such as entity or namespace references
Cascading XSD schema or XSLT style sheet files that contain additional rules or templates
Unless you need or expect this behavior, you should set this property explicitly to False
.
Note
Setting the resolveExternals
property to False
does not prevent your document from being validated upon parsing. This is determined by the value of the validateOnParse
property.
Be careful when handling file input and output.
MSXML provides two DOM methods for working with file input and output:
To read a file from memory or disk as input, you call the
load
method on aDOMDocument
object.To write a file to disk as output, you call the
save
method on aDOMDocument
object.
Before you write code that deals with file input and output, you should be familiar with the details of how to design file handling code for the APIs you plan to use with these methods. In particular, you should understand the possibilities for loading or working with IStream
objects if you reference and use them in your design. Because IStream
objects can be marshaled to other processes, the data you store with them could potentially be cloned or shared to other applications, with unintended consequences.
For more information about working with the IStream
interface, see "IStream – Compound File Implementation" in the Platform SDK.
Remember that XSLT is code.
XSL Transformations (XSLT) might appear to be a style sheet language, but it is actually a programming language. Therefore, many programs that are typically written in script or in languages such as Visual Basic or C/C++ could potentially be designed and written in XSLT.
To prevent problems, you should test your XSLT files as thoroughly as you would any other script or code module against corrupt or accidental input, such as unanticipated XML document types. Debug as necessary, and design and implement good error handling in your XSLT files. For more information, see the following topics:
In particular, safeguard your template designs against the possibility of an infinite recursion loop, in which two templates are written that match and point to each other. The XSLT processor in MSXML does not have a timeout, so when loops occur the application must be manually terminated to stop execution.
Be aware of inherited security contexts from Internet Explorer and other host applications.
MSXML inherits its first level of security from Internet Explorer, or from another immediate host application running under Windows. If that security is not set or in effect, MSXML imposes security based on the source context of the URL provided to locate a file.
For example, the following are three different contexts for loading a sample XML file, books.xml. The first is a local file system, the second is an intranet site, and the third is an Internet site.
C:\temp\books.xml
http://MyWorkgroupServer/books.xml
http://www.example.com/books.xml
For the first URL, MSXML assumes complete trust of the local file system. Access and control of the file are determined solely by the currently configured Windows file security settings, or by the system defaults.
For the second URL, the file is browseable (read-only), because the source is a local Web server on the same local intranet.
For the third URL, the source is an external Web server located using a DNS domain name on the Internet. In this case, MSXML blocks cross-domain interaction. For example, if example.com was the DNS domain requested in the URL, you would not be able to interact with another domain, such as microsoft.com.
For more information about the Internet Explorer security model, see the following topics in Internet Explorer Help:
"Protecting your computer from unsafe software"
"Understanding security and privacy features"
Check the length of character input and validate against a permitted range of characters.
Many attacks on applications have occurred when string input goes unchecked or a buffer used to store it is overrun. For example, a common case is an intentional attack by a malicious user or application that attempts to overrun a text input control on an application form with a large amount of character data. In the worst case, Windows returns an access violation and the application stops responding.
You should know that in the case of the loadXML method, if more than 32 kilobytes of character or string input is passed to it, the MSXML parser will fail but does not report the error. However, even with this internalized safeguarding behavior, you might want to implement additional input checking in your own form validation code for validating user input.
Implement parse error handling in your code.
Many simple applications that can be written using MSXML assume that DOM documents load successfully. For example, consider the following Visual Basic code. This code loads two documents, an XML file and an XSLT style sheet, and then performs a transformation using both files.
Begin Sub LoadButDoNotCheck
Dim xmlDoc As New Msxml2.DOMDocument30
Dim xslDoc As New Msxml2.DOMDocument30
xmlDoc.load "books.xml"
xslDoc.load "stylesheet.xsl"
MsgBox xmlDoc.transformNode(xslDoc)
End Sub
In many cases this code might run without problems. However, it makes two assumptions that might not always be correct:
Both the sample XML file (books.xml) and XSLT style sheet (stylesheet.xsl) are assumed to be available at the same path as the executing VBScript (.vbs) file or compiled Visual Basic application (.exe) file that contains this subroutine.
Both the XML and XSLT documents are assumed to load successfully as well-formed XML before the call to the
transformNode
method. This method call requires both documents.
If any of these conditions are untrue, the subsequent lines of code fail, but in some instances they are unnecessarily executed anyway. You can rewrite this subroutine as follows, so that it handles errors as they occur:
Begin Sub LoadButCheckAndReportParseErrors
Dim xmlDoc As New MSXML2.DOMDocument30
Dim xslDoc As New MSXML2.DOMDocument30
xmlDoc.Load "books.xml"
If xmlDoc.parseError.errorCode = 0 Then
xslDoc.Load "stylesheet.xsl"
If xslDoc.parseError.reason = "" Then
MsgBox xmlDoc.transformNode(xslDoc)
Else
MsgBox "Stylesheet.xsl did not load. " & _
xslDoc.parseError.reason
End If
Else
MsgBox "Books.xml did not load. " & _
xmlDoc.parseError.reason
End If
End Sub
Whenever possible, you should include this kind of parse error handling in code that loads and works with DOMDocument
objects. Robust code takes longer to write, but it is easier and more efficient to maintain.