XML Task
The XML task is used to work with XML data. Using this task, a package can retrieve XML documents, apply operations to the documents by using Extensible Stylesheet Language Transformations (XSLT) style sheets and XPath expressions, merge multiple documents, or validate, compare, and save the updated documents to files and variables.
This task enables an Integration Services package to dynamically modify XML documents at run time. You can use the XML task for the following purposes:
Reformat an XML document. For example, the task can access a report that resides in an XML file and dynamically apply an XSLT style sheet to customize the document presentation.
Select sections of an XML document. For example, the task can access a report that resides in an XML file and dynamically apply an XPath expression to select a section of the document. The operation can also get and process values in the document.
Merge documents from many sources. For example, the task can download reports from multiple sources and dynamically merge them into one comprehensive XML document.
You can include XML data in a data flow by using an XML source to extract values from an XML document. For more information, see XML Source.
XML Operations
The first action the XML task performs is to retrieve a specific XML document. This action is built into the XML task and occurs automatically. The retrieved XML document is used as the source of data for the operation that the XML task performs.
The XML operations Diff, Merge, and Patch require two operands. The first operand specifies the source XML document. The second operand also specifies an XML document, the contents of which depend on the requirements of the operation. For example, the Diff operation compares two documents; therefore, the second operand specifies another, similar XML document to which the source XML document is compared.
The XML task can use a variable or a File connection manager as its source, or include the XML data in a task property.
If the source is a variable, the specified variable contains the path of the XML document.
If the source is a File connection manager, the specified File connection manager provides the source information. The File connection manager is configured separately from the XML task, and is referenced in the XML task. The connection string of the File connection manager specifies the path of the XML file. For more information, see File Connection Manager.
The XML task can be configured to save the result of the operation to a variable or to a file. If saving to a file, the XML task uses a File connection manager to access the file. You can also save the results of the Diffgram generated by the Diff operation to files and variables.
Predefined XML Operations
The XML task includes a predefined set of operations for working with XML documents. The following table describes these operations.
Operation |
Description |
---|---|
Diff |
Compares two XML documents. Using the source XML document as the base document, the Diff operation compares it to a second XML document, detects their differences, and writes the differences to an XML Diffgram document. This operation includes properties for customizing the comparison. |
Merge |
Merges two XML documents. Using the source XML document as the base document, the Merge operation adds the content of a second document into the base document. The operation can specify a merge location within the base document. |
Patch |
Applies the output from the Diff operation, called a Diffgram document, to an XML document, to create a new parent document that includes content from the Diffgram document. |
Validate |
Validates the XML document against a Document Type Definition (DTD) or XML Schema definition (XSD) schema. |
XPath |
Performs XPath queries and evaluations. |
XSLT |
Performs XSL transformations on XML documents. |
Diff Operation
The Diff operation can be configured to use a different comparison algorithm depending on whether the comparison must be fast or precise. The operation can also be configured to automatically select a fast or precise comparison based on the size of the documents being compared.
The Diff operation includes a set of options that customize the XML comparison. The following table describes the options.
Option |
Description |
---|---|
IgnoreComments |
A value that specifies whether comment nodes are compared. |
IgnoreNamespaces |
A value that specifies whether the namespace uniform resource identifier (URI) of an element and its attribute names are compared. If this option is set to true, two elements that have the same local name but a different namespace are considered to be identical. |
IgnorePrefixes |
A value that specifies whether prefixes of element and attribute names are compared. If this option is set to true, two elements that have the same local name but a different namespace URI and prefix are considered identical. |
IgnoreXMLDeclaration |
A value that specifies whether the XML declarations are compared. |
IgnoreOrderOfChildElements |
A value that specifies whether the order of child elements is compared. If this option is set to true, child elements that differ only in their position in a list of siblings are considered to be identical. |
IgnoreWhiteSpaces |
A value that specifies whether white spaces are compared. |
IgnoreProcessingInstructions |
A value that specifies whether processing instructions are compared. |
IgnoreDTD |
A value that specifies whether the DTD is ignored. |
Merge Operation
When you use an XPath statement to identify the merge location in the source document, this statement is expected to return a single node. If the statement returns multiple nodes, only the first node is used. The contents of the second document are merged under the first node that the XPath query returns.
XPath Operation
The XPath operation can be configured to use different types of XPath functionality.
Select the Evaluation option to implement XPath functions such as sum().
Select the Node list option to return the selected nodes as an XML fragment.
Select the Values option to return the inner text value of all the selected nodes, concatenated into a string.
Validation Operation
The Validation operation can be configured to use either a Document Type Definition (DTD) or XML Schema definition (XSD) schema.
XML Document Encoding
The XML task supports merging of Unicode documents only. This means the task can apply the Merge operation only to documents that have a Unicode encoding. Use of other encodings will cause the XML task to fail.
Note
The Diff and Patch operations include an option to ignore the XML declaration in the second-operand XML data, making it possible to use documents that have other encodings in these operations.
To verify that the XML document can be used, review the XML declaration. The declaration must explicitly specify UTF-8, which indicates 8-bit Unicode encoding.
The following tag shows the Unicode 8-bit encoding.
<?xml version="1.0" encoding="UTF-8"?>
Custom Logging Messages Available on the XML Task
The following table describes the custom log entry for the XML task. For more information, see Integration Services (SSIS) Logging and Custom Messages for Logging.
Log entry |
Description |
---|---|
XMLOperation |
Provides information about the operation that the task performs |
Configuration of the XML Task
You can set properties through SSIS Designer or programmatically.
For more information about the properties that you can set in SSIS Designer, click one of the following topics:
For more information about how to set properties in SSIS Designer, click the following topic:
Programmatic Configuration of the XML Task
For more information about programmatically setting these properties, click the following topic:
Related Tasks
Set the Properties of a Task or Container
Related Content
Blog entry, XML Destination Script Component, on agilebi.com
CodePlex sample, Process XML Data Package Sample, on www.codeplex.com
|