XML Task
Applies to: SQL Server SSIS Integration Runtime in Azure Data Factory
The XML task is used to work with XML data. Using this task, a package can retrieve XML documents, apply operations to the documents by using Extensible Stylesheet Language Transformations (XSLT) style sheets and XPath expressions, merge multiple documents, or validate, compare, and save the updated documents to files and variables.
This task enables an Integration Services package to dynamically modify XML documents at run time. You can use the XML task for the following purposes:
Reformat an XML document. For example, the task can access a report that resides in an XML file and dynamically apply an XSLT style sheet to customize the document presentation.
Select sections of an XML document. For example, the task can access a report that resides in an XML file and dynamically apply an XPath expression to select a section of the document. The operation can also get and process values in the document.
Merge documents from many sources. For example, the task can download reports from multiple sources and dynamically merge them into one comprehensive XML document.
Validate an XML document and optionally get detailed error output. For more info, see Validate XML with the XML Task.
You can include XML data in a data flow by using an XML source to extract values from an XML document. For more information, see XML Source.
XML Operations
The first action the XML task performs is to retrieve a specific XML document. This action is built into the XML task and occurs automatically. The retrieved XML document is used as the source of data for the operation that the XML task performs.
The XML operations Diff, Merge, and Patch require two operands. The first operand specifies the source XML document. The second operand also specifies an XML document, the contents of which depend on the requirements of the operation. For example, the Diff operation compares two documents; therefore, the second operand specifies another, similar XML document to which the source XML document is compared.
The XML task can use a variable or a File connection manager as its source, or include the XML data in a task property.
If the source is a variable, the specified variable contains the path of the XML document.
If the source is a File connection manager, the specified File connection manager provides the source information. The File connection manager is configured separately from the XML task, and is referenced in the XML task. The connection string of the File connection manager specifies the path of the XML file. For more information, see File Connection Manager.
The XML task can be configured to save the result of the operation to a variable or to a file. If saving to a file, the XML task uses a File connection manager to access the file. You can also save the results of the Diffgram generated by the Diff operation to files and variables.
Predefined XML Operations
The XML task includes a predefined set of operations for working with XML documents. The following table describes these operations.
Operation | Description |
---|---|
Diff | Compares two XML documents. Using the source XML document as the base document, the Diff operation compares it to a second XML document, detects their differences, and writes the differences to an XML Diffgram document. This operation includes properties for customizing the comparison. |
Merge | Merges two XML documents. Using the source XML document as the base document, the Merge operation adds the content of a second document into the base document. The operation can specify a merge location within the base document. |
Patch | Applies the output from the Diff operation, called a Diffgram document, to an XML document, to create a new parent document that includes content from the Diffgram document. |
Validate | Validates the XML document against a Document Type Definition (DTD) or XML Schema definition (XSD) schema. |
XPath | Performs XPath queries and evaluations. |
XSLT | Performs XSL transformations on XML documents. |
Diff Operation
The Diff operation can be configured to use a different comparison algorithm depending on whether the comparison must be fast or precise. The operation can also be configured to automatically select a fast or precise comparison based on the size of the documents being compared.
The Diff operation includes a set of options that customize the XML comparison. The following table describes the options.
Option | Description |
---|---|
IgnoreComments | A value that specifies whether comment nodes are compared. |
IgnoreNamespaces | A value that specifies whether the namespace uniform resource identifier (URI) of an element and its attribute names are compared. If this option is set to true, two elements that have the same local name but a different namespace are considered to be identical. |
IgnorePrefixes | A value that specifies whether prefixes of element and attribute names are compared. If this option is set to true, two elements that have the same local name but a different namespace URI and prefix are considered identical. |
IgnoreXMLDeclaration | A value that specifies whether the XML declarations are compared. |
IgnoreOrderOfChildElements | A value that specifies whether the order of child elements is compared. If this option is set to true, child elements that differ only in their position in a list of siblings are considered to be identical. |
IgnoreWhiteSpaces | A value that specifies whether white spaces are compared. |
IgnoreProcessingInstructions | A value that specifies whether processing instructions are compared. |
IgnoreDTD | A value that specifies whether the DTD is ignored. |
Merge Operation
When you use an XPath statement to identify the merge location in the source document, this statement is expected to return a single node. If the statement returns multiple nodes, only the first node is used. The contents of the second document are merged under the first node that the XPath query returns.
XPath Operation
The XPath operation can be configured to use different types of XPath functionality.
Select the Evaluation option to implement XPath functions such as sum().
Select the Node list option to return the selected nodes as an XML fragment.
Select the Values option to return the inner text value of all the selected nodes, concatenated into a string.
Validation Operation
The Validation operation can be configured to use either a Document Type Definition (DTD) or XML Schema definition (XSD) schema.
Enable ValidationDetails to get detailed error output. For more info, see Validate XML with the XML Task.
XML Document Encoding
The XML task supports merging of Unicode documents only. This means the task can apply the Merge operation only to documents that have a Unicode encoding. Use of other encodings will cause the XML task to fail.
Note
The Diff and Patch operations include an option to ignore the XML declaration in the second-operand XML data, making it possible to use documents that have other encodings in these operations.
To verify that the XML document can be used, review the XML declaration. The declaration must explicitly specify UTF-8, which indicates 8-bit Unicode encoding.
The following tag shows the Unicode 8-bit encoding.
<?xml version="1.0" encoding="UTF-8"?>
Custom Logging Messages Available on the XML Task
The following table describes the custom log entry for the XML task. For more information, see Integration Services (SSIS) Logging.
Log entry | Description |
---|---|
XMLOperation | Provides information about the operation that the task performs |
Configuration of the XML Task
You can set properties through SSIS Designer or programmatically.
For more information about the properties that you can set in SSIS Designer, click one of the following topics:
For more information about how to set properties in SSIS Designer, click the following topic:
Programmatic Configuration of the XML Task
For more information about programmatically setting these properties, click the following topic:
Related Tasks
Set the Properties of a Task or Container
XML Task Editor (General Page)
Use the General Node of the XML Task Editor dialog box to specify the operation type and configure the operation.
To learn about this task, see Validate XML with the XML Task. For information about working with XML documents and data, see "Employing XML in the .NET Framework" in the MSDN Library.
Static Options
OperationType
Select the operation type from the list. This property has the options listed in the following table.
Value | Description |
---|---|
Validate | Validates the XML document against a Document Type Definition (DTD) or XML Schema definition (XSD) schema. Selecting this option displays the dynamic options in section, Validate. |
XSLT | Performs XSL transformations on XML documents. Selecting this option displays the dynamic options in section, XSLT. |
XPATH | Performs XPath queries and evaluations. Selecting this option displays the dynamic options in section, XPATH. |
Merge | Merges two XML documents. Selecting this option displays the dynamic options in section, Merge. |
Diff | Compares two XML documents. Selecting this option displays the dynamic options in section, Diff. |
Patch | Applies the output from the Diff operation to create a new document. Selecting this option displays the dynamic options in section, Patch. |
SourceType
Select the source type of the XML document. This property has the options listed in the following table.
Value | Description |
---|---|
Direct input | Set the source to an XML document. |
File connection | Select a file that contains the XML document. |
Variable | Set the source to a variable that contains the XML document. |
Source
If Source is set to Direct input, provide the XML code or click the ellipsis button (...) and then provide the XML by using the Document Source Editor dialog box.
If Source is set to File connection, select a File connection manager, or click <New connection...> to create a new connection manager.
Related Topics: File Connection Manager, File Connection Manager Editor
If Source is set to Variable, select an existing variable, or click <New variable...> to create a new variable.
Related Topics: Integration Services (SSIS) Variables, Add Variable.
OperationType Dynamic Options
OperationType = Validate
Specify options for the Validate operation.
SaveOperationResult
Specify whether the XML task saves the output of the Validate operation.
OverwriteDestination
Specify whether to overwrite the destination file or variable.
Destination
Select an existing File connection manager, or click <New connection...> to create a new connection manager.
Related Topics: File Connection Manager, File Connection Manager Editor
DestinationType
Select the destination type of the XML document. This property has the options listed in the following table.
Value | Description |
---|---|
File connection | Select a file that contains the XML document. |
Variable | Set the source to a variable that contains the XML document. |
ValidationType
Select the validation type. This property has the options listed in the following table.
Value | Description |
---|---|
DTD | Use a Document Type Definition (DTD). |
XSD | Use an XML Schema definition (XSD) schema. Selecting this option displays the dynamic options in section, ValidationType. |
FailOnValidationFail
Specify whether the operation fails if the document fails to validate.
ValidationDetails
Provides rich error output when the value of this property is true. For more info, see Validate XML with the XML Task.
ValidationType Dynamic Options
ValidationType = XSD
SecondOperandType
Select the source type of the second XML document. This property has the options listed in the following table.
Value | Description |
---|---|
Direct input | Set the source to an XML document. |
File connection | Select a file that contains the XML document. |
Variable | Set the source to a variable that contains the XML document. |
SecondOperand
If SecondOperandType is set to Direct input, provide the XML code or click the ellipsis button (...) and then provide the XML by using the Source Editor dialog box.
If SecondOperandType is set to File connection, select a File connection manager, or click <New connection...> to create a new connection manager.
Related Topics: File Connection Manager, File Connection Manager Editor
If XPathStringSourceType is set to Variable, select an existing variable, or click <New variable...> to create a new variable.
Related Topics: Integration Services (SSIS) Variables, Add Variable.
OperationType = XSLT
Specify options for the XSLT operation.
SaveOperationResult
Specify whether the XML task saves the output of the XSLT operation.
OverwriteDestination
Specify whether to overwrite the destination file or variable.
Destination
If DestinationType is set to File connection, select a File connection manager, or click <New connection...> to create a new connection manager.
Related Topics: File Connection Manager, File Connection Manager Editor
If DestinationType is set to Variable, select an existing variable, or click <New variable...> to create a new variable.
Related Topics: Integration Services (SSIS) Variables, Add Variable.
DestinationType
Select the destination type of the XML document. This property has the options listed in the following table.
Value | Description |
---|---|
File connection | Select a file that contains the XML document. |
Variable | Set the source to a variable that contains the XML document. |
SecondOperandType
Select the source type of the second XML document. This property has the options listed in the following table.
Value | Description |
---|---|
Direct input | Set the source to an XML document. |
File connection | Select a file that contains the XML document. |
Variable | Set the source to a variable that contains the XML document. |
SecondOperand
If SecondOperandType is set to Direct input, provide the XML code or click the ellipsis button (...) and then provide the XML by using the Source Editor dialog box.
If SecondOperandType is set to File connection, select a File connection manager, or click <New connection...> to create a new connection manager.
Related Topics: File Connection Manager, File Connection Manager Editor
If XPathStringSourceType is set to Variable, select an existing variable, or click <New variable...> to create a new variable.
Related Topics: Integration Services (SSIS) Variables, Add Variable.
OperationType = XPATH
Specify options for the XPath operation.
SaveOperationResult
Specify whether the XML task saves the output of the XPath operation.
OverwriteDestination
Specify whether to overwrite the destination file or variable.
Destination
If DestinationType is set to File connection, select a File connection manager, or click <New connection...> to create a new connection manager.
Related Topics: File Connection Manager, File Connection Manager Editor
If DestinationType is set to Variable, select an existing variable, or click <New variable...> to create a new variable.
Related Topics: Integration Services (SSIS) Variables, Add Variable.
DestinationType
Select the destination type of the XML document. This property has the options listed in the following table.
Value | Description |
---|---|
File connection | Select a file that contains the XML document. |
Variable | Set the source to a variable that contains the XML document. |
SecondOperandType
Select the source type of the second XML document. This property has the options listed in the following table.
Value | Description |
---|---|
Direct input | Set the source to an XML document. |
File connection | Select a file that contains the XML document. |
Variable | Set the source to a variable that contains the XML document. |
SecondOperand
If SecondOperandType is set to Direct input, provide the XML code or click the ellipsis button (...) and then provide the XML by using the Source Editor dialog box.
If SecondOperandType is set to File connection, select a File connection manager, or click <New connection...> to create a new connection manager.
Related Topics: File Connection Manager, File Connection Manager Editor
If XPathStringSourceType is set to Variable, select an existing variable, or click <New variable...> to create a new variable.
Related Topics: Integration Services (SSIS) Variables, Add Variable.
PutResultInOneNode
Specify whether the result is written to a single node.
XPathOperation
Select the XPath result type. This property has the options listed in the following table.
Value | Description |
---|---|
Evaluation | Returns the results of an XPath function. |
Node list | Return the selected nodes as an XML fragment. |
Values | Return the inner text value of all selected nodes, concatenated into a string. |
OperationType = Merge
Specify options for the Merge operation.
XPathStringSourceType
Select the source type of the XML document. This property has the options listed in the following table.
Value | Description |
---|---|
Direct input | Set the source to an XML document. |
File connection | Select a file that contains the XML document. |
Variable | Set the source to a variable that contains the XML document. |
XPathStringSource
If XPathStringSourceType is set to Direct input, provide the XML code or click the ellipsis button (...) and then provide the XML by using the Document Source Editor dialog box.
If XPathStringSourceType is set to File connection, select a File connection manager, or click <New connection...> to create a new connection manager.
Related Topics: File Connection Manager, File Connection Manager Editor
If XPathStringSourceType is set to Variable, select an existing variable, or click <New variable...> to create a new variable.
Related Topics: Integration Services (SSIS) Variables, Add Variable
When you use an XPath statement to identify the merge location in the source document, this statement is expected to return a single node. If the statement returns multiple nodes, only the first node is used. The contents of the second document are merged under the first node that the XPath query returns.
SaveOperationResult
Specify whether the XML task saves the output of the Merge operation.
OverwriteDestination
Specify whether to overwrite the destination file or variable.
Destination
If DestinationType is set to File connection, select a File connection manager, or click <New connection...> to create a new connection manager.
Related Topics: File Connection Manager, File Connection Manager Editor
If DestinationType is set to Variable, select an existing variable, or click <New variable...> to create a new variable.
Related Topics: Integration Services (SSIS) Variables, Add Variable.
DestinationType
Select the destination type of the XML document. This property has the options listed in the following table.
Value | Description |
---|---|
File connection | Select a file that contains the XML document. |
Variable | Set the source to a variable that contains the XML document. |
SecondOperandType
Select the destination type of the second XML document. This property has the options listed in the following table.
Value | Description |
---|---|
Direct input | Set the source to an XML document. |
File connection | Select a file that contains the XML document. |
Variable | Set the source to a variable that contains the XML document. |
SecondOperand
If SecondOperandType is set to Direct input, provide the XML code or click the ellipsis button (...) and then provide the XML by using the Document Source Editor dialog box.
If SecondOperandType is set to File connection, select a File connection manager, or click <New connection...> to create a new connection manager.
Related Topics: File Connection Manager, File Connection Manager Editor
If SecondOperandType is set to Variable, select an existing variable, or click <New variable...> to create a new variable.
Related Topics: Integration Services (SSIS) Variables, Add Variable
OperationType = Diff
Specify options for the Diff operation.
DiffAlgorithm
Select the Diff algorithm to use when comparing documents. This property has the options listed in the following table.
Value | Description |
---|---|
Auto | Let the XML task determine whether to use the fast or precise algorithm. |
Fast | Use a fast, but less precise Diff algorithm. |
Precise | Use a precise Diff algorithm. |
Diff Options
Set the Diff options to apply to the Diff operation. The options are listed in the following table.
Value | Description |
---|---|
IgnoreXMLDeclaration | Specify whether to compare the XML declaration. |
IgnoreDTD | Specify whether to ignore the document type definition (DTD). |
IgnoreWhite Spaces | Specify whether to ignore differences in the amount of white space when comparing documents. |
IgnoreNamespaces | Specify whether to compare the namespace uniform resource identifier (URI) of an element and its attribute names. Note: If this option is set to True, two elements that have the same local name but different namespaces are considered identical. |
IgnoreProcessingInstructions | Specify whether to compare processing instructions. |
IgnoreOrderOf ChildElements | Specify whether to compare the order of child elements. Note: If this option is set to True, child elements that differ only in their position in a list of siblings are considered identical. |
IgnoreComments | Specify whether to compare comment nodes. |
IgnorePrefixes | Specify whether to compare prefixes of element and attribute names. Note: If you set this option to True, two elements that have the same local name, but different namespace URIs and prefixes, are considered identical. |
FailOnDifference
Specify whether the task fails if the Diff operation fails.
SaveDiffGram
Specify whether to save the comparison result, a DiffGram document.
SaveOperationResult
Specify whether the XML task saves the output of the Diff operation.
OverwriteDestination
Specify whether to overwrite the destination file or variable.
Destination
If DestinationType is set to File connection, select a File connection manager, or click <New connection...> to create a new connection manager.
Related Topics: File Connection Manager, File Connection Manager Editor
If DestinationType is set to Variable, select an existing variable, or click <New variable...> to create a new variable.
Related Topics: Integration Services (SSIS) Variables, Add Variable.
DestinationType
Select the destination type of the XML document. This property has the options listed in the following table.
Value | Description |
---|---|
File connection | Select a file that contains the XML document. |
Variable | Set the source to a variable that contains the XML document. |
SecondOperandType
Select the destination type of the XML document. This property has the options listed in the following table.
Value | Description |
---|---|
Direct input | Set the source to an XML document. |
File connection | Select a file that contains the XML document. |
Variable | Set the source to a variable that contains the XML document. |
SecondOperand
If SecondOperandType is set to Direct input, provide the XML code or click the ellipsis button (...) and then provide the XML by using the Document Source Editor dialog box.
If SecondOperandType is set to File connection, select a File connection manager, or click <New connection...> to create a new connection manager.
Related Topics: File Connection Manager, File Connection Manager Editor
If SecondOperandType is set to Variable, select an existing variable, or click <New variable...> to create a new variable.
Related Topics: Integration Services (SSIS) Variables, Add Variable
OperationType = Patch
Specify options for the Patch operation.
SaveOperationResult
Specify whether the XML task saves the output of the Patch operation.
OverwriteDestination
Specify whether to overwrite the destination file or variable.
Destination
If DestinationType is set to File connection, select a File connection manager, or click <New connection...> to create a new connection manager.
Related Topics: File Connection Manager, File Connection Manager Editor
If DestinationType is set to Variable, select an existing variable, or click <New variable...> to create a new variable.
Related Topics: Integration Services (SSIS) Variables, Add Variable.
DestinationType
Select the destination type of the XML document. This property has the options listed in the following table.
Value | Description |
---|---|
File connection | Select a file that contains the XML document. |
Variable | Set the source to a variable that contains the XML document. |
SecondOperandType
Select the destination type of the XML document. This property has the options listed in the following table.
Value | Description |
---|---|
Direct input | Set the source to an XML document. |
File connection | Select a file that contains the XML document. |
Variable | Set the source to a variable that contains the XML document. |
SecondOperand
If SecondOperandType is set to Direct input, provide the XML code or click the ellipsis button (...) and then provide the XML by using the Document Source Editor dialog box.
If SecondOperandType is set to File connection, select a File connection manager, or click <New connection...> to create a new connection manager.
Related Topics: File Connection Manager, File Connection Manager Editor
If SecondOperandType is set to Variable, select an existing variable, or click <New variable...> to create a new variable.
Related Topics: Integration Services (SSIS) Variables, Add Variable