1.3 Overview

This document describes the persistence file structure for the Microsoft SQL Server Integration Services (SSIS) product. Integration Services is a tool that is used to orchestrate activities to move and conform data from disparate sources in an enterprise into a convenient location and shape, often for analysis and reporting. The DTSX structure describes these orchestration activities as a set of pathways that are comprised of a set of steps, each of which can involve processing of the data as it flows through the step. Collectively, these steps form the control flow of the package.

The steps, each of which is an executable, can be conditional or iterative. Some executables are containers for other executables, and some of these containers are define looping within the control flow.  Not all of the steps that are described in the document are necessarily executed on any one execution of the control flow. The DTSX structure allows the user to specify a Data Flow task that modifies and/or transforms the data.

The structure also allows the ability to define pathways for the data with conditional elements that can change the transformations through which particular pieces of data pass and the final destinations to which these pieces of data are routed.

A file that conforms to the DTSX structure as defined in this document can be processed by the data transformation processing engine, or it can be opened in a user interface to make modifications to the definition.

The following diagram shows one typical use of Integration Services: to consolidate a data flow from multiple sources, and while consolidating, also transform and assure consistency of all the data.

Integration Services package data flow

Figure 1: Integration Services package data flow

The intermediate step in the data flow can be one of the following:

  • Executable Task: Performs a defined type of work along the control flow. Custom tasks can be written by end users to perform any task that is not included with the Integration Services product. Examples of standard tasks that ship with the Integration Services product are manipulating an external file, processing Microsoft SQL Server Analysis Services objects, running a data mining query, performing file transfers with FTP, performing a database backup, and executing an external executable.

  • Component: Components perform data transformations along the data flow. Examples of components that ship with the Integration Services product are components that: perform aggregations; create columns derived by a specified formula; and perform row counts, sampling, and lookups. End users can also create custom components to perform additional transformations that are not included with the standard product.

The following diagram is a detailed example of a data flow described by the DTSX structure format. The diagram shows the presence of two data sources (an OLE DB Source and a Flat File Source) and three final data destinations (a Raw File Destination, an OLE DB Destination, and a SQL Server Destination). The steps in the data flow, based on conditional processing, determine which destination the data ultimately flows to and how it is transformed and processed on the route to its final destination.

Some of the intermediate steps are conditional. For example, there are two arrows between Lookup and Fuzzy Lookup. One arrow represents the data flow for erroneous data, and the other arrow represents flow for standard output.

pict5af8839d-5a3a-487b-8fe2-089418be909c

Figure 2: Data flow described by DTSX structure