Connectivity Options for Microsoft SQL Server 2008 Integration Services
SQL Server Technical Article
Writer: Martin Ellis
Technical Reviewers: Val Fontama (Microsoft), Ritu Kothari (Microsoft), Chunhui Zhu (Microsoft), Xiaoyan Zhao (Microsoft), Ben Lubetsky (Attunity), Ramakrishnan Krishnan (Microsoft), Nitin Mehrotra (Microsoft), Ricardo Mendes (Microsoft)
Published: December 2008
Applies to: SQL Server 2008
Summary: This white paper provides an overview of the connectivity options that are available for SQL Server 2008 Integration Services.
Introduction
The data to which a company has access is key to its future success, but obtaining meaningful information from data can be far from straightforward. Companies may need to harvest data from multiple geographical locations and it is unlikely that all the data will be stored in a single format. Spreadsheets in the Microsoft Office Excel spreadsheet software, databases in the Microsoft Office Access database software, XML documents, databases in the Microsoft SQL Server data management software, Oracle databases, Teradata data warehouses, and SAP systems are just a few of the data stores that contemporary organizations use. Other issues, such as data ownership and compliance with regulatory requirements, can further complicate matters.
Data consolidation can be time consuming and resource intensive, and batch windows can be hard to find in an increasingly globalized environment. Furthermore, the value of data can depreciate in a relatively short period of time. Consequently, making reliable data available in a timely and efficient manner is a major challenge for the modern data worker.
SQL Server 2005 introduced SQL Server Integration Services, an enterprise-level data integration and workflow solutions platform for performing extract, transform, and load (ETL) operations. Integration Services provides a set of powerful features that enable the merging and consolidation of data from heterogeneous sources, and it includes tools for extracting, cleaning, standardizing, transforming, and loading data. A wide variety of built-in connectors support these operations, enabling Integration Services to interact not just with SQL Server databases, but with many other proprietary and nonproprietary data sources.
The SQL Server 2008 implementation of Integration Services builds upon the strengths of the previous release, and as a result the new release is a robust enterprise ETL platform that is even more productive. For an outline of the enhancements to Integration Services in SQL Server 2008, see the article “What's New (Integration Services)” on the MSDN® Web site at https://msdn.microsoft.com/en-us/library/bb522534.aspx.
Two key areas of development in SQL Server 2008 Integration Services are:
- Improved options for connectivity.
- Significant gains in performance.
Integration Services provides a wide range of data source connectors out of the box, and many add-on connectors are available from Microsoft and from third-party vendors. As a result, Integration Services is able to work with a broader range of sources than ever before.
The new connectivity options have also contributed to improving performance, and SQL Server now has the fastest ETL tool available. In a benchmark test, over 1 terabyte of data was loaded over a network from flat files into a SQL Server 2008 database in under 30 minutes, using an OLE DB connector. That is a world record 15 minutes faster than the next quickest time for the same amount of data from the nearest competitor! A white paper that discusses this test in greater detail is expected soon, but an overview of the test is provided in an article called “ETL World Record!” on the MSDN Blogs Web site at https://blogs.msdn.com/sqlperf/archive/2008/02/27/etl-world-record.aspx.
There are so many connectivity options available now that it can be difficult to know where to start. The aims of this white paper are to assist IT professionals who want to use SQL Server 2008 Integration Services to perform ETL operations between heterogeneous data sources and destinations, and to raise awareness of the new connectors for Oracle, Teradata, and SAP BI that are distributed and supported by Microsoft. The paper begins with an overview of the many connection options for SQL Server 2008 Integration Services, and then it discusses the out-of-the-box connectors that are built-in to Integration Services. It continues with an introduction to add-on connectors for Tier-1 data sources (Oracle, Teradata, and SAP BI), and then it outlines the adapters for SAP R/3 and Siebel that are available with the adapter pack for Microsoft BizTalk® Server. Finally, the paper introduces connectors for Tier-2 data sources including DB2.
Overview of Connectivity Options in SQL Server 2008 Integration Services
You can use Integration Services to create packages that encapsulate a specific business requirement, such as extracting data from an Oracle database, cleaning the data, and then loading it into a SQL Server Analysis Services database. You create Integration Services packages by using Business Intelligence Development Studio, a version of the Microsoft Visual Studio® development system that has a built-in template for Integration Services projects. This familiar interface, which uses drag-and-drop functionality to select and connect tasks, can help to significantly reduce development time.
Packages consist of one or more control flow tasks, where each task feeds into the next. Some control flow tasks, for example the Bulk Insert task and the Execute SQL task, reference data sources. Others, such as the Script task, do not always do so. You can use a special task called the Data Flow task to configure data transformations. The Data Flow task has its own toolbox, which contains data flow source items, data flow destination items, and the various transformation items. Some data flow sources and data flow destinations can reference data sources directly. For example, when you configure a Raw File source, you can specify a file location. However, the majority of data flow sources and destinations reference a connection manager.
A connection manager is a predefined logical representation of a connection. When you create a connection manager by using Business Intelligence Development Studio, you specify the information that is required to build a connection string for your data source. At run time, the connection string is used to create the connection as required by the elements within the package. You can use a connection manager more than once within a package.
When you create a connection manager, you must first specify the type of connection manager that you require in the Add SSIS Connection Manager dialog box (Figure 1). After you have selected the type of connection manager that you require, you can configure the connection to target your data source more specifically. For example, if you select the OLE DB connection manager type, you can specify OLE DB for Oracle, OLE DB for SQL Server, or other sources. These options are discussed in more detail in the next section.
Figure 1:Add SSIS Connection Manager dialog box
There are also add-on connectors that offer connectivity to sources that have no built-in connector, such as Teradata and SAP BI, or that offer improved performance over existing connectors for sources that are already supported, such as Oracle. After you have installed these connectors, they are accessible and configurable through Business Intelligence Development Studio in the same way as the standard connectors are. These connectors are discussed in more detail later in this paper.
The Built-in Connectors
SQL Server Integration Services includes built-in connectors for many of the most commonly used data sources.
OLE DB
OLE DB is a set of Component Object Model (COM) interfaces that facilitate access to a variety of data stores, including relational databases. After you add an OLE DB connection manager to your Integration Services project, you can then specify the nature of the data source that you want to connect to by selecting a provider in the Connection Manager dialog box (Figure 2). The next few paragraphs provide an overview of these OLE DB providers.
Figure 2: Connection Manager dialog box showing OLE DB provider options
Using OLE DB with Microsoft Office Excel and Access
Organizations frequently need to move data between Office Excel and SQL Server and between Office Access and SQL Server. Microsoft OLE DB Provider for Jet version 4.0 enables you to use Office Access databases up to and including Office Access 2003 and Office Excel workbooks up to and including Office Excel 2003 as data sources and destinations. Office Access 2007 and Office Excel 2007 have a different file format, so you must use the Microsoft OLE DB provider for Office 12.0 Access Database Engine for these files instead. After you have chosen the correct provider, you must specify the file location for the database or workbook.
For Office Excel, there is also a dedicated connection manager, which you can select from the Add SSIS Connection Manager dialog box. When you set up an Excel connection by using this option, you must specify the version of Office Excel that you are referencing. When you do this, the correct data provider (either the OLE DB provider for Jet 4.0 or the OLE DB provider for Office 12.0 Access Database Engine) is added automatically.
Using OLE DB with SQL Server
You can access SQL Server databases by using several OLE DB providers. The Microsoft OLE DB Provider for SQL Server provides access via the standard OLE DB driver (SQLOLEDB). There is also the SQL Server Native Client 10.0, which is an updated version of the SQL Server Native Client first introduced in SQL Server 2005. The SQL Server Native Client provides both OLE DB connectivity and ODBC connectivity. It also enables you to utilize the features of SQL Server 2005 that were not present in previous versions of SQL Server, such as the XML data type and common language runtime (CLR) user-defined data types. The SQL Server Native Client 10.0 is the SQL Server 2008 equivalent of the SQL Server Native Client, and it enables you to take advantage of new features in SQL Server 2008, such as sparse columns and the new date and time data types.
The OLE DB option that you choose for a SQL Server connection depends in part upon whether you need to take advantage of these new features. Whichever access method you choose for SQL Server, the configuration of the connection manager is the same. You must provide the name of the host server and the source database, in addition to authentication information.
Using OLE DB with Oracle
You can use the Microsoft OLE DB Provider for Oracle (MSDAORA) to access to Oracle data sources. You configure the provider by supplying the name of the Oracle server and the authentication details for that server. This information is then used to build the connection string. You can use data flow transformations that rely on an OLE DB connection, such as Lookup and Fuzzy Grouping, with this connector. There are some limitations when you use this provider. For more information, see the article "INFO: Limitations of Microsoft Oracle ODBC Driver and OLEDB Provider" on the Microsoft Help and Support Web site at https://support.microsoft.com/kb/244661. Broadly, Oracle features that were introduced after version 8i are not supported, which precludes the use of some data types, for example. (You can use the provider with later versions of Oracle, but these features will not be available.) There is a built-in .NET provider for Oracle that provides access to Oracle features up to and including version 9i. There is also a high-performance connector for Oracle, the Microsoft connector for Oracle by Attunity 1.0, which will be discussed later in this paper.
Using OLE DB with SQL Server Analysis Services
The Microsoft OLE DB Provider for SQL Server Analysis Services 10.0 provides access to dimensional data and cubes hosted on a server running SQL Server 2008 Analysis Services. Using this connector, you can connect to live Analysis Services databases as well as to Analysis Services projects that are still in development in Business Intelligence Development Studio. The Analysis Services Execute DDL, Analysis Services Processing, and Data Mining Query control flow tasks use the OLE DB Provider for Analysis Services 10.0, as do the Data Mining Model Training, Dimension Processing, and Partition Processing destinations in the data flow. The OLE DB Provider for Analysis Services 9.0 is a provider for SQL Server 2005 Analysis Services, and the Microsoft OLE DB Provider for OLAP Services 8.0 enables access to SQL Server 2000 Analysis Services. Additionally, the Microsoft OLE DB Provider for Data Mining Services enables you to use data mining extensions (DMX) statements. For more information, see the article "OLE DB for Data Mining Specification 1.0 Final" on the Microsoft Web site at https://www.microsoft.com/downloads/details.aspx?DisplayLang=en&FamilyID=01005f92-dba1-4fa4-8ba0-af6a19d30217.
Using OLE DB with Other Data Sources
The Microsoft OLE DB Provider for Microsoft Directory Services is used to access Active Directory® Service Interfaces (ADSI). It provides read access to the Active Directory directory service and other lightweight directory access protocol (LDAP) services such as Novell Directory Services.
The OLE DB Simple Provider is intended to give access to data sources that require only basic OLE DB support. You can use it to open hierarchical ActiveX® Data Objects (ADO) Recordsets over XML documents. For more information, see the article "Microsoft OLE DB Simple Provider" on the MSDN Web site at https://msdn.microsoft.com/en-us/library/ms675260(VS.85).aspx. You can also build hierarchical Recordsets by using the MSDataShape provider.
Finally, the Microsoft OLE DB Provider for Indexing Service enables you to perform full-text searches on files other than database files.
Using OLE DB with Other Integration Services Tasks and Transformations
In addition to being used as data sources and destinations, OLE DB connection managers can be consumed by certain data flow transformations, including Lookup, Fuzzy Lookup, Fuzzy Grouping, Slowly Changing Dimensions, Term Extraction, Term Lookup, and OLE DB Command. The Bulk Insert and Execute SQL control flow tasks also reference OLE DB connections.
ADO.NET
SQL Server 2008 Integration Services provides greater support for ADO.NET connectivity than SQL Server 2005 Integration Services. The most visible example of this is the ADO.NET source and destination options that are available in the data flow, which can read and write data directly to and from databases by using ADO.NET. The ADO.NET source effectively replaces the DataReader source, but the DataReader destination is still available alongside the ADO.NET destination. The DataReader destination is useful when you want to provide data for consumption by other applications. For example, you can use it as input for a Reporting Services data source.
When you select an ADO.NET connection manager from the Add SSIS Connection Manager dialog box, you can choose from a list of ADO.NET 2.0 providers, including the SqlClient Data Provider, the OracleClient Data Provider, the ODBC Data Provider, and the .NET OLE DB Data Provider. The latter enables access to the same list of OLE DB providers that were discussed earlier, but with ADO.NET as an extra layer on top of OLE DB. Because of this extra layer, you will generally achieve better performance by accessing the OLE DB providers directly.
The SqlClient Data Provider and OracleClient Data Provider give access to SQL Server databases and Oracle databases respectively, through the ADO.NET interface. For Oracle and SQL Server connections, you can use either an ADO.NET or an OLE DB provider. Factors that might influence your choice of connection manager and provider include:
- Performance. In general, you can expect OLE DB sources to offer slightly better performance, although you should always test your packages in your own specific environment rather than relying on this to always be the case.
- The availability of certain transformation types. For example, Lookup transformations require an OLE DB connection manager.
- 32-bit and 64-bit environment. The use of ADO.NET providers can simplify deployment in mixed 32-bit and 64-bit environments.
- Functionality. For example, the OracleClient Data Provider offers support for some data types that were introduced after Oracle version 8i, whereas the OLE DB Provider for Oracle does not.
For a more in-depth discussion of these issues, see the article "ADO.NET vs OLEDB" on the Microsoft Integration Services development team wiki at http://ssis.wik.is/Understanding_Connectivity_Stacks/ADO.NET_vs_OLEDB.
Open Database Connectivity (ODBC) support is provided through the .NET ODBC Data Provider. This is the only way that you can access ODBC data sources in SQL Server 2008 Integration Services, because there is no ODBC connection manager.
ADO
ADO connection managers enable access to the OLE DB data sources that were discussed earlier through an ADO interface.
FLATFILE and MULTIFLATFILE
Flat files can be a useful intermediary if you want to exchange data between systems that have no way of connecting to one another, or if there is some other reason that direct communication is not desirable. For example, a mainframe system for which there is no connector might generate a .csv file, and the data that it contains can be imported into SQL Server using a FLATFILE connector. You can use the File System and FTP control flow tasks to obtain and manipulate files, and the FLATFILE and MULTIFLATFILE connection managers can then access the data that they contain. Various file formats are supported, including delimited (.csv) and ragged right. Using the MULTIFLATFILE connection manager, you can reference multiple files as long as they all have the same format. When you specify the file locations, you separate each path with a pipe (|).
FILE
The FILE connection manager is not used to load data from flat files. Instead it provides a package with access to a file at run time. For example, an Execute SQL task that uses a FILE connection can extract and run SQL statements that are contained within a file. Alternatively, you can use a FILE connection as the connection manager for an XML task. FILE connection managers, used with the File System task, can also create a file or folder at run time.
FTP
FTP connection managers can access FTP servers to transfer files. The FTP task can use FTP connection managers to both send and receive files. The FTP connection manager does not support Windows® authentication, only basic and anonymous. The passive mode option enables the connection to be initiated by the client as well as the server. You can then process files that are transferred by an FTP task using a FLATFILE or FILE connector, as discussed earlier.
HTTP
The HTTP connection manager is used to access a Web server. The required connection details can be seen in Figure 3. As with FTP connectors, only basic and anonymous connections are supported; Windows authentication is not. You can specify a client certificate for additional security, and the option to use a proxy server is also available.
The Web Service task uses an HTTP connection manager. You can configure a Web Service task to access a Web service method that will recover data from the Web server and deliver it to a file or a variable for further processing. For example, you could access product information from a Web server and load it into your database. The Web Services Description Language (WSDL) file that describes the methods that the Web service offers must be available locally when you configure the Web Service task.
Figure 3: HTTP Connection Manager Editor dialog box
MSMQ
The Message Queue task uses the MSMQ connection manager to enable Integration Services packages that are located on multiple systems throughout your corporate network to communicate and interact with one another. This can be useful if, for example, you have a central database that needs to receive data from several distributed sources and then perform some kind of aggregation on the data. You can create packages on the distributed servers that use Message Queue tasks to send their data to the central server. A package on the central server, which also contains Message Queue tasks, can pick up the data from each server and then, after all of the data has arrived, process it all together. Using Message Queuing (also known as MSMQ), packages communicate with one another at run time, so the central package in this example might have to run for a relatively long time to gather all of the data. Message Queuing is an asynchronous communication mechanism. If a server cannot deliver its data at the first attempt, it attempts to deliver it later, which adds a degree of reliability to the process.
The Message Queue task enables you to send data as a string, a variable, or a file, and you can configure a task to send or receive. If you choose the receive option, you can have the message automatically converted to a variable by using the String message to variable option in the Message Queue Task Editor dialog box (Figure 4). This makes further processing of the data potentially easier.
Figure 4: Message Queue Task Editor dialog box
MSOLAP100
Choosing the MSOLAP100 connector opens the same dialog box as selecting the OLE DB provider for SQL Server Analysis Services 10.0, which was discussed earlier. It is simply another way to configure this provider.
SMOSERVER
SQL Server Management Objects (SMO) connection managers enable Integration Services to perform transfer tasks. Transfer tasks move databases, logins, jobs, error messages, stored procedures from the master database, and SQL Server objects such as tables, views, and schemas between instances of SQL Server, including SQL Server 2000, SQL Server 2005, and SQL Server 2008. These tasks can be useful when you need to migrate database objects to a newer version of SQL Server. When you configure an SMO connection manager, you only need to supply a SQL Server name and authentication details.
SMTP
The Simple Mail Transfer Protocol (SMTP) connection manager enables access to an SMTP server. The Send Mail task uses this connection manager to send e-mail messages, for example, to indicate success or failure of components within a package. The SMTP connection manager has no reliance on MAPI so the Microsoft Office Outlook® messaging and collaboration client is not required for the connector to function.
SQLMOBILE
The SQLMOBILE connector provides connectivity to databases on SQL Server Compact. When you configure the connector, you specify the location of the database file (which has an .sdf extension) and the authentication details. The connection manager can be used by the Execute SQL task and also by the SQL Server Compact destination in the data flow to load data into a SQL Server Compact database.
WMI
The Windows Management Instrumentation (WMI) connection manager provides access to the WMI scripting interface and is used by the WMI Data Reader and WMI Event Watcher tasks. The WMI scripting interface accepts WMI Query Language (WQL) queries that return information about the Windows environment. WQL is syntactically similar to SQL. You can use WQL queries in a WMI Data Reader task to check the amount of available disk space before you start a data load, for example.
The WMI Event Watcher task can respond to WMI events, such as a file being placed into a particular folder. This is useful when you are processing flat files. For example, when a file arrives in a particular folder, the Event Watcher task can respond by initiating the processing of the file by an Integration Services package. When you create a WMI connection manager, you supply the server name, the default WMI namespace, and the authentication details.
XML
Integration Services does not include an XML connection manager, but it does provide extensive XML handling capabilities through the XML task and the XML source in the data flow.
When you create an XML task, you must specify the XML data that the task will process in the Source Type field of the XML Task Editor dialog box. You can choose a FILE connection manager to reference an XML file, a variable to pass the XML data, or you can enter the data into the XML task. You can select the XML modification that you want to perform on the XML data when the package runs in the Operation Type field of the XML Task Editor dialog box(Figure 5). You can validate the XML against an XML Schema Definition (XSD) document, alter the format of the XML by using an Extensible Stylesheet Language for Transformations (XSLT) stylesheet, use an XML Path Language (XPath) query to obtain just a subsection of the XML data, and merge XML documents to form a single XML document. You can use the Diff option to compare two XML documents and output the differences to a separate diffgram file. You can use the Patch option to create a new XML document by combining a diffgram file created by the Diff option with another XML document.
Figure 5: XML Task Editor dialog box
The XML data flow source provides access to XML data in the data flow, in the form of a document or contained in a variable. You can validate the XML data against an XSD schema document.
Add-on Connectors
In addition to the extensive range of built-in connectors, there are many more that you can install as add-ons. Some of these connectors are provided by Microsoft and others by third parties. There are two main reasons that vendors create add-on connectors:
- To facilitate access to a data source that is not supported by any of the built-in connectors
- To provide an improvement in performance over existing connectors
Vendors of database management systems such as IBM and Oracle usually provide generic adapters that can be used with Integration Services. Figure 6 shows the Select Data Providers page of the Integration Services Connections Project Wizard, which displays a list of vendor-based data sources.
Figure 6: Integration Services Connections Project Wizard
You can see from Figure 6 that there is no built in support for DB2 in Integration Services. A list of Web sites (not visible in Figure 6) is provided in the Select Data Providers page of the Integration Services Connections Project Wizard so that you can download a connector for use in your project. In this case, you can choose from an OLE DB connector from IBM or an OLE DB connector from Microsoft. For more information about the OLE DB connector from Microsoft, see “OLE DB” earlier in this white paper.
A list of connectors for Integration Services is available in the article “Data Sources” on the Microsoft Integration Services development team wiki at http://ssis.wik.is/Data_Sources.
Add-on Connectors for Tier-1 Data Sources
Microsoft Connectors for Oracle and Teradata by Attunity (New in SQL Server 2008)
Attunity, a Microsoft OEM partner that has produced several connectors for Integration Services in the past, has created two Tier-1 connectors for use with Integration Services running on SQL Server 2008 Enterprise Edition:
- The Microsoft Connector for Oracle by Attunity
- The Microsoft Connector for Teradata by Attunity
Attunity developed these connectors in partnership with the Microsoft SQL Server Integration Services Development Team, and both connectors are distributed and supported by Microsoft, unlike other Partner provided connectors. The connectors offer a significant performance advantage over other connectors because they access internal buffering application programming interfaces (APIs) directly rather than having to go through a managed interface. The Microsoft connectors by Attunity are designed to look and function as if they are built-in Integration Services connectors such as Excel, or Flat File connection managers. They handle metadata changes in a manner that is consistent with built-in connectors, and they capture and report errors that are generated by the source systems to which they connect. After you install the connectors, you can create connection managers that use them in the usual way (Figure 7).
Figure 7: Add SSIS Connection Manager dialog box showing the Attunity connectors
New data flow sources and data flow destinations, which reference these connection managers, are added to the data flow toolbox in Business Intelligence Development Studio (Figure 8).
Figure 8: Data Flow Sources in the Toolbox showing the Attunity connectors
Figure 9 shows an Oracle Source and Oracle Destination in use as part of the data flow. The Teradata Source and Teradata Destination have a very similar appearance.
Figure 9: Data Flow task in an Integration Services package containing the Oracle source and destination
Microsoft Connector for Oracle by Attunity
The Oracle connector is compatible with Oracle database versions 9.2.0.4 and later, and it supports x86, x64, and Itanium IA-64 platforms. The Oracle data source that it provides can use tables, views, or an SQL statement to access data. The Oracle data destination can move data into Oracle databases by using arrayed load mode, which loads data in batches as a single transaction, and fast load mode, which uses the OCI direct path protocol to load data more quickly, but with a few restrictions.
Microsoft Connector for Teradata by Attunity
The Teradata connector is compatible with Teradata database versions 2R 6.0, 2R 6.1, 2R 6.2, and 12.0. It supports x86 and x64 platforms. The data source that it provides can extract data from tables and views, or by using an SQL statement. The data destination can load data into Teradata databases incrementally by using the TPT Stream operator, or you can implement fast loading, which uses the TPT Load operator.
Both of these connectors are available from the “Microsoft Connectors for Oracle and Teradata by Attunity” Web site at https://www.microsoft.com/downloads/details.aspx?FamilyId=D9CB21FE-32E9-4D34-A381-6F9231D84F1E&displaylang=en. Both connectors are designed to be used with the SQL Server 2008 Enterprise and SQL Server 2008 Developer.
Microsoft Connector 1.0 for SAP BI (New in SQL Server 2008)
The Microsoft Connector 1.0 for SAP BI is included in the SQL Server 2008 Feature Pack, October 2008 release. It provides connectivity between SQL Server 2008 Enterprise and SAP NetWeaver BI 7.0 through Integration Services. It supports both full and delta data extraction modes from SAP BI 7.0 into SQL Server. It also enables you to move data from non-SAP data sources into SAP BI InfoProviders.
With the Microsoft Connector 1.0 for SAP BI, you can extract data from SAP InfoProviders like InfoCubes, Data Store Objects (DSO), and InfoObjects. Data from these objects is fed into downstream SQL Server databases through a SAP BI Open Hub Destination, which you must create on the SAP system. The Open Hub Destination renders the multidimensional data as two dimensional relational data, which is then loaded into SQL Server using an Integration Services package that references a SAP connection manager. Analysis Services can then build its cubes from here, or you can process the data in some other way.
After you install the Microsoft Connector 1.0 for SAP BI, you can create a SAP BI connection manager (Figure 10) in Business Intelligence Development Studio. You insert the connection details that will be used to generate the connection string, including the host and authentication information. You can also test the connection from here.
Figure 10: SAP BI Connection Manager dialog box
SAP BI source and SAP BI destination items are available in the data flow toolbox, and they reference a SAP BI connection manager. You can configure a SAP BI source to trigger the data extraction process itself (P - Trigger Process Chain), allow it to be started by SAP (W - Wait for Notify), or use it for extracting data only (E – Extract Only). The SAP BI destination requires you to specify an InfoPackage that is configured in SAP. The InfoPackage specifies where and how to load the data into SAP BI.
The Microsoft Connector 1.0 for SAP BI is currently going through the SAP certification process. It is anticipated that the connector will achieve certified status by the end of 2008. Additionally, a white paper that discusses the use of the Microsoft Connector 1.0 for SAP BI in more detail will soon be available on the SQL Server Integration Services developer center Web site at https://msdn.microsoft.com/en-us/sqlserver/cc511477.aspx.
Adapters for SAP R/3 and Siebel eBusiness Applications
You can extend Integration Services connectivity to include SAP R/3 and Siebel eBusiness applications by using adapters available in the BizTalk Adapter Pack 1.0. You can obtain a free download of the evaluation version from the Microsoft Web site at https://www.microsoft.com/downloads/details.aspx?familyid=F3EA0659-B0ED-4D5A-891E-53268128A5F0&displaylang=en.
The pack contains the following adapters that you can use with Integration Services:
- BizTalk Adapter 3.0 for Siebel eBusiness Applications
- BizTalk Adapter 3.0 for mySAP Business Suite
The adapter for mySAP Business Suite supports the following versions:
- SAP R/3 4.6c Non-Unicode
- SAP R/3 4.7 Non-Unicode
- SAP R/3 4.7 Unicode
- SAP R/3 5.0 Non-Unicode
- SAP R/3 5.0 Unicode
- SAP R/3 6.0 Unicode
The Microsoft adapter for mySAP Business Suite has achieved SAP Certification status. For more information, see the article "BizTalk Adapter for SAP Certification" on the Microsoft Web site at https://www.microsoft.com/biztalk/evaluation/adapter/adapters/sap/sapcertification.mspx.
The adapter for Siebel eBusiness Applications supports the following versions:
- Siebel 7.5 (7.5.3.15)
- Siebel 7.7 (7.7.2.8)
- Siebel 7.8 (7.8.2.6)
- Siebel 8.0 (8.0.0.1)
Before you install the BizTalk Adapter Pack 1.0, you must install the Windows Communication Foundation (WCF) Line-of-Business (LOB) Adapter software development kit (SDK), which is available from the Microsoft Web site at https://www.microsoft.com/downloads/details.aspx?FamilyID=56278FDE-B708-469C-987E-DED9C6C5E580&displaylang=en.
Despite the name of the adapter pack, you do not need to install BizTalk Server to use these adapters; you can make direct connections between Integration Services and SAP or Siebel through an ADO.NET connection manager that uses the relevant .NET provider (Figure 11).
Figure 11: Connection Manager dialog box showing .NET providers for Siebel and SAP from the BizTalk Adapter Pack 1.0
To access the SAP and Siebel connections in the data flow, you can use the new ADO.NET source, which was discussed earlier. You can then use transformations and data flow destinations in the usual manner.
The Siebel adapter accesses business components and business objects that the Siebel application exposes. A business component is analogous to a view, in that it is a logical grouping of columns from one or more tables. A business object is a set of related business components.
The SAP adapter accesses remote function calls (RFCs), business application programming interfaces (BAPIs), and intermediate documents (IDOCs) exposed by a SAP system.
Detailed technical documentation on the use of these adapters is included in the BizTalk Adapter Pack download.
Add-on Connectors for Tier-2 Data Sources
Microsoft OLE DB Provider for DB2
The Microsoft OLE DB Provider for DB2 facilitates connectivity to IBM DB2 data sources. The connector is available as part of the Microsoft SQL Server 2008 Feature Pack from the Microsoft Web site at https://www.microsoft.com/downloads/details.aspx?FamilyID=C6C3E9EF-BA29-4A43-8D69-A2BED18FE73C&displaylang=en.
The OLE DB Provider for DB2 provides two-way data flow between Integration Services and the following versions of DB2:
- IBM DB2 for Linux, UNIX, and Windows versions 8 and 9
- IBM DB2 for z/OS versions 8 and 9
- IBM DB2 for i/Series V5R4 and higher
You can use the connector with x86, x64, and Itanium IA-64 editions of the Windows operating system. All connections must be over TCP/IP because this version of the connector does not support IBM’s Systems Network Architecture (SNA). If you require SNA support, you can use Microsoft Host Integration Server 2006 to achieve this.
To implement a connection to a DB2 source by using Business Intelligence Development Studio, create an OLE DB connection manager, and then select Microsoft OLE DB Provider for DB2 for the provider type (Figure 12).
Figure 12: The Connection Manager dialog box showing the provider for DB2 selected
When you click Data Links in the Connection Manager dialog box, the Data Link Properties dialog box opens and you can configure connection options including the IP address, authentication method, initial catalog, and default schema (Figure 13).
The Advanced and All tabs of the Data Link Properties dialog box provide further connection options. For example, you can set the Derive Parameters option to TRUE, which causes the provider to derive parameter information specified in the WHERE clause of an INSERT, UPDATE, DELETE, or SELECT statement from the data source as requested by the consumer.
Figure 13: Data Link Properties dialog box
After you have created the OLE DB Provider for DB2 connection manager, you can create OLE DB data sources and destinations in the data flow that reference it just as you can for other OLE DB providers.
Connectors for Other Data Sources
Microsoft offers connectivity to other data sources through its deep Partner ecosystem. However, unlike the connectors and adapters that are discussed in this paper, Partner provided connectors are supported by the vendor only, and not by Microsoft.
Conclusion
SQL Server 2008 Integration Services supports an ever-expanding range of connectivity options for a wide variety of data sources. The addition of new connectors that have an increased emphasis on performance, and of ADO.NET data sources and destinations, contributes to making Integration Services an excellent choice for ETL operations.
For more information:
https://www.microsoft.com/sqlserver/: SQL Server Web site
https://technet.microsoft.com/en-us/sqlserver/: SQL Server TechCenter
https://msdn.microsoft.com/en-us/sqlserver/: SQL Server DevCenter
https://www.microsoft.com/sqlserver/2008/en/us/integration.aspx: SQL Server Integration Services Web site
https://technet.microsoft.com/en-us/sqlserver/cc510302.aspx: SQL Server Integration Services TechCenter
https://msdn.microsoft.com/en-us/sqlserver/cc511477.aspx: SQL Server Integration Services DevCenter
http://ssis.wik.is/: SQL Server Integration Services Connectivity Wiki
https://www.microsoft.com/downloads/details.aspx?FamilyId=D9CB21FE-32E9-4D34-A381-6F9231D84F1E&displaylang=en: Download site for Microsoft connectors for Oracle and Teradata by Attunity
https://www.microsoft.com/downloads/details.aspx?FamilyID=C6C3E9EF-BA29-4A43-8D69-A2BED18FE73C&displaylang=en: Download site for SQL Server 2008 Feature Pack (includes OLE DB Provider for DB2)
https://www.microsoft.com/downloads/details.aspx?familyid=F3EA0659-B0ED-4D5A-891E-53268128A5F0&displaylang=en: Download site for BizTalk Adapter Pack 1.0 (includes BizTalk Adapter 3.0 for Siebel eBusiness Applications and BizTalk Adapter 3.0 for mySAP Business Suite)
https://download.microsoft.com/download/2/7/c/27cd7357-2649-4035-84af-e9c47df4329c/ConnectivitySSIS.doc: White paper: “Connectivity and SQL Server 2005 Integration Services”
Did this paper help you? Please give us your feedback. Tell us on a scale of 1 (poor) to 5 (excellent), how would you rate this paper and why have you given it this rating? For example:
- Are you rating it high due to having good examples, excellent screen shots, clear writing, or another reason?
- Are you rating it low due to poor examples, fuzzy screen shots, or unclear writing?
This feedback will help us improve the quality of white papers we release.