Working with Excel Files with the Script Task
New: 14 April 2006
SQL Server 2005 Integration Services (SSIS) provides the Excel connection manager, Excel source, and Excel destination for working with data stored in spreadsheets in the Microsoft Excel file format. The techniques described in this topic use the Script task to obtain information about available Excel databases (workbook files) and tables (worksheets and named ranges). These samples can easily be modified to work with any of the other file-based data sources supported by the Microsoft Jet OLE DB Provider.
Configuring a Package to Test the Samples
Example1: Check Whether an Excel File Exists
Example 2: Check Whether an Excel Table Exists
Example 3: Get a List of Excel Files in a Folder
Example 4: Get a List of Tables in an Excel File
Displaying the Results of the Samples
Note
If you want to create a task that you can more easily reuse across multiple packages, consider using the code in this Script task sample as the starting point for a custom task. For more information, see Developing a Custom Task.
Configuring a Package to Test the Samples
You can configure a single package to test all the samples in this topic. The samples use many of the same package variables and the same .NET Framework classes.
To configure a package for use with the examples in this topic
Create a new Integration Services project in Business Intelligence Development Studio and open the default package for editing.
Variables. Open the Variables window and define the following variables:
ExcelFile
, of type String. Enter the complete path and filename to an existing Excel workbook.ExcelTable
, of type String. Enter the name of an existing worksheet or named range in the workbook named in the value of theExcelFile
variable. This value is case-sensitive.ExcelFileExists
, of type Boolean.ExcelTableExists
, of type Boolean.ExcelFolder
, of type String. Enter the complete path of a folder that contains at least one Excel workbook.ExcelFiles
, of type Object.ExcelTables
, of type Object.
Imports statements. Most of the code samples require you to import one or both of the following .NET Framework namespaces at the top of your script file:
- System.IO, for file system operations.
- System.Data.OleDb, to open Excel files as data sources.
References. The code samples that read schema information from Excel files require an additional reference in the script project to the System.Xml namespace.
Example 1 Description: Check Whether an Excel File Exists
This example determines whether the Excel workbook file specified in the ExcelFile
variable exists, and then sets the Boolean value of the ExcelFileExists
variable to the result. You can use this Boolean value for branching in the workflow of the package.
To configure this Script Task example
Add a new Script task to the package and change its name to ExcelFileExists.
Open the Script Task Editor.
On the Script tab, add
ExcelFile
to the ReadOnlyVariables list, and addExcelFileExists
to the ReadWriteVariables list.Click Design Script to open the script editor.
Add an Imports statement for the System.IO namespace at the top of the script file.
Replace all the code in Sub Main with the following code.
Example 1 Code
Dim fileToTest As String
fileToTest = Dts.Variables("ExcelFile").Value.ToString
If File.Exists(fileToTest) Then
Dts.Variables("ExcelFileExists").Value = True
Else
Dts.Variables("ExcelFileExists").Value = False
End If
Dts.TaskResult = Dts.Results.Success
Example 2 Description: Check Whether an Excel Table Exists
This example determines whether the Excel worksheet or named range specified in the ExcelTable
variable exists in the Excel workbook file specified in the ExcelFile
variable, and then sets the Boolean value of the ExcelTableExists
variable to the result. You can use this Boolean value for branching in the workflow of the package.
To configure this Script Task example
Add a new Script task to the package and change its name to ExcelTableExists.
Open the Script Task Editor.
On the Script tab, add
ExcelTable
andExcelFile
to the ReadOnlyVariables list, and addExcelTableExists
to the ReadWriteVariables list.Click Design Script to open the script editor.
Add a reference to the System.Xml assembly in the script project.
Add Imports statements for the System.IO and System.Data.OleDb namespaces at the top of the script file.
Replace all the code in Sub Main with the following code.
Example 2 Code
Dim fileToTest As String
Dim tableToTest As String
Dim connectionString As String
Dim excelConnection As OleDbConnection
Dim excelTables As DataTable
Dim excelTable As DataRow
Dim currentTable As String
fileToTest = Dts.Variables("ExcelFile").Value.ToString
tableToTest = Dts.Variables("ExcelTable").Value.ToString
Dts.Variables("ExcelTableExists").Value = False
If File.Exists(fileToTest) Then
connectionString = "Provider=Microsoft.Jet.OLEDB.4.0;" & _
"Data Source=" & fileToTest & _
";Extended Properties=Excel 8.0"
excelConnection = New OleDbConnection(connectionString)
excelConnection.Open()
excelTables = excelConnection.GetSchema("Tables")
For Each excelTable In excelTables.Rows
currentTable = excelTable.Item("TABLE_NAME").ToString
If currentTable = tableToTest Then
Dts.Variables("ExcelTableExists").Value = True
End If
Next
End If
Dts.TaskResult = Dts.Results.Success
Example 3 Description: Get a List of Excel Files in a Folder
This example fills an array with the list of Excel files found in the folder specified in the value of the ExcelFolder
variable, and then copies the array into the ExcelFiles
variable. You can use the Foreach from Variable enumerator to iterate over the files in the array.
To configure this Script Task example
Add a new Script task to the package and change its name to GetExcelFiles.
Open the Script Task Editor.
On the Script tab, add
ExcelFolder
to the ReadOnlyVariables list, and addExcelFiles
to the ReadWriteVariables list.Click Design Script to open the script editor.
Add an Imports statement for the System.IO namespace at the top of the script file.
Replace all the code in Sub Main with the following code.
Example 3 Code
Const FILE_PATTERN As String = "*.xls"
Dim excelFolder As String
Dim excelFiles As String()
excelFolder = Dts.Variables("ExcelFolder").Value.ToString
excelFiles = Directory.GetFiles(excelFolder, FILE_PATTERN)
Dts.Variables("ExcelFiles").Value = excelFiles
Dts.TaskResult = Dts.Results.Success
Alternate Solution
Instead of using a Script task to gather a list of Excel files into an array, you can also use the ForEach File enumerator to iterate over all the Excel files in a folder. For more information, see How to: Loop through Excel Files and Tables.
Example 4 Description: Get a List of Tables in an Excel File
This example fills an array with the list of worksheets and named ranges found in the Excel workbook file specified by the value of the ExcelFile
variable, and then copies the array into the ExcelTables
variable. You can use the Foreach from Variable Enumerator to iterate over the tables in the array.
Note
The list of tables in an Excel workbook includes both worksheets (which have the $ suffix) and named ranges. If you have to filter the list for only worksheets or only named ranges, you may have to add additional code for this purpose.
To configure this Script Task example
Add a new Script task to the package and change its name to GetExcelTables.
Open the Script Task Editor.
On the Script tab, add
ExcelFile
to the ReadOnlyVariables list, and addExcelTables
to the ReadWriteVariables list.Click Design Script to open the script editor.
Add a reference to the System.Xml namespace in the script project.
Add an Imports statement for the System.Data.OleDb namespace at the top of the script file.
Replace all the code in Sub Main with the following code.
Example 4 Code
Dim excelFile As String
Dim connectionString As String
Dim excelConnection As OleDbConnection
Dim tablesInFile As DataTable
Dim tableCount As Integer = 0
Dim tableInFile As DataRow
Dim currentTable As String
Dim tableIndex As Integer = 0
Dim excelTables As String()
excelFile = Dts.Variables("ExcelFile").Value.ToString
connectionString = "Provider=Microsoft.Jet.OLEDB.4.0;" & _
"Data Source=" & excelFile & _
";Extended Properties=Excel 8.0"
excelConnection = New OleDbConnection(connectionString)
excelConnection.Open()
tablesInFile = excelConnection.GetSchema("Tables")
tableCount = tablesInFile.Rows.Count
ReDim excelTables(tableCount - 1)
For Each tableInFile In tablesInFile.Rows
currentTable = tableInFile.Item("TABLE_NAME").ToString
excelTables(tableIndex) = currentTable
tableIndex += 1
Next
Dts.Variables("ExcelTables").Value = excelTables
Dts.TaskResult = Dts.Results.Success
Alternate Solution
Instead of using a Script task to gather a list of Excel tables into an array, you can also use the ForEach ADO.NET Schema Rowset Enumerator to iterate over all the tables (that is, worksheets and named ranges) in an Excel workbook file. For more information, see How to: Loop through Excel Files and Tables.
Displaying the Results of the Samples
If you have configured each of the examples in this topic in the same package, you can connect all the Script tasks to an additional Script task that displays the output of all the examples.
To configure a Script task to display the output of the examples in this topic
Add a new Script task to the package and change its name to DisplayResults.
Connect each of the four example Script tasks to one another, so that each task runs after the preceding task completes successfully, and connect the fourth example task to the DisplayResults task.
Open the DisplayResults task in the Script Task Editor.
On the Script tab, add all seven variables listed above in " Configuring a Package to Test the Samples" to the ReadOnlyVariables list.
Click Design Script to open the script editor.
Add Imports statements for the Microsoft.VisualBasic and System.Windows.Forms namespaces at the top of the script file.
Replace all the code in Sub Main with the following code.
Run the package and examine the results displayed in a message box.
Code to Display the Results
Const EOL As String = ControlChars.CrLf
Dim results As String
Dim filesInFolder As String()
Dim fileInFolder As String
Dim tablesInFile As String()
Dim tableInFile As String
results = _
"Final values of variables:" & EOL & _
"ExcelFile: " & Dts.Variables("ExcelFile").Value.ToString & EOL & _
"ExcelFileExists: " & Dts.Variables("ExcelFileExists").Value.ToString & EOL & _
"ExcelTable: " & Dts.Variables("ExcelTable").Value.ToString & EOL & _
"ExcelTableExists: " & Dts.Variables("ExcelTableExists").Value.ToString & EOL & _
"ExcelFolder: " & Dts.Variables("ExcelFolder").Value.ToString & EOL & _
EOL
results &= "Excel files in folder: " & EOL
filesInFolder = DirectCast(Dts.Variables("ExcelFiles").Value, String())
For Each fileInFolder In filesInFolder
results &= " " & fileInFolder & EOL
Next
results &= EOL
results &= "Excel tables in file: " & EOL
tablesInFile = DirectCast(Dts.Variables("ExcelTables").Value, String())
For Each tableInFile In tablesInFile
results &= " " & tableInFile & EOL
Next
MessageBox.Show(results, "Results", MessageBoxButtons.OK, MessageBoxIcon.Information)
Dts.TaskResult = Dts.Results.Success
See Also
Other Resources
Excel Connection Manager
How to: Loop through Excel Files and Tables