Schedule U-SQL jobs using SQL Server Integration Services (SSIS)
In this document, you learn how to orchestrate and create U-SQL jobs using SQL Server Integration Service (SSIS).
Important
Azure Data Lake Analytics retired on 29 February 2024. Learn more with this announcement.
For data analytics, your organization can use Azure Synapse Analytics or Microsoft Fabric.
Prerequisites
Azure Feature Pack for Integration Services provides the Azure Data Lake Analytics task and the Azure Data Lake Analytics Connection Manager that helps connect to Azure Data Lake Analytics service. To use this task, make sure you install:
- Download and install SQL Server Data Tools (SSDT) for Visual Studio
- Install Azure Feature Pack for Integration Services (SSIS)
Azure Data Lake Analytics task
The Azure Data Lake Analytics task let users submit U-SQL jobs to the Azure Data Lake Analytics account.
Learn how to configure Azure Data Lake Analytics task.
You can get the U-SQL script from different places by using SSIS built-in functions and tasks, below scenarios show how can you configure the U-SQL scripts for different user cases.
Scenario 1-Use inline script call tvfs and stored procs
In Azure Data Lake Analytics Task Editor, configure SourceType as DirectInput, and put the U-SQL statements into USQLStatement.
For easy maintenance and code management, only put short U-SQL script as inline scripts, for example, you can call existing table valued functions and stored procedures in your U-SQL databases.
Related article: How to pass parameter to stored procedures
Scenario 2-Use U-SQL files in Azure Data Lake Store
You can also use U-SQL files in the Azure Data Lake Store by using Azure Data Lake Store File System Task in Azure Feature Pack. This approach enables you to use the scripts stored on cloud.
Follow below steps to set up the connection between Azure Data Lake Store File System Task and Azure Data Lake Analytics Task.
Set task control flow
In SSIS package design view, add an Azure Data Lake Store File System Task, a Foreach Loop Container and an Azure Data Lake Analytics Task in the Foreach Loop Container. The Azure Data Lake Store File System Task helps to download U-SQL files in your ADLS account to a temporary folder. The Foreach Loop Container and the Azure Data Lake Analytics Task help to submit every U-SQL file under the temporary folder to the Azure Data Lake Analytics account as a U-SQL job.
Configure Azure Data Lake Store File System Task
- Set Operation to CopyFromADLS.
- Set up AzureDataLakeConnection, learn more about Azure Data Lake Store Connection Manager.
- Set AzureDataLakeDirectory. Point to the folder storing your U-SQL scripts. Use relative path that is relative to the Azure Data Lake Store account root folder.
- Set Destination to a folder that caches the downloaded U-SQL scripts. This folder path will be used in Foreach Loop Container for U-SQL job submission.
Learn more about Azure Data Lake Store File System Task.
Configure Foreach Loop Container
In Collection page, set Enumerator to Foreach File Enumerator.
Set Folder under Enumerator configuration group to the temporary folder that includes the downloaded U-SQL scripts.
Set Files under Enumerator configuration to
*.usql
so that the loop container only catches the files ending with.usql
.In Variable Mappings page, add a user defined variable to get the file name for each U-SQL file. Set the Index to 0 to get the file name. In this example, define a variable called
User::FileName
. This variable will be used to dynamically get U-SQL script file connection and set U-SQL job name in Azure Data Lake Analytics Task.
Configure Azure Data Lake Analytics Task
Set SourceType to FileConnection.
Set FileConnection to the file connection that points to the file objects returned from Foreach Loop Container.
To create this file connection:
Choose <New Connection...> in FileConnection setting.
Set Usage type to Existing file, and set the File to any existing file's file path.
In Connection Managers view, right-click the file connection created, and choose Properties.
In the Properties window, expand Expressions, and set ConnectionString to the variable defined in Foreach Loop Container, for example,
@[User::FileName]
.
Set AzureDataLakeAnalyticsConnection to the Azure Data Lake Analytics account that you want to submit jobs to. Learn more about Azure Data Lake Analytics Connection Manager.
Set other job configurations. Learn More.
Use Expressions to dynamically set U-SQL job name:
In Expressions page, add a new expression key-value pair for JobName.
Set the value for JobName to the variable defined in Foreach Loop Container, for example,
@[User::FileName]
.
Scenario 3-Use U-SQL files in Azure Blob Storage
You can use U-SQL files in Azure Blob Storage by using Azure Blob Download Task in Azure Feature Pack. This approach enables you using the scripts on cloud.
The steps are similar with Scenario 2: Use U-SQL files in Azure Data Lake Store. Change the Azure Data Lake Store File System Task to Azure Blob Download Task. Learn more about Azure Blob Download Task.
The control flow is like this:
Scenario 4-Use U-SQL files on the local machine
Besides of using U-SQL files stored on cloud, you can also use files on your local machine or files deployed with your SSIS packages.
Right-click Connection Managers in SSIS project and choose New Connection Manager.
Select File type and select Add....
Set Usage type to Existing file, and set the File to the file on the local machine.
Add Azure Data Lake Analytics Task and:
- Set SourceType to FileConnection.
- Set FileConnection to the File Connection created.
Finish other configurations for Azure Data Lake Analytics Task.
Scenario 5-Use U-SQL statement in SSIS variable
In some cases, you might need to dynamically generate the U-SQL statements. You can use SSIS Variable with SSIS Expression and other SSIS tasks, like Script Task, to help you generate the U-SQL statement dynamically.
Open Variables tool window through SSIS > Variables top-level menu.
Add an SSIS Variable and set the value directly or use Expression to generate the value.
Add Azure Data Lake Analytics Task and:
- Set SourceType to Variable.
- Set SourceVariable to the SSIS Variable created now.
Finish other configurations for Azure Data Lake Analytics Task.
Scenario 6-Pass parameters to U-SQL script
In some cases, you might want to dynamically set the U-SQL variable value in the U-SQL script. Parameter Mapping feature in Azure Data Lake Analytics Task helps with this scenario. There are usually two typical user cases:
- Set the input and output file path variables dynamically based on current date and time.
- Set the parameter for stored procedures.
Learn more about how to set parameters for the U-SQL script.