Use PowerShell to create a data factory pipeline to copy data from SQL Server to Azure
This sample PowerShell script creates a pipeline in Azure Data Factory that copies data from a SQL Server database to an Azure Blob Storage.
Note
We recommend that you use the Azure Az PowerShell module to interact with Azure. To get started, see Install Azure PowerShell. To learn how to migrate to the Az PowerShell module, see Migrate Azure PowerShell from AzureRM to Az.
This sample requires Azure PowerShell. Run Get-Module -ListAvailable Az
to find the version.
If you need to install or upgrade, see Install Azure PowerShell module.
Run the Connect-AzAccount cmdlet to connect to Azure.
Prerequisites
- SQL Server. You use a SQL Server database as a source data store in this sample.
- Azure Storage account. You use Azure blob storage as a destination/sink data store in this sample. if you don't have an Azure storage account, see the Create a storage account article for steps to create one.
- Self-hosted integration runtime. Download MSI file from the download center and run it to install a self-hosted integration runtime on your machine.
Create sample database in SQL Server
In the SQL Server database, create a table named emp by using the following SQL script:
CREATE TABLE dbo.emp ( ID int IDENTITY(1,1) NOT NULL, FirstName varchar(50), LastName varchar(50), CONSTRAINT PK_emp PRIMARY KEY (ID) ) GO
Insert some sample data into the table:
INSERT INTO emp VALUES ('John', 'Doe') INSERT INTO emp VALUES ('Jane', 'Doe')
Sample script
Important
This script creates JSON files that define Data Factory entities (linked service, dataset, and pipeline) on your hard drive in the c:\ folder.
$resourceGroupName = "<Resource group name>"
$dataFactoryName = "<Data factory name>" # must be globally unique
$storageAccountName = "<Az.Storage account name>"
$storageAccountKey = "<Az.Storage account key>"
$sqlServerName = "<SQL server name>"
$sqlDatabaseName = "SQL Server database name"
$sqlTableName = "emp" # create the emp table if it does not already exist in your database with ID, FirstName, and LastName columns of type String.
$sqlUserName = "<SQL Authentication - user name>"
$sqlPassword = "<SQL Authentication - user password>"
$blobFolderPath = "<Azure blob container name>/<Azure blob folder name>"
$integrationRuntimeName = "<Self-hosted integration runtime name"
$pipelineName = "SqlServerToBlobPipeline"
$dataFactoryRegion = "East US"
# Create a resource group
New-AzResourceGroup -Name $resourceGroupName -Location $dataFactoryRegion
# create a data factory
$df = Set-AzDataFactory -ResourceGroupName $resourceGroupName -Name $dataFactoryName -Location $dataFactoryRegion
# create a self-hosted integration runtime
Set-AzDataFactoryIntegrationRuntime -Name $integrationRuntimeName -Type SelfHosted -DataFactoryName $dataFactoryName -ResourceGroupName $resourceGroupName
# get the authorization key from the created integration runtime in the cloud
Get-AzDataFactoryIntegrationRuntimeKey -Name $integrationRuntimeName -DataFactoryName $dataFactoryName -ResourceGroupName $resourceGroupName | ConvertTo-Json
# IMPORTANT: Install self-hosted integration runtime on your machine and use one of the keys to register the IR installed on your machine with the cloud service
# create an Az.Storage linked service
## JSON definition of the linked service.
$storageLinkedServiceDefinition = @"
{
"name": "AzureStorageLinkedService",
"properties": {
"type": "AzureStorage",
"typeProperties": {
"connectionString": {
"value": "DefaultEndpointsProtocol=https;AccountName=$storageAccountName;AccountKey=$storageAccountKey",
"type": "SecureString"
}
}
}
}
"@
## IMPORTANT: stores the JSON definition in a file that will be used by the Set-AzDataFactoryLinkedService command.
$storageLinkedServiceDefinition | Out-File c:\AzureStorageLinkedService.json
## Creates a linked service in the data factory
Set-AzDataFactoryLinkedService -DataFactoryName $dataFactoryName -ResourceGroupName $resourceGroupName -Name "AzureStorageLinkedService" -File c:\AzureStorageLinkedService.json
# create an on-premises SQL Server linked service
## JSON definition of the linked service.
$sqlServerLinkedServiceDefinition = @"
{
"properties": {
"type": "SqlServer",
"typeProperties": {
"connectionString": {
"type": "SecureString",
"value": "Server=$sqlServerName;Database=$sqlDatabaseName;User ID=$sqlUserName;Password=$sqlPassword;Timeout=60"
}
},
"connectVia": {
"type": "integrationRuntimeReference",
"referenceName": "$integrationRuntimeName"
}
},
"name": "SqlServerLinkedService"
}
"@
## IMPORTANT: stores the JSON definition in a file that will be used by the Set-AzDataFactoryLinkedService command.
$sqlServerLinkedServiceDefinition | Out-File c:\SqlServerLinkedService.json
## Encrypt SQL Server credentials
New-AzDataFactoryLinkedServiceEncryptCredential -DataFactoryName $dataFactoryName -ResourceGroupName $resourceGroupName -IntegrationRuntimeName $integrationRuntimeName -File "c:\SqlServerLinkedService.json" > c:\EncryptedSqlServerLinkedService.json
# Create a SQL Server linked service
Set-AzDataFactoryLinkedService -DataFactoryName $dataFactoryName -ResourceGroupName $resourceGroupName "EncryptedSqlServerLinkedService" -File "c:\EncryptedSqlServerLinkedService.json"
# Create a source dataset for source SQL Server Database
## JSON definition of the dataset
$sourceSqlServerDatasetDefiniton = @"
{
"properties": {
"type": "SqlServerTable",
"typeProperties": {
"tableName": "$sqlTableName"
},
"structure": [
{
"name": "ID",
"type": "String"
},
{
"name": "FirstName",
"type": "String"
},
{
"name": "LastName",
"type": "String"
}
],
"linkedServiceName": {
"referenceName": "EncryptedSqlServerLinkedService",
"type": "LinkedServiceReference"
}
},
"name": "SqlServerDataset"
}
"@
## IMPORTANT: store the JSON definition in a file that will be used by the Set-AzDataFactoryDataset command.
$sourceSqlServerDatasetDefiniton | Out-File c:\SqlServerDataset.json
# Create an Azure Blob dataset in the data factory
Set-AzDataFactoryDataset -DataFactoryName $dataFactoryName -ResourceGroupName $resourceGroupName -Name "SqlServerDataset" -File "c:\SqlServerDataset.json"
# Create a dataset for sink Azure Blob Storage
## JSON definition of the dataset
$sinkBlobDatasetDefiniton = @"
{
"properties": {
"type": "AzureBlob",
"typeProperties": {
"folderPath": "$blobFolderPath",
"format": {
"type": "TextFormat"
}
},
"linkedServiceName": {
"referenceName": "AzureStorageLinkedService",
"type": "LinkedServiceReference"
}
},
"name": "AzureBlobDataset"
}
"@
## IMPORTANT: store the JSON definition in a file that will be used by the Set-AzDataFactoryDataset command.
$sinkBlobDatasetDefiniton | Out-File c:\AzureBlobDataset.json
## Create the Azure Blob dataset
Set-AzDataFactoryDataset -DataFactoryName $dataFactoryName -ResourceGroupName $resourceGroupName -Name "AzureBlobDataset" -File "c:\AzureBlobDataset.json"
# Create a pipeline in the data factory
## JSON definition of the pipeline
$pipelineDefinition = @"
{
"name": "$pipelineName",
"properties": {
"activities": [
{
"type": "Copy",
"typeProperties": {
"source": {
"type": "SqlSource"
},
"sink": {
"type":"BlobSink"
}
},
"name": "CopySqlServerToAzureBlobActivity",
"inputs": [
{
"referenceName": "SqlServerDataset",
"type": "DatasetReference"
}
],
"outputs": [
{
"referenceName": "AzureBlobDataset",
"type": "DatasetReference"
}
]
}
]
}
}
"@
## IMPORTANT: store the JSON definition in a file that will be used by the Set-AzDataFactoryPipeline command.
$pipelineDefinition | Out-File c:\SqlServerToBlobPipeline.json
## Create a pipeline in the data factory
Set-AzDataFactoryPipeline -DataFactoryName $dataFactoryName -ResourceGroupName $resourceGroupName -Name "$pipelineName" -File "c:\SqlServerToBlobPipeline.json"
# start the pipeline run
$runId = Invoke-AzDataFactoryPipeline -DataFactoryName $dataFactoryName -ResourceGroupName $resourceGroupName -PipelineName $pipelineName
# Check the pipeline run status until it finishes the copy operation
while ($True) {
$result = Get-AzDataFactoryActivityRun -DataFactoryName $dataFactoryName -ResourceGroupName $resourceGroupName -PipelineRunId $runId -RunStartedAfter (Get-Date).AddMinutes(-30) -RunStartedBefore (Get-Date).AddMinutes(30)
if (($result | Where-Object { $_.Status -eq "InProgress" } | Measure-Object).count -ne 0) {
Write-Host "Pipeline run status: In Progress" -foregroundcolor "Yellow"
Start-Sleep -Seconds 30
}
else {
Write-Host "Pipeline $pipelineName run finished. Result:" -foregroundcolor "Yellow"
$result
break
}
}
# Get the activity run details
$result = Get-AzDataFactoryActivityRun -DataFactoryName $dataFactoryName -ResourceGroupName $resourceGroupName `
-PipelineRunId $runId `
-RunStartedAfter (Get-Date).AddMinutes(-10) `
-RunStartedBefore (Get-Date).AddMinutes(10) `
-ErrorAction Stop
$result
if ($result.Status -eq "Succeeded") {`
$result.Output -join "`r`n"`
}`
else {`
$result.Error -join "`r`n"`
}
# To remove the data factory from the resource gorup
# Remove-AzDataFactory -Name $dataFactoryName -ResourceGroupName $resourceGroupName
#
# To remove the whole resource group
# Remove-AzResourceGroup -Name $resourceGroupName
Clean up deployment
After you run the sample script, you can use the following command to remove the resource group and all resources associated with it:
Remove-AzResourceGroup -ResourceGroupName $resourceGroupName
To remove the data factory from the resource group, run the following command:
Remove-AzDataFactoryV2 -Name $dataFactoryName -ResourceGroupName $resourceGroupName
Script explanation
This script uses the following commands:
Command | Notes |
---|---|
New-AzResourceGroup | Creates a resource group in which all resources are stored. |
Set-AzDataFactoryV2 | Create a data factory. |
New-AzDataFactoryV2LinkedServiceEncryptCredential | Encrypts credentials in a linked service and generates a new linked service definition with the encrypted credential. |
Set-AzDataFactoryV2LinkedService | Creates a linked service in the data factory. A linked service links a data store or compute to a data factory. |
Set-AzDataFactoryV2Dataset | Creates a dataset in the data factory. A dataset represents input/output for an activity in a pipeline. |
Set-AzDataFactoryV2Pipeline | Creates a pipeline in the data factory. A pipeline contains one or more activities that performs a certain operation. In this pipeline, a copy activity copies data from one location to another location in an Azure Blob Storage. |
Invoke-AzDataFactoryV2Pipeline | Creates a run for the pipeline. In other words, runs the pipeline. |
Get-AzDataFactoryV2ActivityRun | Gets details about the run of the activity (activity run) in the pipeline. |
Remove-AzResourceGroup | Deletes a resource group including all nested resources. |
Related content
For more information on the Azure PowerShell, see Azure PowerShell documentation.
Additional Azure Data Factory PowerShell script samples can be found in the Azure Data Factory PowerShell samples.