Create a shared self-hosted integration runtime in Azure Data Factory
APPLIES TO: Azure Data Factory Azure Synapse Analytics
Tip
Try out Data Factory in Microsoft Fabric, an all-in-one analytics solution for enterprises. Microsoft Fabric covers everything from data movement to data science, real-time analytics, business intelligence, and reporting. Learn how to start a new trial for free!
This guide shows you how to create a shared self-hosted integration runtime in Azure Data Factory. Then you can use the shared self-hosted integration runtime in another data factory.
Note
As you share your self hosted integration runtime among more data factories, increased workload can sometimes lead to longer queue times. If queue times become excessive, you can scale up your node or scale out by adding additional nodes. You can add up to 4 nodes.
Create a shared self-hosted integration runtime in Azure Data Factory
You can reuse an existing self-hosted integration runtime infrastructure that you already set up in a data factory. This reuse lets you create a linked self-hosted integration runtime in a different data factory by referencing an existing shared self-hosted IR.
To see an introduction and demonstration of this feature, watch the following 12-minute video:
Terminology
- Shared IR: An original self-hosted IR that runs on a physical infrastructure.
- Linked IR: An IR that references another shared IR. The linked IR is a logical IR and uses the infrastructure of another shared self-hosted IR.
Create a shared self-hosted IR using Azure Data Factory UI
To create a shared self-hosted IR using Azure Data Factory UI, you can take following steps:
In the self-hosted IR to be shared, select Grant permission to another Data factory and in the "Integration runtime setup" page, select the Data factory in which you want to create the linked IR.
Note and copy the above "Resource ID" of the self-hosted IR to be shared.
In the data factory to which the permissions were granted, create a new self-hosted IR (linked) and enter the resource ID.
Create a shared self-hosted IR using Azure PowerShell
To create a shared self-hosted IR using Azure PowerShell, you can take following steps:
- Create a data factory.
- Create a self-hosted integration runtime.
- Share the self-hosted integration runtime with other data factories.
- Create a linked integration runtime.
- Revoke the sharing.
Prerequisites
Note
We recommend that you use the Azure Az PowerShell module to interact with Azure. To get started, see Install Azure PowerShell. To learn how to migrate to the Az PowerShell module, see Migrate Azure PowerShell from AzureRM to Az.
Azure subscription. If you don't have an Azure subscription, create a free account before you begin.
Azure PowerShell. Follow the instructions in Install Azure PowerShell on Windows with PowerShellGet. You use PowerShell to run a script to create a self-hosted integration runtime that can be shared with other data factories.
Note
For a list of Azure regions in which Data Factory is currently available, select the regions that interest you on Products available by region.
Create a data factory
Launch the Windows PowerShell Integrated Scripting Environment (ISE).
Create variables. Copy and paste the following script. Replace the variables, such as SubscriptionName and ResourceGroupName, with actual values:
# If input contains a PSH special character, e.g. "$", precede it with the escape character "`" like "`$". $SubscriptionName = "[Azure subscription name]" $ResourceGroupName = "[Azure resource group name]" $DataFactoryLocation = "EastUS" # Shared Self-hosted integration runtime information. This is a Data Factory compute resource for running any activities # Data factory name. Must be globally unique $SharedDataFactoryName = "[Shared Data factory name]" $SharedIntegrationRuntimeName = "[Shared Integration Runtime Name]" $SharedIntegrationRuntimeDescription = "[Description for Shared Integration Runtime]" # Linked integration runtime information. This is a Data Factory compute resource for running any activities # Data factory name. Must be globally unique $LinkedDataFactoryName = "[Linked Data factory name]" $LinkedIntegrationRuntimeName = "[Linked Integration Runtime Name]" $LinkedIntegrationRuntimeDescription = "[Description for Linked Integration Runtime]"
Sign in and select a subscription. Add the following code to the script to sign in and select your Azure subscription:
Connect-AzAccount Select-AzSubscription -SubscriptionName $SubscriptionName
Create a resource group and a data factory.
Note
This step is optional. If you already have a data factory, skip this step.
Create an Azure resource group by using the New-AzResourceGroup command. A resource group is a logical container into which Azure resources are deployed and managed as a group. The following example creates a resource group named
myResourceGroup
in the WestEurope location:New-AzResourceGroup -Location $DataFactoryLocation -Name $ResourceGroupName
Run the following command to create a data factory:
Set-AzDataFactoryV2 -ResourceGroupName $ResourceGroupName ` -Location $DataFactoryLocation ` -Name $SharedDataFactoryName
Create a self-hosted integration runtime
Note
This step is optional. If you already have the self-hosted integration runtime that you want to share with other data factories, skip this step.
Run the following command to create a self-hosted integration runtime:
$SharedIR = Set-AzDataFactoryV2IntegrationRuntime `
-ResourceGroupName $ResourceGroupName `
-DataFactoryName $SharedDataFactoryName `
-Name $SharedIntegrationRuntimeName `
-Type SelfHosted `
-Description $SharedIntegrationRuntimeDescription
Get the integration runtime authentication key and register a node
Run the following command to get the authentication key for the self-hosted integration runtime:
Get-AzDataFactoryV2IntegrationRuntimeKey `
-ResourceGroupName $ResourceGroupName `
-DataFactoryName $SharedDataFactoryName `
-Name $SharedIntegrationRuntimeName
The response contains the authentication key for this self-hosted integration runtime. You use this key when you register the integration runtime node.
Install and register the self-hosted integration runtime
Download the self-hosted integration runtime installer from Azure Data Factory Integration Runtime.
Run the installer to install the self-hosted integration on a local computer.
Register the new self-hosted integration with the authentication key that you retrieved in a previous step.
Share the self-hosted integration runtime with another data factory
Create another data factory
Note
This step is optional. If you already have the data factory that you want to share with, skip this step. But in order to add or remove role assignments to other data factory, you must have Microsoft.Authorization/roleAssignments/write
and Microsoft.Authorization/roleAssignments/delete
permissions, such as User Access Administrator or Owner.
$factory = Set-AzDataFactoryV2 -ResourceGroupName $ResourceGroupName `
-Location $DataFactoryLocation `
-Name $LinkedDataFactoryName
Grant permission
Grant permission to the data factory that needs to access the self-hosted integration runtime you created and registered.
Important
Do not skip this step!
New-AzRoleAssignment `
-ObjectId $factory.Identity.PrincipalId ` #MSI of the Data Factory with which it needs to be shared
-RoleDefinitionName 'Contributor' `
-Scope $SharedIR.Id
Create a linked self-hosted integration runtime
Run the following command to create a linked self-hosted integration runtime:
Set-AzDataFactoryV2IntegrationRuntime `
-ResourceGroupName $ResourceGroupName `
-DataFactoryName $LinkedDataFactoryName `
-Name $LinkedIntegrationRuntimeName `
-Type SelfHosted `
-SharedIntegrationRuntimeResourceId $SharedIR.Id `
-Description $LinkedIntegrationRuntimeDescription
Now you can use this linked integration runtime in any linked service. The linked integration runtime uses the shared integration runtime to run activities.
Revoke integration runtime sharing from a data factory
To revoke the access of a data factory from the shared integration runtime, run the following command:
Remove-AzRoleAssignment `
-ObjectId $factory.Identity.PrincipalId `
-RoleDefinitionName 'Contributor' `
-Scope $SharedIR.Id
To remove the existing linked integration runtime, run the following command against the shared integration runtime:
Remove-AzDataFactoryV2IntegrationRuntime `
-ResourceGroupName $ResourceGroupName `
-DataFactoryName $SharedDataFactoryName `
-Name $SharedIntegrationRuntimeName `
-LinkedDataFactoryName $LinkedDataFactoryName
Monitoring
Shared IR
Linked IR
Known limitations of self-hosted IR sharing
The data factory in which a linked IR is created must have an Managed Identity. By default, the data factories created in the Azure portal or PowerShell cmdlets have an implicitly created Managed Identity. But when a data factory is created through an Azure Resource Manager template or SDK, you must set the Identity property explicitly. This setting ensures that Resource Manager creates a data factory that contains a Managed Identity.
The Data Factory .NET SDK that supports this feature must be version 1.1.0 or later.
To grant permission, you need the Owner role or the inherited Owner role in the data factory where the shared IR exists.
The sharing feature works only for data factories within the same Microsoft Entra tenant.
For Microsoft Entra ID guest users, the search functionality in the UI, which lists all data factories by using a search keyword, doesn't work. But as long as the guest user is the owner of the data factory, you can share the IR without the search functionality. For the Managed Identity of the data factory that needs to share the IR, enter that Managed Identity in the Assign Permission box and select Add in the Data Factory UI.
Note
This feature is available only in Data Factory V2.