Managed identity for Azure Data Factory

APPLIES TO: Azure Data Factory Azure Synapse Analytics

Tip

Try out Data Factory in Microsoft Fabric, an all-in-one analytics solution for enterprises. Microsoft Fabric covers everything from data movement to data science, real-time analytics, business intelligence, and reporting. Learn how to start a new trial for free!

This article helps you understand managed identity (formerly known as Managed Service Identity/MSI) and how it works in Azure Data Factory.

Note

We recommend that you use the Azure Az PowerShell module to interact with Azure. To get started, see Install Azure PowerShell. To learn how to migrate to the Az PowerShell module, see Migrate Azure PowerShell from AzureRM to Az.

Overview

Managed identities eliminate the need to manage credentials. Managed identities provide an identity for the service instance when connecting to resources that support Microsoft Entra authentication. For example, the service can use a managed identity to access resources like Azure Key Vault, where data admins can securely store credentials or access storage accounts. The service uses the managed identity to obtain Microsoft Entra tokens.

There are two types of supported managed identities:

  • System-assigned: You can enable a managed identity directly on a service instance. When you allow a system-assigned managed identity during the creation of the service, an identity is created in Microsoft Entra tied to that service instance's lifecycle. By design, only that Azure resource can use this identity to request tokens from Microsoft Entra ID. So when the resource is deleted, Azure automatically deletes the identity for you.
  • User-assigned: You may also create a managed identity as a standalone Azure resource. You can create a user-assigned managed identity and assign it to one or more instances of a data factory. In user-assigned managed identities, the identity is managed separately from the resources that use it.

Note

Trusted bypass cannot utilize user-assigned managed identities. It can only employ system-assigned managed identities for connecting to Azure Storage and Azure Key Vault.

Managed identity provides the below benefits:

  • Store credential in Azure Key Vault, in which case-managed identity is used for Azure Key Vault authentication.
  • Access data stores or computes using managed identity authentication, including Azure Blob storage, Azure Data Explorer, Azure Data Lake Storage Gen1, Azure Data Lake Storage Gen2, Azure SQL Database, Azure SQL Managed Instance, Azure Synapse Analytics, REST, Databricks activity, Web activity, and more. Check the connector and activity articles for details.
  • Managed identity is also used to encrypt/decrypt data and metadata using the customer-managed key stored in Azure Key Vault, providing double encryption.

Required Roles for Managed Identities

To effectively use managed identities in Azure Data Factory, specific roles must be assigned to ensure proper access and functionality. Below are the roles required:

  • System-Assigned Managed Identity

    • Reader Role: This role is necessary to read the metadata of the resources.
    • Contributor Role: This role is required to manage the resources that the managed identity needs to access.
  • User-Assigned Managed Identity

    • Managed Identity Operator Role: This role allows the management of the user-assigned managed identity.
    • Reader Role: This role is necessary to read the metadata of the resources.
    • Contributor Role: This role is required to manage the resources that the managed identity needs to access.

System-assigned managed identity

Note

System-assigned managed identity is also referred to as 'Managed identity' elsewhere in the documentation and in the Data Factory Studio for backward compatibility purpose. We will explicitly mention 'User-assigned managed identity' when referring to it.

Generate system-assigned managed identity

System-assigned managed identity is generated as follows:

  • When creating a data factory through Azure portal or PowerShell, managed identity will always be created automatically.
  • When creating data factory through SDK, managed identity will be created only if you specify "Identity = new FactoryIdentity()" in the factory object for creation." See example in .NET Quickstart - Create data factory.
  • When creating a data factory through REST API, managed identity will be created only if you specify "identity" section in request body. See example in REST quickstart - create data factory.

If you find your service instance doesn't have a managed identity associated following retrieve managed identity instruction, you can explicitly generate one by updating it with identity initiator programmatically:

Note

  • Managed identity cannot be modified. Updating a service instance which already has a managed identity won't have any impact, and the managed identity is kept unchanged.
  • If you update a service instance which already has a managed identity without specifying the "identity" parameter in the factory objects or without specifying "identity" section in REST request body, you will get an error.
  • When you delete a service instance, the associated managed identity will also be deleted.

Generate system-assigned managed identity using PowerShell

Call Set-AzDataFactoryV2 command, then you see "Identity" fields being newly generated:

PS C:\> Set-AzDataFactoryV2 -ResourceGroupName <resourceGroupName> -Name <dataFactoryName> -Location <region>

DataFactoryName   : ADFV2DemoFactory
DataFactoryId     : /subscriptions/<subsID>/resourceGroups/<resourceGroupName>/providers/Microsoft.DataFactory/factories/ADFV2DemoFactory
ResourceGroupName : <resourceGroupName>
Location          : East US
Tags              : {}
Identity          : Microsoft.Azure.Management.DataFactory.Models.FactoryIdentity
ProvisioningState : Succeeded

Generate system-assigned managed identity using REST API

Note

If you attempt to update a service instance that already has a managed identity without either specifying the identity parameter in the factory object or providing an identity section in the REST request body, you will get an error.

Call the API below with the "identity" section in the request body:

PATCH https://management.azure.com/subscriptions/<subsID>/resourceGroups/<resourceGroupName>/providers/Microsoft.DataFactory/factories/<data factory name>?api-version=2018-06-01

Request body: add "identity": { "type": "SystemAssigned" }.

{
    "name": "<dataFactoryName>",
    "location": "<region>",
    "properties": {},
    "identity": {
        "type": "SystemAssigned"
    }
}

Response: managed identity is created automatically, and "identity" section is populated accordingly.

{
    "name": "<dataFactoryName>",
    "tags": {},
    "properties": {
        "provisioningState": "Succeeded",
        "loggingStorageAccountKey": "**********",
        "createTime": "2017-09-26T04:10:01.1135678Z",
        "version": "2018-06-01"
    },
    "identity": {
        "type": "SystemAssigned",
        "principalId": "765ad4ab-XXXX-XXXX-XXXX-51ed985819dc",
        "tenantId": "72f988bf-XXXX-XXXX-XXXX-2d7cd011db47"
    },
    "id": "/subscriptions/<subscriptionId>/resourceGroups/<resourceGroupName>/providers/Microsoft.DataFactory/factories/<dataFactoryName>",
    "type": "Microsoft.DataFactory/factories",
    "location": "<region>"
}

Generate system-assigned managed identity using an Azure Resource Manager template

Template: add "identity": { "type": "SystemAssigned" }.

{
    "contentVersion": "1.0.0.0",
    "$schema": "https://schema.management.azure.com/schemas/2019-04-01/deploymentTemplate.json#",
    "resources": [{
        "name": "<dataFactoryName>",
        "apiVersion": "2018-06-01",
        "type": "Microsoft.DataFactory/factories",
        "location": "<region>",
        "identity": {
            "type": "SystemAssigned"
        }
    }]
}

Generate system-assigned managed identity using SDK

Call the create_or_update function with Identity=new FactoryIdentity(). Sample code using .NET:

Factory dataFactory = new Factory
{
    Location = <region>,
    Identity = new FactoryIdentity()
};
client.Factories.CreateOrUpdate(resourceGroup, dataFactoryName, dataFactory);

Retrieve system-assigned managed identity

You can retrieve the managed identity from Azure portal or programmatically. The following sections show some samples.

Tip

If you don't see the managed identity, generate managed identity by updating your service instance.

Retrieve system-assigned managed identity using Azure portal

You can find the managed identity information from Azure portal -> your data factory -> Properties.

Shows the Azure portal with the system-managed identity object ID and Identity Tenant for an Azure Data Factory.

  • Managed Identity Object ID
  • Managed Identity Tenant

The managed identity information will also show up when you create linked service, which supports managed identity authentication, like Azure Blob, Azure Data Lake Storage, Azure Key Vault, etc.

To grant permissions, follow these steps. For detailed steps, see Assign Azure roles using the Azure portal.

  1. Select Access control (IAM).

  2. Select Add > Add role assignment.

    Screenshot that shows Access control (IAM) page with Add role assignment menu open.

  3. On the Members tab, select Managed identity, and then select Select members.

  4. Select your Azure subscription.

  5. Under System-assigned managed identity, select Data Factory, and then select a data factory. You can also use the object ID or data factory name (as the managed-identity name) to find this identity. To get the managed identity's application ID, use PowerShell.

  6. On the Review + assign tab, select Review + assign to assign the role.

Retrieve system-assigned managed identity using PowerShell

The managed identity principal ID and tenant ID will be returned when you get a specific service instance as follows. Use the PrincipalId to grant access:

PS C:\> (Get-AzDataFactoryV2 -ResourceGroupName <resourceGroupName> -Name <dataFactoryName>).Identity

PrincipalId                          TenantId
-----------                          --------
765ad4ab-XXXX-XXXX-XXXX-51ed985819dc 72f988bf-XXXX-XXXX-XXXX-2d7cd011db47

You can get the application ID by copying above principal ID, then running below Microsoft Entra ID command with principal ID as parameter.

PS C:\> Get-AzADServicePrincipal -ObjectId 765ad4ab-XXXX-XXXX-XXXX-51ed985819dc

ServicePrincipalNames : {76f668b3-XXXX-XXXX-XXXX-1b3348c75e02, https://identity.azure.net/P86P8g6nt1QxfPJx22om8MOooMf/Ag0Qf/nnREppHkU=}
ApplicationId         : 76f668b3-XXXX-XXXX-XXXX-1b3348c75e02
DisplayName           : ADFV2DemoFactory
Id                    : 765ad4ab-XXXX-XXXX-XXXX-51ed985819dc
Type                  : ServicePrincipal

Retrieve managed identity using REST API

The managed identity principal ID and tenant ID will be returned when you get a specific service instance as follows.

Call below API in the request:

GET https://management.azure.com/subscriptions/{subscriptionId}/resourceGroups/{resourceGroupName}/providers/Microsoft.DataFactory/factories/{factoryName}?api-version=2018-06-01

Response: You’ll get response like shown in below example. The "identity" section is populated accordingly.

{
    "name":"<dataFactoryName>",
    "identity":{
        "type":"SystemAssigned",
        "principalId":"554cff9e-XXXX-XXXX-XXXX-90c7d9ff2ead",
        "tenantId":"72f988bf-XXXX-XXXX-XXXX-2d7cd011db47"
    },
    "id":"/subscriptions/<subscriptionId>/resourceGroups/<resourceGroupName>/providers/Microsoft.DataFactory/factories/<dataFactoryName>",
    "type":"Microsoft.DataFactory/factories",
    "properties":{
        "provisioningState":"Succeeded",
        "createTime":"2020-02-12T02:22:50.2384387Z",
        "version":"2018-06-01",
        "factoryStatistics":{
            "totalResourceCount":0,
            "maxAllowedResourceCount":0,
            "factorySizeInGbUnits":0,
            "maxAllowedFactorySizeInGbUnits":0
        }
    },
    "eTag":"\"03006b40-XXXX-XXXX-XXXX-5e43617a0000\"",
    "location":"<region>",
    "tags":{

    }
}

Tip

To retrieve the managed identity from an ARM template, add an outputs section in the ARM JSON:

{
    "outputs":{
        "managedIdentityObjectId":{
            "type":"string",
            "value":"[reference(resourceId('Microsoft.DataFactory/factories', parameters('<dataFactoryName>')), '2018-06-01', 'Full').identity.principalId]"
        }
    }
}

User-assigned managed identity

You can create, delete, manage user-assigned managed identities in Microsoft Entra ID. For more details refer to Create, list, delete, or assign a role to a user-assigned managed identity using the Azure portal.

In order to use a user-assigned managed identity, you must first create credentials in your service instance for the UAMI.

See the following topics that introduce when and how to use managed identity:

See Managed Identities for Azure Resources Overview for more background on managed identities for Azure resources, on which managed identity in Azure Data Factory is based.

See Limitations of managed identities, which also apply to managed identities in Azure Data Factory.