Редактиране

Споделяне чрез


Set up your Azure Data Lake Storage Gen2 connection

This article outlines the steps to create an Azure Date Lake Storage Gen2 connection.

Supported authentication types

The Azure Date Lake Storage Gen2 connector supports the following authentication types for copy and Dataflow Gen2 respectively.

Authentication type Copy Dataflow Gen2
Account key
Organizational account
Service Principal
Shared Access Signature (SAS)
Workspace Identity

Set up your connection in Dataflow Gen2

Data Factory in Microsoft Fabric uses Power Query connectors to connect Dataflow Gen2 to Azure Data Lake Storage Gen2. The following links provide the specific Power Query connector information you need to connect to Azure Data Lake Storage Gen2 in Dataflow Gen2:

In some cases, the Power Query connector article might include advanced options, troubleshooting, known issues and limitations, and other information that could also prove useful.

Set up your connection in a data pipeline

To create a connection in a data pipeline:

  1. From the page header in Data Integration service, select Settings Settings gear icon > Manage connections and gateways

    Screenshot showing how to open manage gateway.

  2. Select New at the top of the ribbon to add a new data source.

    Screenshot showing the new page.

    The New connection pane shows up on the left side of the page.

    Screenshot showing the New connection pane.

Set up your connection in any Fabric item

  1. In any Fabric item, select the Azure Data Lake Storage Gen2 option in the Get Data selection, and then select Connect.

    Screenshot showing the Connect to data source page of a Fabric item for Azure Data Lake Storage Gen2 of a Fabric item, with the URL entered.

  2. In Connect to data source, enter the URL to your Azure Data Lake Storage Gen2 account. Refer to Limitations to determine the URL to use.

  3. Select whether you want to use the file system view or the Common Data Model folder view.

  4. If needed, select the on-premises data gateway in Data gateway (only supported in Dataflows, Dataflows Gen2, Datamarts, and Semantic Models).

  5. Select Sign in to sign into the Azure Data Lake Storage Gen2 account. You'll be redirected to your organization's sign-in page. Follow the prompts to sign in to the account.

  6. After you've successfully signed in, select Next.

Set up connections for trusted workspace access

  1. Configure a workspace identity in the workspace where the connection will be used. For more information, see Workspace identity.

  2. Grant the workspace identity, organizational account, or service principal access to the storage account. For more information, see Create a OneLake shortcut to storage account with trusted workspace access

  3. Configure a resource instance rule. For more information, see Resource instance rule.

  4. Follow steps from Set up your connection to create the connection.

Considerations and limitations

  • Workspace identity, Organizational Account, and Service Principal are the only supported authentication types.
  • Connections for trusted workspace access only work in OneLake shortcuts and data pipelines.
  • Connections for trusted workspace access can't be created from the Manage Gateways and connections experience.
  • Existing connections that work for trusted workspace access can't be modified in the Manage Gateways and connections experience.
  • Connections to firewall-enabled Storage accounts will have the status Offline in Manage connections and gateways.
  • Checking the status of a connection with workspace identity as the authentication method is not supported.

Set up your connection

Step 1: Specify the new connection name, type, server and full path

Screenshot showing how to set a new connection.

In the New connection pane, choose Cloud, and specify the following fields:

  • Connection name: Specify a name for your connection.
  • Connection type: Select a type for your connection.
  • Server: Enter your Azure Data Lake Storage Gen2 server name. For example, https://contosoadlscdm.dfs.core.windows.net. Specify your Azure Data Lake Storage Gen2 server name. Go to your Azure Data Lake Storage Gen2 account interface, browse to the Endpoints section, and get your Azure Data Lake Storage Gen2.
  • Full path: Enter the full path to your Azure Data Lake Storage Gen2 container name.

Step 2: Select and set your authentication

Under Authentication method, select your authentication from the drop-down list and complete the related configuration. The Azure Data Lake Storage Gen2 connector supports the following authentication types:

Screenshot showing the authentication method for Azure Data Lake Storage Gen2.

Key authentication

Account key: Specify your Azure Data Lake Storage Gen2 account key. Go to your Azure Data Lake Storage Gen2 account interface, browse to the Access key section, and get your account key.

Screenshot showing that key authentication method for Azure Data Lake Storage Gen2.

OAuth2 authentication

Screenshot showing that OAuth2 authentication method for Azure Data Lake Storage Gen2.

Open Edit credentials. The sign-in interface opens. Enter your account and password to sign in to your account. After signing in, you'll come back to the New connection page.

Shared access signature authentication

Screenshot showing that shared access signature authentication method for Azure Data Lake Storage Gen2.

SAS token: Specify the shared access signature token for your Azure Data Lake Storage Gen2 container.

If you don’t have a SAS token, switch to Shared access signature in your Azure Data Lake Storage Gen2 account interface. Under Allowed resource types, select Container, and then select Generate SAS and connection string. You can get your SAS token from the generated content that appears. The shared access signature is a URI that encompasses in its query parameters all the information necessary for authenticated access to a storage resource. To access storage resources with the shared access signature, the client only needs to pass in the shared access signature to the appropriate constructor or method. For more information about shared access signatures, go to Shared access signatures: Understand the shared access signature model.

Service principal authentication

Screenshot showing that service principal authentication method for Azure Data Lake Storage Gen2.

  • Tenant Id: Specify the tenant information (domain name or tenant ID) under which your application resides. Retrieve it by hovering over the upper-right corner of the Azure portal.
  • Service principal ID: Specify the application (client) ID.
  • Service principal key: Specify your application's key.

To use service principal authentication, follow these steps:

  1. Register an application entity in Microsoft Entra ID by following Register your application with a Microsoft Entra tenant. Make note of these values, which you use to define the connection:

    • Tenant ID
    • Application ID
    • Application key
  2. Grant the service principal proper permission. For examples of how permission works in Azure Data Lake Storage Gen2, go to Access control lists on files and directories.

    • As source, in Storage Explorer, grant at least Execute permission for all upstream folders and the file system, along with Read permission for the files to copy. Alternatively, in Access control (IAM), grant at least the Storage Blob Data Reader role.
    • As destination, in Storage Explorer, grant at least Execute permission for all upstream folders and the file system, along with Write permission for the destination folder. Alternatively, in Access control (IAM), grant at least the Storage Blob Data Contributor role.

    Note

    If you use a UI to author and the service principal isn't set with the "Storage Blob Data Reader/Contributor" role in IAM, when doing a test connection or browsing/navigating folders, choose Test connection to file path or Browse from specified path, and then specify a path with Read + Execute permission to continue.

Workspace identity authentication

Workspace identity: Select workspace identity from the authentication method dropdown. A Fabric workspace identity is an automatically managed service principal that can be associated with a Fabric workspace. Fabric workspaces with a workspace identity can securely read or write to Azure Data Lake Storage Gen2 accounts through OneLake shortcuts and data pipelines. When selecting this option in the connector, make sure that the workspace has a workspace identity and that the identity has the ability to read or write to the intended Azure Data Lake Storage Gen2 account. For more information, see Workspace identity

Note

Connections with workspace identity will have the status Offline in Manage connections and gateways. Checking the status of a connection with workspace identity isn't supported.

Step 3: Specify the privacy level you want to apply

In the General tab, select the privacy level that you want apply in the Privacy level drop-down list. Three privacy levels are supported. For more information, go to General.

Screenshot showing that Privacy Level of data lake gen2

Step 4: Create your connection

Select Create. Your creation is successfully tested and saved if all the credentials are correct. If not correct, the creation fails with errors.

Screenshot showing connection page.

Table summary

The connector properties in the following table are supported in pipeline copy.

Name Description Required Property Copy
Connection name A name for your connection. Yes
Connection type Select a type for your connection. Yes
Server Enter the name of Azure Data Lake Storage Gen2 server, for example, https://contosoadlscdm.dfs.core.windows.net. Yes
Full path Enter the full path of your Azure Data Lake Storage Gen2 container name. Yes
Authentication Go to Authentication. Yes Go to Authentication.
Privacy Level The privacy level that you want to apply. Allowed values are Organizational, Privacy, and Public. Yes

Authentication

The properties in the following table are the supported authentication types.

Name Description Required Property Copy
Key
- Account key The Azure Data Lake Storage Gen2 account key. Yes
Shared Access Signature (SAS)
- SAS token Specify the shared access signature token for your Azure Data Lake Storage Gen2 container. Yes
Service Principal
- Tenant ID The tenant information (domain name or tenant ID). Yes
- Service Principal ID The application's client ID. Yes
- Service Principal key The application's key. Yes
Workspace identity