Service-to-service authentication with Azure Data Lake Storage Gen1 using Python
In this article, you learn about how to use the Python SDK to do service-to-service authentication with Azure Data Lake Storage Gen1. For end-user authentication with Data Lake Storage Gen1 using Python, see End-user authentication with Data Lake Storage Gen1 using Python.
Python. You can download Python from here. This article uses Python 3.6.2.
An Azure subscription. See Get Azure free trial.
Create a Microsoft Entra ID "Web" Application. You must have completed the steps in Service-to-service authentication with Data Lake Storage Gen1 using Microsoft Entra ID.
To work with Data Lake Storage Gen1 using Python, you need to install three modules.
- The
azure-mgmt-resource
module, which includes Azure modules for Active Directory, etc. - The
azure-mgmt-datalake-store
module, which includes the Data Lake Storage Gen1 account management operations. For more information on this module, see Azure Data Lake Storage Gen1 Management module reference. - The
azure-datalake-store
module, which includes the Data Lake Storage Gen1 filesystem operations. For more information on this module, see azure-datalake-store Filesystem module reference.
Use the following commands to install the modules.
pip install azure-mgmt-resource
pip install azure-mgmt-datalake-store
pip install azure-datalake-store
In the IDE of your choice create a new Python application, for example, mysample.py.
Add the following snippet to import the required modules:
## Use this for Azure AD authentication from msrestazure.azure_active_directory import AADTokenCredentials ## Required for Data Lake Storage Gen1 account management from azure.mgmt.datalake.store import DataLakeStoreAccountManagementClient from azure.mgmt.datalake.store.models import DataLakeStoreAccount ## Required for Data Lake Storage Gen1 filesystem management from azure.datalake.store import core, lib, multithread # Common Azure imports import adal from azure.mgmt.resource.resources import ResourceManagementClient from azure.mgmt.resource.resources.models import ResourceGroup ## Use these as needed for your application import logging, getpass, pprint, uuid, time
Save changes to mysample.py.
Use this snippet to authenticate with Microsoft Entra ID for account management operations on Data Lake Storage Gen1 such as create a Data Lake Storage Gen1 account, delete a Data Lake Storage Gen1 account, etc. The following snippet can be used to authenticate your application non-interactively, using the client secret for an application / service principal of an existing Microsoft Entra ID "Web App" application.
authority_host_uri = 'https://login.microsoftonline.com'
tenant = '<TENANT>'
authority_uri = authority_host_uri + '/' + tenant
RESOURCE = 'https://management.core.windows.net/'
client_id = '<CLIENT_ID>'
client_secret = '<CLIENT_SECRET>'
context = adal.AuthenticationContext(authority_uri, api_version=None)
mgmt_token = context.acquire_token_with_client_credentials(RESOURCE, client_id, client_secret)
armCreds = AADTokenCredentials(mgmt_token, client_id, resource=RESOURCE)
Use the following snippet to authenticate with Microsoft Entra ID for filesystem operations on Data Lake Storage Gen1 such as create folder, upload file, etc. The following snippet can be used to authenticate your application non-interactively, using the client secret for an application / service principal. Use this with an existing Microsoft Entra ID "Web App" application.
tenant = '<TENANT>'
RESOURCE = 'https://datalake.azure.net/'
client_id = '<CLIENT_ID>'
client_secret = '<CLIENT_SECRET>'
adlCreds = lib.auth(tenant_id = tenant,
client_secret = client_secret,
client_id = client_id,
resource = RESOURCE)
In this article, you learned how to use service-to-service authentication to authenticate with Data Lake Storage Gen1 using Python. You can now look at the following articles that talk about how to use Python to work with Data Lake Storage Gen1.