AzureDataLakeGen2Datastore Class
Represents a datastore that saves connection information to Azure Data Lake Storage Gen2.
To create a datastore that saves connection information to Azure Data Lake Storage, use the
register_azure_data_lake_gen2
method of the Datastore class.
To access data from an AzureDataLakeGen2Datastore object, create a Dataset and use one of the methods like from_files for a FileDataset. For more information, see Create Azure Machine Learning datasets.
Also keep in mind:
The AzureDataLakeGen2 class does not provide upload method, recommended way to uploading data to AzureDataLakeGen2 datastores is via Dataset upload. More details could be found at : https://docs.microsoft.com/azure/machine-learning/how-to-create-register-datasets
When using a datastore to access data, you must have permission to access the data, which depends on the credentials registered with the datastore.
When using Service Principal Authentication to access storage via AzureDataLakeGen2, the service principal or app registration must be assigned the specific role-based access control (RBAC) role at minimum of "Storage Blob Data Reader". For more information, see Storage built-in roles.
Initialize a new Azure Data Lake Gen2 Datastore.
- Inheritance
-
AzureDataLakeGen2Datastore
Constructor
AzureDataLakeGen2Datastore(workspace, name, container_name, account_name, tenant_id=None, client_id=None, client_secret=None, resource_url=None, authority_url=None, protocol=None, endpoint=None, service_data_access_auth_identity=None)
Parameters
- resource_url
- str
The resource url, which determines what operations will be performed on the Data Lake Store.
- protocol
- str
The protocol to use to connect to the blob container. If None, defaults to https.
- endpoint
- str
The endpoint of the blob container. If None, defaults to core.windows.net.
- service_data_access_auth_identity
- str or <xref:_restclient.models.ServiceDataAccessAuthIdentity>
Indicates which identity to use to authenticate service data access to customer's storage. Possible values include: 'None', 'WorkspaceSystemAssignedIdentity', 'WorkspaceUserAssignedIdentity'
- resource_url
- str
The resource url, which determines what operations will be performed on the Data Lake Store.
- protocol
- str
The protocol to use to connect to the blob container. If None, defaults to https.
- service_data_access_auth_identity
- str or <xref:_restclient.models.ServiceDataAccessAuthIdentity>
Indicates which identity to use to authenticate service data access to customer's storage. Possible values include: 'None', 'WorkspaceSystemAssignedIdentity', 'WorkspaceUserAssignedIdentity'
Feedback
https://aka.ms/ContentUserFeedback.
Coming soon: Throughout 2024 we will be phasing out GitHub Issues as the feedback mechanism for content and replacing it with a new feedback system. For more information see:Submit and view feedback for