Create datastores

APPLIES TO: Azure CLI ml extension v2 (current) Python SDK azure-ai-ml v2 (current)

In this article, learn how to connect to Azure data storage services with Azure Machine Learning datastores.

Prerequisites

Note

Azure Machine Learning datastores do not create the underlying storage account resources. Instead, they link an existing storage account for Azure Machine Learning use. Azure Machine Learning datastores are not required for this. If you have access to the underlying data, you can use storage URIs directly.

Create an Azure Blob datastore

from azure.ai.ml.entities import AzureBlobDatastore
from azure.ai.ml import MLClient

ml_client = MLClient.from_config()

store = AzureBlobDatastore(
    name="",
    description="",
    account_name="",
    container_name=""
)

ml_client.create_or_update(store)

Create an Azure Data Lake Gen2 datastore

from azure.ai.ml.entities import AzureDataLakeGen2Datastore
from azure.ai.ml import MLClient

ml_client = MLClient.from_config()

store = AzureDataLakeGen2Datastore(
    name="",
    description="",
    account_name="",
    filesystem=""
)

ml_client.create_or_update(store)

Create an Azure Files datastore

from azure.ai.ml.entities import AzureFileDatastore
from azure.ai.ml.entities import AccountKeyConfiguration
from azure.ai.ml import MLClient

ml_client = MLClient.from_config()

store = AzureFileDatastore(
    name="file_example",
    description="Datastore pointing to an Azure File Share.",
    account_name="mytestfilestore",
    file_share_name="my-share",
    credentials=AccountKeyConfiguration(
        account_key= "XXXxxxXXXxXXXXxxXXXXXxXXXXXxXxxXxXXXxXXXxXXxxxXXxxXXXxXxXXXxxXxxXXXXxxxxxXXxxxxxxXXXxXXX"
    ),
)

ml_client.create_or_update(store)

Create an Azure Data Lake Gen1 datastore

from azure.ai.ml.entities import AzureDataLakeGen1Datastore
from azure.ai.ml import MLClient

ml_client = MLClient.from_config()

store = AzureDataLakeGen1Datastore(
    name="",
    store_name="",
    description="",
)

ml_client.create_or_update(store)

Create a OneLake (Microsoft Fabric) datastore (preview)

This section describes the creation of a OneLake datastore using various options. The OneLake datastore is part of Microsoft Fabric. At this time, Azure Machine Learning supports connecting to Microsoft Fabric Lakehouse artifacts that includes folders/ files and Amazon S3 shortcuts. For more information about Lakehouse, see What is a lakehouse in Microsoft Fabric.

To create a OneLake datastore, you need

  • Endpoint
  • Fabric workspace name or GUID
  • Artifact name or GUID

information from your Microsoft Fabric instance. These three screenshots describe retrieval of these required information resources from your Microsoft Fabric instance:

OneLake workspace name

In your Microsoft Fabric instance, you can find the workspace information as shown in this screenshot. You can use either a GUID value, or a "friendly name" to create an Azure Machine Learning OneLake datastore.

Screenshot that shows Fabric Workspace details in Microsoft Fabric UI.

OneLake endpoint

In your Microsoft Fabric instance, you can find the endpoint information as shown in this screenshot:

Screenshot that shows Fabric endpoint details in Microsoft Fabric UI.

OneLake artifact name

In your Microsoft Fabric instance, you can find the artifact information as shown in this screenshot. You can use either a GUID value, or a "friendly name" to create an Azure Machine Learning OneLake datastore, as shown in this screenshot:

Screenshot showing how to get Fabric LH artifact details in Microsoft Fabric UI.

Create a OneLake datastore

from azure.ai.ml.entities import OneLakeDatastore, OneLakeArtifact
from azure.ai.ml import MLClient

ml_client = MLClient.from_config()

store = OneLakeDatastore(
    name="onelake_example_id",
    description="Datastore pointing to an Microsoft fabric artifact.",
    one_lake_workspace_name="AzureML_Sample_OneLakeWS",
    endpoint="msit-onelake.dfs.fabric.microsoft.com"
    artifact = OneLakeArtifact(
        name="AzML_Sample_LH",
        type="lake_house"
    )
)

ml_client.create_or_update(store)

Next steps