In this article, learn how to connect to data sources located outside of Azure, to make that data available to Azure Machine Learning services. Azure connections serve as key vault proxies, and interactions with connections are direct interactions with an Azure key vault. An Azure Machine Learning connection securely stores username and password data resources, as secrets, in a key vault. The key vault RBAC controls access to these data resources. For this data availability, Azure supports connections to these external sources:
Snowflake DB
Amazon S3
Azure SQL DB
Important
This feature is currently in public preview. This preview version is provided without a service-level agreement, and we don't recommend it for production workloads. Certain features might not be supported or might have constrained capabilities.
An Azure Machine Learning connection securely stores the credentials passed during connection creation in the Workspace Azure Key Vault. A connection references the credentials from the key vault storage location for further use. You don't need to directly deal with the credentials after they are stored in the key vault. You have the option to store the credentials in the YAML file. A CLI command or SDK can override them. We recommend that you avoid credential storage in a YAML file, because a security breach could lead to a credential leak.
Note
For a successful data import, please verify that you installed the latest azure-ai-ml package (version 1.5.0 or later) for SDK, and the ml extension (version 2.15.1 or later).
If you have an older SDK package or CLI extension, please remove the old one and install the new one with the code shown in the tab section. Follow the instructions for SDK and CLI as shown here:
This YAML file creates a Snowflake DB connection. Be sure to update the appropriate values:
YAML
# my_snowflakedb_connection.yaml$schema:http://azureml/sdk-2-0/Connection.jsontype:snowflakename:my-sf-db-connection# add your datastore name heretarget:jdbc:snowflake://<myaccount>.snowflakecomputing.com/?db=<mydb>&warehouse=<mywarehouse>&role=<myrole># add the Snowflake account, database, warehouse name and role name here. If no role name provided it will default to PUBLICcredentials: type:username_password username:<username># add the Snowflake database user name here or leave this blank and type in CLI command line password:<password># add the Snowflake database password here or leave this blank and type in CLI command line
Create the Azure Machine Learning connection in the CLI:
Option 1: Use the username and password in YAML file
Azure CLI
az ml connection create --file my_snowflakedb_connection.yaml
Option 2: Override the username and password at the command line
Azure CLI
az ml connection create --file my_snowflakedb_connection.yaml --set credentials.
username="<username>" credentials.
password="<password>"
Option 2: Use WorkspaceConnection() in a Python script
Python
from azure.ai.ml import MLClient
from azure.ai.ml.entities import WorkspaceConnection
from azure.ai.ml.entities import UsernamePasswordConfiguration
# If using username/password, the name/password values should be url-encodedimport urllib.parse
username = urllib.parse.quote(os.environ["SNOWFLAKEDB_USERNAME"], safe="")
password = urllib.parse.quote(os.environ["SNOWFLAKEDB_PASSWORD"], safe="")
target= "jdbc:snowflake://<myaccount>.snowflakecomputing.com/?db=<mydb>&warehouse=<mywarehouse>&role=<myrole>"# add the Snowflake account, database, warehouse name and role name here. If no role name provided it will default to PUBLIC
name= <my_snowflake_connection> # name of the connection
wps_connection = WorkspaceConnection(name= name,
type="snowflake",
target= target,
credentials= UsernamePasswordConfiguration(username=username, password=password)
)
ml_client.connections.create_or_update(workspace_connection=wps_connection)
Under Assets in the left navigation, select Data. Next, select the Data Connection tab. Then select Create as shown in this screenshot:
In the Create connection pane, fill in the values as shown in the screenshot. Choose Snowflake for the category, and Username password for the Authentication type. Be sure to specify the Target textbox value in this format, filling in your specific values between the < > characters:
Select Save to securely store the credentials in the key vault associated with the relevant workspace. This connection is used when running a data import job.
Create a Snowflake DB connection that uses OAuth
The information in this section describes how to create a Snowflake DB connection that uses OAuth to authenticate.
Important
Before following the steps in this section, you must first Configure Azure to issue OAuth tokens on behalf of the client. This configuration creates a service principal, which is required for the OAuth connection. You need the following information to create the connection:
Client ID: The ID of the service principal
Client Secret: The secret of the service principal
Tenant ID: The ID of the Microsoft Entra ID tenant
This YAML file creates a Snowflake DB connection that uses OAuth. Be sure to update the appropriate values:
YAML
# my_snowflakedb_connection.yamlname:snowflake_service_principal_connectiontype:snowflake# Add the Snowflake account, database, warehouse name, and role name here. If no role name is provided, it will default to PUBLIC.target:jdbc:snowflake://<myaccount>.snowflakecomputing.com/?db=<mydb>&warehouse=<mywarehouse>&scope=<scopeForServicePrincipal>credentials: type:service_principal client_id:<client-id># The service principal's client id client_secret:<client-secret># The service principal's client secret tenant_id:<tenant-id># The Microsoft Entra ID tenant id
Create the Azure Machine Learning connection in the CLI:
Azure CLI
az ml connection create --file my_snowflakedb_connection.yaml
You can also override the information in the YAML file at the command line:
Azure CLI
az ml connection create --file my_snowflakedb_connection.yaml --setcredentials.client_id="my-client-id"credentials.client_secret="my-client-secret"credentials.tenant_id="my-tenant-id"
With the Python SDK, you can create a connection by loading the connection information stored in the YAML file. You can optionally override the values:
You can also directly specify the connection information in a Python script without relying on a YAML file:
Python
from azure.ai.ml import MLClient
from azure.ai.ml.entities import WorkspaceConnection
from azure.ai.ml.entities import ServicePrincipalConfiguration
target= "jdbc:snowflake://<myaccount>.snowflakecomputing.com/?db=<mydb>&warehouse=<mywarehouse>&role=<myrole>"# add the Snowflake account, database, warehouse name and role name here. If no role name provided it will default to PUBLIC
name= <my_snowflake_connection> # name of the connection
auth = ServicePrincipalConfiguration(client_id="<my-client-id>", client_secret="<my-client-secret>", tenant_id="<my-tenant-id>")
wps_connection = WorkspaceConnection(name= name,
type="snowflake",
target=target,
credentials=auth
)
ml_client.connections.create_or_update(workspace_connection=wps_connection)
Note
Creation of a Snowflake DB connection using a service principal (for OAuth) is only available through the Azure CLI or Python SDK.
This YAML script creates an Azure SQL DB connection. Be sure to update the appropriate values:
YAML
# my_sqldb_connection.yaml$schema:http://azureml/sdk-2-0/Connection.jsontype:azure_sql_dbname:my-sqldb-connectiontarget:Server=tcp:<myservername>,<port>;Database=<mydatabase>;Trusted_Connection=False;Encrypt=True;ConnectionTimeout=30# add the sql servername, port addresss and databasecredentials: type:sql_auth username:<username># add the sql database user name here or leave this blank and type in CLI command line password:<password># add the sql database password here or leave this blank and type in CLI command line
Create the Azure Machine Learning connection in the CLI:
Option 1: Use the username / password from YAML file
Azure CLI
az ml connection create --file my_sqldb_connection.yaml
Option 2: Override the username and password in YAML file
Azure CLI
az ml connection create --file my_sqldb_connection.yaml --set credentials.
username="<username>" credentials.
password="<password>"
from azure.ai.ml import MLClient
from azure.ai.ml.entities import WorkspaceConnection
from azure.ai.ml.entities import UsernamePasswordConfiguration
# If using username/password, the name/password values should be url-encodedimport urllib.parse
username = urllib.parse.quote(os.environ["MYSQL_USERNAME"], safe="")
password = urllib.parse.quote(os.environ["MYSQL_PASSWORD"], safe="")
target= "Server=tcp:<myservername>,<port>;Database=<mydatabase>;Trusted_Connection=False;Encrypt=True;Connection Timeout=30"# add the sql servername, port address and database
name= <my_sql_connection> # name of the connection
wps_connection = WorkspaceConnection(name= name,
type="azure_sql_db",
target= target,
credentials= UsernamePasswordConfiguration(username=username, password=password)
)
ml_client.connections.create_or_update(workspace_connection=wps_connection)
Under Assets in the left navigation, select Data. Next, select the Data Connection tab. Then select Create as shown in this screenshot:
In the Create connection pane, fill in the values as shown in the screenshot. Choose AzureSqlDb for the category, and Username password for the Authentication type. Be sure to specify the Target textbox value in this format, filling in your specific values between the < > characters:
Option 2: Use WorkspaceConnection() in a Python script
Python
from azure.ai.ml import MLClient
from azure.ai.ml.entities import WorkspaceConnection
from azure.ai.ml.entities import AccessKeyConfiguration
target=<mybucket> # add the s3 bucket details
name=<my_s3_connection> # name of the connection
wps_connection=WorkspaceConnection(name=name,
type="s3",
target= target,
credentials= AccessKeyConfiguration(access_key_id="XXXJ5kL6mN7oP8qR9sT0uV1wX2yZ3aB4cXXX",acsecret_access_key="C2dE3fH4iJ5kL6mN7oP8qR9sT0uV1w")
)
ml_client.connections.create_or_update(workspace_connection=wps_connection)
Under Assets in the left navigation, select Data. Next, select the Data Connection tab. Then select Create as shown in this screenshot:
In the Create connection pane, fill in the values as shown in the screenshot. Choose S3 for the category, and Username password for the Authentication type. Be sure to specify the Target textbox value in this format, filling in your specific values between the < > characters:
<target>
Non-data connections
You can use these connection types to connect to Git:
Python feed
Azure Container Registry
a connection that uses an API key
These connections aren't data connections, but are used to connect to external services for use in your code.
Create the Azure Machine Learning connection in the CLI:
Azure CLI
az ml connection create --file connection.yaml
The following example creates an Azure Container Registry connection:
Python
from azure.ai.ml.entities import WorkspaceConnection
from azure.ai.ml.entities import UsernamePasswordConfiguration
# If using username/password, the name/password values should be url-encodedimport urllib.parse
username = os.environ["REGISTRY_USERNAME"]
password = os.environ["REGISTRY_PASSWORD"]
name = "my_acr_conn"
target = "https://iJ5kL6mN7.core.windows.net/mycontainer"
wps_connection = WorkspaceConnection(
name=name,
type="container_registry",
target=target,
credentials=UsernamePasswordConfiguration(username=username, password=password),
)
ml_client.connections.create_or_update(workspace_connection=wps_connection)
You can't create an Azure Container Registry connection in studio.
API key
The following example creates an API key connection:
Python
from azure.ai.ml.entities import WorkspaceConnection
from azure.ai.ml.entities import UsernamePasswordConfiguration, ApiKeyConfiguration
name = "my_api_key"
target = "https://L6mN7oP8q.core.windows.net/mycontainer"
wps_connection = WorkspaceConnection(
name=name,
type="apikey",
target=target,
credentials=ApiKeyConfiguration(key="9sT0uV1wX"),
)
ml_client.connections.create_or_update(workspace_connection=wps_connection)
Generic Container Registry
Using the GenericContainerRegistry workspace connection, you can specify an external registry, such as Nexus or Artifactory, for image builds. Environment images are pushed from the specified registry, and the previous cache is ignored.
Create connection from YAML file with your credentials:
Azure CLI
az ml connection create --file connection.yaml --credentialsusername=<username>password=<password>--resource-group my-resource-group--workspace-name my-workspace
Create environment
Azure CLI
az ml environment create --name my-env--version1--file my_env.yml --conda-file conda_dep.yml --image mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu20.04--resource-group my-resource-group--workspace-name my-workspace
You can verify that the environment was successfully created
Azure CLI
az ml environment show --name my-env--version1--resource-group my-resource-group--workspace-name my-workspace
The following example creates a Generic Container Registry connection:
Python
import os
from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential
from azure.ai.ml.entities import Environment
from azure.ai.ml.entities import WorkspaceConnection
from azure.ai.ml.entities import UsernamePasswordConfiguration
from azureml.core.conda_dependencies import CondaDependencies
from azure.ai.ml import command
username = os.environ["REGISTRY_USERNAME"]
password = os.environ["REGISTRY_PASSWORD"]
# Enter details of AML workspace
subscription_id = "<SUBSCRIPTION_ID>"
resource_group = "<RESOURCE_GROUP>"
workspace = "<AML_WORKSPACE_NAME>"
ml_client = MLClient( DefaultAzureCredential(), subscription_id, resource_group, workspace)
credentials = UsernamePasswordConfiguration(username=username, password=password)
# Create GenericContainerRegistry workspace connection for a generic registry
ws_connection = WorkspaceConnection(name="<name>", target="<target>", type="GenericContainerRegistry", credentials=credentials)
ml_client.connections.create_or_update(ws_connection)
# Create an environment
env_docker_conda = Environment(image="<base image>", conda_file="<yml file>", name="docker-image-plus-conda-example", description="Environment created from a Docker image plus Conda environment.")
ml_client.environments.create_or_updat(env_docker_conda)
job = command(command="echo 'hello world'", environment=env_docker_conda,display_name="v2-job-example")
returned_job = ml_client.create_or_update(job)
Manage data ingestion and preparation, model training and deployment, and machine learning solution monitoring with Python, Azure Machine Learning and MLflow.