In this article, learn how to connect to data sources located outside of Azure, to make that data available to Azure Machine Learning services. Azure connections serve as key vault proxies, and interactions with connections are direct interactions with an Azure key vault. An Azure Machine Learning connection securely stores username and password data resources, as secrets, in a key vault. The key vault RBAC controls access to these data resources. For this data availability, Azure supports connections to these external sources:
Snowflake DB
Amazon S3
Azure SQL DB
Important
This feature is currently in public preview. This preview version is provided without a service-level agreement, and we don't recommend it for production workloads. Certain features might not be supported or might have constrained capabilities.
An Azure Machine Learning connection securely stores the credentials passed during connection creation in the Workspace Azure Key Vault. A connection references the credentials from the key vault storage location for further use. You don't need to directly deal with the credentials after they are stored in the key vault. You have the option to store the credentials in the YAML file. A CLI command or SDK can override them. We recommend that you avoid credential storage in a YAML file, because a security breach could lead to a credential leak.
Note
For a successful data import, please verify that you installed the latest azure-ai-ml package (version 1.5.0 or later) for SDK, and the ml extension (version 2.15.1 or later).
If you have an older SDK package or CLI extension, please remove the old one and install the new one with the code shown in the tab section. Follow the instructions for SDK and CLI as shown here:
This YAML file creates a Snowflake DB connection. Be sure to update the appropriate values:
# my_snowflakedb_connection.yaml
$schema: http://azureml/sdk-2-0/Connection.json
type: snowflake
name: my-sf-db-connection # add your datastore name here
target: jdbc:snowflake://<myaccount>.snowflakecomputing.com/?db=<mydb>&warehouse=<mywarehouse>&role=<myrole>
# add the Snowflake account, database, warehouse name and role name here. If no role name provided it will default to PUBLIC
credentials:
type: username_password
username: <username> # add the Snowflake database user name here or leave this blank and type in CLI command line
password: <password> # add the Snowflake database password here or leave this blank and type in CLI command line
Create the Azure Machine Learning connection in the CLI:
Option 1: Use the username and password in YAML file
az ml connection create --file my_snowflakedb_connection.yaml
Option 2: Override the username and password at the command line
az ml connection create --file my_snowflakedb_connection.yaml --set credentials.
username="<username>" credentials.
password="<password>"
Option 2: Use WorkspaceConnection() in a Python script
from azure.ai.ml import MLClient
from azure.ai.ml.entities import WorkspaceConnection
from azure.ai.ml.entities import UsernamePasswordConfiguration
# If using username/password, the name/password values should be url-encoded
import urllib.parse
username = urllib.parse.quote(os.environ["SNOWFLAKEDB_USERNAME"], safe="")
password = urllib.parse.quote(os.environ["SNOWFLAKEDB_PASSWORD"], safe="")
target= "jdbc:snowflake://<myaccount>.snowflakecomputing.com/?db=<mydb>&warehouse=<mywarehouse>&role=<myrole>"
# add the Snowflake account, database, warehouse name and role name here. If no role name provided it will default to PUBLIC
name= <my_snowflake_connection> # name of the connection
wps_connection = WorkspaceConnection(name= name,
type="snowflake",
target= target,
credentials= UsernamePasswordConfiguration(username=username, password=password)
)
ml_client.connections.create_or_update(workspace_connection=wps_connection)
Under Assets in the left navigation, select Data. Next, select the Data Connection tab. Then select Create as shown in this screenshot:
In the Create connection pane, fill in the values as shown in the screenshot. Choose Snowflake for the category, and Username password for the Authentication type. Be sure to specify the Target textbox value in this format, filling in your specific values between the < > characters:
Select Save to securely store the credentials in the key vault associated with the relevant workspace. This connection is used when running a data import job.
Create a Snowflake DB connection that uses OAuth
The information in this section describes how to create a Snowflake DB connection that uses OAuth to authenticate.
Important
Before following the steps in this section, you must first Configure Azure to issue OAuth tokens on behalf of the client. This configuration creates a service principal, which is required for the OAuth connection. You need the following information to create the connection:
Client ID: The ID of the service principal
Client Secret: The secret of the service principal
Tenant ID: The ID of the Microsoft Entra ID tenant
This YAML file creates a Snowflake DB connection that uses OAuth. Be sure to update the appropriate values:
# my_snowflakedb_connection.yaml
name: snowflake_service_principal_connection
type: snowflake
# Add the Snowflake account, database, warehouse name, and role name here. If no role name is provided, it will default to PUBLIC.
target: jdbc:snowflake://<myaccount>.snowflakecomputing.com/?db=<mydb>&warehouse=<mywarehouse>&scope=<scopeForServicePrincipal>
credentials:
type: service_principal
client_id: <client-id> # The service principal's client id
client_secret: <client-secret> # The service principal's client secret
tenant_id: <tenant-id> # The Microsoft Entra ID tenant id
Create the Azure Machine Learning connection in the CLI:
az ml connection create --file my_snowflakedb_connection.yaml
You can also override the information in the YAML file at the command line:
az ml connection create --file my_snowflakedb_connection.yaml --set credentials.client_id="my-client-id" credentials.client_secret="my-client-secret" credentials.tenant_id="my-tenant-id"
With the Python SDK, you can create a connection by loading the connection information stored in the YAML file. You can optionally override the values:
You can also directly specify the connection information in a Python script without relying on a YAML file:
from azure.ai.ml import MLClient
from azure.ai.ml.entities import WorkspaceConnection
from azure.ai.ml.entities import ServicePrincipalConfiguration
target= "jdbc:snowflake://<myaccount>.snowflakecomputing.com/?db=<mydb>&warehouse=<mywarehouse>&role=<myrole>"
# add the Snowflake account, database, warehouse name and role name here. If no role name provided it will default to PUBLIC
name= <my_snowflake_connection> # name of the connection
auth = ServicePrincipalConfiguration(client_id="<my-client-id>", client_secret="<my-client-secret>", tenant_id="<my-tenant-id>")
wps_connection = WorkspaceConnection(name= name,
type="snowflake",
target=target,
credentials=auth
)
ml_client.connections.create_or_update(workspace_connection=wps_connection)
Note
Creation of a Snowflake DB connection using a service principal (for OAuth) is only available through the Azure CLI or Python SDK.
This YAML script creates an Azure SQL DB connection. Be sure to update the appropriate values:
# my_sqldb_connection.yaml
$schema: http://azureml/sdk-2-0/Connection.json
type: azure_sql_db
name: my-sqldb-connection
target: Server=tcp:<myservername>,<port>;Database=<mydatabase>;Trusted_Connection=False;Encrypt=True;Connection Timeout=30
# add the sql servername, port addresss and database
credentials:
type: sql_auth
username: <username> # add the sql database user name here or leave this blank and type in CLI command line
password: <password> # add the sql database password here or leave this blank and type in CLI command line
Create the Azure Machine Learning connection in the CLI:
Option 1: Use the username / password from YAML file
az ml connection create --file my_sqldb_connection.yaml
Option 2: Override the username and password in YAML file
az ml connection create --file my_sqldb_connection.yaml --set credentials.
username="<username>" credentials.
password="<password>"
from azure.ai.ml import MLClient
from azure.ai.ml.entities import WorkspaceConnection
from azure.ai.ml.entities import UsernamePasswordConfiguration
# If using username/password, the name/password values should be url-encoded
import urllib.parse
username = urllib.parse.quote(os.environ["MYSQL_USERNAME"], safe="")
password = urllib.parse.quote(os.environ["MYSQL_PASSWORD"], safe="")
target= "Server=tcp:<myservername>,<port>;Database=<mydatabase>;Trusted_Connection=False;Encrypt=True;Connection Timeout=30"
# add the sql servername, port address and database
name= <my_sql_connection> # name of the connection
wps_connection = WorkspaceConnection(name= name,
type="azure_sql_db",
target= target,
credentials= UsernamePasswordConfiguration(username=username, password=password)
)
ml_client.connections.create_or_update(workspace_connection=wps_connection)
Under Assets in the left navigation, select Data. Next, select the Data Connection tab. Then select Create as shown in this screenshot:
In the Create connection pane, fill in the values as shown in the screenshot. Choose AzureSqlDb for the category, and Username password for the Authentication type. Be sure to specify the Target textbox value in this format, filling in your specific values between the < > characters:
Option 2: Use WorkspaceConnection() in a Python script
from azure.ai.ml import MLClient
from azure.ai.ml.entities import WorkspaceConnection
from azure.ai.ml.entities import AccessKeyConfiguration
target=<mybucket> # add the s3 bucket details
name=<my_s3_connection> # name of the connection
wps_connection=WorkspaceConnection(name=name,
type="s3",
target= target,
credentials= AccessKeyConfiguration(access_key_id="XXXJ5kL6mN7oP8qR9sT0uV1wX2yZ3aB4cXXX",acsecret_access_key="C2dE3fH4iJ5kL6mN7oP8qR9sT0uV1w")
)
ml_client.connections.create_or_update(workspace_connection=wps_connection)
Under Assets in the left navigation, select Data. Next, select the Data Connection tab. Then select Create as shown in this screenshot:
In the Create connection pane, fill in the values as shown in the screenshot. Choose S3 for the category, and Username password for the Authentication type. Be sure to specify the Target textbox value in this format, filling in your specific values between the < > characters:
<target>
Non-data connections
You can use these connection types to connect to Git:
Python feed
Azure Container Registry
a connection that uses an API key
These connections aren't data connections, but are used to connect to external services for use in your code.
Create the Azure Machine Learning connection in the CLI:
az ml connection create --file connection.yaml
The following example creates an Azure Container Registry connection. A managed identity authenticates this connection:
from azure.ai.ml.entities import WorkspaceConnection
from azure.ai.ml.entities import UsernamePasswordConfiguration
# If using username/password, the name/password values should be url-encoded
import urllib.parse
username = urllib.parse.quote(os.environ["REGISTRY_USERNAME"], safe="")
password = urllib.parse.quote(os.environ["REGISTRY_PASSWORD"], safe="")
name = "my_acr_conn"
target = "https://iJ5kL6mN7.core.windows.net/mycontainer"
wps_connection = WorkspaceConnection(
name=name,
type="container_registry",
target=target,
credentials=UsernamePasswordConfiguration(username=username, password=password),
)
ml_client.connections.create_or_update(workspace_connection=wps_connection)
You can't create an Azure Container Registry connection in studio.
API key
The following example creates an API key connection:
from azure.ai.ml.entities import WorkspaceConnection
from azure.ai.ml.entities import UsernamePasswordConfiguration, ApiKeyConfiguration
name = "my_api_key"
target = "https://L6mN7oP8q.core.windows.net/mycontainer"
wps_connection = WorkspaceConnection(
name=name,
type="apikey",
target=target,
credentials=ApiKeyConfiguration(key="9sT0uV1wX"),
)
ml_client.connections.create_or_update(workspace_connection=wps_connection)
Generic Container Registry
Using the GenericContainerRegistry workspace connection, you can specify an external registry, such as Nexus or Artifactory, for image builds. Environment images are pushed from the specified registry, and the previous cache is ignored.