CLI (v2) data YAML schema

APPLIES TO: Azure CLI ml extension v2 (current)

You can find the source JSON schema at https://azuremlschemas.azureedge.net/latest/data.schema.json.

Note

The YAML syntax detailed in this document is based on the JSON schema for the latest version of the ML CLI v2 extension. This syntax is guaranteed only to work with the latest version of the ML CLI v2 extension. You can find the schemas for older extension versions at https://azuremlschemasprod.azureedge.net/.

YAML syntax

Key Type Description Allowed values Default value
$schema string The YAML schema. If you use the Azure Machine Learning Visual Studio Code extension to author the YAML file, include $schema at the top of your file to invoke schema and resource completions.
name string Required. The data asset name.
version string The dataset version. If omitted, Azure Machine Learning autogenerates a version.
description string The data asset description.
tags object The datastore tag dictionary.
type string The data asset type. Specify uri_file for data that points to a single file source, or uri_folder for data that points to a folder source. uri_file, uri_folder uri_folder
path string Either a local path to the data source file or folder, or the URI of a cloud path to the data source file or folder. Ensure that the source provided here is compatible with the type specified.

Supported URI types are azureml, https, wasbs, abfss, and adl. To use the azureml:// URI format, see Core yaml syntax.

Remarks

The az ml data commands can be used to manage Azure Machine Learning data assets.

Examples

Visit this GitHub resource for examples. Several are shown:

YAML: datastore file

$schema: https://azuremlschemas.azureedge.net/latest/data.schema.json
name: cloud-file-example
description: Data asset created from file in cloud.
type: uri_file
path: azureml://datastores/workspaceblobstore/paths/example-data/titanic.csv

YAML: datastore folder

$schema: https://azuremlschemas.azureedge.net/latest/data.schema.json
name: cloud-folder-example
description: Data asset created from folder in cloud.
type: uri_folder
path: azureml://datastores/workspaceblobstore/paths/example-data/

YAML: https file

$schema: https://azuremlschemas.azureedge.net/latest/data.schema.json
name: cloud-file-https-example
description: Data asset created from a file in cloud using https URL.
type: uri_file
path: https://account-name.blob.core.windows.net/container-name/example-data/titanic.csv

YAML: https folder

$schema: https://azuremlschemas.azureedge.net/latest/data.schema.json
name: cloud-folder-https-example
description: Dataset created from folder in cloud using https URL.
type: uri_folder
path: https://account-name.blob.core.windows.net/container-name/example-data/

YAML: wasbs file

$schema: https://azuremlschemas.azureedge.net/latest/data.schema.json
name: cloud-file-wasbs-example
description: Data asset created from a file in cloud using wasbs URL.
type: uri_file
path: wasbs://account-name.blob.core.windows.net/container-name/example-data/titanic.csv

YAML: wasbs folder

$schema: https://azuremlschemas.azureedge.net/latest/data.schema.json
name: cloud-folder-wasbs-example
description: Data asset created from folder in cloud using wasbs URL.
type: uri_folder
path: wasbs://account-name.blob.core.windows.net/container-name/example-data/

YAML: local file

$schema: https://azuremlschemas.azureedge.net/latest/data.schema.json
name: local-file-example-titanic
description: Data asset created from local file.
type: uri_file
path: sample-data/titanic.csv

YAML: local folder

$schema: https://azuremlschemas.azureedge.net/latest/data.schema.json
name: local-folder-example-titanic
description: Dataset created from local folder.
type: uri_folder
path: sample-data/

Next steps