CLI (v2) data YAML schema

APPLIES TO: Azure CLI ml extension v2 (current)

The source JSON schema can be found at https://azuremlschemas.azureedge.net/latest/data.schema.json.

Note

The YAML syntax detailed in this document is based on the JSON schema for the latest version of the ML CLI v2 extension. This syntax is guaranteed only to work with the latest version of the ML CLI v2 extension. You can find the schemas for older extension versions at https://azuremlschemasprod.azureedge.net/.

YAML syntax

Key Type Description Allowed values Default value
$schema string The YAML schema. If you use the Azure Machine Learning VS Code extension to author the YAML file, including $schema at the top of your file enables you to invoke schema and resource completions.
name string Required. Name of the data asset.
version string Version of the dataset. If omitted, Azure ML will autogenerate a version.
description string Description of the data asset.
tags object Dictionary of tags for the data asset.
type string The data asset type. Specify uri_file for data that points to a single file source, or uri_folder for data that points to a folder source. uri_file, uri_folder uri_folder
path string Either a local path to the data source file or folder, or the URI of a cloud path to the data source file or folder. Please ensure that the source provided here is compatible with the type specified.

Supported URI types are azureml, https, wasbs, abfss, and adl. See Core yaml syntax for more information on how to use the azureml:// URI format.

Remarks

The az ml data commands can be used for managing Azure Machine Learning data assets.

Examples

Examples are available in the examples GitHub repository. Several are shown below.

YAML: datastore file

$schema: https://azuremlschemas.azureedge.net/latest/data.schema.json
name: cloud-file-example
description: Data asset created from file in cloud.
type: uri_file
path: azureml://datastores/workspaceblobstore/paths/example-data/titanic.csv

YAML: datastore folder

$schema: https://azuremlschemas.azureedge.net/latest/data.schema.json
name: cloud-folder-example
description: Data asset created from folder in cloud.
type: uri_folder
path: azureml://datastores/workspaceblobstore/paths/example-data/

YAML: https file

$schema: https://azuremlschemas.azureedge.net/latest/data.schema.json
name: cloud-file-https-example
description: Data asset created from a file in cloud using https URL.
type: uri_file
path: https://account-name.blob.core.windows.net/container-name/example-data/titanic.csv

YAML: https folder

$schema: https://azuremlschemas.azureedge.net/latest/data.schema.json
name: cloud-folder-https-example
description: Dataset created from folder in cloud using https URL.
type: uri_folder
path: https://account-name.blob.core.windows.net/container-name/example-data/

YAML: wasbs file

$schema: https://azuremlschemas.azureedge.net/latest/data.schema.json
name: cloud-file-wasbs-example
description: Data asset created from a file in cloud using wasbs URL.
type: uri_file
path: wasbs://account-name.blob.core.windows.net/container-name/example-data/titanic.csv

YAML: wasbs folder

$schema: https://azuremlschemas.azureedge.net/latest/data.schema.json
name: cloud-folder-wasbs-example
description: Data asset created from folder in cloud using wasbs URL.
type: uri_folder
path: wasbs://account-name.blob.core.windows.net/container-name/example-data/

YAML: local file

$schema: https://azuremlschemas.azureedge.net/latest/data.schema.json
name: local-file-example-titanic
description: Data asset created from local file.
type: uri_file
path: sample-data/titanic.csv

YAML: local folder

$schema: https://azuremlschemas.azureedge.net/latest/data.schema.json
name: local-folder-example-titanic
description: Dataset created from local folder.
type: uri_folder
path: sample-data/

Next steps