CLI （v2）命令作業 YAML 架構

發行項
09/01/2024

您可以在找到 https://azuremlschemas.azureedge.net/latest/commandJob.schema.json來源 JSON 架構。

注意

本文件中詳述的 YAML 語法是以最新版 ML CLI v2 延伸模組的 JSON 結構描述為基礎。此語法僅保證能與最新版的 ML CLI v2 延伸模組搭配運作。您可以在 https://azuremlschemasprod.azureedge.net/ 找到舊版延伸模組的結構描述。

YAML 語法

機碼	類型	描述	允許的值	預設值
`$schema`	string	YAML 結構描述。如果您使用 Azure Machine Learning VS Code 擴充功能來撰寫 YAML 檔案，在檔案頂端包含 `$schema` 可讓您叫用結構描述和資源完成。
`type`	const	作業類型。	`command`	`command`
`name`	字串	作業的名稱。工作區中的所有作業都必須是唯一的。如果省略，Azure 機器學習會自動產生名稱的 GUID。
`display_name`	字串	在 Studio UI 中顯示作業的名稱。在工作區內不一定要是唯一名稱。如果省略，Azure 機器學習會自動產生顯示名稱的人類可讀取形容詞名標識碼。
`experiment_name`	字串	用來組織作業的實驗名稱。每個作業的執行記錄都會組織在 Studio 的 [實驗] 索引標籤中對應的實驗之下。如果省略，Azure 機器學習預設為建立作業的工作目錄名稱。
`description`	字串	作業的描述。
`tags`	object	作業標籤的字典。
`command`	字串	要執行的命令。
`code`	字串	要上傳並用於作業的原始程式碼目錄本機路徑。
`environment`	字串或物件	要用於作業的環境。可以是工作區中現有版本設定環境的參考或內嵌環境規格。若要參考現有的環境，請使用 `azureml:<environment_name>:<environment_version>` 語法或 `azureml:<environment_name>@latest` （參考最新版的環境）。若要內嵌定義環境，請遵循環境架構。 `name`排除和 `version` 屬性，因為內嵌環境不支持它們。
`environment_variables`	object	環境變數索引鍵/值組的字典，可在執行命令的進程上設定。
`distribution`	object	分散式定型案例的散發組態。其中一個 MpiConfiguration、PyTorchConfiguration 或 TensorFlowConfiguration。
`compute`	字串	要對其執行作業的計算目標名稱。可以是工作區中現有計算的參考（使用 `azureml:<compute_name>` 語法）或 `local` 指定本機執行。注意：管線中的作業不支援`localcompute`		`local`
`resources.instance_count`	整數	要用於作業的節點數目。		`1`
`resources.instance_type`	字串	要用於作業的實例類型。適用於在已啟用 Azure Arc 的 Kubernetes 計算上執行的作業（其中欄位中指定的 `compute` 計算目標為 `type: kubernentes`）。如果省略，則預設為 Kubernetes 叢集的預設實例類型。如需詳細資訊，請參閱建立和選取 Kubernetes 實例類型。
`resources.shm_size`	字串	Docker 容器的共用記憶體區塊大小。格式應 `<number><unit>` 為 number 必須大於 0 的格式，且單位可以是其中一個 `b` （位元組）、（KB）、 `k` （MB）， `m` 或 `g` （GB）。		`2g`
`limits.timeout`	整數	允許作業執行的秒數上限。達到此限制時，系統會取消作業。
`inputs`	object	作業輸入的字典。索引鍵是作業內容中的輸入名稱，值則是輸入值。您可以使用 `${{ inputs.<input_name> }}` 表示式在中`command`參考輸入。
`inputs.<input_name>`	number、integer、boolean、string 或 object	其中一個常值（類型為 number、integer、boolean 或 string），或包含作業輸入數據規格的物件。
`outputs`	object	作業輸出組態的字典。索引鍵是作業內容中的輸出名稱，值則是輸出設定。您可以使用 `${{ outputs.<output_name> }}` 表示式在中`command`參考輸出。
`outputs.<output_name>`	object	您可以將物件保留空白，在此情況下，輸出的類型`uri_folder`為，Azure 機器學習會產生輸出的輸出位置。輸出目錄的檔案會透過讀寫掛接寫入。如果您想要為輸出指定不同的模式，請提供包含作業輸出規格的物件。
`identity`	object	身分識別用於數據存取。它可以是 UserIdentityConfiguration、ManagedIdentityConfiguration 或 None。如果 UserIdentityConfiguration，則會使用作業送出者的身分識別來存取、輸入數據和寫入結果至輸出資料夾，否則會使用計算目標的受控識別。

散發組態

MpiConfiguration

機碼	類型	描述	允許的值
`type`	const	必要。散發類型。	`mpi`
`process_count_per_instance`	整數	必要。要針對作業啟動的每個節點進程數目。

PyTorchConfiguration

機碼	類型	描述	允許的值	預設值
`type`	const	必要。散發類型。	`pytorch`
`process_count_per_instance`	整數	要針對作業啟動的每個節點進程數目。		`1`

TensorFlowConfiguration

機碼	類型	描述	允許的值	預設值
`type`	const	必要。散發類型。	`tensorflow`
`worker_count`	整數	要為作業啟動的背景工作數目。		預設為 `resources.instance_count`。
`parameter_server_count`	整數	要針對作業啟動的參數伺服器數目。		`0`

作業輸入

機碼	類型	描述	允許的值	預設值
`type`	字串	作業輸入的類型。指定 `uri_file` 指向單一檔案來源的輸入數據，或 `uri_folder` 指定指向資料夾來源的輸入數據。	`uri_file`、、 `uri_folder`、 `mlflow_modelcustom_model`	`uri_folder`
`path`	字串	要作為輸入的數據路徑。可以透過幾種方式指定： - 資料來源檔案或資料夾的本機路徑，例如 `path: ./iris.csv`。數據會在作業提交期間上傳。 - 要作為輸入之檔案或資料夾之雲端路徑的 URI。支援的 URI 型態為 `azureml`、、`https`、`wasbsabfss`、 `adl`。如需如何使用 URI 格式的詳細資訊，`azureml://`請參閱核心 yaml 語法。 - 現有的已註冊 Azure 機器學習數據資產，以做為輸入。若要參考已註冊的數據資產，請使用 `azureml:<data_name>:<data_version>` 語法或 `azureml:<data_name>@latest` （參考該數據資產的最新版本），例如 `path: azureml:cifar10-data:1` 或 `path: azureml:cifar10-data@latest`。
`mode`	字串	如何將數據傳遞至計算目標的模式。針對唯讀掛接（`ro_mount`），數據會以掛接路徑的形式取用。資料夾會掛接為資料夾，而檔案會掛接為檔案。 Azure 機器學習會將輸入解析為掛接路徑。若為 `download` 模式，數據會下載至計算目標。 Azure 機器學習會將輸入解析為下載的路徑。如果您只想要資料成品儲存位置的 URL，而不是掛接或下載數據本身，您可以使用 `direct` 模式。此模式會傳入記憶體位置的 URL 做為作業輸入。在此情況下，您完全負責處理認證以存取記憶體。 `eval_mount`和 `eval_download` 模式對 MLTable 而言是唯一的，而且會將數據掛接為路徑，或將數據下載到計算目標。如需模式的詳細資訊，請參閱存取作業中的數據	`ro_mount`、、 `download`、 `direct`、 `eval_download`、 `eval_mount`	`ro_mount`

工作輸出

機碼	類型	描述	允許的值	預設值
`type`	字串	作業輸出的類型。針對預設 `uri_folder` 類型，輸出會對應至資料夾。	`uri_folder`、、 `mlflow_modelcustom_model`	`uri_folder`
`mode`	字串	輸出檔案如何傳遞至目的地記憶體的模式。針對讀寫掛接模式（`rw_mount`），輸出目錄是掛接的目錄。針對上傳模式，寫入的檔案會在作業結束時上傳。	`rw_mount`, `upload`	`rw_mount`

身分識別組態

UserIdentityConfiguration

機碼	類型	描述	允許的值
`type`	const	必要。識別類型。	`user_identity`

ManagedIdentityConfiguration

機碼	類型	描述	允許的值
`type`	const	必要。識別類型。	`managed` 或 `managed_identity`

備註

az ml job命令可用來管理 Azure 機器學習作業。

範例

範例 GitHub 存放庫中有範例可用。下列各節顯示一些範例。

YAML：hello world

$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
command: echo "hello world"
environment:
  image: library/python:latest

YAML：顯示名稱、實驗名稱、描述和標籤

$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
command: echo "hello world"
environment:
  image: library/python:latest
tags:
  hello: world
display_name: hello-world-example
experiment_name: hello-world-example
description: |
  # Azure Machine Learning "hello world" job

  This is a "hello world" job running in the cloud via Azure Machine Learning!

  ## Description

  Markdown is supported in the studio for job descriptions! You can edit the description there or via CLI.

YAML：環境變數

$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
command: echo $hello_env_var
environment:
  image: library/python:latest
environment_variables:
  hello_env_var: "hello world"

YAML：原始程式碼

$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
command: ls
code: src
environment:
  image: library/python:latest

YAML：常值輸入

$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
command: |
  echo ${{inputs.hello_string}}
  echo ${{inputs.hello_number}}
environment:
  image: library/python:latest
inputs:
  hello_string: "hello world"
  hello_number: 42

YAML：寫入預設輸出

$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
command: echo "hello world" > ./outputs/helloworld.txt
environment:
  image: library/python:latest

YAML：寫入具名數據輸出

$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
command: echo "hello world" > ${{outputs.hello_output}}/helloworld.txt
outputs:
  hello_output:
environment:
  image: python

YAML：數據存放區 URI 檔案輸入

$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
command: |
  echo "--iris-csv: ${{inputs.iris_csv}}"
  python hello-iris.py --iris-csv ${{inputs.iris_csv}}
code: src
inputs:
  iris_csv:
    type: uri_file 
    path: azureml://datastores/workspaceblobstore/paths/example-data/iris.csv
environment: azureml://registries/azureml/environments/sklearn-1.5/labels/latest

YAML：資料存放區 URI 資料夾輸入

$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
command: |
  ls ${{inputs.data_dir}}
  echo "--iris-csv: ${{inputs.data_dir}}/iris.csv"
  python hello-iris.py --iris-csv ${{inputs.data_dir}}/iris.csv
code: src
inputs:
  data_dir:
    type: uri_folder 
    path: azureml://datastores/workspaceblobstore/paths/example-data/
environment: azureml://registries/azureml/environments/sklearn-1.5/labels/latest

YAML：URI 檔案輸入

$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
command: |
  echo "--iris-csv: ${{inputs.iris_csv}}"
  python hello-iris.py --iris-csv ${{inputs.iris_csv}}
code: src
inputs:
  iris_csv:
    type: uri_file 
    path: https://azuremlexamples.blob.core.windows.net/datasets/iris.csv
environment: azureml://registries/azureml/environments/sklearn-1.5/labels/latest

YAML：URI 資料夾輸入

$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
command: |
  ls ${{inputs.data_dir}}
  echo "--iris-csv: ${{inputs.data_dir}}/iris.csv"
  python hello-iris.py --iris-csv ${{inputs.data_dir}}/iris.csv
code: src
inputs:
  data_dir:
    type: uri_folder 
    path: wasbs://datasets@azuremlexamples.blob.core.windows.net/
environment: azureml://registries/azureml/environments/sklearn-1.5/labels/latest

YAML：透過紙廠的筆記本

$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
command: |
  pip install ipykernel papermill
  papermill hello-notebook.ipynb outputs/out.ipynb -k python
code: src
environment:
  image: library/python:3.11.6

YAML：基本 Python 模型定型

$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
code: src
command: >-
  python main.py 
  --iris-csv ${{inputs.iris_csv}}
  --C ${{inputs.C}}
  --kernel ${{inputs.kernel}}
  --coef0 ${{inputs.coef0}}
inputs:
  iris_csv: 
    type: uri_file
    path: wasbs://datasets@azuremlexamples.blob.core.windows.net/iris.csv
  C: 0.8
  kernel: "rbf"
  coef0: 0.1
environment: azureml://registries/azureml/environments/sklearn-1.5/labels/latest
compute: azureml:cpu-cluster
display_name: sklearn-iris-example
experiment_name: sklearn-iris-example
description: Train a scikit-learn SVM on the Iris dataset.

YAML：使用本機 Docker 建置內容進行基本 R 模型定型

$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
command: >
  Rscript train.R 
  --data_folder ${{inputs.iris}}
code: src
inputs:
  iris: 
    type: uri_file
    path: https://azuremlexamples.blob.core.windows.net/datasets/iris.csv
environment:
  build:
    path: docker-context
compute: azureml:cpu-cluster
display_name: r-iris-example
experiment_name: r-iris-example
description: Train an R model on the Iris dataset.

YAML：分散式 PyTorch

$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
code: src
command: >-
  python train.py
  --epochs ${{inputs.epochs}}
  --learning-rate ${{inputs.learning_rate}}
  --data-dir ${{inputs.cifar}}
inputs:
  epochs: 1
  learning_rate: 0.2
  cifar:
     type: uri_folder
     path: azureml:cifar-10-example@latest
environment: azureml:AzureML-acpt-pytorch-2.2-cuda12.1@latest
compute: azureml:gpu-cluster
distribution:
  type: pytorch
  process_count_per_instance: 1
resources:
  instance_count: 2
display_name: pytorch-cifar-distributed-example
experiment_name: pytorch-cifar-distributed-example
description: Train a basic convolutional neural network (CNN) with PyTorch on the CIFAR-10 dataset, distributed via PyTorch.

YAML：分散式 TensorFlow

$schema: https://azuremlschemas.azureedge.net/latest/commandJob.schema.json
code: src
command: >-
  python train.py
  --epochs ${{inputs.epochs}}
  --model-dir ${{inputs.model_dir}}
inputs:
  epochs: 1
  model_dir: outputs/keras-model
environment: azureml:AzureML-tensorflow-2.16-cuda12@latest
compute: azureml:gpu-cluster
resources:
  instance_count: 2
distribution:
  type: tensorflow
  worker_count: 2
display_name: tensorflow-mnist-distributed-example
experiment_name: tensorflow-mnist-distributed-example
description: Train a basic neural network with TensorFlow on the MNIST dataset, distributed via TensorFlow.

下一步

安裝和使用 CLI (v2)

共用方式為

CLI （v2）命令作業 YAML 架構

YAML 語法

散發組態

MpiConfiguration

PyTorchConfiguration

TensorFlowConfiguration

作業輸入

工作輸出

身分識別組態

UserIdentityConfiguration

ManagedIdentityConfiguration

備註

範例

YAML：hello world

YAML：顯示名稱、實驗名稱、描述和標籤

YAML：環境變數

YAML：原始程式碼

YAML：常值輸入

YAML：寫入預設輸出

YAML：寫入具名數據輸出

YAML：數據存放區 URI 檔案輸入

YAML：資料存放區 URI 資料夾輸入

YAML：URI 檔案輸入

YAML：URI 資料夾輸入

YAML：透過紙廠的筆記本

YAML：基本 Python 模型定型

YAML：使用本機 Docker 建置內容進行基本 R 模型定型

YAML：分散式 PyTorch

YAML：分散式 TensorFlow

下一步

意見反應

其他資源

共用方式為

CLI （v2） 命令作業 YAML 架構

YAML 語法

散發組態

MpiConfiguration

PyTorchConfiguration

TensorFlowConfiguration

作業輸入

工作輸出

身分識別組態

UserIdentityConfiguration

ManagedIdentityConfiguration

備註

範例

YAML：hello world

YAML：顯示名稱、實驗名稱、描述和標籤

YAML：環境變數

YAML：原始程式碼

YAML：常值輸入

YAML：寫入預設輸出

YAML：寫入具名數據輸出

YAML：數據存放區 URI 檔案輸入

YAML：資料存放區 URI 資料夾輸入

YAML：URI 檔案輸入

YAML：URI 資料夾輸入

YAML：透過紙廠的筆記本

YAML：基本 Python 模型定型

YAML：使用本機 Docker 建置內容進行基本 R 模型定型

YAML：分散式 PyTorch

YAML：分散式 TensorFlow

下一步

意見反應

其他資源

CLI （v2）命令作業 YAML 架構