CLI （v2）掃掠作業 YAML 架構

發行項
07/03/2024

您可以在找到 https://azuremlschemas.azureedge.net/latest/sweepJob.schema.json來源 JSON 架構。

注意

本文件中詳述的 YAML 語法是以最新版 ML CLI v2 延伸模組的 JSON 結構描述為基礎。此語法僅保證能與最新版的 ML CLI v2 延伸模組搭配運作。您可以在 https://azuremlschemasprod.azureedge.net/ 找到舊版延伸模組的結構描述。

YAML 語法

機碼	類型	描述	允許的值	預設值
`$schema`	string	YAML 結構描述。如果您使用 Azure 機器學習 VS Code 擴充功能來撰寫 YAML 檔案，您可以在檔案頂端包含`$schema`架構和資源完成。
`type`	const	必要。作業類型。	`sweep`	`sweep`
`name`	字串	作業的名稱。工作區中的所有作業都必須是唯一的。如果省略，Azure 機器學習會自動產生名稱的 GUID。
`display_name`	字串	在 Studio UI 中顯示作業的名稱。在工作區中可以是非唯一的。如果省略，Azure 機器學習會自動產生顯示名稱的人類可讀取形容詞-名詞標識符。
`experiment_name`	字串	在實驗名稱下組織作業。每個作業的執行記錄會組織在工作室的 [實驗] 索引標籤中對應的實驗之下。如果省略，Azure 機器學習預設`experiment_name`為建立作業的工作目錄名稱。
`description`	字串	作業的描述。
`tags`	object	作業標籤的字典。
`sampling_algorithm`	object	必要。要用於的 `search_space`超參數取樣演算法。其中一個 RandomSamplingAlgorithm、GridSamplingAlgorithm 或 BayesianSamplingAlgorithm。
`search_space`	object	必要。超參數搜尋空間的字典。超參數名稱是索引鍵，而值是參數表達式。您可以使用表示式在中`trial.command${{ search_space.<hyperparameter> }}`參考超參數。
`search_space.<hyperparameter>`	object	請瀏覽參數表示式，以取得要使用的一組可能表達式。
`objective.primary_metric`	字串	必要。每個試用作業所報告的主要計量名稱。計量必須使用相同的對應計量名稱， `mlflow.log_metric()` 登入用戶的訓練腳本。
`objective.goal`	字串	必要。的 `objective.primary_metric`優化目標。	`maximize`, `minimize`
`early_termination`	object	要使用的早期終止原則。符合指定原則的準則時，會取消試用作業。如果省略，則不會套用任何提早終止原則。其中一個 BanditPolicy、MedianStoppingPolicy 或 TruncationSelectionPolicy。
`limits`	object	掃掠作業的限制。請參閱索引鍵的屬性`limits`。
`compute`	字串	必要。在其中執行作業的計算目標名稱，其語法為 `azureml:<compute_name>` 。
`trial`	object	必要。每個試用版的工作範本。每個試用作業都會提供系統從 `search_space`中取樣的不同超參數值組合。瀏覽索引鍵的屬性`trial`。
`inputs`	object	作業輸入的字典。索引鍵是作業內容中的輸入名稱，值則是輸入值。您可以使用 `${{ inputs.<input_name> }}` 表示式在中`command`參考輸入。
`inputs.<input_name>`	number、integer、boolean、string 或 object	常值之一（類型為 number、integer、boolean 或 string）或包含作業輸入數據規格的物件。
`outputs`	object	作業輸出組態的字典。索引鍵是作業內容中的輸出名稱，值則是輸出設定。您可以使用 `${{ outputs.<output_name> }}` 表示式在中`command`參考輸出。
`outputs.<output_name>`	object	您可以讓物件保持空白，在此情況下，根據預設，輸出的類型為 `uri_folder` ，而 Azure 機器學習系統會產生輸出的輸出位置。輸出目錄的所有檔案都會透過讀寫掛接寫入。若要指定輸出的不同模式，請提供包含作業輸出規格的物件。
`identity`	object	身分識別用於數據存取。它可以是使用者身分識別組態、受控識別組態或無。針對 UserIdentityConfiguration，作業送出者的身分識別是用來存取輸入數據，並將結果寫入輸出資料夾。否則，會使用計算目標的受控識別。

取樣演算法

RandomSamplingAlgorithm

機碼	類型	描述	允許的值	預設值
`type`	const	必要。取樣演算法的類型。	`random`
`seed`	整數	要用來初始化隨機數產生的隨機種子。如果省略，則默認種子值為 null。
`rule`	字串	要使用的隨機取樣類型。默認值 `random`會使用簡單的統一隨機取樣，而 `sobol` 會使用Sobol準隨機序列。	`random`, `sobol`	`random`

GridSamplingAlgorithm

機碼	類型	描述	允許的值
`type`	const	必要。取樣演演算法類型。	`grid`

BayesianSamplingAlgorithm

機碼	類型	描述	允許的值
`type`	const	必要。取樣演演算法類型。	`bayesian`

提前終止原則

BanditPolicy

機碼	類型	描述	允許的值	預設值
`type`	const	必要。原則類型。	`bandit`
`slack_factor`	數值	用來計算最佳試用距離所允許距離的比例。其中一個 `slack_factor` 或 `slack_amount` 是必要的。
`slack_amount`	數值	允許與最佳試用版的絕對距離。其中一個 `slack_factor` 或 `slack_amount` 是必要的。
`evaluation_interval`	整數	套用原則的頻率。		`1`
`delay_evaluation`	整數	延遲第一個原則評估的間隔數目。如果指定，原則會套用至大於或等於 `delay_evaluation`的每個倍`evaluation_interval`數。		`0`

MedianStoppingPolicy

機碼	類型	描述	允許的值	預設值
`type`	const	必要。原則類型。	`median_stopping`
`evaluation_interval`	整數	套用原則的頻率。		`1`
`delay_evaluation`	整數	延遲第一個原則評估的間隔數目。如果指定，原則會套用至大於或等於 `delay_evaluation`的每個倍`evaluation_interval`數。		`0`

TruncationSelectionPolicy

機碼	類型	描述	允許的值	預設值
`type`	const	必要。原則類型。	`truncation_selection`
`truncation_percentage`	整數	必要。每個評估間隔取消的試用作業百分比。
`evaluation_interval`	整數	套用原則的頻率。		`1`
`delay_evaluation`	整數	延遲第一個原則評估的間隔數目。如果指定，原則會套用至大於或等於 `delay_evaluation`的每個倍`evaluation_interval`數。		`0`

參數表達式

選擇

機碼	類型	描述	允許的值
`type`	const	必要。運算式類型。	`choice`
`values`	陣列	必要。要從中選擇的離散值清單。

Randint

機碼	類型	描述	允許的值
`type`	const	必要。運算式類型。	`randint`
`upper`	整數	必要。整數範圍的獨佔上限。

Qlognormal、qnormal

機碼	類型	描述	允許的值
`type`	const	必要。運算式類型。	`qlognormal`, `qnormal`
`mu`	數值	必要。常態分佈的平均值。
`sigma`	數值	必要。常態分佈的標準偏差。
`q`	整數	必要。平滑因數。

Qloguniform、quniform

機碼	類型	描述	允許的值
`type`	const	必要。運算式類型。	`qloguniform`, `quniform`
`min_value`	數值	必要。範圍中的最小值（含）。
`max_value`	數值	必要。範圍中的最大值（含）。
`q`	整數	必要。平滑因數。

Lognormal、normal

機碼	類型	描述	允許的值
`type`	const	必要。運算式類型。	`lognormal`, `normal`
`mu`	數值	必要。常態分佈的平均值。
`sigma`	數值	必要。常態分佈的標準偏差。

Loguniform

機碼	類型	描述	允許的值
`type`	const	必要。運算式類型。	`loguniform`
`min_value`	數值	必要。範圍中的最小值為 `exp(min_value)` （含）。
`max_value`	數值	必要。範圍中的最大值為 `exp(max_value)` （含）。

Uniform

機碼	類型	描述	允許的值
`type`	const	必要。運算式類型。	`uniform`
`min_value`	數值	必要。範圍中的最小值（含）。
`max_value`	數值	必要。範圍中的最大值（含）。

索引鍵的屬性`limits`

機碼	類型	描述	預設值
`max_total_trials`	整數	試用作業數目上限。	`1000`
`max_concurrent_trials`	整數	可以同時執行的試用作業數目上限。	預設為 `max_total_trials`。
`timeout`	整數	允許執行整個掃掠作業的秒數上限。達到此限制后，系統會取消掃掠作業，包括其所有試用版。	`5184000`
`trial_timeout`	整數	允許執行每個試用作業的秒數上限。達到此限制之後，系統就會取消試用版。

索引鍵的屬性`trial`

機碼	類型	描述	預設值
`command`	字串	必要。要執行的命令。
`code`	字串	要上傳並用於作業的原始程式碼目錄本機路徑。
`environment`	字串或物件	必要。要用於作業的環境。此值可以是工作區中現有已建立版本環境的參考，也可以是內嵌環境規格。若要參考現有的環境，請使用 `azureml:<environment-name>:<environment-version>` 語法。若要內嵌定義環境，請遵循環境架構。 `name`排除和 `version` 屬性，因為內嵌環境不支持它們。
`environment_variables`	object	要在執行命令的進程上設定的環境變數名稱/值組字典。
`distribution`	object	分散式定型案例的散發組態。其中一個 Mpi 組態、PyTorch 組態或 TensorFlow 組態。
`resources.instance_count`	整數	要用於作業的節點數目。	`1`

散發組態

MpiConfiguration

機碼	類型	描述	允許的值
`type`	const	必要。散發類型。	`mpi`
`process_count_per_instance`	整數	必要。要針對作業啟動的每個節點進程數目。

PyTorchConfiguration

機碼	類型	描述	允許的值	預設值
`type`	const	必要。散發類型。	`pytorch`
`process_count_per_instance`	整數	要針對作業啟動的每個節點進程數目。		`1`

TensorFlowConfiguration

機碼	類型	描述	允許的值	預設值
`type`	const	必要。散發類型。	`tensorflow`
`worker_count`	整數	要為作業啟動的背景工作數目。		預設為 `resources.instance_count`。
`parameter_server_count`	整數	要針對作業啟動的參數伺服器數目。		`0`

作業輸入

機碼	類型	描述	允許的值	預設值
`type`	字串	作業輸入的類型。指定 `uri_file` 指向單一檔案來源的輸入數據，或 `uri_folder` 指定指向資料夾來源的輸入數據。如需詳細資訊，請流覽深入了解數據存取。	`uri_file`、、 `uri_folder`、 `mltablemlflow_model`	`uri_folder`
`path`	字串	要作為輸入的數據路徑。這個值可以透過幾種方式指定： - 資料來源檔案或資料夾的本機路徑，例如 `path: ./iris.csv`。數據會在作業提交期間上傳。 - 要作為輸入之檔案或資料夾之雲端路徑的 URI。支援的 URI 型態為 `azureml`、、`https`、`wasbsabfss`、 `adl`。如需使用 `azureml://` URI 格式的詳細資訊，請流覽 Core yaml 語法。 - 現有的已註冊 Azure 機器學習數據資產，用來作為輸入。若要參考已註冊的數據資產，請使用 `azureml:<data_name>:<data_version>` 語法或 `azureml:<data_name>@latest` （參考該資料資產的最新版本） - 例如 `path: azureml:cifar10-data:1` 或 `path: azureml:cifar10-data@latest`。
`mode`	字串	如何將數據傳遞至計算目標的模式。針對唯讀掛接（`ro_mount`），數據會以掛接路徑的形式取用。資料夾會掛接為資料夾，而檔案會掛接為檔案。 Azure 機器學習會將輸入解析為掛接路徑。若為 `download` 模式，數據會下載至計算目標。 Azure 機器學習會將輸入解析為下載的路徑。若只是數據成品或成品儲存位置的 URL，而不是掛接或下載數據本身，請使用 `direct` 模式。這會傳入記憶體位置的 URL 做為作業輸入。在此情況下，您完全負責處理認證以存取記憶體。	`ro_mount`、、 `downloaddirect`	`ro_mount`

工作輸出

機碼	類型	描述	允許的值	預設值
`type`	字串	作業輸出類型。針對預設 `uri_folder` 類型，輸出會對應至資料夾。	`uri_file`、、 `uri_folder`、 `mltablemlflow_model`	`uri_folder`
`mode`	字串	將輸出檔案或檔案傳遞至目的地記憶體的模式。針對讀寫掛接模式（`rw_mount`），輸出目錄是掛接的目錄。針對上傳模式，寫入的所有檔案都會在作業結束時上傳。	`rw_mount`, `upload`	`rw_mount`

身分識別組態

UserIdentityConfiguration

機碼	類型	描述	允許的值
`type`	const	必要。識別類型。	`user_identity`

ManagedIdentityConfiguration

機碼	類型	描述	允許的值
`type`	const	必要。識別類型。	`managed` 或 `managed_identity`

備註

您可以使用 az ml job 命令來管理 Azure 機器學習作業。

範例

如需範例，請流覽 GitHub 存放庫範例。這裡顯示數個：

YAML：hello sweep

$schema: https://azuremlschemas.azureedge.net/latest/sweepJob.schema.json
type: sweep
trial:
  command: >-
    python hello-sweep.py
    --A ${{inputs.A}}
    --B ${{search_space.B}}
    --C ${{search_space.C}}
  code: src
  environment: azureml:AzureML-sklearn-1.1@latest
inputs:
  A: 0.5
sampling_algorithm: random
search_space:
  B:
    type: choice
    values: ["hello", "world", "hello_world"]
  C:
    type: uniform
    min_value: 0.1
    max_value: 1.0
objective:
  goal: minimize
  primary_metric: random_metric
limits:
  max_total_trials: 4
  max_concurrent_trials: 2
  timeout: 3600
display_name: hello-sweep-example
experiment_name: hello-sweep-example
description: Hello sweep job example.

YAML：基本 Python 模型超參數微調

$schema: https://azuremlschemas.azureedge.net/latest/sweepJob.schema.json
type: sweep
trial:
  code: src
  command: >-
    python main.py 
    --iris-csv ${{inputs.iris_csv}}
    --C ${{search_space.C}}
    --kernel ${{search_space.kernel}}
    --coef0 ${{search_space.coef0}}
  environment: azureml:AzureML-sklearn-1.1@latest
inputs:
  iris_csv: 
    type: uri_file
    path: wasbs://datasets@azuremlexamples.blob.core.windows.net/iris.csv
compute: azureml:cpu-cluster
sampling_algorithm: random
search_space:
  C:
    type: uniform
    min_value: 0.5
    max_value: 0.9
  kernel:
    type: choice
    values: ["rbf", "linear", "poly"]
  coef0:
    type: uniform
    min_value: 0.1
    max_value: 1
objective:
  goal: minimize
  primary_metric: training_f1_score
limits:
  max_total_trials: 20
  max_concurrent_trials: 10
  timeout: 7200
display_name: sklearn-iris-sweep-example
experiment_name: sklearn-iris-sweep-example
description: Sweep hyperparemeters for training a scikit-learn SVM on the Iris dataset.

下一步

安裝和使用 CLI (v2)

共用方式為

CLI （v2）掃掠作業 YAML 架構

YAML 語法

取樣演算法

RandomSamplingAlgorithm

GridSamplingAlgorithm

BayesianSamplingAlgorithm

提前終止原則

BanditPolicy

MedianStoppingPolicy

TruncationSelectionPolicy

參數表達式

選擇

Randint

Qlognormal、qnormal

Qloguniform、quniform

Lognormal、normal

Loguniform

Uniform

索引鍵的屬性`limits`

索引鍵的屬性`trial`

散發組態

MpiConfiguration

PyTorchConfiguration

TensorFlowConfiguration

作業輸入

工作輸出

身分識別組態

UserIdentityConfiguration

ManagedIdentityConfiguration

備註

範例

YAML：hello sweep

YAML：基本 Python 模型超參數微調

下一步

意見反應

意見反應

其他資源

共用方式為

CLI （v2） 掃掠作業 YAML 架構

YAML 語法

取樣演算法

RandomSamplingAlgorithm

GridSamplingAlgorithm

BayesianSamplingAlgorithm

提前終止原則

BanditPolicy

MedianStoppingPolicy

TruncationSelectionPolicy

參數表達式

選擇

Randint

Qlognormal、qnormal

Qloguniform、quniform

Lognormal、normal

Loguniform

Uniform

索引鍵的屬性limits

索引鍵的屬性trial

散發組態

MpiConfiguration

PyTorchConfiguration

TensorFlowConfiguration

作業輸入

工作輸出

身分識別組態

UserIdentityConfiguration

ManagedIdentityConfiguration

備註

範例

YAML：hello sweep

YAML：基本 Python 模型超參數微調

下一步

意見反應

意見反應

其他資源

CLI （v2）掃掠作業 YAML 架構

索引鍵的屬性`limits`

索引鍵的屬性`trial`