CLI (v2) 自動化 ML 映射多重標籤分類作業 YAML 架構

發行項
04/04/2023

您可以在 https://azuremlsdk2.blob.core.windows.net/preview/0.0.1/autoMLImageClassificationMultilabelJob.schema.json 中找到來源 JSON 結構描述。

注意

此文件中詳述的 YAML 語法是以最新版 ML CLI v2 延伸模組的 JSON 結構描述為基礎。此語法只保證能與最新版的 ML CLI v2 延伸模組搭配使用。您可以在 https://azuremlschemasprod.azureedge.net/ 找到舊版延伸模組的結構描述。

YAML 語法

如需 Yaml 語法中所有索引鍵的資訊，請參閱影像分類工作的 Yaml 語法。在這裡，我們只會描述與影像分類工作所指定值不同的索引鍵。

索引鍵	類型	描述	允許的值	預設值
`task`	const	必要。 AutoML 工作的類型。	`image_classification_multilabel`	`image_classification_multilabel`
`primary_metric`	string	AutoML 將針對模型選取進行優化的計量。	`iou`	`iou`

備註

您可以使用 az ml job 命令來管理 Azure Machine Learning 作業。

範例

範例 GitHub 存放庫中有範例可用。與影像多重標籤分類作業相關的範例如下所示。

YAML：AutoML 影像多重標籤分類作業

$schema: https://azuremlsdk2.blob.core.windows.net/preview/0.0.1/autoMLJob.schema.json
type: automl

experiment_name: dpv2-cli-automl-image-classification-multilabel-experiment
description: A multi-label Image classification job using fridge items dataset

compute: azureml:gpu-cluster

task: image_classification_multilabel
log_verbosity: debug
primary_metric: iou

target_column_name: label
training_data:
  # Update the path, if prepare_data.py is using data_path other than "./data"
  path: data/training-mltable-folder
  type: mltable
validation_data:
  # Update the path, if prepare_data.py is using data_path other than "./data"
  path: data/validation-mltable-folder
  type: mltable

limits:
  timeout_minutes: 60
  max_trials: 10
  max_concurrent_trials: 2

training_parameters:
  early_stopping: True
  evaluation_frequency: 1

sweep:
  sampling_algorithm: random
  early_termination:
    type: bandit
    evaluation_interval: 2
    slack_factor: 0.2
    delay_evaluation: 6

search_space:
  - model_name:
      type: choice
      values: [vitb16r224]
    learning_rate:
      type: uniform
      min_value: 0.005
      max_value: 0.05
    number_of_epochs:
      type: choice
      values: [15, 30]
    gradient_accumulation_step:
      type: choice
      values: [1, 2]

  - model_name:
      type: choice
      values: [seresnext]
    learning_rate:
      type: uniform
      min_value: 0.005
      max_value: 0.05
    validation_resize_size:
      type: choice
      values: [288, 320, 352]
    validation_crop_size:
      type: choice
      values: [224, 256]
    training_crop_size:
      type: choice
      values: [224, 256]

YAML：AutoML 影像多重標籤分類管線作業

$schema: https://azuremlschemas.azureedge.net/latest/pipelineJob.schema.json
type: pipeline

description: Pipeline using AutoML Image Multilabel Classification task

display_name: pipeline-with-image-classification-multilabel
experiment_name: pipeline-with-automl

settings:
  default_compute: azureml:gpu-cluster

inputs:
  image_multilabel_classification_training_data:
    type: mltable
    # Update the path, if prepare_data.py is using data_path other than "./data"
    path: data/training-mltable-folder
  image_multilabel_classification_validation_data:
    type: mltable
    # Update the path, if prepare_data.py is using data_path other than "./data"
    path: data/validation-mltable-folder

jobs:
  image_multilabel_classification_node:
    type: automl
    task: image_classification_multilabel
    log_verbosity: info
    primary_metric: iou
    limits:
      timeout_minutes: 180
      max_trials: 10
      max_concurrent_trials: 2
    target_column_name: label
    training_data: ${{parent.inputs.image_multilabel_classification_training_data}}
    validation_data: ${{parent.inputs.image_multilabel_classification_validation_data}}
    training_parameters:
      early_stopping: True
      evaluation_frequency: 1
    sweep:
      sampling_algorithm: random
      early_termination:
        type: bandit
        evaluation_interval: 2
        slack_factor: 0.2
        delay_evaluation: 6
    search_space:
      - model_name:
          type: choice
          values: [vitb16r224]
        learning_rate:
          type: uniform
          min_value: 0.005
          max_value: 0.05
        number_of_epochs:
          type: choice
          values: [15, 30]
        gradient_accumulation_step:
          type: choice
          values: [1, 2]

      - model_name:
          type: choice
          values: [seresnext]
        learning_rate:
          type: uniform
          min_value: 0.005
          max_value: 0.05
        validation_resize_size:
          type: choice
          values: [288, 320, 352]
        validation_crop_size:
          type: choice
          values: [224, 256]
        training_crop_size:
          type: choice
          values: [224, 256]

    # currently need to specify outputs "mlflow_model" explicitly to reference it in following nodes
    outputs:
      best_model:
        type: mlflow_model
  register_model_node:
    type: command
    component: file:./components/component_register_model.yaml
    inputs:
      model_input_path: ${{parent.jobs.image_multilabel_classification_node.outputs.best_model}}
      model_base_name: fridge_items_multilabel_classification_model

後續步驟

安裝和使用 CLI (v2)

Share via

CLI (v2) 自動化 ML 映射多重標籤分類作業 YAML 架構

YAML 語法

備註

範例

YAML：AutoML 影像多重標籤分類作業

YAML：AutoML 影像多重標籤分類管線作業

後續步驟

其他資源