Databricks 자산 번들의 작업 작업 설정 재정의

아티클
04/24/2024

이 문서에서는 Databricks 자산 번들의 Azure Databricks 작업 작업에 대한 설정을 재정의하는 방법을 설명합니다. Databricks 자산 번들이란?

Azure Databricks 번들 구성 파일에서 작업 정의 내의 매핑을 사용하여 task 매핑의 작업 작업 설정과 최상위 resources 매핑의 작업 작업 설정을 targets 조인할 수 있습니다(예: 줄임표는 생략된 콘텐츠를 간결하게 표시함).

# ...
resources:
  jobs:
    <some-unique-programmatic-identifier-for-this-job>:
      # ...
      tasks:
        - task_key: <some-unique-programmatic-identifier-for-this-task>
          # Task settings.

targets:
  <some-unique-programmatic-identifier-for-this-target>:
    resources:
      jobs:
        <the-matching-programmatic-identifier-for-this-job>:
          # ...
          tasks:
            - task_key: <the-matching-programmatic-identifier-for-this-key>
              # Any more task settings to join with the settings from the
              # resources mapping for the matching top-level task_key.
          # ...

최상위 resources 매핑 및 targets 동일한 task매핑에 조인 task 하려면 매핑을 task_key 동일한 값으로 설정해야 합니다.

최상위 resources 매핑과 targets 동일한 task매핑에 작업 작업 설정이 모두 정의된 경우 매핑의 설정 targets 이 최상위 resources 매핑의 설정보다 우선합니다.

예제 1: 여러 리소스 매핑에 정의되고 설정 충돌이 없는 작업 작업 설정

이 예제 spark_version 에서 최상위 resources 매핑은 매핑 targets 과 node_type_id 결합되어 num_workersresources 명명된 my-task 설정task_key(줄임표는 생략된 콘텐츠를 간결하게 표시함)을 정의합니다.

# ...
resources:
  jobs:
    my-job:
      name: my-job
      tasks:
        - task_key: my-key
          new_cluster:
            spark_version: 13.3.x-scala2.12

targets:
  development:
    resources:
      jobs:
        my-job:
          name: my-job
          tasks:
            - task_key: my-task
              new_cluster:
                node_type_id: Standard_DS3_v2
                num_workers: 1
          # ...

이 예제를 실행 databricks bundle validate 하면 결과 그래프는 다음과 같습니다(줄임표는 간결하게 하기 위해 생략된 콘텐츠를 나타낸다).

{
  "...": "...",
  "resources": {
    "jobs": {
      "my-job": {
        "tasks": [
          {
            "new_cluster": {
              "node_type_id": "Standard_DS3_v2",
              "num_workers": 1,
              "spark_version": "13.3.x-scala2.12"
            },
            "task-key": "my-task"
          }
        ],
        "...": "..."
      }
    }
  }
}

예제 2: 여러 리소스 매핑에 정의된 충돌하는 작업 작업 설정

이 예제spark_version에서는 최상위 resources 매핑 resourcestargets과 num_workers 매핑 모두에서 정의됩니다. spark_version및 num_workers 매핑에서 targets 최상위 resources 매핑보다 spark_versionnum_workers 우선 resources 합니다. 이렇게 하면 명명 my-task 된 (줄임표는 생략된 콘텐츠를 간결하게 나타내기 위해)에 대한 task_key 설정을 정의합니다.

# ...
resources:
  jobs:
    my-job:
      name: my-job
      tasks:
        - task_key: my-task
          new_cluster:
            spark_version: 13.3.x-scala2.12
            node_type_id: Standard_DS3_v2
            num_workers: 1

targets:
  development:
    resources:
      jobs:
        my-job:
          name: my-job
          tasks:
            - task_key: my-task
              new_cluster:
                spark_version: 12.2.x-scala2.12
                num_workers: 2
          # ...

이 예제를 실행 databricks bundle validate 하면 결과 그래프는 다음과 같습니다(줄임표는 간결하게 하기 위해 생략된 콘텐츠를 나타낸다).

{
  "...": "...",
  "resources": {
    "jobs": {
      "my-job": {
        "tasks": [
          {
            "new_cluster": {
              "node_type_id": "Standard_DS3_v2",
              "num_workers": 2,
              "spark_version": "12.2.x-scala2.12"
            },
            "task_key": "my-task"
          }
        ],
        "...": "..."
      }
    }
  }
}

다음을 통해 공유

Databricks 자산 번들의 작업 작업 설정 재정의

예제 1: 여러 리소스 매핑에 정의되고 설정 충돌이 없는 작업 작업 설정

예제 2: 여러 리소스 매핑에 정의된 충돌하는 작업 작업 설정

추가 리소스