Databricks 资产捆绑包配置

本文介绍 Databricks 资产捆绑包配置文件的语法，该文件定义 Databricks 资产捆绑包。请参阅什么是 Databricks 资产捆绑包？。

有关捆绑配置参考，请参阅配置参考。

databricks.yml

捆绑包必须包含一个（且只有一个）配置文件，该文件在捆绑项目文件夹的根目录中命名 databricks.yml 。 databricks.yml 是定义捆绑包的主配置文件，但它可以在映射中 include 引用其他配置文件，例如资源配置文件。捆绑配置以 YAML 表示。有关 YAML 的详细信息，请参阅官方 YAML 规范。

最简单的 databricks.yml 定义捆绑包名称，这是所需的顶级映射和目标部署。

bundle:
  name: my_bundle

targets:
  dev:
    default: true

有关所有顶级映射的详细信息，请参阅配置参考。

提示

借助对 Databricks 资产捆绑包的 Python 支持，可以在 Python 中定义资源。请参阅 Python 中的捆绑配置。

规范

以下 YAML 规范提供了 Databricks 资产捆绑包的顶级配置键。有关配置参考，请参阅配置参考。

# This is the default bundle configuration if not otherwise overridden in
# the "targets" top-level mapping.
bundle: # Required.
  name: string # Required.
  databricks_cli_version: string
  cluster_id: string
  deployment: Map
  git:
    origin_url: string
    branch: string

# This is the identity to use to run the bundle
run_as:
  - user_name: <user-name>
  - service_principal_name: <service-principal-name>

# These are any additional configuration files to include.
include:
  - '<some-file-or-path-glob-to-include>'
  - '<another-file-or-path-glob-to-include>'

# These are any scripts that can be run.
scripts:
  <some-unique-script-name>:
    content: string

# These are any additional files or paths to include or exclude.
sync:
  include:
    - '<some-file-or-path-glob-to-include>'
    - '<another-file-or-path-glob-to-include>'
  exclude:
    - '<some-file-or-path-glob-to-exclude>'
    - '<another-file-or-path-glob-to-exclude>'
  paths:
    - '<some-file-or-path-to-synchronize>'

# These are the default artifact settings if not otherwise overridden in
# the targets top-level mapping.
artifacts:
  <some-unique-artifact-identifier>:
    build: string
    dynamic_version: boolean
    executable: string
    files:
      - source: string
    path: string
    type: string

# These are for any custom variables for use throughout the bundle.
variables:
  <some-unique-variable-name>:
    description: string
    default: string or complex
    lookup: Map
    type: string # The only valid value is "complex" if the variable is a complex variable, otherwise do not define this key.

# These are the default workspace settings if not otherwise overridden in
# the targets top-level mapping.
workspace:
  artifact_path: string
  auth_type: string
  azure_client_id: string # For Azure Databricks only.
  azure_environment: string # For Azure Databricks only.
  azure_login_app_id: string # For Azure Databricks only. Reserved for future use.
  azure_tenant_id: string # For Azure Databricks only.
  azure_use_msi: true | false # For Azure Databricks only.
  azure_workspace_resource_id: string # For Azure Databricks only.
  client_id: string # For Databricks on AWS only.
  file_path: string
  google_service_account: string # For Databricks on Google Cloud only.
  host: string
  profile: string
  resource_path: string
  root_path: string
  state_path: string

# These are the permissions to apply to resources defined
# in the resources mapping.
permissions:
  - level: <permission-level>
    group_name: <unique-group-name>
  - level: <permission-level>
    user_name: <unique-user-name>
  - level: <permission-level>
    service_principal_name: <unique-principal-name>

# These are the resource settings if not otherwise overridden in
# the targets top-level mapping.
resources:
  apps:
    <unique-app-name>:
      # See the REST API create request payload reference for apps.
  clusters:
    <unique-cluster-name>:
      # See the REST API create request payload reference for clusters.
  dashboards:
    <unique-dashboard-name>:
      # See the REST API create request payload reference for dashboards.
  experiments:
    <unique-experiment-name>:
      # See the REST API create request payload reference for experiments.
  jobs:
    <unique-job-name>:
      # See REST API create request payload reference for jobs.
  model_serving_endpoint:
    <unique-model-serving-endpoint-name>:
    # See the model serving endpoint request payload reference.
  models:
    <unique-model-name>:
      # See the REST API create request payload reference for models (legacy).
  pipelines:
    <unique-pipeline-name>:
      # See the REST API create request payload reference for :re[LDP] (pipelines).
  quality_monitors:
    <unique-quality-monitor-name>:
    # See the quality monitor request payload reference.
  registered_models:
    <unique-registered-model-name>:
    # See the registered model request payload reference.
  schemas:
    <unique-schema-name>:
      # See the Unity Catalog schema request payload reference.
  secret_scopes:
    <unique-secret-scope-name>:
      # See the secret scope request payload reference.
  volumes:
    <unique-volume-name>:
    # See the Unity Catalog volume request payload reference.

# These are the targets to use for deployments and workflow runs. One and only one of these
# targets can be set to "default: true".
targets:
  <some-unique-programmatic-identifier-for-this-target>:
    artifacts:
      # See the preceding "artifacts" syntax.
    bundle:
      # See the preceding "bundle" syntax.
    default: boolean
    git: Map
    mode: string
    permissions:
      # See the preceding "permissions" syntax.
    presets:
      <preset>: <value>
    resources:
      # See the preceding "resources" syntax.
    sync:
      # See the preceding "sync" syntax.
    variables:
      <preceding-unique-variable-name>: <non-default-value>
    workspace:
      # See the preceding "workspace" syntax.
    run_as:
      # See the preceding "run_as" syntax.

示例

本部分包含一些基本示例，可帮助你了解捆绑包的工作原理以及如何构建配置。

注意

有关演示捆绑包功能和常见捆绑包用例的配置示例，请参阅捆绑包配置示例和 GitHub 中的捆绑示例存储库。

以下示例捆绑配置指定本地文件 hello.py 居于与捆绑配置文件 databricks.yml 相同的目录中。它使用具有指定群集 ID 的远程群集将此笔记本作为作业运行。远程工作区 URL 和工作区身份验证凭据是从调用方名为的本地DEFAULT中读取的。

bundle:
  name: hello-bundle

resources:
  jobs:
    hello-job:
      name: hello-job
      tasks:
        - task_key: hello-task
          existing_cluster_id: 1234-567890-abcde123
          notebook_task:
            notebook_path: ./hello.py

targets:
  dev:
    default: true

下面的示例添加了一个名称为 prod 的目标，该目标使用不同的远程工作区 URL 和工作区身份验证凭据，这些凭据是从调用者的 .databrickscfg 文件中与指定工作区 URL 匹配的 host 条目中读取的。此作业运行相同的笔记本，但使用具有指定群集 ID 的不同远程群集。

注意

Databricks 建议尽可能使用 host 映射而不是 default 映射，因为这样可以使捆绑包配置文件更易于移植。设置 host 映射会指示 Databricks CLI 在 .databrickscfg 文件中查找匹配的配置文件，然后使用该配置文件的字段来确定要使用的 Databricks 身份验证类型。如果存在具有匹配 host 字段的多个配置文件，则必须使用 --profile 捆绑命令上的选项来指定要使用的配置文件。

请注意，不需要在 notebook_task 映射中声明 prod 映射，因为如果未显式替代 notebook_task 映射中的 resources 映射，它会回退为使用顶级 notebook_task 映射中的 prod 映射。

bundle:
  name: hello-bundle

resources:
  jobs:
    hello-job:
      name: hello-job
      tasks:
        - task_key: hello-task
          existing_cluster_id: 1234-567890-abcde123
          notebook_task:
            notebook_path: ./hello.py

targets:
  dev:
    default: true
  prod:
    workspace:
      host: https://<production-workspace-url>
    resources:
      jobs:
        hello-job:
          name: hello-job
          tasks:
            - task_key: hello-task
              existing_cluster_id: 2345-678901-fabcd456

使用以下捆绑包命令在 dev 目标中验证、部署和运行此作业。有关捆绑包生命周期的详细信息，请参阅开发 Databricks 资产捆绑包。

# Because the "dev" target is set to "default: true",
# you do not need to specify "-t dev":
databricks bundle validate
databricks bundle deploy
databricks bundle run hello_job

# But you can still explicitly specify it, if you want or need to:
databricks bundle validate
databricks bundle deploy -t dev
databricks bundle run -t dev hello_job

若要改为在 prod 目标中验证、部署和运行此作业，请执行以下操作：

# You must specify "-t prod", because the "dev" target
# is already set to "default: true":
databricks bundle validate
databricks bundle deploy -t prod
databricks bundle run -t prod hello_job

若要实现更模块化和更好地跨捆绑包重复使用定义和设置，请将捆绑配置拆分为单独的文件：

# databricks.yml

bundle:
  name: hello-bundle

include:
  - '*.yml'

# hello-job.yml

resources:
  jobs:
    hello-job:
      name: hello-job
      tasks:
        - task_key: hello-task
          existing_cluster_id: 1234-567890-abcde123
          notebook_task:
            notebook_path: ./hello.py

# targets.yml

targets:
  dev:
    default: true
  prod:
    workspace:
      host: https://<production-workspace-url>
    resources:
      jobs:
        hello-job:
          name: hello-job
          tasks:
            - task_key: hello-task
              existing_cluster_id: 2345-678901-fabcd456

反馈

此页面是否有帮助？

Last updated on 2026-02-14

通过

Databricks 资产捆绑包配置

databricks.yml

规范

示例

反馈

其他资源