声明性自动化捆绑包配置

本文介绍捆绑配置文件的语法，该文件定义了声明性自动化捆绑包（以前称为 Databricks 资产捆绑包）。请参阅什么是声明性自动化捆绑包？

有关捆绑配置参考，请参阅配置参考。

databricks.yml

捆绑包必须包含一个（且只有一个）配置文件，该文件在捆绑项目文件夹的根目录中命名 databricks.yml 。 databricks.yml 是定义捆绑包的主配置文件，但它可以在映射中 include 引用其他配置文件，例如资源配置文件。捆绑配置以 YAML 表示。有关 YAML 的详细信息，请参阅官方 YAML 规范。

最简单的 databricks.yml 定义捆绑包名称，这是所需的顶级映射和目标部署。

bundle:
  name: my_bundle

targets:
  dev:
    default: true

有关所有顶级映射的详细信息，请参阅配置参考。

提示

Python 对声明性自动化捆绑包的支持使你能够在 Python 中定义资源。请参阅 Python 中的捆绑配置。

规范

以下 YAML 规范为声明性自动化捆绑包提供顶级配置密钥。有关完整的配置参考，请参阅配置参考和声明性自动化捆绑包资源。

# This is the default bundle configuration if not otherwise overridden in
# the "targets" top-level mapping.
bundle: # Required.
  name: string # Required.
  databricks_cli_version: string
  cluster_id: string
  deployment: Map
  git:
    origin_url: string
    branch: string

# This is the identity to use to run the bundle
run_as:
  - user_name: <user-name>
  - service_principal_name: <service-principal-name>

# These are any additional configuration files to include.
include:
  - '<some-file-or-path-glob-to-include>'
  - '<another-file-or-path-glob-to-include>'

# These are any scripts that can be run.
scripts:
  <some-unique-script-name>:
    content: string

# These are any additional files or paths to include or exclude.
sync:
  include:
    - '<some-file-or-path-glob-to-include>'
    - '<another-file-or-path-glob-to-include>'
  exclude:
    - '<some-file-or-path-glob-to-exclude>'
    - '<another-file-or-path-glob-to-exclude>'
  paths:
    - '<some-file-or-path-to-synchronize>'

# These are the default artifact settings if not otherwise overridden in
# the targets top-level mapping.
artifacts:
  <some-unique-artifact-identifier>:
    build: string
    dynamic_version: boolean
    executable: string
    files:
      - source: string
    path: string
    type: string

# These are for any custom variables for use throughout the bundle.
variables:
  <some-unique-variable-name>:
    description: string
    default: string or complex
    lookup: Map
    type: string # The only valid value is "complex" if the variable is a complex variable, otherwise do not define this key.

# These are the workspace settings if not otherwise overridden in
# the targets top-level mapping.
workspace:
  artifact_path: string
  host: string
  profile: string
  resource_path: string
  root_path: string
  state_path: string

# These are the permissions to apply to resources defined
# in the resources mapping.
permissions:
  - level: <permission-level>
    group_name: <unique-group-name>
  - level: <permission-level>
    user_name: <unique-user-name>
  - level: <permission-level>
    service_principal_name: <unique-principal-name>

# These are the resource settings if not otherwise overridden in
# the targets top-level mapping.
resources:
  alerts:
    <unique-alert-name>:
      # alert settings
  apps:
    <unique-app-name>:
      # app settings
  catalogs:
    <unique-catalog-name>:
      # catalog settings
  clusters:
    <unique-cluster-name>:
      # cluster settings
  dashboards:
    <unique-dashboard-name>:
      # dashboard settings
  database_catalogs:
    <unique-database-catalog-name>:
      # database catalog settings
  database_instances:
    <unique-database-instance-name>:
      # database instance settings
  experiments:
    <unique-experiment-name>:
      # experiment settings
  jobs:
    <unique-job-name>:
      # job settings
  model_serving_endpoints:
    <unique-model-serving-endpoint-name>:
    # model_serving_endpoint settings
  pipelines:
    <unique-pipeline-name>:
      # pipeline settings
  postgres_branches:
    <unique-postgres-branch-name>:
      # postgres branch settings
  postgres_endpoints:
    <unique-postgres-endpoint-name>:
      # postgres endpoint settings
  postgres_projects:
    <unique-postgres-project-name>:
      # postgres project settings
  quality_monitors:
    <unique-quality-monitor-name>:
    # quality monitor settings
  registered_models:
    <unique-registered-model-name>:
    # registered model settings
  schemas:
    <unique-schema-name>:
      # schema settings
  secret_scopes:
    <unique-secret-scope-name>:
      # secret scopes settings
  sql_warehouses:
    <unique-sql-warehouse-name>:
      # sql warehouse settings
  synced_database_tables:
    <unique-synced-database-table-name>:
      # synced database table settings
  volumes:
    <unique-volume-name>:
    # volumes settings

# These are the targets to use for deployments and workflow runs. One and only one of these
# targets can be set to "default: true".
targets:
  <some-unique-programmatic-identifier-for-this-target>:
    artifacts:
      # artifact build settings for this target
    bundle:
      # bundle settings for this target
    default: boolean
    git: Map
    mode: string
    permissions:
      # permissions for this target
    presets:
      <preset>: <value>
    resources:
      # resource settings for this target
    sync:
      # sync settings for this target
    variables:
      <defined-variable-name>: <non-default-value> # value for this target
    workspace:
      # workspace settings for this target
    run_as:
      # run_as settings for this target

示例

本部分包含一些基本示例，可帮助你了解捆绑包的工作原理以及如何构建配置。

注意

有关演示捆绑包功能和常见捆绑包用例的配置示例，请参阅捆绑包配置示例和 GitHub 中的捆绑示例存储库。

以下示例捆绑配置指定本地文件 hello.py 居于与捆绑配置文件 databricks.yml 相同的目录中。它使用具有指定群集 ID 的远程群集将此笔记本作为作业运行。远程工作区 URL 和工作区身份验证凭据是从调用方名为的本地DEFAULT中读取的。

bundle:
  name: hello-bundle

resources:
  jobs:
    hello-job:
      name: hello-job
      tasks:
        - task_key: hello-task
          existing_cluster_id: 1234-567890-abcde123
          notebook_task:
            notebook_path: ./hello.py

targets:
  dev:
    default: true

下面的示例添加了一个名称为 prod 的目标，该目标使用不同的远程工作区 URL 和工作区身份验证凭据，这些凭据是从调用者的 .databrickscfg 文件中与指定工作区 URL 匹配的 host 条目中读取的。此作业运行相同的笔记本，但使用具有指定群集 ID 的不同远程群集。

注意

Databricks 建议尽可能使用 host 映射而不是 default 映射，因为这样可以使捆绑包配置文件更易于移植。设置 host 映射会指示 Databricks CLI 在 .databrickscfg 文件中查找匹配的配置文件，然后使用该配置文件的字段来确定要使用的 Databricks 身份验证类型。如果存在具有匹配 host 字段的多个配置文件，则必须使用 --profile 捆绑命令上的选项来指定要使用的配置文件。

请注意，不需要在 notebook_task 映射中声明 prod 映射，因为如果未显式替代 notebook_task 映射中的 resources 映射，它会回退为使用顶级 notebook_task 映射中的 prod 映射。

bundle:
  name: hello-bundle

resources:
  jobs:
    hello-job:
      name: hello-job
      tasks:
        - task_key: hello-task
          existing_cluster_id: 1234-567890-abcde123
          notebook_task:
            notebook_path: ./hello.py

targets:
  dev:
    default: true
  prod:
    workspace:
      host: https://<production-workspace-url>
    resources:
      jobs:
        hello-job:
          name: hello-job
          tasks:
            - task_key: hello-task
              existing_cluster_id: 2345-678901-fabcd456

使用以下捆绑包命令在 dev 目标中验证、部署和运行此作业。有关捆绑包生命周期的详细信息，请参阅开发声明性自动化捆绑包。

# Because the "dev" target is set to "default: true",
# you do not need to specify "-t dev":
databricks bundle validate
databricks bundle deploy
databricks bundle run hello_job

# But you can still explicitly specify it, if you want or need to:
databricks bundle validate
databricks bundle deploy -t dev
databricks bundle run -t dev hello_job

若要改为在 prod 目标中验证、部署和运行此作业，请执行以下操作：

# You must specify "-t prod", because the "dev" target
# is already set to "default: true":
databricks bundle validate
databricks bundle deploy -t prod
databricks bundle run -t prod hello_job

若要实现更模块化和更好地跨捆绑包重复使用定义和设置，请将捆绑配置拆分为单独的文件：

# databricks.yml

bundle:
  name: hello-bundle

include:
  - '*.yml'

# hello-job.yml

resources:
  jobs:
    hello-job:
      name: hello-job
      tasks:
        - task_key: hello-task
          existing_cluster_id: 1234-567890-abcde123
          notebook_task:
            notebook_path: ./hello.py

# targets.yml

targets:
  dev:
    default: true
  prod:
    workspace:
      host: https://<production-workspace-url>
    resources:
      jobs:
        hello-job:
          name: hello-job
          tasks:
            - task_key: hello-task
              existing_cluster_id: 2345-678901-fabcd456

其他资源

反馈

此页面是否有帮助？

Last updated on 2026-03-21