Partajați prin


pipelines command group

Note

This information applies to Databricks CLI versions 0.205 and above. The Databricks CLI is in Public Preview.

Databricks CLI use is subject to the Databricks License and Databricks Privacy Notice, including any Usage Data provisions.

The pipelines command group within the Databricks CLI contains two sets of functionality. The first set allows you to manage a pipeline project and its workflow. The second set allows you to create, edit, delete, start, and view details about pipeline objects in Databricks.

For information about pipelines, see Lakeflow Spark Declarative Pipelines.

Manage pipeline projects

The following commands allow you to manage pipelines in projects.

databricks pipelines deploy

Deploy pipelines by uploading all files defined in the project to the target workspace, and creating or updating the pipelines defined in the workspace.

databricks pipelines deploy [flags]

Arguments

None

Options

--auto-approve

    Skip interactive approvals that might be required for deployment

--fail-on-active-runs

    Fail if there are running pipelines in the deployment

--force-lock

    Force acquisition of deployment lock

Global flags

databricks pipelines destroy

Destroy a pipelines project.

databricks pipelines destroy [flags]

Arguments

None

Options

--auto-approve

    Skip interactive approvals for deleting pipelines

--force-lock

    Force acquisition of deployment lock

Global flags

databricks pipelines dry-run

Validates correctness of the pipeline's graph, identified by KEY. Doesn't materialize or publish any datasets.

databricks pipelines dry-run [flags] [KEY]

Arguments

KEY

    The unique name of the pipeline to dry run, as defined in its YAML file. If there's only one pipeline in the project, KEY is optional and the pipeline is auto-selected.

Options

--no-wait

    Don't wait for the run to complete

--restart

    Restart the run if it's already running

Global flags

databricks pipelines generate

Generate configuration for an existing Spark pipeline.

This command looks for a spark-pipeline.yml or *.spark-pipeline.yml file in the specified directory and generates a new *.pipeline.yml configuration file in the resources folder of the project that defines the pipeline. If multiple spark-pipeline.yml files exist, specify the full path to a specific *.spark-pipeline.yml file.

databricks pipelines generate [flags]

Note

To generate configuration for an existing pipeline in the Databricks workspace, see databricks bundle generate pipeline and Generate configuration for an existing job or pipeline using the Databricks CLI.

Options

--existing-pipeline-dir

    Path to the existing pipeline directory in src (e.g., src/my_pipeline).

--force

    Overwrite existing pipeline configuration file.

Global flags

Examples

The following example looks in the current directory and reads src/my_pipeline/spark-pipeline.yml, then creates a configuration file resources/my_pipeline.pipeline.yml that defines the pipeline:

databricks pipelines generate --existing-pipeline-dir src/my_pipeline

databricks pipelines history

Retrieve past runs for a pipeline identified by KEY.

databricks pipelines history [flags] [KEY]

Arguments

KEY

    The unique name of the pipeline, as defined in its YAML file. If there's only one pipeline in the project, KEY is optional and the pipeline is auto-selected.

Options

--end-time string

    Filter updates before this time (format: 2025-01-15T10:30:00Z)

--start-time string

    Filter updates after this time (format: 2025-01-15T10:30:00Z)

Global flags

databricks pipelines init

Initialize a new pipelines project.

For a tutorial that walks through creating, deploying, and running a pipeline project using the Databricks CLI, see Develop Lakeflow Spark Declarative Pipelines with Databricks Asset Bundles.

databricks pipelines init [flags]

Arguments

None

Options

--config-file string

    JSON file containing key value pairs of input parameters required for template initialization

--output-dir string

    Directory to write the initialized template to

Global flags

databricks pipelines logs

Retrieve events for the pipeline identified by KEY. By default, this command shows the events of the pipeline's most recent update.

databricks pipelines logs [flags] [KEY]

Arguments

KEY

    The unique name of the pipeline, as defined in its YAML file. If there's only one pipeline in the project, KEY is optional and the pipeline is auto-selected.

Options

--end-time string

    Filter for events that are before this end time (format: 2025-01-15T10:30:00Z)

--event-type strings

    Filter events by list of event types

--level strings

    Filter events by list of log levels (INFO, WARN, ERROR, METRICS)

-n, --number int

    Number of events to return

--start-time string

    Filter for events that are after this start time (format: 2025-01-15T10:30:00Z)

--update-id string

    Filter events by update ID. If not provided, uses the most recent update ID

Global flags

Examples

databricks pipelines logs pipeline-name --update-id update-1 -n 10
databricks pipelines logs pipeline-name --level ERROR,METRICS --event-type update_progress --start-time 2025-01-15T10:30:00Z

databricks pipelines open

Open a pipeline in the browser, identified by KEY.

databricks pipelines open [flags] [KEY]

Arguments

KEY

    The unique name of the pipeline to open, as defined in its YAML file. If there's only one pipeline in the project, KEY is optional and the pipeline is auto-selected.

Options

--force-pull

    Skip local cache and load the state from the remote workspace

Global flags

databricks pipelines run

Run the pipeline identified by KEY. Refreshes all tables in the pipeline unless otherwise specified.

databricks pipelines run [flags] [KEY]

Arguments

KEY

    The unique name of the pipeline to run, as defined in its YAML file. If there's only one pipeline in the project, KEY is optional and the pipeline is auto-selected.

Options

--full-refresh strings

    List of tables to reset and recompute

--full-refresh-all

    Perform a full graph reset and recompute

--no-wait

    Don't wait for the run to complete

--refresh strings

    List of tables to run

--restart

    Restart the run if it's already running

Global flags

databricks pipelines stop

Stop the pipeline if it's running, identified by KEY or PIPELINE_ID. If there is no active update for the pipeline, this request is a no-op.

databricks pipelines stop [KEY|PIPELINE_ID] [flags]

Arguments

KEY

    The unique name of the pipeline to stop, as defined in its YAML file. If there's only one pipeline in the project, KEY is optional and the pipeline is auto-selected.

PIPELINE_ID

    The UUID of the pipeline to stop.

Options

--no-wait

    do not wait to reach IDLE state

--timeout duration

    maximum amount of time to reach IDLE state (default 20m0s)

Global flags

Manage pipeline objects

The following commands allow you to manage pipeline objects in Databricks.

databricks pipelines create

Create a new data processing pipeline based on the requested configuration. If successful, this command returns the ID of the new pipeline.

databricks pipelines create [flags]

Arguments

None

Options

--json JSON

    The inline JSON string or the @path to the JSON file with the request body.

Global flags

databricks pipelines delete

Delete a pipeline.

databricks pipelines delete PIPELINE_ID [flags]

Arguments

PIPELINE_ID

    The pipeline to delete.

Options

Global flags

databricks pipelines get

Get a pipeline.

databricks pipelines get PIPELINE_ID [flags]

Arguments

PIPELINE_ID

    The pipeline to get.

Options

Global flags

databricks pipelines get-update

Get an update from an active pipeline.

databricks pipelines get-update PIPELINE_ID UPDATE_ID [flags]

Arguments

PIPELINE_ID

    The ID of the pipeline.

UPDATE_ID

    The ID of the update.

Options

Global flags

databricks pipelines list-pipeline-events

Retrieve events for a pipeline.

databricks pipelines list-pipeline-events PIPELINE_ID [flags]

Arguments

PIPELINE_ID

    The pipeline to retrieve events for.

Options

--filter string

    Criteria to select a subset of results, expressed using a SQL-like syntax.

--max-results int

    Max number of entries to return in a single page.

--page-token string

    Page token returned by previous call.

Global flags

databricks pipelines list-pipelines

List pipelines defined in the Delta Live Tables system.

databricks pipelines list-pipelines [flags]

Arguments

None

Options

--filter string

    Select a subset of results based on the specified criteria.

--max-results int

    The maximum number of entries to return in a single page.

--page-token string

    Page token returned by previous call.

Global flags

databricks pipelines list-updates

List updates for an active pipeline.

databricks pipelines list-updates PIPELINE_ID [flags]

Arguments

PIPELINE_ID

    The pipeline to return updates for.

Options

--max-results int

    Max number of entries to return in a single page.

--page-token string

    Page token returned by previous call.

--until-update-id string

    If present, returns updates until and including this update_id.

Global flags

databricks pipelines start-update

Start a new update for the pipeline. If there is already an active update for the pipeline, the request will fail and the active update will remain running.

databricks pipelines start-update PIPELINE_ID [flags]

Arguments

PIPELINE_ID

    The pipeline to start an update for.

Options

--cause StartUpdateCause

    Supported values: [API_CALL, JOB_TASK, RETRY_ON_FAILURE, SCHEMA_CHANGE, SERVICE_UPGRADE, USER_ACTION]

--full-refresh

    If true, this update will reset all tables before running.

--json JSON

    The inline JSON string or the @path to the JSON file with the request body.

--validate-only

    If true, this update only validates the correctness of pipeline source code but does not materialize or publish any datasets.

Global flags

databricks pipelines update

Update a pipeline with the supplied configuration.

databricks pipelines update PIPELINE_ID [flags]

Arguments

PIPELINE_ID

    Unique identifier for this pipeline.

Options

--allow-duplicate-names

    If false, deployment will fail if the name has changed and it conflicts with the name of another pipeline.

--budget-policy-id string

    Budget policy of this pipeline.

--catalog string

    A catalog in Unity Catalog to publish data from this pipeline to.

--channel string

    Lakeflow Spark Declarative Pipelines release channel that specifies which version to use.

--continuous

    Whether the pipeline is continuous or triggered.

--development

    Whether the pipeline is in development mode.

--edition string

    Pipeline product edition.

--expected-last-modified int

    If present, the last-modified time of the pipeline settings before the edit.

--id string

    Unique identifier for this pipeline.

--json JSON

    The inline JSON string or the @path to the JSON file with the request body.

--name string

    Friendly identifier for this pipeline.

--photon

    Whether Photon is enabled for this pipeline.

--pipeline-id string

    Unique identifier for this pipeline.

--schema string

    The default schema (database) where tables are read from or published to.

--serverless

    Whether serverless compute is enabled for this pipeline.

--storage string

    DBFS root directory for storing checkpoints and tables.

--target string

    Target schema (database) to add tables in this pipeline to.

Global flags

databricks pipelines get-permission-levels

Get pipeline permission levels.

databricks pipelines get-permission-levels PIPELINE_ID [flags]

Arguments

PIPELINE_ID

    The pipeline for which to get or manage permissions.

Options

Global flags

databricks pipelines get-permissions

Get the permissions of a pipeline. Pipelines can inherit permissions from their root object.

databricks pipelines get-permissions PIPELINE_ID [flags]

Arguments

PIPELINE_ID

    The pipeline for which to get or manage permissions.

Options

Global flags

databricks pipelines set-permissions

Set pipeline permissions.

Sets permissions on an object, replacing existing permissions if they exist. Deletes all direct permissions if none are specified. Objects can inherit permissions from their root object.

databricks pipelines set-permissions PIPELINE_ID [flags]

Arguments

PIPELINE_ID

    The pipeline for which to get or manage permissions.

Options

--json JSON

    The inline JSON string or the @path to the JSON file with the request body.

Global flags

databricks pipelines update-permissions

Update the permissions on a pipeline. Pipelines can inherit permissions from their root object.

databricks pipelines update-permissions PIPELINE_ID [flags]

Arguments

PIPELINE_ID

    The pipeline for which to get or manage permissions.

Options

--json JSON

    The inline JSON string or the @path to the JSON file with the request body.

Global flags

Global flags

--debug

  Whether to enable debug logging.

-h or --help

    Display help for the Databricks CLI or the related command group or the related command.

--log-file string

    A string representing the file to write output logs to. If this flag is not specified then the default is to write output logs to stderr.

--log-format format

    The log format type, text or json. The default value is text.

--log-level string

    A string representing the log format level. If not specified then the log format level is disabled.

-o, --output type

    The command output type, text or json. The default value is text.

-p, --profile string

    The name of the profile in the ~/.databrickscfg file to use to run the command. If this flag is not specified then if it exists, the profile named DEFAULT is used.

--progress-format format

    The format to display progress logs: default, append, inplace, or json

-t, --target string

    If applicable, the bundle target to use