Share via


Enable the default publishing mode in a pipeline

Important

This feature is in Public Preview.

This article describes how to migrate pipelines that use the LIVE virtual schema (the legacy publishing mode) to the default publishing mode.

The default publishing mode allows a single pipeline to write to multiple catalogs and schemas, and includes a simplified syntax for working with tables and views within the pipeline. The legacy publishing mode is considered deprecated, and Databricks recommends migrating all pipelines to the default publishing mode.

Migration affects the metadata of the pipeline, but does not read, move, or write to any datasets.

How to tell if your pipeline uses the legacy publishing mode

Legacy publishing mode pipelines are indicated in the Summary field of the Lakeflow Declarative Pipelines settings UI. You can also confirm a pipeline uses legacy publishing mode if the target field is set in the pipeline's JSON specification.

Considerations for migrating to the default publishing mode

The following notes are useful to keep in mind during migration:

  • After a pipeline is migrated to the default publishing mode, it can't be migrated back to using the LIVE virtual schema.
  • You may need to prepare your pipeline for migration by addressing any syntax changes between the legacy and default publishing modes. Most pipelines do not require changes. For details, see Preparing pipelines for migration.
  • The migration affects metadata only. It does not read, move, or write to any datasets.
  • In the default publishing mode, materialized views and streaming tables can't be moved across schemas after they are created.
  • The default publishing mode requires Databricks CLI version v0.230.0 or above. See Install or update the Databricks CLI.

Migrate to the default publishing mode

Use the following steps to migrate to the default publishing mode.

  1. Open the pipeline that you want to migrate in the Databricks UI.

  2. Pause updates, and let any currently running pipeline stop.

    At least one update must run prior to completing the migration. If the pipeline is triggered, or was already paused, manually run a single update. If the pipeline is continuous, make sure that it gets to (or is already in) the RUNNING state, and then pause.

  3. Optionally, prepare any code that may need to be migrated.

    The default publishing mode is generally backwards compatible with the legacy publishing mode, but be sure to properly prepare your pipeline for migration so that your pipeline code runs correctly when upgraded. Most pipelines will not need changes.

  4. Add a configuration in the pipeline Settings: pipelines.enableDPMForExistingPipeline, set to true.

  5. Start a manual update, and let the update complete.

  6. Go to the pipeline settings JSON page. Find the target field, and replace it with schema (keep the same value). Your configuration should include JSON such as the following:

    ...
    "catalog": "main",
    "configuration": {
      "pipelines.setMigrationHints": "true",
      "pipelines.enableDPMForExistingPipeline": "true"
    },
    "schema": "default",
    "data_sampling": "false",
    ...
    
  7. Save your JSON.

  8. In the pipeline Settings, remove the pipeline configuration for pipelines.setMigrationHints and pipelines.enableDPMForExistingPipeline.

  9. Enable the pipeline updates, as they were prior to migration.

The default publishing mode is now enabled on the pipeline. If you see issues, use the next section to help troubleshoot. If issues persist, reach out to your Databricks account manager.

Preparing pipelines for migration

The default publishing mode is generally backwards compatible with the legacy publishing mode, but some pipeline may need to be modified to run. The following notes can help you prepare your pipelines for migration.

The LIVE keyword

The LIVE keyword in the legacy publishing mode qualifies the catalog and schema of the object with the pipeline defaults. The default publishing mode no longer uses the LIVE keyword to qualify tables or views. The LIVE keyword is ignored, and replaced with the default catalog and schema for the pipeline. Generally this will use the same default catalog and schema as the LIVE keyword in the legacy publishing mode, unless you later add USE CATALOG or USE SCHEMA commands to your pipeline.

In the legacy publishing mode, partially qualified table and view references without the LIVE keyword (such as table1) use the Spark defaults. In the default publishing mode, partially qualified references use the pipeline defaults. If your Spark defaults and pipelines are different, you should fully qualify the name of any partially qualified table or view before migrating.

Note

After migration, you can remove the LIVE keyword from your code. Optionally, you can replace the LIVE keyword with fully qualified table or view names.

Column references with the LIVE keyword

You can't use the LIVE keyword to define columns in the default publishing mode. For example, this code:

CREATE OR REPLACE MATERIALIZED VIEW target AS SELECT LIVE.source.id FROM LIVE.source;

would need to be replaced with the following, prior to migration:

CREATE OR REPLACE MATERIALIZED VIEW target AS SELECT source.id FROM LIVE.source;

This version works in either publishing mode.

Warnings vs errors

Some warnings in the legacy publishing mode have been replaced by errors in the default publishing mode.

Self references A self-reference (or circular reference) is not allowed in the default publishing mode (and had undefined results in the legacy publishing mode). For example:

CREATE OR REPLACE MATERIALIZED VIEW table1 AS SELECT * FROM target_catalog.target_schema.table1;

would generate a warning in the legacy publishing mode (and have undefined results). In the default publishing mode, it generates an error.

Multipart names You can't use periods in names in the default publishing mode (multipart names). For example, the following Python code is valid in the legacy mode, but not in the default mode:

@dlt.view(name=”a.b.c”)
def transform():
  return …

Before migrating, rename the view to a name that doesn't include a period character.

Troubleshooting

The following table describes errors that could happen when migrating from the legacy publishing mode.

Error Description
CANNOT_MIGRATE_HMS_PIPELINE Migration is not supported for Hive metastore pipelines. As an alternative, you might be able to clone the pipeline from Hive metastore to Unity Catalog prior to migration. See Create a Unity Catalog pipeline by cloning a Hive metastore pipeline.
MISSING_EXPECTED_PROPERTY This error indicates that you did not run a recent update prior to adding the pipelines.enableDPMForExistingPipeline configuration. Remove that configuration, and, if it is missing, add the pipelines.setMigrationHints configuration, set to true. Run an update, and then continue from step 3.
PIPELINE_INCOMPATIBLE_WITH_DPM This error indicates that your pipeline code is not fully compatible with the default publishing mode. See Preparing pipelines for migration.