Standalone pipelines vs. Lakeflow Spark Declarative Pipelines

Azure Databricks offers two ways to build materialized views and streaming tables: standalone pipelines, or full pipelines created with Lakeflow Spark Declarative Pipelines. Both run on the same declarative engine and produce Unity Catalog managed tables. The difference is how much of the pipeline you author and operate.

  • A standalone materialized view or streaming table is a single dataset defined with SQL syntax. Azure Databricks creates and manages a pipeline behind the scenes to refresh it. You create and refresh standalone datasets from a Databricks SQL warehouse, or from a notebook on serverless general compute using spark.sql(). See Standalone pipelines.
  • A Lakeflow Spark Declarative Pipelines pipeline is a pipeline that you author and operate as a unit. It can contain many datasets, in SQL and Python, with dependency orchestration, lineage, and pipeline-wide operational features. See What are pipelines?.

When you create a standalone materialized view or streaming table, the managed pipeline appears on the Jobs & Pipelines page with a pipeline type of MV/ST. Datasets defined in a Lakeflow Spark Declarative Pipelines pipeline have a pipeline type of ETL.

When to use a standalone pipeline

Use standalone materialized views and streaming tables when:

  • You accelerate queries or transform data with a single materialized view or streaming table.
  • You work from a Databricks SQL warehouse, the SQL editor, or a notebook on serverless general compute, and schedule refreshes with SCHEDULE, TRIGGER ON UPDATE, or a SQL task in a job.
  • You don't need sinks, multi-stage orchestration, or other pipeline-only features.

When to use a Lakeflow Spark Declarative Pipelines pipeline

Use a Lakeflow Spark Declarative Pipelines pipeline when:

  • You build a multi-stage pipeline with intermediate datasets, where Azure Databricks manages dependencies and lineage across the datasets. Intermediate datasets can be published to the catalog or kept private to the pipeline.
  • You author tables and flows in Python.
  • You write to external Delta tables or event streaming destinations using sinks (create_sink() or foreach_batch_sink()).
  • You apply change data capture from a database snapshot using create_auto_cdc_from_snapshot_flow().
  • You want triggered or continuous execution across the whole pipeline.

Comparison

Property Standalone streaming table or materialized view Pipeline streaming table or materialized view
Authoring interface SQL syntax, from a Databricks SQL warehouse or with spark.sql() in a notebook on serverless general compute SQL and Python
Scope One dataset, in a pipeline that Azure Databricks manages for you Many datasets in one pipeline, with dependency orchestration and lineage
Execution Triggered, with SCHEDULE, TRIGGER ON UPDATE, or a SQL task Triggered or continuous
Pipeline-only features Sinks, create_auto_cdc_from_snapshot_flow(), private datasets
Pipeline type label MV/ST ETL
Move between pipelines Not supported; recreate the table in the target pipeline Supported