Cost, Scaling, and Migration Considerations – Spark Structured Streaming (DBR) vs. Delta Live Tables (DLT)

Question

Cost, Scaling, and Migration Considerations – Spark Structured Streaming (DBR) vs. Delta Live Tables (DLT)

Janice Chi 140

We are currently designing a near real-time streaming pipeline for a healthcare analytics workload and are evaluating Databricks Spark Structured Streaming (using DBR) versus Delta Live Tables (DLT) for implementation.

Project Context:

Ingestion source: Kafka-based CDC stream (~3,000 to 30,000 events/sec)

Target: Azure SQL Hyperscale

Current plan: Initially use standard Spark Structured Streaming in Databricks, without enabling auto-scaling

Cluster: We plan to manually tune min/max workers (e.g., 2 to 10 nodes), possibly using asynchronous auto-scaling (newer feature) to improve scale-in behavior

Concern: We've observed known limitations in Databricks with aggressive scale-in delays, and have read that even Databricks recommends avoiding auto-scaling in strict SLA-driven streaming workloads

Guidance Requested:

Can we safely start with DBR-based Spark Structured Streaming (without auto-scaling) for a real-time Kafka CDC pipeline, and defer DLT adoption until we observe any actual bottlenecks?

If we later decide to migrate the same Spark streaming logic to DLT, what will be the estimated effort:

Will this be a complete rewrite (SQL/Python format change)?

  Are there any known **incompatibilities or manual conversion steps** (e.g., `foreachBatch`, streaming joins, window operations, checkpointing)?
  
     Does DLT support seamless porting of existing notebook logic from DBR?
     
     Based on recent customer feedback and platform evolution, are there **clear cost benchmarks** or **guidance on which workloads justify DLT** over manually managed Spark streaming clusters?

Any clarification or recommended best practices for migrating from DBR to DLT with minimal disruption would be greatly appreciated.

Venkat Reddy Navari 3,125 Reputation points Microsoft External Staff Moderator

2025-06-26T10:30:29.25+00:00

@Janice Chi We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet. In case if you have any resolution please do share that same with the community as it can be helpful to others. Otherwise, will respond with more details and we will try to help.

1 answer

Your answer

Venkat Reddy Navari 3,125 Reputation points Microsoft External Staff Moderator

2025-06-26T10:30:29.25+00:00

@Janice Chi We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet. In case if you have any resolution please do share that same with the community as it can be helpful to others. Otherwise, will respond with more details and we will try to help.

Answer 1

@Janice Chi Yes, absolutely. Starting with Spark Structured Streaming (without enabling autoscaling) is a solid and widely used approach, especially when:

You want full control over cluster size and performance tuning
You're working with SLA-sensitive streaming (like Kafka CDC)
You’re still experimenting and don’t want the overhead of Delta Live Tables (DLT) just yet

Also, you’re right to be cautious about autoscaling. In practice, Databricks’ autoscaling can lag a bit on scale-in, and it’s something they themselves recommend avoiding in latency-sensitive streaming pipelines. Your plan to manually set a worker range and possibly use asynchronous autoscaling for smoother scale-in is a good move.

Migrating from DBR to DLT Later — What’s the Effort?

Code Format & Rewrite Requirements

It depends on how your current logic is written but if you structure things thoughtfully, the transition can be relatively smooth.

If your code uses standard DataFrame APIs (SQL or Python) and writes to Delta tables, you can likely reuse most of it in DLT with only minor adjustments.
If you're using foreachBatch to push data directly into Azure SQL (common in CDC), that part won’t carry over to DLT it doesn’t support custom sink logic like foreachBatch. In DLT, you’d typically write to Delta tables first, then sync to SQL via ADF or another downstream process.
More complex logic (e.g., streaming joins, window aggregations, manual checkpointing) might also need rework since DLT manages orchestration and state differently.

Notebook Reuse & Compatibility

Yes, you can reuse notebooks inside DLT pipelines. Just be mindful of structure:

DLT expects your logic to be wrapped in decorators like @dlt.table or @dlt.view. You can find the Python API reference for DLT here
It organizes your pipeline as a DAG, so breaking your logic into clear steps/modules helps
Logging, checkpointing, and retries are handled by DLT automatically, so some parts of your DBR logic may become redundant

Cost and When to Choose DLT

There’s no official public cost benchmark doc yet, but based on customer feedback and usage patterns:

DLT is Ideal When:

You want auto-managed recovery, testing, and data quality checks
Your pipeline has multiple dependencies across batch and streaming tables
You need built-in observability (event logs, quality metrics)
You’re managing multiple pipelines and need better governance and automation

DBR Might Be Better When:

You need custom sink logic, like writing directly to Azure SQL
You want fine-grained control over cluster resources and scheduling
Budget is tight and you want to avoid the per-pipeline cost model of DLT

For pricing, you can model both options using the Azure Pricing Calculator by comparing Databricks compute usage (standard vs. DLT pipelines) for your projected workloads.

I hope this information helps. Please do let us know if you have any further queries.

Kindly consider upvoting the comment if the information provided is helpful. This can assist other community members in resolving similar issues.

Share via

Cost, Scaling, and Migration Considerations – Spark Structured Streaming (DBR) vs. Delta Live Tables (DLT)

1 answer

Your answer