Hi @Janice Chi
how do we guarantee that merge operations in Databricks happen in the correct sequence — insert first, followed by updates — especially when all events for the same key come through Kafka?
To ensure that MERGE operations in Databricks apply CDC events (Insert → Update → Delete) in the correct sequence, even when consuming from Kafka, you should design your architecture around the following principles:
1.Kafka Guarantees Ordering Per Partition
Kafka guarantees the order of messages within a single partition. If your Kafka producer uses the primary key as the message key, all events for a specific key (e.g., an Insert followed by Updates) will be routed to the same partition, and their original order will be preserved.
2.Include Sequence Metadata
Make sure your CDC payload contains reliable metadata to preserve ordering:
- event_timestamp → time when the change occurred in the source.
- op_type → indicates 'I', 'U', or 'D' (Insert/Update/Delete).
- kafka_offset or sequence_number → for deterministic ordering.
3.Sort Events Before Merging
When using micro-batch processing (foreachBatch), ensure that you sort CDC events by event_timestamp or sequence_number before applying your MERGE INTO. This ensures that older operations (e.g., Insert) are applied before newer ones (e.g., Updates or Deletes) for each key.
4.Deduplicate and Apply Row-wise Merge Logic
Within each batch:
- Use Spark window functions like ROW_NUMBER() to get the latest operation per key if needed.
- Group by primary key, then apply merge rules:
WHEN MATCHED AND op_type = 'U' THEN UPDATE
WHEN MATCHED AND op_type = 'D' THEN DELETE
WHEN NOT MATCHED AND op_type = 'I' THEN INSERT
5.Ensure Idempotency and Retry Safety
Maintain a control table to track processed Kafka offsets per partition. This helps ensure that retries do not violate operation order and allows you to reprocess only failed partitions without affecting others.
By enforcing key-based Kafka partitioning, capturing sequence metadata, and implementing sorted, idempotent merge logic in Databricks, you can guarantee that Insert, Update, and Delete events are applied in the correct order even in high-throughput, distributed CDC pipelines.
Does Kafka preserve ordering of events per key?
Yes, Kafka guarantees per-partition ordering. If your Kafka producer uses a consistent partition key (typically the primary key), then all events for that key (Insert → Update → Delete) will go to the same partition and retain their ingestion order within that partition.
Kafka does not guarantee ordering across partitions, so if events for the same key are distributed across multiple partitions, ordering may be lost.
How should we read and process these messages in Databricks to maintain this order during the merge?
Use Spark Structured Streaming to consume Kafka with the following setup:
- Set the Kafka partition key as your primary key in the CDC producer to ensure all events per key land in the same partition.
- In Databricks, process data in partition-aware batches, preserving the Kafka offset order within each partition.
- Include event_timestamp, op_type, and kafka_offset in your schema
Best practice:
- Use withWatermark to manage late data.
- Order CDC records within the micro-batch by event_timestamp or kafka_offset.
- Reconstruct full rows for UPDATEs if Kafka sends only changed fields.
Are there any recommended best practices or Spark Structured Streaming configurations for processing I-U-D CDC events in the correct order for each key?
- Primary Key → Kafka Partition: Ensure CDC producer uses primary key as Kafka message key.
- Add Metadata: Include event_timestamp, op_type (I, U, D), and kafka_offset for ordering and auditing.
- Event Sequencing: a) Sort each micro-batch by event_timestamp or kafka_offset before applying MERGE. b)For UPDATEs, use coalesce logic if only partial fields are sent.
- Upsert Logic: a) For MERGE INTO, ensure only one row per primary key using ROW_NUMBER() or deduplication per micro-batch.
- Idempotency: a) Track kafka_offset per partition in a control table to avoid reprocessing.
If we perform micro-batch based merge in Databricks, how can we ensure ordering across partitions and offset ranges?
You cannot enforce strict global ordering across partitions — only intra-partition ordering is guaranteed.
Approach:
- Process each partition independently if fine-grained control is needed.
- Maintain a control table tracking partition → last_processed_offset for idempotent replays.
- Within each micro-batch, process data sorted by event_timestamp (or kafka_offset if available).
- In large setups, use foreachBatch() in Structured Streaming and handle partitioning logic explicitly within your batch logic.
I hope this information helps. Please do let us know if you have any further queries.
Kindly consider upvoting the comment if the information provided is helpful. This can assist other community members in resolving similar issues.
Thank you.