Inconsistencies with Azure Event Hubs and Checkpoints

Paul Hernandez 631 Reputation points Microsoft Employee
2024-01-24T16:54:36.8733333+00:00

We are observing inconsistencies in the process of sending events to an event hub, reading events from a Databricks notebook using spark.readStream, and making use of checkpoint to read only the latest records coming to EventHub. While in normal cases, the count of events sent and received match, in some random cases, we are getting all of the events instead of just the increment, that is, the ones sent after the previous checkpoint. Could anyone please point to possible causes of this behavior?

Azure Event Hubs
Azure Event Hubs
An Azure real-time data ingestion service.
558 questions
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
1,937 questions
{count} votes