Inconsistencies with Azure Event Hubs and Checkpoints
Paul Hernandez
631
Reputation points Microsoft Employee
We are observing inconsistencies in the process of sending events to an event hub, reading events from a Databricks notebook using spark.readStream, and making use of checkpoint to read only the latest records coming to EventHub. While in normal cases, the count of events sent and received match, in some random cases, we are getting all of the events instead of just the increment, that is, the ones sent after the previous checkpoint. Could anyone please point to possible causes of this behavior?