Query about Checkpoint Strategy in HDInsight Spark and Event Hubs Environment
Dear Azure Support Team,
Hello, I am Wonho Kim from Samsung Electornics.
I have a question regarding the checkpoint strategy in the HDInsight Spark and Event Hubs environment.
As far as I know, using the Checkpoint feature of Spark allows us to perform actions related to failure recovery.
However, when there are some failure in Spark applications resulting in different offsets between Event Hubs and Spark Checkpoint,
is there any recommended method for managing checkpoints by Azure to ensure more accurate exactly-once guarantee?
In our previous service implementation using Flink and Kafka,
we performed a task where we fetched the offset from Kafka and aligned it with Flink's offset.
I would like to know if fetching the offset from Event Hubs and aligning it with Spark is a more precise or preferred approach.
Looking forward to your response.