When you start the job, the timestamp you specify (now or custom time) instructs the job to produce output from that timestamp going forward. The job then looks at your query logic and calculates how much data must be read from your input sources in order to produce output from timestamp X. And yes, even if query is based on event time instead of arrival time, it will automatically be taken care of.
ASA job start options in relation to Event Hub input
Shawn Lam
21
Reputation points
The start options for an ASA job make this claim:
Now: Makes the starting point of the output event stream the same as when the job is started. If a temporal operator is used (e.g. time window, LAG or JOIN), Azure Stream Analytics will automatically look back at the data in the input source. For instance, if you start a job “Now” and if your query uses a 5-minutes Tumbling Window, Azure Stream Analytics will seek data from 5 minutes ago in the input.
How does that behave in the context of an Event Hub input?
- Does it query Event Hubs directly for past 5 minutes' worth of events? Even though Event Hub consumers usually read from checkpoints within a consumer group.
- What if our query is based on event time rather than arrival time? Would it know how to query into Event Hubs data that way?