Snapshot Windows

A snapshot window defines a subset of events that fall within some period of time and over which you can perform some set-based computation such as an aggregation. Snapshot windows divide the timeline along the start and end times of events, and are thus dynamic and event-driven. Together with timestamp modifications, they are very flexible and can be used for a variety of scenarios.

For a general description of event windows and their implementation and use in StreamInsight, see Using Event Windows.

Understanding Snapshot Windows

Snapshot windows are defined according to the start and end times of the events in the stream, instead of a fixed grid along the timeline. The size and time period of the window is defined only by the events in the stream. For each pair of closest event endpoints (start time and end time), a snapshot window is created. By this definition, all event start and end times fall on window boundaries, never in between. That is, snapshot windows divide the timeline according to all occurring changes.

The following illustration shows a stream with three events: e1, e2, and e3. The vertical bars show the snapshot window boundaries that are defined by these events. The event streams in light blue represent the event streams moving through time. The orange boxes show the snapshot windows and the contained events in each window. For example, based on the start time and end time, only event e1 is in the first snapshot window. However, both events e1 and e2 are overlapping and, therefore, are included in the second window.

Snapshot window illustration

After the framework applies the input policy that clips events to the windows (the only input policy currently available), the events appear as shown in the following illustration.

Snapshot window with events clipped to the window.

These are the windows and events that are input into the actual set-based operation. Understanding the clipping behavior is important when applying a user-defined time-sensitive aggregate or operator that is able to inspect the timestamps of input events.

Defining a Snapshot Window

Snapshot windows have no parameters for the window definition. The default window policies clip input and output events to the window size.

var snapshotAgg = from w in inputStream.SnapshotWindow()
                  select new { sum = w.Sum(e => e.i) };

The arguments to the snapshot window in the example above are static properties that return instances of the corresponding policy classes. These properties are provided for convenience.

Snapshot windows are powerful constructs that can be used to implement sliding windows. A sliding window is a window in time that moves with events, instead of a fixed period. The advantage of a sliding window is that it adjusts its length according to the input events and hence does not produce any output if the input did not change. This can be seen as a way to compress the event stream. Such a design is especially useful for an aggregation inside a Group and Apply operator, with input data that constitutes a high number of groups. With a hopping window, each group would produce a result for each window, independently of the rate of change of the input. For more information, see Hopping Windows.

A sliding window is implemented with the snapshot operator paired with the appropriate temporal modification of the input stream. The timestamp modification (usually an extension of the event duration) will first change the "coverage" of each event over time. The snapshot will then contain all events that fall within that duration. For example, assume the goal is to compute the sliding average of a point event input stream over the last three minutes. The following illustration shows how an application of the AlterEventDuration() method on the input stream "stretches" each event along the timeline as shown in the following illustration.

Snapshot window with modified timestamps.

On that stream, the Snapshot operator is applied and the average on the desired event field is computed over the resulting windows as shown in the following illustration.

Snapshot window with aggregated (average) results.

The resulting events in this illustration describe the average of all events within the last three minutes, at each point in time. This result is represented by interval events, lasting as long as the average did not change within that timeframe. For example, the fourth event in that series asserts that the average of all events during the last three minutes was 1.5, measured from any point within that event. The result event starts when the point event with payload 3 just falls outside of the three minutes window, and the result event ends right before the point event with payload 1 falls outside the window as shown in the following illustration.

Snapshot window with point event results.

Using Language Integrated Query (LINQ), this scenario is expressed as follows (assuming that the input event type has a field 'Value').

var result = from win in inputStream.AlterEventDuration(e => TimeSpan.FromMinutes(3)).SnapshotWindow()
             select new { average = win.Avg(e => e.Value) };

In general, in order to "look back" a certain amount of time when applying a set-based operation through a snapshot, the event lifetimes must be extended into the future. Other types of event lifetime modification operators can be used to achieve different results. For more information, see Time Stamp Modifications.

See Also

Concepts

Aggregations

TopK

User-defined Aggregates and Operators

Time Stamp Modifications

Count Windows

Hopping Windows

Using Event Windows