Event Structure

 

This topic describes the structure of events which are published by the event source or consumed by the event sink. In the development experience, they are specified either as single primitive types, as .NET Framework classes, or structs, and define the data (payload) that is associated with each event in the event stream.

For general information about events and event streams, see StreamInsight Concepts.

Events

The underlying data represented in a temporal stream is packaged into events. An event is the basic unit of data processed by the StreamInsight server. Each event consists of the following parts:

  • Header. An event header contains metadata that defines the event kind and one or more timestamps that define the time interval for the event. The timestamps are application-based and supplied by the data source rather than a system time supplied by the StreamInsight server. Note that the timestamps use the DateTimeOffset data type, which has time zone awareness and is based on a 24-hour clock. The StreamInsight server normalizes all times to UTC datetime and verifies on input that the UTC flag is set on the timestamp fields.

  • Payload. A .NET data structure that holds the data associated with the event. The fields defined in the payload are user-defined. Their types are based on the .NET type system.

Events in the stream whose application timestamps correspond to their order of arrival into the query are said to be “in order”. When this is not the case, events are said to arrive “out of order”. The StreamInsight server guarantees that if events arrive out of order, the output of a query is the same as if the events arrived in order, unless the query writer explicitly specifies otherwise. Within a stream, typical event arrival patterns are:

  • A steady rate, such as records from files or tables.

  • An intermittent and random rate, such as data from a retail barcode scanner.

  • An intermittent rate with sudden bursts, such as Web clicks or weather telemetry.

Event Header

The header of an event defines the event kind as well as the temporal properties of the event.

Event Kind

The event kind indicates whether the event is a new event in the stream or an indicator that carries information about the stream. StreamInsight supports two event kinds: INSERT and CTI (current time increment).

The INSERT event kind adds an event with its payload into the event stream. In addition to the payload, the header of the INSERT event identifies the start and end time for the event. The following diagram shows the layout of an INSERT event kind.

Header Payload
Event kind ::= INSERT

StartTime ::= DateTimeOffset

EndTime ::= DateTimeOffset
Field 1 … Field n as CLR types

The CTI event kind is a special punctuation event that indicates the completeness of the existing events in the stream. The CTI event structure consists of a single field that provides a current timestamp. A CTI event serves two purposes:

  1. First, it enables a query to accept and process events whose application timestamps do not correspond to their order of arrival into the query. When a CTI event is issued, it indicates to the StreamInsight server that no subsequent incoming INSERT events will revise the event history before the CTI timestamp. That is, after a CTI event has been issued, no INSERT event can have a start time earlier than the timestamp of the CTI event. This indication of "completeness" of a stream of events enables the StreamInsight server to release the results of windowing or other aggregating operators that have accumulated state, thus ensuring that events flow efficiently through the system.

  2. The second purpose of CTI events is to maintain the low latency of the query. Frequent CTIs will make the query pump out the results at a higher frequency.

Important


Without the presence of CTI events in the input stream, no output will be generated from the query.

For more information, see Advancing Application Time.

The following diagram shows the layout of a CTI event kind.

Header
Event kind ::= CTI

StartTime ::= DateTimeOffset

Event Models

The event model defines the event shape based on its temporal characteristics. StreamInsight supports three event models: interval, point, and edge. Interval events can be seen as the most generic type, of which edge and point are special cases.

Interval

The interval event model represents an event whose payload is valid for a given period of time. The interval event model requires that both the start and end time of the event be provided in the event metadata. Interval events are valid only for this specific time interval. It is important to note that start times are inclusive, whereas end times are exclusive regarding the validity of the event's payload.

The following diagram shows the layout of an interval event model.

Metadata Payload
Event kind ::= INSERT

StartTime ::= DateTimeOffset

EndTime ::= DateTimeOffset
Field 1 … Field n as CLR types

Examples of interval events include the width of an electronic pulse, the duration of (validity of) an auction bid, or a stock ticker activity in which the bid price for the stock is valid for a specific time period. In the power monitoring example described above, the power meter event stream may be represented with the following interval events.

Event Kind Start End Payload (Consumption)
INSERT 2009-07-15 09:13:33.317 2009-07-15 09:14:09.270 100
INSERT 2009-07-15 09:14:09.270 2009-07-15 09:14:22.255 200
INSERT 2009-07-15 09:14:22.255 2009-07-15 09:15:04.987 100
Point

A point event model represents an event occurrence at a single point in time. The point event model requires only the start time for the event. The StreamInsight server infers the valid end time by adding a tick (the smallest unit of time in the underlying time data type) to the start time to set the valid time interval for the event. Because event end times are exclusive, point events are valid only for the single instant of their start time.

The following diagram shows the layout of a point event model.

Metadata Payload
Event kind ::= INSERT

StartTime ::= DateTimeOffset
Field 1 … Field n as CLR types

Examples of point events include a meter reading, the arrival of an email, a user Web click, a stock tick, or an entry into the Windows Event Log. In the power monitoring example described above, the power meter event stream may be represented with the following point events. Note that the end time is calculated as the start time plus 1 tick (t).

Event Kind Start End Payload (Consumption)
INSERT 2009-07-15 09:13:33.317 2009-07-15 09:13:33.317 + t 100
INSERT 2009-07-15 09:14:09.270 2009-07-15 09:14:09.270 + t 200
INSERT 2009-07-15 09:14:22.255 2009-07-15 09:14:22.255 + t 100
Edge

An edge event model represents an event occurrence whose payload is valid for a given interval of time. When a Start edge event is received, the end time is set to the maximum time into the future. When the End event is received, the end time of the event is updated. The edge event model contains two properties: occurrence time and an edge type. Together, these properties define either the start or end point of the edge event.

The following diagram shows the layout of an edge event model.

Metadata Payload
Event kind ::= INSERT

Edge time ::= DateTimeOffset

Edge type ::= START | END
Field 1 … Field n as CLR types

Examples of edge events are Windows processes, trace events from Event Tracing for Windows (ETW), a Web user session, or quantization of an analog signal. The valid time interval for the payload of an edge event is the difference between the timestamp of the Start event and the timestamp of the End event. In the following diagram, notice that the event with a payload value of 'c' does not have a known end date at this point in time.

Event Kind Edge Type Start Time End Time Payload
INSERT Start t0 DateTimeOffset.MaxValue a
INSERT End t0 t1 a
INSERT Start t1 DateTimeOffset.MaxValue b
INSERT End t1 t3 b
INSERT Start t3 DateTimeOffset.MaxValue c
… and so on

The following illustration shows the quantization of an analog signal using edge events based on the start and end times defined in the table above. Such a continuous signal implies that for every new value, both an END as well as a START edge must be submitted. The described edges in the illustration refer to the event from time t1 to t3.

EdgeEvent

It is important to choose the right event model for your data. For instance, if you have events that last for a period of time, and your application has the ability to determine both the start and end times of the event, it is better to use interval events to model it. If you have a scenario where you do not know the end time of an event at event arrival, you could consider modeling the event as a point event, alter its lifetime to extend for a period of time, and then use the Clip operation to modify the lifetime when that event’s end is recognized. The other alternative to consider is to model these events as edge events.

While edge events are a very convenient event model, there are a couple of performance implications you should be aware of. Processing edge events works best when these events arrive fully ordered – i.e. all start edges are ordered on start time, end edges are ordered on end time, and the combined sequence of events is also ordered in time. For example, suppose you have the following sequence of edge events:

Event Kind Edge Type Start Time End Time Payload
INSERT Start 1 DateTimeOffset.MaxValue a
INSERT End 1 10 a
INSERT Start 3 DateTimeOffset.MaxValue b
INSERT End 3 6 b
INSERT Start 5 DateTimeOffset.MaxValue c
INSERT End 5 20 c

This sequence is unordered on timestamps (1, 10, 3, 6, 5, 20). If the edge events were fully ordered (1, 3, 5, 6, 10, 20) query processing would perform more efficiently. This can be achieved by splitting the problem into two queries. The first query is an empty query that receives edge events as input, fully orders them, and outputs the ordered edge events. The second query takes this input and performs the main logic. Note that these should be defined as two separate queries and then joined together using dynamic query composition. For more information, see Composing Queries at Runtime.

Event Payload

The payload of an event is a .NET data structure that contains the data associated with the event. The fields in the payload are user-defined and their types are based on the .NET type system. Most CLR scalar and elementary types are supported for payload fields. Nested types are not supported.

Event Fields

When you design an event structure, you define a .NET class or struct that represents the fixed payload, or use a primitive type if your event payload can be represented by a single field. When using structs or classes, you can use only public fields and properties as payload fields. Private fields and properties, and class methods are ignored and cannot be used in the event type. The following example defines a simple event type that has two payload fields, V1 and V2, of type int.

public class MyPayload  
{  
    public int V1 { get; set; }  
    public int V2 { get; set; }  
}  

Another example shows how to use a nested event type:

public class ComplexPayload  
{  
    public ValueType Value { get; set; }  
    public bool Status  { get; set; }  
}  
  
public class ValueType  
{  
    public double Value { get; set; }  
    public int Quality  { get; set; }  
}  

Payload Field Requirements

When defining events, consider the following payload field requirements and functionality.

  • An event structure cannot have an empty payload structure. At least one field is required.

  • Both scalar and elementary .NET Framework types and nested types can be used for the payload fields. See the section 'Supported Data Types' that follows.

  • Fields cannot be modified by using custom attributes.

  • Events in the StreamInsight server are an ordered list of fields instead of .NET structs, which do not impose an order on its fields.

  • Nullability of the field is inferred. For example, int? will be nullable, but int will not be nullable. String and byte[] types are always nullable.

  • The default size of byte[] is 512.

  • The maximum length of string fields is only bound by the page size for the entire event, which is 16k bytes.

Event Size

There is no explicit limitation on the number of fields that can be defined in the event. The number of fields depends on the type of the individual fields, their size, and nullability.

The event page size in the StreamInsight server is 16K. Because an event cannot span across multiple pages, 16K minus some overhead is the effective maximum event size (including payload and timestamp fields). In addition to the fixed overhead incurred by the page header, event header, and system fields such as timestamp, nullability adds to the variable overhead in the order of N/8 adjusted to an upper boundary.

To maximize event page utilization, we recommend the following guidelines:

  • Avoid nullable fields.

  • Minimize the use of string and byte[] fields.

  • Keep the event lifetimes only as long as required by the semantics of the respective scenario, so that memory needed for event state can be released more efficiently in the engine

Supported Data Types

In StreamInsight, each event field and expression has a specific data type. StreamInsight supports the following data types. Event payloads can also contain nested types that are composed of these data types.

Short name .NET Class Type Width in bits Range
byte Byte Unsigned integer 8 0 to 255
sbyte Sbyte Signed integer 8 -128 to 127
byte[] Byte[] 1 byte
int int32 Signed integer 32 -2,147,483,648 to 2,147,483,647
uint uint32 Unsigned integer 32 0 to 4294967295
short int16 Signed integer 16 -32,768 to 32,767
ushort uint16 Unsigned integer 16 0 to 65535
long int64 Signed integer 64 -9223372036854775808 to 9223372036854775807
ulong uint64 Unsigned integer 64 0 to 18446744073709551615
float Single Single-precision floating point type 32 -3.4 × 1038 to +3.4 × 1038
double Double Double-precision floating point type 64 ±5.0 × 10−324 to ±1.7 × 10308
decimal Decimal Precise fractional or integral type that can represent decimal numbers with 29 significant digits 128 ±1.0 × 10e−28 to ±7.9 × 10e28
bool Boolean Logical Boolean type 8 true or false
datetime DateTime Dates and times with values ranging from 12:00:00 midnight, January 1, 0001 Anno Domini (Common Era) through 11:59:59 P.M., December 31, 9999 A.D. (C.E.)
timespan TimeSpan The number of ticks that equal the represented time interval. A tick is equal to 100 nanoseconds Int64.MinValue ticks to Int64.MaxValue ticks
guid guid Globally unique identifier 128
char Char A Unicode character. 16 U+0000 to U+ffff
string String 1 A sequence of Unicode characters .

1 Does not include nullable type.

See Also

StreamInsight Concepts