scan operator

2025-05-30

Applies to: ✅ Microsoft Fabric ✅ Azure Data Explorer ✅ Azure Monitor ✅ Microsoft Sentinel

Scans data, matches, and builds sequences based on the predicates.

Matching records are determined according to predicates defined in the operator’s steps. A predicate can depend on the state that is generated by previous steps. The output for the matching record is determined by the input record and assignments defined in the operator's steps.

Syntax

T | scan [ with_match_id = MatchIdColumnName ] [ declare ( ColumnDeclarations ) ] with ( StepDefinitions )

ColumnDeclarations syntax

ColumnName : ColumnType[= DefaultValue ] [, ... ]

StepDefinition syntax

step StepName [ output = all | last | none] : Condition [ => Column = Assignment [, ... ] ] ;

Learn more about syntax conventions.

Parameters

Name	Type	Required	Description
T	`string`	✔️	The input tabular source.
MatchIdColumnName	`string`		The name of a column of type `long` that is appended to the output as part of the scan execution. Indicates the 0-based index of the match for the record.
ColumnDeclarations	`string`		Declares an extension to the schema of T. These columns are assigned values in the steps. If not assigned, the DefaultValue is returned. Unless otherwise specified, DefaultValue is `null`.
StepName	`string`	✔️	Used to reference values in the state of scan for conditions and assignments. The step name must be unique.
Condition	`string`	✔️	An expression that evaluates to `true` or `false` that defines which records from the input match the step. A record matches the step when the condition is `true` with the step’s state or with the previous step’s state.
Assignment	`string`		A scalar expression that is assigned to the corresponding column when a record matches a step.
`output`	`string`		Controls the output logic of the step on repeated matches. `all` outputs all records matching the step, `last` outputs only the last record in a series of repeating matches for the step, and `none` doesn't output records matching the step. The default is `all`.

Returns

A record for each match of a record from the input to a step. The schema of the output is the schema of the source extended with the column in the declare clause.

Scan logic

scan goes over the serialized input data, record by record, comparing each record against each step’s condition while taking into account the current state of each step.

State

The underlying state of the scan operator can be thought of as a table with a row for each step. Each step maintains its own state with the latest values of the columns and declared variables from all of the previous steps and the current step. If relevant, it also holds the match ID for the ongoing sequence.

If a scan operator has n steps named s_1, s_2, ..., s_n then step s_k would have k records in its state corresponding to s_1, s_2, ..., s_k. The StepName.ColumnName format is used to reference a value in the state. For instance, s_2.col1 would reference column col1 that belongs to step s_2 in the state of s_k. For a detailed example, see the scan logic walkthrough.

The state starts empty and updates whenever a scanned input record matches a step. When the state of the current step is nonempty, the step is referred to as having an active sequence.

Matching logic

Each input record is evaluated against all of the steps in reverse order, from the last step to the first. When a record r is evaluated against some step s_k, the following logic is applied:

Check 1: If the state of the previous step (s_k-1) is nonempty, and r meets the Condition of s_k, then a match occurs. The match leads to the following actions:
1. The state of s_k is cleared.
2. The state of s_k-1 is promoted to become the state of s_k.
3. The assignments of s_k are calculated and extend r.
4. The extended r is added to the output and to the state of s_k.
Note

If Check 1 results in a match, Check 2 is disregarded, and r moves on to be evaluated against s_k-1.
Check 2: If the state of s_k has an active sequence or s_k is the first step, and r meets the Condition of s_k, then a match occurs. The match leads to the following actions:
1. The assignments of s_k are calculated and extend r.
2. The values that represent s_k in the state of s_k are replaced with the values of the extended r.
3. If s_k is defined as output=all, the extended r is added to the output.
4. If s_k is the first step, a new sequence begins and the match ID increases by 1. This only affects the output when with_match_id is used.

Once the checks for s_k are complete, r moves on to be evaluated against s_k-1.

For a detailed example of this logic, see the scan logic walkthrough.

Examples

The example in this section shows how to use the syntax to help you get started.

The examples in this article use publicly available tables in the help cluster, such as the StormEvents table in the Samples database.

The examples in this article use publicly available tables, such as the StormEvents table in the Weather analytics sample data.

Cumulative sum

Calculate the cumulative sum for an input column. The result of this example is equivalent to using row_cumsum().

Run the query

range x from 1 to 5 step 1 
| scan declare (cumulative_x:long=0) with 
(
    step s1: true => cumulative_x = x + s1.cumulative_x;
)

Output

x	cumulative_x
1	1
2	3
3	6
4	10
5	15

Cumulative sum on multiple columns with a reset condition

Calculate the cumulative sum for two input columns, reset the sum value to the current record value whenever the cumulative sum reached 10 or more.

Run the query

range x from 1 to 5 step 1
| extend y = 2 * x
| scan declare (cumulative_x:long=0, cumulative_y:long=0) with 
(
    step s1: true => cumulative_x = iff(s1.cumulative_x >= 10, x, x + s1.cumulative_x), 
                     cumulative_y = iff(s1.cumulative_y >= 10, y, y + s1.cumulative_y);
)

Output

x	y	cumulative_x	cumulative_y
1	2	1	2
2	4	3	6
3	6	6	12
4	8	10	8
5	10	5	18

Fill forward a column

Fill forward a string column. Each empty value is assigned the last seen nonempty value.

Run the query

let Events = datatable (Ts: timespan, Event: string) [
    0m, "A",
    1m, "",
    2m, "B",
    3m, "",
    4m, "",
    6m, "C",
    8m, "",
    11m, "D",
    12m, ""
]
;
Events
| sort by Ts asc
| scan declare (Event_filled: string="") with 
(
    step s1: true => Event_filled = iff(isempty(Event), s1.Event_filled, Event);
)

Output

Ts	Event	Event_filled
00:00:00	A	A
00:01:00		A
00:02:00	B	B
00:03:00		B
00:04:00		B
00:06:00	C	C
00:08:00		C
00:11:00	D	D
00:12:00		D

Sessions tagging

Divide the input into sessions: a session ends 30 minutes after the first event of the session, after which a new session starts. Note the use of with_match_id flag, which assigns a unique value for each distinct match (session) of scan. Also note the special use of two steps in this example, inSession has true as condition so it captures and outputs all the records from the input while endSession captures records that happen more than 30m from the sessionStart value for the current match. The endSession step has output=none meaning it doesn't produce output records. The endSession step is used to advance the state of the current match from inSession to endSession, allowing a new match (session) to begin, starting from the current record.

Run the query

let Events = datatable (Ts: timespan, Event: string) [
    0m, "A",
    1m, "A",
    2m, "B",
    3m, "D",
    32m, "B",
    36m, "C",
    38m, "D",
    41m, "E",
    75m, "A"
]
;
Events
| sort by Ts asc
| scan with_match_id=session_id declare (sessionStart: timespan) with 
(
    step inSession: true => sessionStart = iff(isnull(inSession.sessionStart), Ts, inSession.sessionStart);
    step endSession output=none: Ts - inSession.sessionStart > 30m;
)

Output

Ts	Event	sessionStart	session_id
00:00:00	A	00:00:00	0
00:01:00	A	00:00:00	0
00:02:00	B	00:00:00	0
00:03:00	D	00:00:00	0
00:32:00	B	00:32:00	1
00:36:00	C	00:32:00	1
00:38:00	D	00:32:00	1
00:41:00	E	00:32:00	1
01:15:00	A	01:15:00	2

Calculating Session Length per User

Calculate the session start time, end time, and duration for each user's session using the scan operator. A session is defined as a period between a user's login and the subsequent logout. By combining partition and scan with output=none and output=all, this pattern ensures that a single row is returned per session (i.e., per login/logout pair), rather than a row per event.

The logic works by:

In step s1: Capturing the login timestamp using a scan step with output=none
In step s2: Emitting a row only when a matching logout is found using output=all

Run the query

let LogsEvents = datatable(Timestamp:datetime, userID:int, EventType:string)
[
    datetime(2024-05-28 08:15:23), 1, "login",
    datetime(2024-05-28 08:30:15), 2, "login",
    datetime(2024-05-28 09:10:27), 3, "login",
    datetime(2024-05-28 12:30:45), 1, "logout",
    datetime(2024-05-28 11:45:32), 2, "logout",
    datetime(2024-05-28 13:25:19), 3, "logout"
];
LogsEvents
| sort by userID, Timestamp
| partition hint.strategy=native by userID (
    sort by Timestamp asc 
    | scan declare (start: datetime, end: datetime, sessionDuration: timespan) with (
        step s1 output=none: EventType == "login" => start = Timestamp;
        step s2 output=all: EventType == "logout" => start = s1.start, end = Timestamp, sessionDuration = Timestamp - s1.start;
    )
)
| project start, end, userID, sessionDuration

Output

userID	start	end	sessionDuration
1	2024-05-28 08:15:23.0000	2024-05-28 12:30:45.0000	04:15:22
3	2024-05-28 09:10:27.0000	2024-05-28 13:25:19.0000	04:14:52
2	2024-05-28 08:30:15.0000	2024-05-28 11:45:32.0000	03:15:17

Events between Start and Stop

Find all sequences of events between the event Start and the event Stop that occur within 5 minutes. Assign a match ID for each sequence.

Run the query

let Events = datatable (Ts: timespan, Event: string) [
    0m, "A",
    1m, "Start",
    2m, "B",
    3m, "D",
    4m, "Stop",
    6m, "C",
    8m, "Start",
    11m, "E",
    12m, "Stop"
]
;
Events
| sort by Ts asc
| scan with_match_id=m_id with 
(
    step s1: Event == "Start";
    step s2: Event != "Start" and Event != "Stop" and Ts - s1.Ts <= 5m;
    step s3: Event == "Stop" and Ts - s1.Ts <= 5m;
)

Output

Ts	Event	m_id
00:01:00	Start	0
00:02:00	B	0
00:03:00	D	0
00:04:00	Stop	0
00:08:00	Start	1
00:11:00	E	1
00:12:00	Stop	1

Calculate a custom funnel of events

Calculate a funnel completion of the sequence Hail -> Tornado -> Thunderstorm Wind by State with custom thresholds on the times between the events (Tornado within 1h and Thunderstorm Wind within 2h). This example is similar to the funnel_sequence_completion plugin, but allows greater flexibility.

Run the query

StormEvents
| partition hint.strategy=native by State 
    (
    sort by StartTime asc
    | scan with 
    (
        step hail: EventType == "Hail";
        step tornado: EventType == "Tornado" and StartTime - hail.StartTime <= 1h;
        step thunderstormWind: EventType == "Thunderstorm Wind" and StartTime - tornado.StartTime <= 2h;
    )
    )
| summarize dcount(State) by EventType

Output

EventType	dcount_State
Hail	50
Tornado	34
Thunderstorm Wind	32

Scan logic walkthrough

This section demonstrates the scan logic using a step-by-step walkthrough of the Events between start and stop example:

Run the query

let Events = datatable (Ts: timespan, Event: string) [
    0m, "A",
    1m, "Start",
    2m, "B",
    3m, "D",
    4m, "Stop",
    6m, "C",
    8m, "Start",
    11m, "E",
    12m, "Stop"
]
;
Events
| sort by Ts asc
| scan with_match_id=m_id with 
(
    step s1: Event == "Start";
    step s2: Event != "Start" and Event != "Stop" and Ts - s1.Ts <= 5m;
    step s3: Event == "Stop" and Ts - s1.Ts <= 5m;
)

Output

Ts	Event	m_id
00:01:00	Start	0
00:02:00	B	0
00:03:00	D	0
00:04:00	Stop	0
00:08:00	Start	1
00:11:00	E	1
00:12:00	Stop	1

The state

Think of the state of the scan operator as a table with a row for each step, in which each step has its own state. This state contains the latest values of the columns and declared variables from all of the previous steps and the current step. To learn more, see State.

For this example, the state can be represented with the following table:

step	s2.Ts	s2.Event	s3.Ts	s3.Event
s1	X	X	X	X
s2			X	X
s3

The "X" indicates that a specific field is irrelevant for that step.

The matching logic

This section follows the matching logic through each record of the Events table, explaining the transformation of the state and output at each step.

Note

An input record is evaluated against the steps in reverse order, from the last step (s3) to the first step (s1).

Record 1

Ts	Event
0m	"A"

Record evaluation at each step:

s3: Check 1 isn't passed because the state of s2 is empty, and Check 2 isn't passed because s3 lacks an active sequence.
s2: Check 1 isn't passed because the state of s1 is empty, and Check 2 isn't passed because s2 lacks an active sequence.
s1: Check 1 is irrelevant because there's no previous step. Check 2 isn't passed because the record doesn't meet the condition of Event == "Start". Record 1 is discarded without affecting the state or output.

State:

step	s2.Ts	s2.Event	s3.Ts	s3.Event
s1	X	X	X	X
s2			X	X
s3

Record 2

Ts	Event
1m	"Start"

Record evaluation at each step:

s3: Check 1 isn't passed because the state of s2 is empty, and Check 2 isn't passed because s3 lacks an active sequence.
s2: Check 1 isn't passed because the state of s1 is empty, and Check 2 isn't passed because s2 lacks an active sequence.
s1: Check 1 is irrelevant because there's no previous step. Check 2 is passed because the record meets the condition of Event == "Start". This match initiates a new sequence, and the m_id is assigned. Record 2 and its m_id (0) are added to the state and the output.

State:

step	m_id	s1.Ts	s1.Event	s2.Ts	s2.Event	s3.Ts	s3.Event
s1	0	00:01:00	"Start"	X	X	X	X
s2						X	X
s3

Record 3

Ts	Event
2m	"B"

Record evaluation at each step:

s3: Check 1 isn't passed because the state of s2 is empty, and Check 2 isn't passed because s3 lacks an active sequence.
s2: Check 1 is passed because the state of s1 is nonempty and the record meets the condition of Ts - s1.Ts < 5m. This match causes the state of s1 to be cleared and the sequence in s1 to be promoted to s2. Record 3 and its m_id (0) are added to the state and the output.
s1: Check 1 is irrelevant because there's no previous step, and Check 2 isn't passed because the record doesn't meet the condition of Event == "Start".

State:

step	m_id	s1.Ts	s1.Event	s2.Ts	s2.Event	s3.Ts	s3.Event
s1				X	X	X	X
s2	0	00:01:00	"Start"	00:02:00	"B"	X	X
s3

Record 4

Ts	Event
3m	"D"

Record evaluation at each step:

s3: Check 1 isn't passed because the record doesn't meet the condition of Event == "Stop", and Check 2 isn't passed because s3 lacks an active sequence.
s2: Check 1 isn't passed because the state of s1 is empty. it passes Check 2 because it meets the condition of Ts - s1.Ts < 5m. Record 4 and its m_id (0) are added to the state and the output. The values from this record overwrite the previous state values for s2.Ts and s2.Event.
s1: Check 1 is irrelevant because there's no previous step, and Check 2 isn't passed because the record doesn't meet the condition of Event == "Start".

State:

step	m_id	s1.Ts	s1.Event	s2.Ts	s2.Event	s3.Ts	s3.Event
s1				X	X	X	X
s2	0	00:01:00	"Start"	00:03:00	"D"	X	X
s3

Record 5

Ts	Event
4m	"Stop"

Record evaluation at each step:

s3: Check 1 is passed because s2 is nonempty and it meets the s3 condition of Event == "Stop". This match causes the state of s2 to be cleared and the sequence in s2 to be promoted to s3. Record 5 and its m_id (0) are added to the state and the output.
s2: Check 1 isn't passed because the state of s1 is empty, and Check 2 isn't passed because s2 lacks an active sequence.
s1: Check 1 is irrelevant because there's no previous step. Check 2 isn't passed because the record doesn't meet the condition of Event == "Start".

State:

step	m_id	s1.Ts	s1.Event	s2.Ts	s2.Event	s3.Ts	s3.Event
s1				X	X	X	X
s2						X	X
s3	0	00:01:00	"Start"	00:03:00	"D"	00:04:00	"Stop"

Record 6

Ts	Event
6m	"C"

Record evaluation at each step:

s3: Check 1 isn't passed because the state of s2 is empty, and Check 2 isn't passed because s3 doesn't meet the s3 condition of Event == "Stop".
s2: Check 1 isn't passed because the state of s1 is empty, and Check 2 isn't passed because s2 lacks an active sequence.
s1: Check 1 isn't passed because there's no previous step, and Check 2 isn't passed because it doesn't meet the condition of Event == "Start". Record 6 is discarded without affecting the state or output.

State:

step	m_id	s1.Ts	s1.Event	s2.Ts	s2.Event	s3.Ts	s3.Event
s1				X	X	X	X
s2						X	X
s3	0	00:01:00	"Start"	00:03:00	"D"	00:04:00	"Stop"

Record 7

Ts	Event
8m	"Start"

Record evaluation at each step:

s3: Check 1 isn't passed because the state of s2 is empty, and Check 2 isn't passed because it doesn't meet the condition of Event == "Stop".
s2: Check 1 isn't passed because the state of s1 is empty, and Check 2 isn't passed because s2 lacks an active sequence.
s1: Check 1 isn't passed because there's no previous step. it passes Check 2 because it meets the condition of Event == "Start". This match initiates a new sequence in s1 with a new m_id. Record 7 and its m_id (1) are added to the state and the output.

State:

step	m_id	s1.Ts	s1.Event	s2.Ts	s2.Event	s3.Ts	s3.Event
s1	1	00:08:00	"Start"	X	X	X	X
s2						X	X
s3	0	00:01:00	"Start"	00:03:00	"D"	00:04:00	"Stop"

Note

There are now two active sequences in the state.

Record 8

Ts	Event
11m	"E"

Record evaluation at each step:

s3: Check 1 isn't passed because the state of s2 is empty, and Check 2 isn't passed because it doesn't meet the s3 condition of Event == "Stop".
s2: Check 1 is passed because the state of s1 is nonempty and the record meets the condition of Ts - s1.Ts < 5m. This match causes the state of s1 to be cleared and the sequence in s1 to be promoted to s2. Record 8 and its m_id (1) are added to the state and the output.
s1: Check 1 is irrelevant because there's no previous step, and Check 2 isn't passed because the record doesn't meet the condition of Event == "Start".

State:

step	m_id	s1.Ts	s1.Event	s2.Ts	s2.Event	s3.Ts	s3.Event
s1				X	X	X	X
s2	1	00:08:00	"Start"	00:11:00	"E"	X	X
s3	0	00:01:00	"Start"	00:03:00	"D"	00:04:00	"Stop"

Record 9

Ts	Event
12m	"Stop"

Record evaluation at each step:

s3: Check 1 is passed because s2 is nonempty and it meets the s3 condition of Event == "Stop". This match causes the state of s2 to be cleared and the sequence in s2 to be promoted to s3. Record 9 and its m_id (1) are added to the state and the output.
s2: Check 1 isn't passed because the state of s1 is empty, and Check 2 isn't passed because s2 lacks an active sequence.
s1: Check 1 isn't passed because there's no previous step. it passes Check 2 because it meets the condition of Event == "Start". This match initiates a new sequence in s1 with a new m_id.

State:

step	m_id	s1.Ts	s1.Event	s2.Ts	s2.Event	s3.Ts	s3.Event
s1				X	X	X	X
s2						X	X
s3	1	00:08:00	"Start"	00:11:00	"E"	00:12:00	"Stop"

Final output

Ts	Event	m_id
00:01:00	Start	0
00:02:00	B	0
00:03:00	D	0
00:04:00	Stop	0
00:08:00	Start	1
00:11:00	E	1
00:12:00	Stop	1

Share via

scan operator

Syntax

ColumnDeclarations syntax

StepDefinition syntax

Parameters

Returns

Scan logic

State

Matching logic

Examples

Cumulative sum

Cumulative sum on multiple columns with a reset condition

Fill forward a column

Sessions tagging

Calculating Session Length per User

Events between Start and Stop

Calculate a custom funnel of events

Scan logic walkthrough

The state

The matching logic

Record 1

Record 2

Record 3

Record 4

Record 5

Record 6

Record 7

Record 8

Record 9

Final output

Feedback

Additional resources