parse-where operator
Applies to: ✅ Microsoft Fabric ✅ Azure Data Explorer ✅ Azure Monitor ✅ Microsoft Sentinel
Evaluates a string expression, and parses its value into one or more calculated columns. The result is only the successfully parsed strings.
parse-where
parses the strings in the same way as parse, and filters out strings that were not parsed successfully.
See parse operator, which produces nulls for unsuccessfully parsed strings.
Syntax
T | parse-where
[kind=
kind [flags=
regexFlags]] expression with
*
(stringConstant columnName [:
columnType]) *
...
Learn more about syntax conventions.
Parameters
Name | Type | Required | Description |
---|---|---|---|
T | string |
✔️ | The tabular input to parse. |
kind | string |
✔️ | One of the supported kind values. The default value is simple . |
regexFlags | string |
If kind is regex , then you can specify regex flags to be used like U for ungreedy, m for multi-line mode, s for match new line \n , and i for case-insensitive. More flags can be found in Flags. |
|
expression | string |
✔️ | An expression that evaluates to a string. |
stringConstant | string |
✔️ | A string constant for which to search and parse. |
columnName | string |
✔️ | The name of a column to assign a value to, extracted from the string expression. |
columnType | string |
The scalar value that indicates the type to convert the value to. The default is the string . |
Note
- Use project if you also want to drop or rename some columns.
- Use
*
in the pattern to skip junk values. This value can't be used afterstring
column. - The parse pattern may start with ColumnName, in addition to StringConstant.
- If the parsed expression isn't of type
string
, it will be converted to typestring
.
Supported kind values
Text | Description |
---|---|
simple |
This is the default value. stringConstant is a regular string value and the match is strict. All string delimiters should appear in the parsed string, and all extended columns must match the required types. |
regex |
stringConstant may be a regular expression and the match is strict. All string delimiters, which can be a regex for this mode, should appear in the parsed string, and all extended columns must match the required types. |
Regex mode
In regex mode, parse will translate the pattern to a regex and use regular expressions in order to do the matching using numbered captured groups that are handled internally. For example:
parse-where kind=regex Col with * <regex1> var1:string <regex2> var2:long
The regex that will be generated by the parse internally is .*?<regex1>(.*?)<regex2>(\-\d+)
.
*
was translated to.*?
.string
was translated to.*?
.long
was translated to\-\d+
.
Returns
The input table, which is extended according to the list of columns that are provided to the operator.
Note
Only successfully parsed strings will be in the output. Strings that don't match the pattern will be filtered out.
Examples
The parse-where
operator provides a streamlined way to extend
a table by using multiple extract
applications on the same string
expression. This is most useful when the table has a string
column that contains several values that you want to break into individual columns. For example, you can break up a column that was produced by a developer trace ("printf
"/"Console.WriteLine
") statement.
Using parse
In the example below, the column EventText
of table Traces
contains strings of the form Event: NotifySliceRelease (resourceName={0}, totalSlices= {1}, sliceNumber={2}, lockTime={3}, releaseTime={4}, previousLockTime={5})
. The operation below will extend the table with six columns: resourceName
, totalSlices
, sliceNumber
, lockTime
, releaseTime
, previousLockTime
, Month
, and Day
.
A few of the strings don't have a full match.
Using parse
, the calculated columns will have nulls.
let Traces = datatable(EventText: string)
[
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=invalid_number, lockTime=02/17/2016 08:40:01, releaseTime=02/17/2016 08:40:01, previousLockTime=02/17/2016 08:39:01)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=15, lockTime=02/17/2016 08:40:00, releaseTime=invalid_datetime, previousLockTime=02/17/2016 08:39:00)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=20, lockTime=02/17/2016 08:40:01, releaseTime=02/17/2016 08:40:01, previousLockTime=02/17/2016 08:39:01)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=22, lockTime=02/17/2016 08:41:01, releaseTime=02/17/2016 08:41:00, previousLockTime=02/17/2016 08:40:01)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=invalid_number, sliceNumber=16, lockTime=02/17/2016 08:41:00, releaseTime=02/17/2016 08:41:00, previousLockTime=02/17/2016 08:40:00)"
];
Traces
| parse EventText with * "resourceName=" resourceName ", totalSlices=" totalSlices: long * "sliceNumber=" sliceNumber: long * "lockTime=" lockTime ", releaseTime=" releaseTime: date "," * "previousLockTime=" previouLockTime: date ")" *
| project
resourceName,
totalSlices,
sliceNumber,
lockTime,
releaseTime,
previouLockTime
Output
resourceName | totalSlices | sliceNumber | lockTime | releaseTime | previousLockTime |
---|---|---|---|---|---|
PipelineScheduler | 27 | 20 | 02/17/2016 08:40:01 | 2016-02-17 08:40:01.0000000 | 2016-02-17 08:39:01.0000000 |
PipelineScheduler | 27 | 22 | 02/17/2016 08:41:01 | 2016-02-17 08:41:00.0000000 | 2016-02-17 08:40:01.0000000 |
Using parse-where
Using 'parse-where' will filter-out unsuccessfully parsed strings from the result.
let Traces = datatable(EventText: string)
[
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=invalid_number, lockTime=02/17/2016 08:40:01, releaseTime=02/17/2016 08:40:01, previousLockTime=02/17/2016 08:39:01)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=15, lockTime=02/17/2016 08:40:00, releaseTime=invalid_datetime, previousLockTime=02/17/2016 08:39:00)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=20, lockTime=02/17/2016 08:40:01, releaseTime=02/17/2016 08:40:01, previousLockTime=02/17/2016 08:39:01)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=22, lockTime=02/17/2016 08:41:01, releaseTime=02/17/2016 08:41:00, previousLockTime=02/17/2016 08:40:01)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=invalid_number, sliceNumber=16, lockTime=02/17/2016 08:41:00, releaseTime=02/17/2016 08:41:00, previousLockTime=02/17/2016 08:40:00)"
];
Traces
| parse-where EventText with * "resourceName=" resourceName ", totalSlices=" totalSlices: long * "sliceNumber=" sliceNumber: long * "lockTime=" lockTime ", releaseTime=" releaseTime: date "," * "previousLockTime=" previousLockTime: date ")" *
| project
resourceName,
totalSlices,
sliceNumber,
lockTime,
releaseTime,
previousLockTime
Output
resourceName | totalSlices | sliceNumber | lockTime | releaseTime | previousLockTime |
---|---|---|---|---|---|
PipelineScheduler | 27 | 20 | 02/17/2016 08:40:01 | 2016-02-17 08:40:01.0000000 | 2016-02-17 08:39:01.0000000 |
PipelineScheduler | 27 | 22 | 02/17/2016 08:41:01 | 2016-02-17 08:41:00.0000000 | 2016-02-17 08:40:01.0000000 |
Regex mode using regex flags
To get the resourceName and totalSlices, use the following query:
let Traces = datatable(EventText: string)
[
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=non_valid_integer, sliceNumber=11, lockTime=02/17/2016 08:40:01, releaseTime=02/17/2016 08:40:01, previousLockTime=02/17/2016 08:39:01)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=15, lockTime=02/17/2016 08:40:00, releaseTime=02/17/2016 08:40:00, previousLockTime=02/17/2016 08:39:00)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=non_valid_integer, sliceNumber=44, lockTime=02/17/2016 08:40:01, releaseTime=02/17/2016 08:40:01, previousLockTime=02/17/2016 08:39:01)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=22, lockTime=02/17/2016 08:41:01, releaseTime=02/17/2016 08:41:00, previousLockTime=02/17/2016 08:40:01)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=16, lockTime=02/17/2016 08:41:00, releaseTime=02/17/2016 08:41:00, previousLockTime=02/17/2016 08:40:00)"
];
Traces
| parse-where kind = regex EventText with * "RESOURCENAME=" resourceName "," * "totalSlices=" totalSlices: long "," *
| project resourceName, totalSlices
Output
resourceName | totalSlices |
---|---|
parse-where
with case-insensitive regex flag
In the above query, the default mode was case-sensitive, so the strings were parsed successfully. No result was obtained.
To get the required result, run parse-where
with a case-insensitive (i
) regex flag.
Only three strings will be parsed successfully, so the result is three records (some totalSlices hold invalid integers).
let Traces = datatable(EventText: string)
[
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=non_valid_integer, sliceNumber=11, lockTime=02/17/2016 08:40:01, releaseTime=02/17/2016 08:40:01, previousLockTime=02/17/2016 08:39:01)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=15, lockTime=02/17/2016 08:40:00, releaseTime=02/17/2016 08:40:00, previousLockTime=02/17/2016 08:39:00)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=non_valid_integer, sliceNumber=44, lockTime=02/17/2016 08:40:01, releaseTime=02/17/2016 08:40:01, previousLockTime=02/17/2016 08:39:01)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=22, lockTime=02/17/2016 08:41:01, releaseTime=02/17/2016 08:41:00, previousLockTime=02/17/2016 08:40:01)",
"Event: NotifySliceRelease (resourceName=PipelineScheduler, totalSlices=27, sliceNumber=16, lockTime=02/17/2016 08:41:00, releaseTime=02/17/2016 08:41:00, previousLockTime=02/17/2016 08:40:00)"
];
Traces
| parse-where kind = regex flags=i EventText with * "RESOURCENAME=" resourceName "," * "totalSlices=" totalSlices: long "," *
| project resourceName, totalSlices
Output
resourceName | totalSlices |
---|---|
PipelineScheduler | 27 |
PipelineScheduler | 27 |
PipelineScheduler | 27 |