What are configuration patterns in the data processor?
Important
Azure IoT Operations Preview – enabled by Azure Arc is currently in PREVIEW. You shouldn't use this preview software in production environments.
You will need to deploy a new Azure IoT Operations installation when a generally available release is made available, you won't be able to upgrade a preview installation.
See the Supplemental Terms of Use for Microsoft Azure Previews for legal terms that apply to Azure features that are in beta, preview, or otherwise not yet released into general availability.
Several types of configuration are common to multiple pipeline stages. This article describes the path, batch, templates, retry, and duration configuration patterns.
Path
Several pipeline stages use a path to identify a location in the message where data should be read from or written to. To define these locations, you use a Path
field that uses jq syntax:
- A path is defined as a string in the UI, and uses the jq syntax.
- A path is defined relative to the root of the data processor message. The path
.
refers to the entire message. - All paths start with
.
. - Each segment of path can be:
.<identifier>
for an alphanumeric object key such as.topic
.."<identifier>"
for an arbitrary object key such as."asset-id"
.["<identifier>"]
for an arbitrary object key such as["asset-id"]
.[<index>]
for an array index such as[2]
.
- Segments can be chained to form complex paths.
Individual path segments must conform to the following regular expressions:
Pattern | Regex | Example |
---|---|---|
.<identifier> |
\.[a-zA-Z_][a-zA-Z0-9_]* |
.topic |
."<identifier>" |
\."(\\"\|[^"])*" |
."asset-id" |
["<identifier>"] |
\["(\\"\|[^"])*"\] |
["asset-id"] |
[<index>] |
\[-?[0-9]+\] |
[2] |
To learn more, see Paths in the jq guide.
Path examples
The following paths are examples based on the standard data processor Message structure:
.systemProperties
returns:{ "partitionKey": "foo", "partitionId": 5, "timestamp": "2023-01-11T10:02:07Z" }
.payload.humidity
returns:10
.userProperties[0]
returns:{ "key": "prop1", "value": "value1" }
.userProperties[0].value
returns:value1
Duration
Several stages require the definition of a timespan duration. For example, windows in an aggregate stage and timeouts in a call out stage. These stages use a Duration
field to represent this value.
Several pipeline stages use a timespan duration. For example, the aggregate stage has a windows property and the call out stages have a timeouts property. To define these timespans, you use a Duration
field:
Duration is a string value with the following format <number><char><number><char>
where `char`` can be:
h
: hourm
: minutes
: secondms
: millisecondus
,µs
: microsecondns
: nanosecond
Duration examples
2h
1h10m30s
3m9ms400ns
Templates
Several stages require you to define a string with a mix of dynamic and static values. These stages use template values.
Data processor templates use Mustache syntax to define dynamic values in strings.
The dynamic system values available for use in templates are:
instanceId
: The data processor instance ID.pipelineId
: The pipeline ID the stage is part of.YYYY
: The current year.MM
: The current month.DD
: The current date.HH
: The current hour.mm
: The current minute.
Currently, you can use templates to define file paths in a destination stage.
Static and dynamic fields
Some stages require the definition of values that can either be static strings or a dynamic value derived from a Path
in a Message. To define these values, you can use static or dynamic fields.
A static or dynamic field is always written as an object with a type
field that has one of two values: static
or dynamic
. The schema varies based on type
.
Static fields
The static definition is a fixed value for type
is static
. The actual value is held in value
.
Field | Type | Description | Required | Default | Example |
---|---|---|---|---|---|
type | const string | The type of the field. | Yes | - | static |
value | any | The static value to use for the configuration (typically a string). | Yes | - | "static" |
The following examples show some static field definitions:
{
"some-field": {
"type": "static",
"value": "some-static-value"
}
}
{
"some-boolean-field": {
"type": "static",
"value": true
}
}
{
"some-complex-field": {
"type": "static",
"value": {
"some": [
"nested",
"data"
]
}
}
}
Dynamic fields
The fixed value for type
is dynamic
. The value is a jq path.
Field | Type | Description | Required | Default | Example |
---|---|---|---|---|---|
type | const string | The type of the field | Yes | - | dynamic |
value | Path | The path in each message where a value for the field can be dynamically retrieved. | Yes | - | .systemProperties.partitionKey |
The following example shows a dynamic field definition:
{
"some-field": {
"type": "dynamic",
"value": ".systemProperties.topic"
}
}
Retry
Stages that call out to external services can use retries to handle temporary failures and improve reliability. You can override the default retry policy when you configure a stage.
There are four possible retry policies:
default
: The default retry policy is to use an exponential backoff retry with three retries and 500 ms initial retry interval.none
: No retries are performed.fixed
: A fixed retry policy retries a fixed number of times with a fixed interval between each retry.exponential
: An exponential retry policy retries a fixed number of times with an exponentially increasing interval between each retry.
If you choose default
or none
, you don't need to provide any more configuration. If you choose fixed
or exponential
, you must provide more configuration:
Fixed
Field | Type | Description | Required? | Default | Example |
---|---|---|---|---|---|
type |
string | The name of the retry type: fixed |
Yes | N/A | fixed |
interval |
Duration | Initial time period between each retry. | No | 500ms |
2s |
maxRetries |
int | The number of retries, from 1 to 20 |
No | 3 |
20 |
Exponential
Field | Type | Description | Required? | Default | Example |
---|---|---|---|---|---|
type |
string | The name of the retry type: exponential |
Yes | N/A | exponential |
interval |
Duration | Initial time period between each retry. | No | 500ms |
2s |
maxRetries |
int | The number of retries, from 1 to 20 |
No | 3 |
20 |
maxInterval |
Duration | Maximum time period between each retry. | No | None | 100s |
Exponential retry times are calculated as follows: If the interval starts with 500ms
, the next few retry intervals are randInt(500ms, 500ms * 2^1)
, randInt(500ms, 500ms * 2^2)
, …, randInt(500ms, 500ms * 2^maxRetries)
.
Batch
Several destinations in the output stage let you batch messages before they write to the destination. These destinations use batch to define the batching parameters.
Currently, you can define batch based on time. To define batching, you need to specify the time interval and a path. Path
defines which portion of the incoming message to include in the batched output.
Field | Type | Description | Required | Default | Example |
---|---|---|---|---|---|
Time | Duration | How long to batch data | No | 60s (In destinations where batching is enforced) |
120s |
Path | Path | The path to value in each message to include in the output. | No | .payload |
.payload.output |