Send data to Azure Data Explorer from a data processor pipeline

Artikkel
01/10/2024

Important

Azure IoT Operations Preview – enabled by Azure Arc is currently in PREVIEW. You shouldn't use this preview software in production environments.

You will need to deploy a new Azure IoT Operations installation when a generally available release is made available, you won't be able to upgrade a preview installation.

See the Supplemental Terms of Use for Microsoft Azure Previews for legal terms that apply to Azure features that are in beta, preview, or otherwise not yet released into general availability.

Use the Azure Data Explorer destination to write data to a table in Azure Data Explorer from a data processor pipeline. The destination stage batches messages before it sends them to Azure Data Explorer.

Prerequisites

To configure and use an Azure Data Explorer destination pipeline stage, you need:

A deployed instance of the data processor.
An Azure Data Explorer cluster.
A database in your Azure Data Explorer cluster.

Set up Azure Data Explorer

Before you can write to Azure Data Explorer from a data pipeline, you need to grant access to the database from the pipeline. You can use either a service principal or a managed identity to authenticate the pipeline to the database. The advantage of using a managed identity is that you don't need to manage the lifecycle of the service principal. The managed identity is automatically managed by Azure and is tied to the lifecycle of the resource it's assigned to.

Service principal
Managed identity

To create a service principal with a client secret:

Use the following Azure CLI command to create a service principal.
```
az ad sp create-for-rbac --name <YOUR_SP_NAME> 
```
The output of this command includes an appId, displayName, password, and tenant. Make a note of these values to use when you configure access to your cloud resource such as Microsoft Fabric, create a secret, and configure a pipeline destination:
```
{
    "appId": "<app-id>",
    "displayName": "<name>",
    "password": "<client-secret>",
    "tenant": "<tenant-id>"
}
```

To grant admin access to your Azure Data Explorer database, run the following command in your database query tab:

.add database <DatabaseName> admins (<ApplicationId>) <Notes>

For the destination stage to connect to Azure Data Explorer, it needs access to a secret that contains the authentication details. To create a secret:

Use the following command to add a secret to your Azure Key Vault that contains the client secret you made a note of when you created the service principal:
```
az keyvault secret set --vault-name <your-key-vault-name> --name AccessADXSecret --value <client-secret>
```
Add the secret reference to your Kubernetes cluster by following the steps in Manage secrets for your Azure IoT Operations Preview deployment.

To find the application ID of the managed identity and your tenant ID, run the following commands. Replace the placeholders with your cluster name, resource group, and subscription ID:

CLUSTER_NAME=<Your connected cluster name>
RESOURCE_GROUP=<The resource group where your connected cluster is installed>
SUBSCRIPTION=<The subscription where your connected cluster is installed>
EXTENSION_NAME=processor

OBJECT_ID=$(az k8s-extension show --name $EXTENSION_NAME --cluster-name $CLUSTER_NAME --resource-group $RESOURCE_GROUP --cluster-type connectedClusters --query identity.principalId -o tsv --subscription $SUBSCRIPTION)
echo "App ID:    " `az ad sp show --query appId --id $OBJECT_ID -o tsv`
echo "Tenant ID: " `az account show --query tenantId -o tsv`

Make a note of the App ID and Tenant ID, you need these values later.

To add the managed identity to the database, go to the Azure Data Explorer portal and run the following query on your database. Replace the placeholders with the values you made a note of in the previous step:

.add database ['<your-database-name>'] admins ('aadapp=<your-app-ID>;<your-tenant-ID>');

Batching

The data processor writes to Azure Data Explorer in batches. While you batch data in the data processor before sending it, Azure Data Explorer has its own default ingestion batching policy. Therefore, you might not see your data in Azure Data Explorer immediately after the data processor writes it to the Azure Data Explorer destination.

To view data in Azure Data Explorer as soon as the pipeline sends it, you can set the ingestion batching policy count to 1. To edit the ingestion batching policy, run the following command in your database query tab:

.alter database <your-database-name> policy ingestionbatching
```
{
    "MaximumBatchingTimeSpan" : "00:00:30",
    "MaximumNumberOfItems" : 1,
    "MaximumRawDataSizeMB": 1024
}
```

Configure the destination stage

The Azure Data Explorer destination stage JSON configuration defines the details of the stage. To author the stage, you can either interact with the form-based UI, or provide the JSON configuration on the Advanced tab:

Field	Type	Description	Required	Default	Example
Display name	String	A name to show in the data processor UI.	Yes	-	`MQTT broker output`
Description	String	A user-friendly description of what the stage does.	No		`Write to topic default/topic1`
Cluster URL	String	The URI (This value isn't the data ingestion URI).	Yes	-
Database	String	The database name.	Yes	-
Table	String	The name of the table to write to.	Yes	-
Batch	Batch	How to batch data.	No	`60s`	`10s`
Retry	Retry	The retry policy to use.	No	`default`	`fixed`
Authentication¹	String	The authentication details to connect to Azure Data Explorer. `Service principal` or `Managed identity`	Service principal	Yes	-
Columns > Name	string	The name of the column.	Yes		`temperature`
Columns > Path	Path	The location within each record of the data where the value of the column should be read from.	No	`.{{name}}`	`.temperature`

¹Authentication: Currently, the destination stage supports service principal based authentication or managed identity when it connects to Azure Data Explorer.

To configure service principal based authentication provide the following values. You made a note of these values when you created the service principal and added the secret reference to your cluster.

Field	Description	Required
TenantId	The tenant ID.	Yes
ClientId	The app ID you made a note of when you created the service principal that has access to the database.	Yes
Secret	The secret reference you created in your cluster.	Yes

Sample configuration

The following JSON example shows a complete Azure Data Explorer destination stage configuration that writes the entire message to the quickstart table in the database`:

{
    "displayName": "Azure data explorer - 71c308",
    "type": "output/dataexplorer@v1",
    "viewOptions": {
        "position": {
            "x": 0,
            "y": 784
        }
    },
    "clusterUrl": "https://clusterurl.region.kusto.windows.net",
    "database": "databaseName",
    "table": "quickstart",
    "authentication": {
        "type": "servicePrincipal",
        "tenantId": "tenantId",
        "clientId": "clientId",
        "clientSecret": "secretReference"
    },
    "batch": {
        "time": "5s",
        "path": ".payload"
    },
    "columns": [
        {
            "name": "Timestamp",
            "path": ".Timestamp"
        },
        {
            "name": "AssetName",
            "path": ".assetName"
        },
        {
            "name": "Customer",
            "path": ".Customer"
        },
        {
            "name": "Batch",
            "path": ".Batch"
        },
        {
            "name": "CurrentTemperature",
            "path": ".CurrentTemperature"
        },
        {
            "name": "LastKnownTemperature",
            "path": ".LastKnownTemperature"
        },
        {
            "name": "Pressure",
            "path": ".Pressure"
        },
        {
            "name": "IsSpare",
            "path": ".IsSpare"
        }
    ],
    "retry": {
        "type": "fixed",
        "interval": "20s",
        "maxRetries": 4
    }
}

The configuration defines that:

Messages are batched for 5 seconds.
Uses the batch path .payload to locate the data for the columns.

Example

The following example shows a sample input message to the Azure Data Explorer destination stage:

{
  "payload": {
    "Batch": 102,
    "CurrentTemperature": 7109,
    "Customer": "Contoso",
    "Equipment": "Boiler",
    "IsSpare": true,
    "LastKnownTemperature": 7109,
    "Location": "Seattle",
    "Pressure": 7109,
    "Timestamp": "2023-08-10T00:54:58.6572007Z",
    "assetName": "oven"
  },
  "qos": 0,
  "systemProperties": {
    "partitionId": 0,
    "partitionKey": "quickstart",
    "timestamp": "2023-11-06T23:42:51.004Z"
  },
  "topic": "quickstart"
}

Jagamisviis:

Send data to Azure Data Explorer from a data processor pipeline

Prerequisites

Set up Azure Data Explorer

Batching

Configure the destination stage

Sample configuration

Example

Lisaressursid

Jagamisviis:

Send data to Azure Data Explorer from a data processor pipeline

Prerequisites

Set up Azure Data Explorer

Batching

Configure the destination stage

Sample configuration

Example

Related content

Lisaressursid