Use Azure Data Factory command activity to run Azure Data Explorer management commands

Article
13/09/2023

Azure Data Factory (ADF) is a cloud-based data integration service that allows you to perform a combination of activities on the data. Use ADF to create data-driven workflows for orchestrating and automating data movement and data transformation. The Azure Data Explorer Command activity in Azure Data Factory enables you to run Azure Data Explorer management commands within an ADF workflow. This article teaches you how to create a pipeline with a lookup activity and ForEach activity containing an Azure Data Explorer command activity.

Prerequisites

An Azure subscription. Create a free Azure account.
An Azure Data Explorer cluster and database. Create a cluster and database.
A source of data.
A data factory. Create a data factory.

Create a new pipeline

Select the Author pencil tool.
Create a new pipeline by selecting + and then select Pipeline from the drop-down.

Create a Lookup activity

A lookup activity can retrieve a dataset from any Azure Data Factory-supported data sources. The output from Lookup activity can be used in a ForEach or other activity.

In the Activities pane, under General, select the Lookup activity. Drag and drop it into the main canvas on the right.
The canvas now contains the Lookup activity you created. Use the tabs below the canvas to change any relevant parameters. In General, rename the activity.

Tip

Click on the empty canvas area to view the pipeline properties. Use the General tab to rename the pipeline. Our pipeline is named pipeline-4-docs.

Create an Azure Data Explorer dataset in lookup activity

In Settings, select your pre-created Azure Data Explorer Source dataset, or select + New to create a new dataset.
Select the Azure Data Explorer (Kusto) dataset from New Dataset window. Select Continue to add the new dataset.
The new Azure Data Explorer dataset parameters are visible in Settings. To update the parameters, select Edit.
The AzureDataExplorerTable new tab opens in the main canvas.
- Select General and edit the dataset name.
- Select Connection to edit the dataset properties.
- Select the Linked service from the drop-down, or select + New to create a new linked service.
When creating a new linked service, the New Linked Service (Azure Data Explorer) page opens:
- Select Name for Azure Data Explorer linked service. Add Description if needed.
- In Connect via integration runtime, change current settings, if needed.
- In Account selection method select your cluster using one of two methods:
  - Select the From Azure subscription radio button and select your Azure subscription account. Then, select your Cluster. Note the drop-down will only list clusters that belong to the user.
  - Instead, select Enter manually radio button and enter your Endpoint (cluster URL).
- Specify the Tenant.
- Enter Service principal ID. This value can be found in the Azure portal under App Registrations > Overview > Application (client) ID. The principal must have the adequate permissions, according to the permission level required by the command being used.
- Select Service principal key button and enter Service Principal Key.
- Select your Database from the dropdown menu. Alternatively, select Edit checkbox and enter your database name.
- Select Test Connection to test the linked service connection you created. If you can connect to your setup, a green checkmark Connection successful will appear.
- Select Finish to complete linked service creation.
Once you've set up a linked service, In AzureDataExplorerTable > Connection, add Table name. Select Preview data, to make sure that the data is presented properly.

Your dataset is now ready, and you can continue editing your pipeline.

Add a query to your lookup activity

In pipeline-4-docs > Settings add a query in Query text box, for example:

Kusto

ClusterQueries
| where Database !in ("KustoMonitoringPersistentDatabase", "$systemdb")
| summarize count() by Database

Change the Query timeout or No truncation and First row only properties, as needed. In this flow, we keep the default Query timeout and uncheck the checkboxes.

Create a For-Each activity

The For-Each activity is used to iterate over a collection and execute specified activities in a loop.

Now you add a For-Each activity to the pipeline. This activity will process the data returned from the Lookup activity.
- In the Activities pane, under Iteration & Conditionals, select the ForEach activity and drag and drop it into the canvas.
- Draw a line between the output of the Lookup activity and the input of the ForEach activity in the canvas to connect them.
Select the ForEach activity in the canvas. In the Settings tab below:
- Check the Sequential checkbox for a sequential processing of the Lookup results, or leave it unchecked to create parallel processing.
- Set Batch count.
- In Items, provide the following reference to the output value: @activity('Lookup1').output.value

Create an Azure Data Explorer Command activity within the ForEach activity

Double-click the ForEach activity in the canvas to open it in a new canvas to specify the activities within ForEach.
In the Activities pane, under Azure Data Explorer, select the Azure Data Explorer Command activity and drag and drop it into the canvas.
In the Connection tab, select the same Linked Service previously created.
In the Command tab, provide the following command:

Kusto
```
.export
async compressed
into csv h"http://<storageName>.blob.core.windows.net/data/ClusterQueries;<storageKey>" with (
sizeLimit=100000,
namePrefix=export
)
<| ClusterQueries | where Database == "@{item().Database}"
```
The Command instructs Azure Data Explorer to export the results of a given query into a blob storage, in a compressed format. It runs asynchronously (using the async modifier). The query addresses the database column of each row in the Lookup activity result. The Command timeout can be left unchanged.
Note

The command activity has the following limits:
- Size limit: 1 MB response size
- Time limit: 20 minutes (default), 1 hour (maximum).
- If needed, you can append a query to the result using AdminThenQuery, to reduce resulting size/time.
Now the pipeline is ready. You can go back to the main pipeline view by clicking the pipeline name.
Select Debug before publishing the pipeline. The pipeline progress can be monitored in the Output tab.
You can Publish All and then Add trigger to run the pipeline.

Management command outputs

The structure of the command activity output is detailed below. This output can be used by the next activity in the pipeline.

Returned value of a non-async management command

In a non-async management command, the structure of the returned value is similar to the structure of the Lookup activity result. The count field indicates the number of returned records. A fixed array field value contains a list of records.

JSON

{ 
    "count": "2", 
    "value": [ 
        { 
            "ExtentId": "1b9977fe-e6cf-4cda-84f3-4a7c61f28ecd", 
            "ExtentSize": 1214.0, 
            "CompressedSize": 520.0 
        }, 
        { 
            "ExtentId": "b897f5a3-62b0-441d-95ca-bf7a88952974", 
            "ExtentSize": 1114.0, 
            "CompressedSize": 504.0 
        } 
    ] 
}

Returned value of an async management command

In an async management command, the activity polls the operations table behind the scenes, until the async operation is completed or times-out. Therefore, the returned value will contain the result of .show operations OperationId for that given OperationId property. Check the values of State and Status properties, to verify successful completion of the operation.

JSON

{ 
    "count": "1", 
    "value": [ 
        { 
            "OperationId": "910deeae-dd79-44a4-a3a2-087a90d4bb42", 
            "Operation": "TableSetOrAppend", 
            "NodeId": "", 
            "StartedOn": "2019-06-23T10:12:44.0371419Z", 
            "LastUpdatedOn": "2019-06-23T10:12:46.7871468Z", 
            "Duration": "00:00:02.7500049", 
            "State": "Completed", 
            "Status": "", 
            "RootActivityId": "f7c5aaaf-197b-4593-8ba0-e864c94c3c6f", 
            "ShouldRetry": false, 
            "Database": "MyDatabase", 
            "Principal": "<some principal id>", 
            "User": "<some User id>" 
        } 
    ] 
}

Additional resources

Documentation

Azure Data Explorer integration with Azure Data Factory - Azure Data Explorer

In this article, integrate Azure Data Explorer with Azure Data Factory to use the copy, lookup, and command activities.
Copy and transform data in Azure Data Explorer - Azure Data Factory & Azure Synapse

Learn how to copy or transform data in Azure Data Explorer by using Data Factory or Azure Synapse Analytics.
Copy data from Azure Data Factory to Azure Data Explorer - Azure Data Explorer

In this article, you learn how to ingest (load) data into Azure Data Explorer by using the Azure Data Factory copy tool.
Copy in bulk from a database to Azure Data Explorer by using the Azure Data Factory template - Azure Data Explorer

In this article, you learn to use an Azure Data Factory template to copy in bulk from a database to Azure Data Explorer
Troubleshoot the Azure Data Explorer connector - Azure Data Factory & Azure Synapse

Learn how to troubleshoot issues with the Azure Data Explorer connector in Azure Data Factory and Azure Synapse Analytics.
Azure Data Explorer (Kusto) - Azure Synapse Analytics

This article provides information on how to use the connector for moving data between Azure Data Explorer (Kusto) and serverless Apache Spark pools.
Quickstart: Connect Azure Data Explorer to an Azure Synapse Analytics workspace - Azure Synapse Analytics

Connect an Azure Data Explorer cluster to an Azure Synapse Analytics workspace by using Apache Spark for Azure Synapse Analytics.
Create and alter Azure SQL external tables - Kusto

Learn how to create and alter an SQL external table.

Training

Learning path

Data analysis in Azure Data Explorer with Kusto Query Language - Training

Learn how to analyze data in Azure Data Explorer using the Kusto Query Language

Certification

Microsoft Certified: Azure Data Engineer Associate - Certifications

Demonstrate understanding of common data engineering tasks to implement and manage data engineering workloads on Microsoft Azure, using a number of Azure services.

Events

FabCon Vegas

31 Mar, 23 - 2 Apr, 23

The biggest Fabric, Power BI, and SQL learning event. March 31 – April 2. Use code FABINSIDER to save $400.

Share via

Use Azure Data Factory command activity to run Azure Data Explorer management commands

Prerequisites

Create a new pipeline

Create a Lookup activity

Create an Azure Data Explorer dataset in lookup activity

Add a query to your lookup activity

Create a For-Each activity

Create an Azure Data Explorer Command activity within the ForEach activity

Management command outputs

Returned value of a non-async management command

Returned value of an async management command

Feedback

Additional resources

Share via

Use Azure Data Factory command activity to run Azure Data Explorer management commands

Prerequisites

Create a new pipeline

Create a Lookup activity

Create an Azure Data Explorer dataset in lookup activity

Add a query to your lookup activity

Create a For-Each activity

Create an Azure Data Explorer Command activity within the ForEach activity

Management command outputs

Returned value of a non-async management command

Returned value of an async management command

Related content

Feedback

Additional resources