Events
31 Mar, 23 - 2 Apr, 23
The biggest Fabric, Power BI, and SQL learning event. March 31 – April 2. Use code FABINSIDER to save $400.
Register todayThis browser is no longer supported.
Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.
Azure Data Factory (ADF) is a cloud-based data integration service that allows you to perform a combination of activities on the data. Use ADF to create data-driven workflows for orchestrating and automating data movement and data transformation. The Azure Data Explorer Command activity in Azure Data Factory enables you to run Azure Data Explorer management commands within an ADF workflow. This article teaches you how to create a pipeline with a lookup activity and ForEach activity containing an Azure Data Explorer command activity.
Select the Author pencil tool.
Create a new pipeline by selecting + and then select Pipeline from the drop-down.
A lookup activity can retrieve a dataset from any Azure Data Factory-supported data sources. The output from Lookup activity can be used in a ForEach or other activity.
In the Activities pane, under General, select the Lookup activity. Drag and drop it into the main canvas on the right.
The canvas now contains the Lookup activity you created. Use the tabs below the canvas to change any relevant parameters. In General, rename the activity.
Tip
Click on the empty canvas area to view the pipeline properties. Use the General tab to rename the pipeline. Our pipeline is named pipeline-4-docs.
In Settings, select your pre-created Azure Data Explorer Source dataset, or select + New to create a new dataset.
Select the Azure Data Explorer (Kusto) dataset from New Dataset window. Select Continue to add the new dataset.
The new Azure Data Explorer dataset parameters are visible in Settings. To update the parameters, select Edit.
The AzureDataExplorerTable new tab opens in the main canvas.
When creating a new linked service, the New Linked Service (Azure Data Explorer) page opens:
Once you've set up a linked service, In AzureDataExplorerTable > Connection, add Table name. Select Preview data, to make sure that the data is presented properly.
Your dataset is now ready, and you can continue editing your pipeline.
In pipeline-4-docs > Settings add a query in Query text box, for example:
ClusterQueries
| where Database !in ("KustoMonitoringPersistentDatabase", "$systemdb")
| summarize count() by Database
Change the Query timeout or No truncation and First row only properties, as needed. In this flow, we keep the default Query timeout and uncheck the checkboxes.
The For-Each activity is used to iterate over a collection and execute specified activities in a loop.
Now you add a For-Each activity to the pipeline. This activity will process the data returned from the Lookup activity.
In the Activities pane, under Iteration & Conditionals, select the ForEach activity and drag and drop it into the canvas.
Draw a line between the output of the Lookup activity and the input of the ForEach activity in the canvas to connect them.
Select the ForEach activity in the canvas. In the Settings tab below:
Check the Sequential checkbox for a sequential processing of the Lookup results, or leave it unchecked to create parallel processing.
Set Batch count.
In Items, provide the following reference to the output value: @activity('Lookup1').output.value
Double-click the ForEach activity in the canvas to open it in a new canvas to specify the activities within ForEach.
In the Activities pane, under Azure Data Explorer, select the Azure Data Explorer Command activity and drag and drop it into the canvas.
In the Connection tab, select the same Linked Service previously created.
In the Command tab, provide the following command:
.export
async compressed
into csv h"http://<storageName>.blob.core.windows.net/data/ClusterQueries;<storageKey>" with (
sizeLimit=100000,
namePrefix=export
)
<| ClusterQueries | where Database == "@{item().Database}"
The Command instructs Azure Data Explorer to export the results of a given query into a blob storage, in a compressed format. It runs asynchronously (using the async modifier). The query addresses the database column of each row in the Lookup activity result. The Command timeout can be left unchanged.
Note
The command activity has the following limits:
Now the pipeline is ready. You can go back to the main pipeline view by clicking the pipeline name.
Select Debug before publishing the pipeline. The pipeline progress can be monitored in the Output tab.
You can Publish All and then Add trigger to run the pipeline.
The structure of the command activity output is detailed below. This output can be used by the next activity in the pipeline.
In a non-async management command, the structure of the returned value is similar to the structure of the Lookup activity result. The count
field indicates the number of returned records. A fixed array field value
contains a list of records.
{
"count": "2",
"value": [
{
"ExtentId": "1b9977fe-e6cf-4cda-84f3-4a7c61f28ecd",
"ExtentSize": 1214.0,
"CompressedSize": 520.0
},
{
"ExtentId": "b897f5a3-62b0-441d-95ca-bf7a88952974",
"ExtentSize": 1114.0,
"CompressedSize": 504.0
}
]
}
In an async management command, the activity polls the operations table behind the scenes, until the async operation is completed or times-out. Therefore, the returned value will contain the result of .show operations OperationId
for that given OperationId property. Check the values of State and Status properties, to verify successful completion of the operation.
{
"count": "1",
"value": [
{
"OperationId": "910deeae-dd79-44a4-a3a2-087a90d4bb42",
"Operation": "TableSetOrAppend",
"NodeId": "",
"StartedOn": "2019-06-23T10:12:44.0371419Z",
"LastUpdatedOn": "2019-06-23T10:12:46.7871468Z",
"Duration": "00:00:02.7500049",
"State": "Completed",
"Status": "",
"RootActivityId": "f7c5aaaf-197b-4593-8ba0-e864c94c3c6f",
"ShouldRetry": false,
"Database": "MyDatabase",
"Principal": "<some principal id>",
"User": "<some User id>"
}
]
}
Events
31 Mar, 23 - 2 Apr, 23
The biggest Fabric, Power BI, and SQL learning event. March 31 – April 2. Use code FABINSIDER to save $400.
Register todayTraining
Learning path
Data analysis in Azure Data Explorer with Kusto Query Language - Training
Learn how to analyze data in Azure Data Explorer using the Kusto Query Language
Certification
Microsoft Certified: Azure Data Engineer Associate - Certifications
Demonstrate understanding of common data engineering tasks to implement and manage data engineering workloads on Microsoft Azure, using a number of Azure services.
Documentation
Azure Data Explorer integration with Azure Data Factory - Azure Data Explorer
In this article, integrate Azure Data Explorer with Azure Data Factory to use the copy, lookup, and command activities.
Copy and transform data in Azure Data Explorer - Azure Data Factory & Azure Synapse
Learn how to copy or transform data in Azure Data Explorer by using Data Factory or Azure Synapse Analytics.
Copy data from Azure Data Factory to Azure Data Explorer - Azure Data Explorer
In this article, you learn how to ingest (load) data into Azure Data Explorer by using the Azure Data Factory copy tool.