Copy data to or from Azure Data Explorer using Azure Data Factory or Synapse Analytics
Article
APPLIES TO:
Azure Data Factory
Azure Synapse Analytics
Tip
Try out Data Factory in Microsoft Fabric, an all-in-one analytics solution for enterprises. Microsoft Fabric covers everything from data movement to data science, real-time analytics, business intelligence, and reporting. Learn how to start a new trial for free!
This article describes how to use the copy activity in Azure Data Factory and Synapse Analytics pipelines to copy data to or from Azure Data Explorer. It builds on the copy activity overview article, which offers a general overview of copy activity.
① Azure integration runtime ② Self-hosted integration runtime
You can copy data from any supported source data store to Azure Data Explorer. You can also copy data from Azure Data Explorer to any supported sink data store. For a list of data stores that the copy activity supports as sources or sinks, see the Supported data stores table.
Note
Copying data to or from Azure Data Explorer through an on-premises data store by using self-hosted integration runtime is supported in version 3.14 and later.
With the Azure Data Explorer connector, you can do the following:
Copy data by using Microsoft Entra application token authentication with a service principal.
As a source, retrieve data by using a KQL (Kusto) query.
Grant the service principal the correct permissions in Azure Data Explorer. See Manage Azure Data Explorer database permissions for detailed information about roles and permissions and about managing permissions. In general, you must:
As source, grant at least the Database viewer role to your database
As sink, grant at least the Database user role to your database
Note
When you use the UI to author, by default your login user account is used to list Azure Data Explorer clusters, databases, and tables. You can choose to list the objects using the service principal by clicking the dropdown next to the refresh button, or manually enter the name if you don't have permission for these operations.
The following properties are supported for the Azure Data Explorer linked service:
Property
Description
Required
type
The type property must be set to AzureDataExplorer.
Yes
endpoint
Endpoint URL of the Azure Data Explorer cluster, with the format as https://<clusterName>.<regionName>.kusto.windows.net.
Yes
database
Name of database.
Yes
tenant
Specify the tenant information (domain name or tenant ID) under which your application resides. This is known as "Authority ID" in Kusto connection string. Retrieve it by hovering the mouse pointer in the upper-right corner of the Azure portal.
Yes
servicePrincipalId
Specify the application's client ID. This is known as "Microsoft Entra application client ID" in Kusto connection string.
The integration runtime to be used to connect to the data store. You can use the Azure integration runtime or a self-hosted integration runtime if your data store is in a private network. If not specified, the default Azure integration runtime is used.
No
Example: using service principal key authentication
Grant the managed identity the correct permissions in Azure Data Explorer. See Manage Azure Data Explorer database permissions for detailed information about roles and permissions and about managing permissions. In general, you must:
As source, grant the Database viewer role to your database.
As sink, grant the Database ingestor and Database viewer roles to your database.
Note
When you use the UI to author, your login user account is used to list Azure Data Explorer clusters, databases, and tables. Manually enter the name if you don't have permission for these operations.
The following properties are supported for the Azure Data Explorer linked service:
Property
Description
Required
type
The type property must be set to AzureDataExplorer.
Yes
endpoint
Endpoint URL of the Azure Data Explorer cluster, with the format as https://<clusterName>.<regionName>.kusto.windows.net.
Yes
database
Name of database.
Yes
connectVia
The integration runtime to be used to connect to the data store. You can use the Azure integration runtime or a self-hosted integration runtime if your data store is in a private network. If not specified, the default Azure integration runtime is used.
No
Example: using system-assigned managed identity authentication
As source, grant at least the Database viewer role to your database
As sink, grant at least the Database ingestor role to your database
Assign one or multiple user-assigned managed identities to your data factory or Synapse workspace, and create credentials for each user-assigned managed identity.
The following properties are supported for the Azure Data Explorer linked service:
Property
Description
Required
type
The type property must be set to AzureDataExplorer.
Yes
endpoint
Endpoint URL of the Azure Data Explorer cluster, with the format as https://<clusterName>.<regionName>.kusto.windows.net.
Yes
database
Name of database.
Yes
credentials
Specify the user-assigned managed identity as the credential object.
Yes
connectVia
The integration runtime to be used to connect to the data store. You can use the Azure integration runtime or a self-hosted integration runtime if your data store is in a private network. If not specified, the default Azure integration runtime is used.
No
Example: using user-assigned managed identity authentication
For a full list of sections and properties available for defining datasets, see Datasets. This section lists properties that the Azure Data Explorer dataset supports.
To copy data to Azure Data Explorer, set the type property of the dataset to AzureDataExplorerTable.
The following properties are supported:
Property
Description
Required
type
The type property must be set to AzureDataExplorerTable.
Yes
table
The name of the table that the linked service refers to.
Yes for sink; No for source
Dataset properties example:
{
"name": "AzureDataExplorerDataset",
"properties": {
"type": "AzureDataExplorerTable",
"typeProperties": {
"table": "<table name>"
},
"schema": [],
"linkedServiceName": {
"referenceName": "<Azure Data Explorer linked service name>",
"type": "LinkedServiceReference"
}
}
}
Copy activity properties
For a full list of sections and properties available for defining activities, see Pipelines and activities. This section provides a list of properties that Azure Data Explorer sources and sinks support.
Azure Data Explorer as source
To copy data from Azure Data Explorer, set the type property in the Copy activity source to AzureDataExplorerSource. The following properties are supported in the copy activity source section:
Property
Description
Required
type
The type property of the copy activity source must be set to: AzureDataExplorerSource
Yes
query
A read-only request given in a KQL format. Use the custom KQL query as a reference.
Yes
queryTimeout
The wait time before the query request times out. Default value is 10 min (00:10:00); allowed max value is 1 hour (01:00:00).
No
noTruncation
Indicates whether to truncate the returned result set. By default, result is truncated after 500,000 records or 64 megabytes (MB). Truncation is strongly recommended to ensure the correct behavior of the activity.
No
Note
By default, Azure Data Explorer source has a size limit of 500,000 records or 64 MB. To retrieve all the records without truncation, you can specify set notruncation; at the beginning of your query. For more information, see Query limits.
To copy data to Azure Data Explorer, set the type property in the copy activity sink to AzureDataExplorerSink. The following properties are supported in the copy activity sink section:
Property
Description
Required
type
The type property of the copy activity sink must be set to: AzureDataExplorerSink.
Yes
ingestionMappingName
Name of a pre-created mapping on a Kusto table. To map the columns from source to Azure Data Explorer (which applies to all supported source stores and formats, including CSV/JSON/Avro formats), you can use the copy activity column mapping (implicitly by name or explicitly as configured) and/or Azure Data Explorer mappings.
No
additionalProperties
A property bag which can be used for specifying any of the ingestion properties which aren't being set already by the Azure Data Explorer Sink. Specifically, it can be useful for specifying ingestion tags. Learn more from Azure Data Explore data ingestion doc.
No
Example:
"activities":[
{
"name": "CopyToAzureDataExplorer",
"type": "Copy",
"typeProperties": {
"source": {
"type": "<source type>"
},
"sink": {
"type": "AzureDataExplorerSink",
"ingestionMappingName": "<optional Azure Data Explorer mapping name>",
"additionalProperties": {<additional settings for data ingestion>}
}
},
"inputs": [
{
"referenceName": "<input dataset name>",
"type": "DatasetReference"
}
],
"outputs": [
{
"referenceName": "<Azure Data Explorer output dataset name>",
"type": "DatasetReference"
}
]
}
]
Mapping data flow properties
When transforming data in mapping data flow, you can read from and write to tables in Azure Data Explorer. For more information, see the source transformation and sink transformation in mapping data flows. You can choose to use an Azure Data Explorer dataset or an inline dataset as source and sink type.
Source transformation
The below table lists the properties supported by Azure Data Explorer source. You can edit these properties in the Source options tab.
Name
Description
Required
Allowed values
Data flow script property
Table
If you select Table as input, data flow will fetch all the data from the table specified in the Azure Data Explorer dataset or in the source options when using inline dataset.
No
String
(for inline dataset only) tableName
Query
A read-only request given in a KQL format. Use the custom KQL query as a reference.
No
String
query
Timeout
The wait time before the query request times out. Default is '172000' (2 days)
No
Integer
timeout
Azure Data Explorer source script examples
When you use Azure Data Explorer dataset as source type, the associated data flow script is:
The below table lists the properties supported by Azure Data Explorer sink. You can edit these properties in the Settings tab. When using inline dataset, you will see additional settings, which are the same as the properties described in dataset properties section.
Name
Description
Required
Allowed values
Data flow script property
Table action
Determines whether to recreate or remove all rows from the destination table prior to writing. - None: No action will be done to the table. - Recreate: The table will get dropped and recreated. Required if creating a new table dynamically. - Truncate: All rows from the target table will get removed.
No
true or false
recreate truncate
Pre and Post SQL scripts
Specify multiple Kusto control commands scripts that will execute before (pre-processing) and after (post-processing) data is written to your sink database.
No
String
preSQLs; postSQLs
Timeout
The wait time before the query request times out. Default is '172000' (2 days)
No
Integer
timeout
Azure Data Explorer sink script examples
When you use Azure Data Explorer dataset as sink type, the associated data flow script is:
Demonstrate understanding of common data engineering tasks to implement and manage data engineering workloads on Microsoft Azure, using a number of Azure services.