Events
Mar 31, 11 PM - Apr 2, 11 PM
The biggest Fabric, Power BI, and SQL learning event. March 31 – April 2. Use code FABINSIDER to save $400.
Register todayThis browser is no longer supported.
Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.
Event Grid ingestion is a pipeline that listens to Azure storage, and updates Azure Data Explorer to pull information when subscribed events occur. Azure Data Explorer offers continuous ingestion from Azure Storage (Blob storage and ADLSv2) with Azure Event Grid subscription for blob created or blob renamed notifications and streaming these notifications to Azure Data Explorer via an Azure Event Hubs.
The Event Grid ingestion pipeline goes through several steps. You create a target table in Azure Data Explorer into which the data in a particular format will be ingested. Then you create an Event Grid data connection in Azure Data Explorer. The Event Grid data connection needs to know events routing information, such as what table to send the data to and the table mapping. You also specify ingestion properties, which describe the data to be ingested, the target table, and the mapping. You can generate sample data and upload blobs or rename blobs to test your connection. Delete blobs after ingestion.
Event Grid ingestion can be managed through the Azure portal, using the ingestion wizard, programmatically with C# or Python, or with the Azure Resource Manager template.
For general information about data ingestion in Azure Data Explorer, see Azure Data Explorer data ingestion overview.
Managed Identity based data connection (recommended): Using a managed identity-based data connection is the most secure way to connect to data sources. It provides full control over the ability to fetch data from a data source. Setup of an Event Grid data connection using managed identity requires the following steps:
Caution
Key-based data connection: If a managed identity authentication is not specified for the data connection, the connection automatically defaults to key-based authentication. Key-based connections fetch data using a resource connection string, such as the Azure Event Hubs connection string. Azure Data Explorer gets the resource connection string for the specified resource and securely saves it. The connection string is then used to fetch data from the data source.
Caution
If the key is rotated, the data connection will no longer work and will be unable to fetch data from the data source. To fix the issue, update or recreate the data connection.
The original uncompressed data size should be part of the blob metadata, or else Azure Data Explorer will estimate it. The ingestion uncompressed size limit per file is 6 GB.
Note
Event Grid notification subscription can be set on Azure Storage accounts for BlobStorage
, StorageV2
, or Data Lake Storage Gen2.
You can specify ingestion properties of the blob ingestion via the blob metadata. You can set the following properties:
Property | Description |
---|---|
rawSizeBytes |
Size of the raw (uncompressed) data. For Avro/ORC/Parquet, that is the size before format-specific compression is applied. Provide the original data size by setting this property to the uncompressed data size in bytes. |
kustoDatabase |
The case-sensitive name of the target database. By default, data is ingested into the target database associated with the data connection. Use this property to override the default database and send data to a different database. To do so, you must first set up the connection as a multi-database connection. |
kustoTable |
The case-sensitive name of the existing target table. Overrides the Table set on the Data Connection pane. |
kustoDataFormat |
Data format. Overrides the Data format set on the Data Connection pane. |
kustoIngestionMappingReference |
Name of the existing ingestion mapping to be used. Overrides the Column mapping set on the Data Connection pane. |
kustoIgnoreFirstRecord |
If set to true , Kusto ignores the first row of the blob. Use in tabular format data (CSV, TSV, or similar) to ignore headers. |
kustoExtentTags |
String representing tags that will be attached to resulting extent. |
kustoCreationTime |
Overrides Extent Creation time for the blob, formatted as an ISO 8601 string. Use for backfilling. |
When you create a data connection to your cluster, you specify the routing for where to send ingested data. The default routing is to the target table specified in the connection string that is associated with the target database. The default routing for your data is also referred to as static routing. You can specify an alternative routing for your data by using the event data properties.
Routing data to an alternate database is off by default. To send the data to a different database, you must first set the connection as a multi-database connection. You can do this in the Azure portal, C#, Python, or an ARM template. The user, group, service principal, or managed identity used to allow database routing must at least have the contributor role and write permissions on the cluster. For more information, see Create an Event Grid data connection for Azure Data Explorer.
To specify an alternate database, set the Database ingestion property.
Warning
Specifying an alternate database without setting the connection as a multi-database data connection will cause the ingestion to fail.
When setting up a blob storage connection to Azure Data Explorer cluster, specify target table properties:
You can also specify target table properties for each blob, using blob metadata. The data will dynamically route, as specified by ingestion properties.
The example below shows you how to set ingestion properties on the blob metadata before uploading it. Blobs are routed to different tables.
In addition, you can specify the target database. An Event Grid data connection is created within the context of a specific database. Hence this database is the data connection's default database routing. To send the data to a different database, set the "KustoDatabase" ingestion property and set the data connection as a Multi database data connection. Routing data to another database is disabled by default (not allowed). Setting a database ingestion property that is different than the data connection's database, without allowing data routing to multiple databases (setting the connection as a Multi database data connection), will cause the ingestion to fail.
For more information, see upload blobs.
var container = new BlobContainerClient("<storageAccountConnectionString>", "<containerName>");
await container.CreateIfNotExistsAsync();
var blob = container.GetBlobClient("<blobName>");
// Blob is dynamically routed to table `Events`, ingested using `EventsMapping` data mapping
await blob.SetMetadataAsync(
new Dictionary<string, string>
{
{ "rawSizeBytes", "4096" }, // the uncompressed size is 4096 bytes
{ "kustoTable", "Events" },
{ "kustoDataFormat", "json" },
{ "kustoIngestionMappingReference", "EventsMapping" },
{ "kustoDatabase", "AnotherDB" }
}
);
await blob.UploadAsync(BinaryData.FromString(File.ReadAllText("<filePath>")));
You can create a blob from a local file, set ingestion properties to the blob metadata, and upload it. For examples, see Use the Event Grid data connection.
Note
BlockBlob
to generate data, as using AppendBlob
may result in unexpected behavior.CreateFile
for uploading files and Flush
at the end with the close parameter set to true
. For a detailed example of Data Lake Gen2 SDK correct usage, see Use the Event Grid data connection.CopyBlob
operation is not supported for storage accounts that have the hierarchical namespace feature enabled on them.When using ADLSv2, you can rename a blob to trigger blob ingestion to Azure Data Explorer. For example, see Rename blobs.
Note
Azure Data Explorer won't delete the blobs after ingestion. Use Azure Blob storage lifecycle to manage your blob deletion. It's recommended to keep the blobs for three to five days.
If local authentication is disabled on the Event Hubs namespace that contains the event hub used for streaming notifications, use the following steps to ensure that data flows properly from storage to the event hub using managed identities:
In addition, configure the Event Grid data connection to use managed identity authentication so that Azure Data Explorer can receive notifications from the event hub.
When using Azure Data Explorer to export the files used for Event Grid ingestion, note:
abfss://filesystem@accountname.dfs.core.windows.net
) but the storage account isn't enabled for hierarchical namespace.https://accountname.blob.core.windows.net
). The export works as expected even when using the ADLS Gen2 connection string, but notifications won't be triggered and Event Grid ingestion won't work.When using custom components to emulate Azure Storage events, the emulated events must strictly comply with Azure Blob Storage event schema, as Azure Data Explorer will discard events that cannot be parsed by the Event Grid SDK.
Events
Mar 31, 11 PM - Apr 2, 11 PM
The biggest Fabric, Power BI, and SQL learning event. March 31 – April 2. Use code FABINSIDER to save $400.
Register todayTraining
Certification
Microsoft Certified: Azure Data Engineer Associate - Certifications
Demonstrate understanding of common data engineering tasks to implement and manage data engineering workloads on Microsoft Azure, using a number of Azure services.