Associate metadata to data ingested from Storage to Azure Data Explorer

DG-4811 1 Reputation point
2022-12-05T09:07:23.927+00:00

Hi,

I'm using the Event Grid approach to ingest data from Blob storage into Azure Data Explorer. That works so far.

What I'm now struggling with, is to associate some metadata (data that is not part of the file in the storage) with the files and have it available during ingestion (e.g. in the data mapping). I imagine that I can map this metadata to columns and later query the data by these columns.

For example, I have an Azure Pipeline that produces some results (csv files) and saves them to a blob store. Now I want to "attach" some metadata about the pipeline run itself (e.g. timestamp, pipeline job id, commit hash, ...) to the data.

How could this be achieved?

Azure Blob Storage
Azure Blob Storage
An Azure service that stores unstructured data in the cloud as blobs.
2,425 questions
Azure Data Explorer
Azure Data Explorer
An Azure data analytics service for real-time analysis on large volumes of data streaming from sources including applications, websites, and internet of things devices.
479 questions
Azure Event Grid
Azure Event Grid
An Azure event routing service designed for high availability, consistent performance, and dynamic scale.
312 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. MayankBargali-MSFT 68,396 Reputation points
    2022-12-07T06:35:38.913+00:00

    @DG-4811 Thanks for reaching out. Please confirm if my understanding is correct and your setup:
    Storage --> event grid (subscribe to storage events) --> event hub (as event handler) --> Azure Data Explorer

    Now you want to add some metadata which needs to be sent for the storage events. Unfortunately, you cannot add any custom data to any of the events for the system events (azure service) such as storage account events etc. While configuring the endpoint on the subscription you can configure only custom delivery properties and for event hub it is only partition key which would not help with your use case.

    The workaround would be having your custom logic to save the mapping of your blob file with the associated metadata leveraging any of the store models. As you want the data ingested before the Azure Data Explorer you need to use a middle system that will handle the mapping for you. You can use Azure function with event grid trigger and event hub as the output binding. Now your function app will have the custom logic to add more properties (metadata) to the event before sending it to the event hub.
    Storage --> event grid (subscribe to storage events) --> Azure function (as event handler) --> event hub --> Azure Data Explorer

    The above is one of the possible solutions and you can use any of your custom applications deployed on azure as the middleware.

    0 comments No comments