Ingesting logs from Azure Blob Storage created by Log Analytics diagnostic settings

Ivan Wilson 121 Reputation points
2023-09-01T04:46:53.32+00:00

I need to pull logs from a Log Analytics Workspace into Azure SQL Server. I've configured the diagnostic settings to output logs to Azure Blob Storage.

I want to use Azure Data Factory to transfer the logs from Azure Blob Storage to an Azure SQL Database. Since the logs will be generated infrequently, I want to use a trigger based on the creation of an item in the Azure Blob Storage.

My concern is that Log Analytics may update the log file after creation. I can't find any details about how frequently the diagnostic settings generates new files.

My guess is that it creates one file every hour that a log entry is created. This file is then modified for any new log events within the same hour. At the start of the next hour, any new log events will be written to a new log file.

Does anyone with experience with the log analytics diagnostic settings know if this sounds right?

Azure Monitor
Azure Monitor
An Azure service that is used to collect, analyze, and act on telemetry data from Azure and on-premises environments.
2,802 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,547 questions
{count} votes

Accepted answer
  1. Amira Bedhiafi 15,216 Reputation points
    2023-09-01T11:43:01.83+00:00

    Logs are written as blobs when you want to set up diagnostic logging to Azure Blob Storage. The naming pattern of these blobs can give you an idea of how often they're generated. For example, a typical blob name might look like: resourceId=/SUBSCRIPTIONS/{subscription_id}/RESOURCEGROUPS/{resource_group_name}/PROVIDERS/MICROSOFT.COMPUTE/VIRTUALMACHINES/{resource_name}/y=2023/m=09/d=01/h=00/m=00/PT1H.json.

    From the naming, you can infer that logs are broken down by year, month, day, hour, and minute. The PT1H.json suffix suggests that the log file typically covers a 1-hour period.

    Your assumption seems correct. Log Analytics typically generates a new file for each hour that log entries are created.

    If new log data for that hour comes in after the blob has been created, Log Analytics would typically append that data to the existing blob for that hour, rather than creating a new blob.

    Using a trigger based on the creation of an item in Azure Blob Storage makes sense. However, since the logs might get appended within the hour, it may be a good idea to add some delay to your Data Factory pipeline's trigger to ensure that you capture all the logs for that hour. For instance, if a blob is created for the 1:00 PM - 2:00 PM window at 2:00 PM, you might want your Data Factory pipeline to start ingesting this blob only at 2:15 PM or 2:30 PM to account for potential late-arriving log data.

    If you require real-time ingestion, it might be better to consider using a service like Azure Stream Analytics. But if a slight delay is acceptable, the above approach with Azure Data Factory should work for your needs.

    0 comments No comments

0 additional answers

Sort by: Most helpful