What are best practices for capturing Synapse or Data Factory Pipeline Activity Outputs in Log Analytics or Azure Monitor?

NPizzuti 56 Reputation points
2024-03-21T14:42:00.16+00:00

Hello,

I have spent some time researching online and I have not come to any concrete conclusion. I am seeking best practices for capturing outputs from Synapse pipeline activities into Log Analytics or Azure Monitor. I have not found a way to capture this using the built-in "Diagnostic Settings" options, as that only captures information about the pipelines or activities and not their individual outputs.

My specific use-case is Apache Spark notebook logging. I have an output from my notebook activity (path is

Azure Monitor
Azure Monitor
An Azure service that is used to collect, analyze, and act on telemetry data from Azure and on-premises environments.
3,273 questions
Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,925 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,708 questions
0 comments No comments
{count} votes

Accepted answer
  1. Amira Bedhiafi 24,636 Reputation points
    2024-03-21T17:30:13.7566667+00:00

    If the out-of-the-box solutions do not meet your needs, consider using Azure Log Analytics custom logs :

    • Modifying your Spark notebooks to write output details to a log file or directly to Log Analytics using the Log Analytics Data Collector API.
    • Using Azure Functions or Automation Runbooks to periodically read these outputs and send them to Log Analytics.

    For real-time processing and logging, you can use Azure Event Grid to subscribe to events from Azure Data Factory or Synapse Analytics. Or simply, trigger an Azure Function upon completion of activities to capture the output and log it to Azure Log Analytics.

    For your specific case of capturing outputs from Spark notebook activities:

    • Modify your Spark notebooks to include logging statements that write directly to Azure Log Analytics through the Data Collector API.
    • Alternatively, write the outputs to an intermediate storage (like Azure Blob Storage) with detailed logging information, and then use a scheduled process (Azure Functions or Logic Apps) to ingest these logs into Log Analytics.
    0 comments No comments

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.