Best Practices for Automating Pipeline Execution Data Collection in Azure Data Factory

Hanna 220 Reputation points
2024-07-16T14:23:15.0633333+00:00

Hello everyone,

I am looking for the best practices to create an automated workflow for collecting execution data from Azure Data Factory (ADF) pipelines, storing this data in Azure Data Lake Storage Gen2, and consolidating it into a single table for simplified analysis and performance monitoring.

Specifically, I would like to know:

  1. What are the recommended methods for extracting pipeline execution data from Azure Data Factory?
  2. What tools or services should be used to automate the data extraction process?
  3. How should the data be structured and stored in Azure Data Lake Storage Gen2 for optimal querying and performance?
  4. What are the best practices for consolidating the extracted data into a single table for analysis?
  5. Are there any examples or templates available that demonstrate this workflow?

Any guidance, tips, or resources would be greatly appreciated!

Thank you.

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,082 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,202 questions
0 comments No comments
{count} votes

Accepted answer
  1. Nandan Hegde 31,591 Reputation points MVP
    2024-07-16T14:35:42.2966667+00:00

    The way to get the pipeline execution details is via REST API :

    https://learn.microsoft.com/en-us/rest/api/datafactory/pipeline-runs/get?view=rest-datafactory-2018-06-01&tabs=HTTP

    You can use ADF itself to extract the data from REST API and then leverage ADF dataflow to flatten the JSON into flat file and upload in blob

    Sample :

    https://www.linkedin.com/pulse/get-pipeline-execution-details-using-azure-data-rest-api-narmadha

    1 person found this answer helpful.

0 additional answers

Sort by: Most helpful