Best Practices for Automating Pipeline Execution Data Collection in Azure Data Factory

Question

Hello everyone,

I am looking for the best practices to create an automated workflow for collecting execution data from Azure Data Factory (ADF) pipelines, storing this data in Azure Data Lake Storage Gen2, and consolidating it into a single table for simplified analysis and performance monitoring.

Specifically, I would like to know:

What are the recommended methods for extracting pipeline execution data from Azure Data Factory?
What tools or services should be used to automate the data extraction process?
How should the data be structured and stored in Azure Data Lake Storage Gen2 for optimal querying and performance?
What are the best practices for consolidating the extracted data into a single table for analysis?
Are there any examples or templates available that demonstrate this workflow?

Any guidance, tips, or resources would be greatly appreciated!

Thank you.

Accepted Answer

The way to get the pipeline execution details is via REST API :

https://learn.microsoft.com/en-us/rest/api/datafactory/pipeline-runs/get?view=rest-datafactory-2018-06-01&tabs=HTTP

You can use ADF itself to extract the data from REST API and then leverage ADF dataflow to flatten the JSON into flat file and upload in blob

Sample :

https://www.linkedin.com/pulse/get-pipeline-execution-details-using-azure-data-rest-api-narmadha

Share via

Best Practices for Automating Pipeline Execution Data Collection in Azure Data Factory

0 additional answers