dataflow taking toomuch time for executions need help on priority.


Hi In my production pipeline i am using dataflow activity to falttern json file into csv . it taking too much time that is 45 minutes and still either fail or success.

same pipeline in dev environment same input data taking 4 to 5 min to complete. can anyone solve to fix this issues.



i have shared dataflow settings and dataflow skeyliton for your are understanding.

[1]: /api/attachments/172527-image.png?platform=QnA [3]: /api/attachments/172561-image.png?platform=QnA

Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
1,338 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,532 questions
Active Directory Federation Services
Active Directory Federation Services
An Active Directory technology that provides single-sign-on functionality by securely sharing digital identity and entitlement rights across security and enterprise boundaries.
1,189 questions
{count} votes

2 answers

Sort by: Most helpful
  1. AnnuKumari-MSFT 30,676 Reputation points Microsoft Employee

    Hi @Karnati,Venkata Suchendra Reddy,IN-Bangalore ,

    Thankyou for using Microsoft Q&A platform and posting your query. Could you please check if there are any other pipelines running at the same time which might be sharing Integration Runtime that might cause delay. Also, Please compare your dev and Production pipeline JSON using online comparator tools . If it all looks good but issue still persists, then I would suggest you to raise a support ticket with Microsoft to analyze further on the issue. To get more details on how to raise the support ticket, kindly check the below article: Create an Azure support request.

    If this answers your query, do click Accept Answer and Up-Vote for the same. And, if you have any further query do let us know.

    0 comments No comments

  2. MarkKromer-MSFT 5,186 Reputation points Microsoft Employee

    Look at the top portion of your activity monitoring view, which is cut-off and not showing in your screenshot here. It will tell you how much time it took to acquire the Spark cluster to execute your data flow. If it is in the 3-4 minute range, then you should set a TTL on your Azure IR so that the pipeline can use a warm cluster instead of requiring ADF to spin-up a new cluster on every data flow activity invocation.