How to reduce the data flow performance timing?

HemanthKumar loka 226 Reputation points
2022-03-16T14:20:42.33+00:00

The pipeline having set variable, metadata activity and If condition, the if condition have dataflow When I trigger the pipeline and data flow took time 4 to 5 mins to complete. I wanted to reduce the timing? Can you please any one help me on this.

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,624 questions
0 comments No comments
{count} votes

Accepted answer
  1. AnnuKumari-MSFT 34,556 Reputation points Microsoft Employee Moderator
    2022-03-17T08:08:04.797+00:00

    Hi @HemanthKumar loka ,
    Thankyou for using Microsoft Q&A platform and posting your query.
    I acknowledge the fact that Data flow would take few minutes to spin up the cluster.
    Dataflow runs behind on spark clusters which are managed by ADF. Clusters are created on demand from scratch and will be destroyed after job is done. That's the reason that 4-5 minutes for acquiring compute. Once compute is acquired, the job runs and kill the cluster after the job run is completed.

    There is a workaround though that user can set TimeToLive in Azure IR, and this will keep cluster alive for next job (if the job falls in this time period). Like if you set the TTL for 10 minutes, it will wait if there is any other job for same IR arrives and continue the cycle. If no job arrives in 10 minutes it kills the cluster.

    From the ADF pipeline designer UI, go to Connections > Integration Runtimes > New. Select Azure IR and then open the Data Flow Run Time properties section. You will be able to see TimeToLive Option there which is The allowed idle time for the data flow compute. Specifies how long it stays alive after completion of a data flow run if there are no other active jobs.

    Please refer to the following blog posts for more information:

    1. https://techcommunity.microsoft.com/t5/azure-data-factory-blog/adf-adds-ttl-to-azure-ir-to-reduce-data-flow-activity-times/ba-p/878380
    2. https://social.msdn.microsoft.com/Forums/en-US/91d388d9-730f-4d53-93b2-2a8697513511/azure-dataflow-execution-behaviour?forum=AzureDataFactory

    Hope this will help. Please let us know if any further queries.

    ------------------------------

    • Please don't forget to click on 130616-image.png or upvote 130671-image.png button whenever the information provided helps you.
      Original posters help the community find answers faster by identifying the correct answer. Here is how
    • Want a reminder to come back and check responses? Here is how to subscribe to a notification
    • If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators
    0 comments No comments

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.