Dynamic load balance

Ryan Abbey 1,186 Reputation points
2023-01-26T23:05:01.87+00:00

We have a Synapse Data Factory process that dynamically kicks off >200 extracts. As Data Factory has a 20 pipeline running concurrently limit, the remaining extracts are queued behind the initial 20

However, it appears to be evenly queuing the extracts among the available processors. Most of the processes are 1-2 minutes but we have a few that are 10+ minutes so what we are seeing is most extracts complete but some only run after the long running extracts have completed rather than being processed by an available processor. When badly distributed, we sometimes end up with two 10+ minutes processes running consecutively

Is there any way to decide what load balancing technique to use? Or any other way to stop it distributing poorly?

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
5,373 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,623 questions
{count} votes

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.