Share via

Notebook clusters in pipeline parallel processes

Iwan 65 Reputation points
2024-09-19T12:59:09.1533333+00:00

I have a pipeline triggered daily which loops through a list of notebooks, and runs them using a cluster in a for each run in parallel. At the moment I have seven notebooks running and adding more over time.

What would happen when the cluster runs more than it's able to at one time? Will it process x number of notebooks at once and leave the rest in queue until it has capacity or will it fail to run the rest as it's reached its capacity?

Azure Synapse Analytics
Azure Synapse Analytics

An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.

0 comments No comments

Answer accepted by question author

Vinodh247-1375 43,021 Reputation points Volunteer Moderator
2024-09-19T14:38:07.2766667+00:00

Hi Iwan,

Thanks for reaching out to Microsoft Q&A.

In Azure Synapse, when you run multiple notebooks in parallel within a pipeline, the capacity of the cluster can become a bottleneck. If the cluster reaches its capacity (ex: due to limited available cores or memory), synapse will typically queue the remaining jobs until resources become available. It will not fail the jobs outright unless there's an issue such as an out-of-memory error or a configuration problem.

To handle this gracefully, you can try the following:

  • Ensure you monitor the cluster’s CPU and memory usage to understand when you might hit capacity limits.
  • Adjust the concurrency in your "ForEach" activity in the pipeline to limit how many notebooks are run in parallel. This can be done using the "Batch Count" property.
  • If you are using a cluster that supports auto-scaling, the cluster will attempt to add more nodes to accommodate additional workload, as long as it is configured and has not hit any maximum limits.
  • Synapse typically queues notebook executions that exceed current resources. If you have many notebooks, it will process them as resources free up. If a notebook fails due to resource constraints, you could add retry policies in your pipeline.
  • For spark jobs, If you consistently have long queue times, consider increasing your spark pool size.

Implementing these strategies can help manage resource allocation and prevent failures due to exceeding cluster capacity.

Please 'Upvote'(Thumbs-up) and 'Accept' as an answer if the reply was helpful. This will benefit other community members who face the same issue.

Was this answer helpful?

1 person found this answer helpful.
0 comments No comments

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.