Hi Jain, Shovit,
Thank you for posting query in Microsoft Q&A Platform.
Azure Data Factory provides a waiting/queueing mechanism for Databricks job clusters. You can use the maxConcurrentRuns
parameter in the Databricks activity settings to limit the number of concurrent runs of a Databricks job. This parameter specifies the maximum number of runs that can be executed at the same time.
If the maximum number of concurrent runs is reached, any new runs will be queued until a job cluster becomes available. Once a job cluster becomes available, the queued runs will be executed in the order they were received.
Here is an example of how to use the maxConcurrentRuns
parameter in the Databricks activity settings:
In your Azure Data Factory pipeline, add a Databricks activity to execute your Databricks notebook.
In the Databricks activity settings, set the maxConcurrentRuns
parameter to the maximum number of concurrent runs you want to allow. For example, if you want to allow a maximum of 5 concurrent runs, set maxConcurrentRuns
to 5.
Save and publish your pipeline.
With this configuration, if the maximum number of concurrent runs is reached, any new runs will be queued until a job cluster becomes available. Once a job cluster becomes available, the queued runs will be executed in the order they were received, up to the maximum number of concurrent runs specified by maxConcurrentRuns
.
Note that the maxConcurrentRuns
parameter is only available for Databricks activities that use job clusters. If you are using a Databricks activity that uses a dedicated cluster, you can use the maxWorkers
parameter to limit the number of workers in the cluster.
Hope this helps. Please let me know how it goes.