In the Databricks workspace, when you define a job, you can specify the configuration of the new job cluster it will run on. This includes specifying the number of worker nodes, which implicitly sets the number of cores used.
When setting up the Databricks Linked Service in ADF (#2), you can set the Existing Cluster ID
. This will mean every ADF pipeline that uses this linked service will use that existing cluster rather than creating a new job cluster. This will make your jobs run serially on that cluster if they overlap, thereby avoiding the problem of maxing out cores.
Databricks has a feature called Pools (#3) which can be utilized to pre-warm clusters. By leveraging pools, you can have clusters that start faster and you can also set a cap on the number of concurrently active clusters in a pool. This way you can control the concurrency at the pool level.
Use ADF's pipeline concurrency settings. You can limit how many instances of a particular pipeline run at the same time.
Set up dependency conditions in ADF such that certain activities don't run unless others have completed. This is more manual and requires foresight into which jobs might be running concurrently.
Rather than querying the Databricks API directly in every ADF pipeline, you can create an Azure Logic App that manages the execution queue for your Databricks jobs. The Logic App can check if there's available capacity before triggering a new Databricks job via ADF. This abstracts away the check from individual ADF pipelines and centralizes the logic.
If many of your transformations share common logic, consider refactoring your jobs and using Databricks libraries to modularize and centralize some of the logic. This can reduce the number of distinct jobs you need to run, possibly reducing the need for so many concurrent clusters.
You can use a combination of #2 and #3 to manage the concurrency and core usage of your Databricks jobs initiated from ADF.