Azure Databricks - Data Factory - orchestration - Job Cluster

Jain, Shovit 40 Reputation points
2023-07-19T10:12:44.12+00:00

Hello All,

Using Azure Data Factory as orchestration which has Databricks, while execution it more often fails with Error Message:
Operation on target <ADLS> failed: Databricks execution failed with error state: InternalError, error message: Unexpected failure while waiting for the cluster <cluster_name> to be ready: Cluster <cluster_name> is in unexpected state.

I am using job cluster while executing Databricks notebook, it will terminate and become available once the job is completed, now new job can create new job cluster to execute.

Problem is I want to have a queue mechanism where if the max limit of having the job cluster reach, my job should wait for the availability of the job cluster instead of failure.

Is there any waiting\queueing mechanism currently available?

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,530 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,641 questions
{count} votes

Accepted answer
  1. ShaikMaheer-MSFT 38,546 Reputation points Microsoft Employee Moderator
    2023-07-20T16:45:23.6333333+00:00

    Hi Jain, Shovit,

    Thank you for posting query in Microsoft Q&A Platform.

    Azure Data Factory provides a waiting/queueing mechanism for Databricks job clusters. You can use the maxConcurrentRuns parameter in the Databricks activity settings to limit the number of concurrent runs of a Databricks job. This parameter specifies the maximum number of runs that can be executed at the same time.

    If the maximum number of concurrent runs is reached, any new runs will be queued until a job cluster becomes available. Once a job cluster becomes available, the queued runs will be executed in the order they were received.

    Here is an example of how to use the maxConcurrentRuns parameter in the Databricks activity settings:

    In your Azure Data Factory pipeline, add a Databricks activity to execute your Databricks notebook.

    In the Databricks activity settings, set the maxConcurrentRuns parameter to the maximum number of concurrent runs you want to allow. For example, if you want to allow a maximum of 5 concurrent runs, set maxConcurrentRuns to 5.

    Save and publish your pipeline.

    With this configuration, if the maximum number of concurrent runs is reached, any new runs will be queued until a job cluster becomes available. Once a job cluster becomes available, the queued runs will be executed in the order they were received, up to the maximum number of concurrent runs specified by maxConcurrentRuns.

    Note that the maxConcurrentRuns parameter is only available for Databricks activities that use job clusters. If you are using a Databricks activity that uses a dedicated cluster, you can use the maxWorkers parameter to limit the number of workers in the cluster.

    Hope this helps. Please let me know how it goes.


0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.