Databricks quota exceed exception

Aswini Barik 0 Reputation points
2023-01-18T20:47:39.5266667+00:00

Hello ,

 We have lot of Databricks jobs configured to run on job cluster at their schedules. One fine day our jobs failed with below error.

"AZURE_QUOTA_EXCEEDED_EXCEPTION(CLIENT_ERROR)azure_error_code:QuotaExceeded,azure_error_message:Operation could not be completed as it results in exceeding approved standardEDSv4Family Cores quota. Additional details - Deployment Model: Resource Manager,"

Although this is clear that quota usage exceeded . I have few questions to which i am looking for to the point answer. Any answer except increase quota.

1 > How do i find out at any moment in time which cluster ( job & interactive ) is using how many vcpu which should total up the usage at that particular time.

2 > Is there any way to continuosly log vcpu usage by Clusters ? This would tell us how the pick was reached . Which all cluster were up when the pick is reached.

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,520 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Tasadduq Burney 8,956 Reputation points MVP Volunteer Moderator
    2023-01-18T20:54:50.2333333+00:00
    1. To find out at any moment in time which clusters (job and interactive) are using how many vCPUs, you can use the Azure Databricks REST API or the Azure Databricks CLI.
    • Using the Azure Databricks REST API:
      1. Make a GET request to the /api/2.0/clusters/list endpoint to retrieve a list of all clusters.
      2. For each cluster, make a GET request to the /api/2.0/clusters/get?cluster_id=<cluster_id> endpoint to retrieve detailed information about the cluster, including the number of vCPUs.
    • Using the Azure Databricks CLI:
      1. Use the databricks clusters list command to retrieve a list of all clusters.
      2. For each cluster, use the databricks clusters describe --cluster-id <cluster_id> command to retrieve detailed information about the cluster, including the number of vCPUs.
    1. There is no built-in way to continuously log vCPU usage by clusters in Azure Databricks. However, you can use Azure Monitor to collect metrics about the usage of the Databricks cluster and the underlying resources.
    • Using Azure Monitor:
      1. Enable Azure Monitor for your Databricks workspace
      2. Create a metric alert to trigger when the vCPU usage reaches a certain threshold.
      3. Create a Log Analytics workspace and collect metrics data and events from your Databricks workspace.
      4. Use Log Analytics queries to analyze the metrics data and identify which clusters were up when the peak usage was reached.
    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.