Hello ,
We have lot of Databricks jobs configured to run on job cluster at their schedules. One fine day our jobs failed with below error.
"AZURE_QUOTA_EXCEEDED_EXCEPTION(CLIENT_ERROR)azure_error_code:QuotaExceeded,azure_error_message:Operation could not be completed as it results in exceeding approved standardEDSv4Family Cores quota. Additional details - Deployment Model: Resource Manager,"
Although this is clear that quota usage exceeded . I have few questions to which i am looking for to the point answer. Any answer except increase quota.
1 > How do i find out at any moment in time which cluster ( job & interactive ) is using how many vcpu which should total up the usage at that particular time.
2 > Is there any way to continuosly log vcpu usage by Clusters ? This would tell us how the pick was reached . Which all cluster were up when the pick is reached.