Pools and Cost

Gopinath Rajee 656 Reputation points
2022-04-27T04:21:41.28+00:00

All,

Suppose I create a Pool with Min Idle of 5 and Max Capacity of 10 to run the job clusters inside the pool. When not in use, how do they remain idle only to get assigned very quickly when allocated to a job cluster?

Is it that the nodes keep running in the background and because of which they get assigned to jobclusters very quickly?

Does this mean that we will be billed by Azure since the nodes are running but not by Databricks since there is no DBU consumed? If Azure is billing for the idle instances in the pool, how much are they charging us?

Thanks,
grajee

<<Azure Databricks pools reduce cluster start and auto-scaling times by maintaining a set of idle, ready-to-use instances>>

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,262 questions
0 comments No comments
{count} votes

Accepted answer
  1. PRADEEPCHEEKATLA 90,251 Reputation points
    2022-04-27T12:00:23.713+00:00

    Hello @Gopinath Rajee ,

    Thanks for the question and using MS Q&A platform.

    Suppose I create a Pool with Min Idle of 5 and Max Capacity of 10 to run the job clusters inside the pool. When not in use, how do they remain idle only to get assigned very quickly when allocated to a job cluster?

    When you create a pool, in order to control its size, you can set three parameters: minimum idle instances, maximum capacity, and idle instance auto termination.

    1. Minimum Idle Instances: The minimum number of instances the pool keeps idle. These instances do not terminate, regardless of the setting specified in Idle Instance Auto Termination. If a cluster consumes idle instances from the pool, Azure Databricks provisions additional instances to maintain the minimum.
    2. Maximum Capacity: The maximum number of instances that the pool will provision. If set, this value constrains all instances (idle + used). If a cluster using the pool requests more instances than this number during autoscaling, the request will fail with an INSTANCE_POOL_MAX_CAPACITY_FAILURE error.
    3. Idle Instance Auto Termination: The time in minutes that instances above the value set in Minimum Idle Instances can be idle before being terminated by the pool.

    197012-image.png

    Is it that the nodes keep running in the background and because of which they get assigned to jobclusters very quickly?

    Yes, the nodes keep running in the background, until the value set in Minimum Idle Instances can be idle before being terminated by the pool

    Does this mean that we will be billed by Azure since the nodes are running but not by Databricks since there is no DBU consumed? If Azure is billing for the idle instances in the pool, how much are they charging us?

    Azure Databricks does not charge DBUs while instances are idle in the pool. Instance provider billing does apply based on the instance Type. See Linux Virtual Machines Pricing.

    • Instance types: A pool consists of both idle instances kept ready for new clusters and instances in use by running clusters. All of these instances are of the same instance provider type, selected when creating a pool.

    A pool’s instance type cannot be edited. Clusters attached to a pool use the same instance type for the driver and worker nodes. Different families of instance types fit different use cases, such as memory-intensive or compute-intensive workloads.

    Example: I used Standard_Ds3_v2, you will be billed based on the instance type selected.

    196888-image.png

    Hope this will help. Please let us know if any further queries.

    ------------------------------

    • Please don't forget to click on 130616-image.png or upvote 130671-image.png button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is how
    • Want a reminder to come back and check responses? Here is how to subscribe to a notification
    • If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators
    0 comments No comments

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.