Hello @Gopinath Rajee ,
Thanks for the question and using MS Q&A platform.
Suppose I create a Pool with Min Idle of 5 and Max Capacity of 10 to run the job clusters inside the pool. When not in use, how do they remain idle only to get assigned very quickly when allocated to a job cluster?
When you create a pool, in order to control its size, you can set three parameters: minimum idle instances, maximum capacity, and idle instance auto termination.
- Minimum Idle Instances: The minimum number of instances the pool keeps idle. These instances do not terminate, regardless of the setting specified in Idle Instance Auto Termination. If a cluster consumes idle instances from the pool, Azure Databricks provisions additional instances to maintain the minimum.
- Maximum Capacity: The maximum number of instances that the pool will provision. If set, this value constrains all instances (idle + used). If a cluster using the pool requests more instances than this number during autoscaling, the request will fail with an INSTANCE_POOL_MAX_CAPACITY_FAILURE error.
- Idle Instance Auto Termination: The time in minutes that instances above the value set in Minimum Idle Instances can be idle before being terminated by the pool.
Is it that the nodes keep running in the background and because of which they get assigned to jobclusters very quickly?
Yes, the nodes keep running in the background, until the value set in Minimum Idle Instances can be idle before being terminated by the pool
Does this mean that we will be billed by Azure since the nodes are running but not by Databricks since there is no DBU consumed? If Azure is billing for the idle instances in the pool, how much are they charging us?
Azure Databricks does not charge DBUs while instances are idle in the pool. Instance provider billing does apply based on the instance Type. See Linux Virtual Machines Pricing.
- Instance types: A pool consists of both idle instances kept ready for new clusters and instances in use by running clusters. All of these instances are of the same instance provider type, selected when creating a pool.
A pool’s instance type cannot be edited. Clusters attached to a pool use the same instance type for the driver and worker nodes. Different families of instance types fit different use cases, such as memory-intensive or compute-intensive workloads.
Example: I used
Standard_Ds3_v2, you will be billed based on the instance type selected.
Hope this will help. Please let us know if any further queries.
- Please don't forget to click on or upvote button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is how
- Want a reminder to come back and check responses? Here is how to subscribe to a notification
- If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators