How to custom configure HDInsight Autoscale

2024-05-06

Following are few configurations that can be tuned to custom configure HDInsight Autoscale as per customer needs.

Note

This is applicable for 4.0 and 5.0 stacks.

Configurations

Configuration	Description	Default value	Applicable cluster/Autoscale type	Remarks
yarn.4_0.graceful.decomm.workaround.enable	Enable YARN graceful decommissioning	Loadware autoscale – True Scheduled autoscale - True	Hadoop/Spark	If this config is disabled, YARN puts nodes in decommissioned state directly from running state without waiting for the applications using the node to finish. This action might lead to applications getting killed abruptly when nodes are decommissioned. Read more about job resiliency in YARN here
yarn.graceful.decomm.timeout	YARN graceful decommissioning timeout in seconds	Hadoop Loadware – 3600 Spark Scheduled - 1 Hadoop Scheduled – 1 Spark Loadware – 86400	Hadoop/Spark	Graceful decommissioning timeout is best configured according to customer applications. For example – if an application has many mappers and few reducers, which can take 4 hours to complete, this configuration needs to be set to more than 4 hours
yarn.max.scale.up.increment	Maximum number of nodes to scale up in one go	200	Hadoop/Spark/Interactive Query	It has been tested with 200 nodes. We don't recommend setting this value to more than 200. It can be set to less than 200 if the customer wants less aggressive scale up
yarn.max.scale.down.increment	Maximum number of nodes to scale up in one go	50	Hadoop/Spark/Interactive Query	Can be set to up to 100
nodemanager.recommission.enabled	Feature to enabled recommissioning of decommissioning NMs before adding new nodes to the cluster	True	Hadoop/Spark load based autoscale	Disabling this feature can cause underutilization of cluster. There can be nodes in decommissioning state, which have no containers to run but are waiting for application to finish, even if there's more load in the cluster. Note: Applicable for images on 2304280205 or later
UnderProvisioningDiagnoser.time.ms	Time in milliseconds for which cluster needs to under provisioned for a scale up to trigger	180000	Hadoop/Spark load based autoscaling	-
OverProvisioningDiagnoser.time.ms	Time in milliseconds for which cluster needs to be overprovisioned for a scale down to trigger	180000	Hadoop/Spark load based autoscaling	-
hdfs.decommission.enable	Decommission data nodes before triggering decommissioning node managers. HDFS doesn't support any graceful decommission timeout, it’s immediate	True	Hadoop/Spark load based autoscaling	Decommissioning datanodes before decommissioning nodemanagers so that particular datanode isn't used for storing shuffle data.
scaling.recommission.cooldown.ms	Cooldown period after recommission during which no metrics are sampled	120000	Hadoop/Spark load based autoscaling	This cooldown period ensures the cluster has some time to redistribute the load to the newly recommissioned `nodemanagers`. Note: Applicable for images on 2304280205 or later
scale.down.nodes.with.ms	Scale down nodes where an AM is running	false	Hadoop/Spark	Can be turned on if there are enough reattempts configured for the AM. Useful for cases where there are long running applications (example spark streaming) which can be killed for scaling down cluster if load has reduced. Note: Applicable for images on 2304280205 or later

Note

The above configs can be changed using this script run on the headnodes as a script action, please use this readme to understand how to run the script.
Customers are advised to test the configurations on lower environments before moving to production.
How to check image version

Next steps

Read about guidelines for scaling clusters manually in Scaling guidelines