@Irene Sorry for the delay. I am still following up internally with the products team on this behavior however can you please set spark.dynamicExecutionAllocation.enabled on session start payload even dynamic execution allocation is enabled on pool level? You can set it in notebook via magic command (doc). If you are using other http client, the payload should follow the protocol: https://github.com/cloudera/livy#request-body. Please let me know how it goes.
Spark pools: dynamicExecutorAllocation parameter
Hello,
I'm experimenting with Synapse Spark pools and I've noticed that there's a dynamicExecutorAllocation parameter that is available when creating a Spark pool via REST API but not available via UI
When I'm setting dynamicExecutorAllocation.enabled to true Spark behaviour seems to be slightly inconsistent. If I'm checking spark configuration on job startup sometimes I see that spark.dynamicAllocation.enabled
set to true, and spark.dynamicAllocation.maxExecutor
is unset, which I guess is expected behaviour, but there is also this parameter present spark.dynamicAllocation.disableIfMinMaxNotSpecified.enabled=true
. The job still executes on the min amount of executors, and I am not sure if that's because it's optimal configuration or is it because of spark.dynamicAllocation.disableIfMinMaxNotSpecified.enabled
. Is there any information about this?
Also I tried doing the same on a different pool, and when I'm setting dynamicExecutionAllocation.enabled=true there, on the job execution spark configuration still has "spark.dynamicAllocation.enabled=false". From the Spark pool JSON settings:
"dynamicExecutorAllocation": {
"enabled": true
},
Are there any additional settings? Or am I using dynamicExecutionAllocation property incorrectly?
Thanks!
3 answers
Sort by: Most helpful
-
Saurabh Sharma 23,821 Reputation points Microsoft Employee
2021-02-11T00:24:15.357+00:00 -
Martin B 96 Reputation points
2021-03-26T09:36:09.423+00:00 Hello,
I noticed the same issue and tried the workaround by overriding the configuration via the first cell in the notebook:%%configure -f { "conf": { "spark.dynamicAllocation.disableIfMinMaxNotSpecified.enabled": true, "spark.dynamicAllocation.enabled": true, "spark.dynamicAllocation.minExecutors": 2, "spark.dynamicAllocation.maxExecutors": 5 } }
This works for executing the notebook within a Synapse Studio Develop tab.
However, I noticed when executing the notebook from a synapse pipeline, the configuration change seem to have no effect. SparkUI shows
spark.dynamicAllocation.enabled=false
in Environment tab. -
Sumit Kumar 96 Reputation points
2021-06-16T13:08:33.917+00:00 @Saurabh Sharma I came upon this question while trying to solve the same problem - I'm unable to dynamically increase the number of executors from 2. Tried @Martin B 's suggestion. It worked initially but now its throwing an error when executing the notebook.
Is there a final answer on how these parameters can be set for a pool?