Hello @Mehulsinh Vaghela ,
Welcome to the MS Q&A platform.
The reason for the memory bottleneck can be any of the following:
- The driver instance type is not optimal for the load executed on the driver.
- There are memory-intensive operations executed on the driver.
- There are many notebooks or jobs running in parallel on the same cluster.
Here’s our recommendation for GC allocation failure issues:
- If more data going to the driver memory, then you need to increase the more driver memory space. -You can check it out based on the ganglia metrics and driver logs(stdout).
- If multiple notebooks are attached to the cluster and all the applications are sending the data to the driver at the same time then the above recommendation should be tweaked or else avoid the operations that are sending the data to the driver.
- Avoid running batch jobs on a shared interactive cluster.
- Distribute the workloads into different clusters. No matter how big the cluster is, the functionalities of the Spark driver cannot be distributed within a cluster.
Hope this will help. Please let us know if any further queries.
------------------------------
- Please don't forget to click on
or upvote
button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is how
- Want a reminder to come back and check responses? Here is how to subscribe to a notification
- If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators