java.lang.OutOfMemoryError: GC overhead limit exceeded error in azure databricks

Question

Mehulsinh Vaghela 1

1 answer

Answer 1

Welcome to the MS Q&A platform.

The reason for the memory bottleneck can be any of the following:

Here’s our recommendation for GC allocation failure issues:

If more data going to the driver memory, then you need to increase the more driver memory space. -You can check it out based on the ganglia metrics and driver logs(stdout).
If multiple notebooks are attached to the cluster and all the applications are sending the data to the driver at the same time then the above recommendation should be tweaked or else avoid the operations that are sending the data to the driver.
Avoid running batch jobs on a shared interactive cluster.
Distribute the workloads into different clusters. No matter how big the cluster is, the functionalities of the Spark driver cannot be distributed within a cluster.

Hope this will help. Please let us know if any further queries.

------------------------------

Please don't forget to click on or upvote button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is how
Want a reminder to come back and check responses? Here is how to subscribe to a notification
If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators