I have a similar problem and it does not seem like the root question at hand is getting answered. Why is there so much memory in the first place? I am experiencing exact same situation with a very simple join operation on a 3x9 pyspark dataframe. As soon as I click run at all, same memory consumption as shown above is seen in mine. Where is all this memory usage coming from? My same code ran fine a week ago.
How to reduce unnecessary high memory usage in a Databricks cluster?
We are having unnecessary high memory usage even when nothing is running on the cluster. When the cluster first starts, it's fine, but when I run a script and it finishes executing, nothing gets back to the idle (initial) state (even hours after nothing else was executed).
Cluster config:
Some settings i tried:
Spark Config:
spark.executor.extraJavaOptions -XX:+UseG1GC -XX:MaxGCPauseMillis=500 -XX:ParallelGCThreads=20 -XX:ConcGCThreads=5 -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:G1HeapRegionSize=8M spark.memory.storageFraction 0.5 spark.dynamicAllocation.maxExecutors 10 spark.driver.extraJavaOptions -XX:+UseG1GC -XX:MaxGCPauseMillis=500 -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=5 -XX:GCLogFileSize=10M -Xloggc:/databricks/driver/logs/gc.log -XX:G1HeapRegionSize=8M -XX:+ExplicitGCInvokesConcurrent spark.dynamicAllocation.enabled true spark.memory.fraction 0.6 spark.dynamicAllocation.minExecutors 1
6 answers
Sort by: Most helpful
-
-
Luigi Greselin 5 Reputation points
2024-06-18T07:47:33.34+00:00 I am experiencing the same problem. I even switched from a 28 Gb Memory cluster to a 56 Gb. It gets completely full after the first run
-
PRADEEPCHEEKATLA-MSFT 90,221 Reputation points Microsoft Employee
2024-05-10T03:29:28.78+00:00 @Senad Hadzikic - If you want to release the cached memory in your Databricks cluster without restarting the cluster itself, you can try the following steps:
- Use the
spark.catalog.clearCache()
method to clear the cached data in Spark. This method removes all cached data from memory and disk. You can run this method in a notebook cell to clear the cached data. - Use the
dbutils.fs.unmount()
method to unmount any mounted file systems. Mounted file systems can consume memory, so unmounting them can help free up memory. You can run this method in a notebook cell to unmount any mounted file systems. - Use the
sync
command to flush the file system buffers and free up memory. You can run this command in a notebook cell to flush the file system buffers. - Use the
echo /proc/sys/vm/drop_caches
command to drop the page cache, dentries, and inodes. This command can help free up memory that is being used by the operating system cache. However, this command requires root access, so you might need to contact your Databricks administrator to run this command. - Consider using a different type of Databricks cluster. For example, you might try using a different instance type or a different number of nodes to see if this improves memory usage.
Note that these steps might not free up all of the memory that is being used by your Databricks cluster, but they can help free up some memory. If you are still experiencing high memory usage after trying these steps, you might need to consider opening a support ticket for further assistance.
Hope this helps. Do let us know if you any further queries.
- Use the
-
Alex 0 Reputation points
2024-07-08T00:53:04.0066667+00:00 Same problem here, I did the same as the user in the last comment, increased the cluster memory. But it still shows almost 100% memory utilization long after the script in a Notebook has completed. Ironically, clearing the cache does not help. Only way to solve it is to restart the cluster.
-
Alex 0 Reputation points
2024-08-05T19:59:13.4333333+00:00 Hi @PRADEEPCHEEKATLA-MSFT , anybody at Microsoft looking into this? If not, can you point us to the right Databricks forum/contact. It's been months without a solution.