How to reduce unnecessary high memory usage in a Databricks cluster?

Senad Hadzikic 20 Reputation points
2024-05-08T08:58:46.4433333+00:00

We are having unnecessary high memory usage even when nothing is running on the cluster. When the cluster first starts, it's fine, but when I run a script and it finishes executing, nothing gets back to the idle (initial) state (even hours after nothing else was executed).

Screenshot 2024-05-08 at 10.53.08

Cluster config:
Screenshot 2024-05-08 at 10.56.09

Some settings i tried:
Screenshot 2024-05-08 at 10.56.41

Spark Config:
spark.executor.extraJavaOptions -XX:+UseG1GC -XX:MaxGCPauseMillis=500 -XX:ParallelGCThreads=20 -XX:ConcGCThreads=5 -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:G1HeapRegionSize=8M spark.memory.storageFraction 0.5 spark.dynamicAllocation.maxExecutors 10 spark.driver.extraJavaOptions -XX:+UseG1GC -XX:MaxGCPauseMillis=500 -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=5 -XX:GCLogFileSize=10M -Xloggc:/databricks/driver/logs/gc.log -XX:G1HeapRegionSize=8M -XX:+ExplicitGCInvokesConcurrent spark.dynamicAllocation.enabled true spark.memory.fraction 0.6 spark.dynamicAllocation.minExecutors 1

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,030 questions
{count} votes

3 answers

Sort by: Most helpful
  1. PRADEEPCHEEKATLA-MSFT 83,301 Reputation points Microsoft Employee
    2024-05-10T03:29:28.78+00:00

    @Senad Hadzikic - If you want to release the cached memory in your Databricks cluster without restarting the cluster itself, you can try the following steps:

    • Use the spark.catalog.clearCache() method to clear the cached data in Spark. This method removes all cached data from memory and disk. You can run this method in a notebook cell to clear the cached data.
    • Use the dbutils.fs.unmount() method to unmount any mounted file systems. Mounted file systems can consume memory, so unmounting them can help free up memory. You can run this method in a notebook cell to unmount any mounted file systems.
    • Use the sync command to flush the file system buffers and free up memory. You can run this command in a notebook cell to flush the file system buffers.
    • Use the echo /proc/sys/vm/drop_caches command to drop the page cache, dentries, and inodes. This command can help free up memory that is being used by the operating system cache. However, this command requires root access, so you might need to contact your Databricks administrator to run this command.
    • Consider using a different type of Databricks cluster. For example, you might try using a different instance type or a different number of nodes to see if this improves memory usage.

    Note that these steps might not free up all of the memory that is being used by your Databricks cluster, but they can help free up some memory. If you are still experiencing high memory usage after trying these steps, you might need to consider opening a support ticket for further assistance.

    Hope this helps. Do let us know if you any further queries.

    0 comments No comments

  2. Ben Gislason 0 Reputation points
    2024-05-24T12:27:24.49+00:00

    I have a similar problem and it does not seem like the root question at hand is getting answered. Why is there so much memory in the first place? I am experiencing exact same situation with a very simple join operation on a 3x9 pyspark dataframe. As soon as I click run at all, same memory consumption as shown above is seen in mine. Where is all this memory usage coming from? My same code ran fine a week ago.

    0 comments No comments

  3. Luigi Greselin 0 Reputation points
    2024-06-18T07:47:33.34+00:00

    I am experiencing the same problem. I even switched from a 28 Gb Memory cluster to a 56 Gb. It gets completely full after the first run

    0 comments No comments