I think you are missing some steps :
It looks like you're trying to configure caching for a compute cluster in Azure Databricks. The code snippet you provided appears to be setting some configurations, but they don't look like valid Spark configurations, so that could be why it's not working as expected.
You'll need to create one if you're not already working with a cluster.
Databricks runtime has built-in support for caching, and you can configure it through Spark configurations. Here's an example snippet to set caching configurations using SparkConf:
val conf = new SparkConf()
.set("spark.databricks.io.cache.enabled", "true")
.set("spark.databricks.io.cache.maxDiskUsage", "50g")
.set("spark.databricks.io.cache.maxMetadataCache", "1g")
.set("spark.databricks.io.cache.compression.enabled", "false")
val spark = SparkSession.builder().config(conf).getOrCreate()
You can cache DataFrames in Databricks using the cache()
method :
val df = spark.read.parquet("path/to/your/data")
df.cache()
You can use the Databricks UI to monitor and manage caching. This allows you to view the cache status, storage levels, and more.