Unable to migrate to latest Databricks 13.3 LTS because of issue with its Spark 3.4.1

Oleksiy Rudenko 5 Reputation points
2023-12-14T13:03:27.83+00:00

Currently our project uses Databricks Runtime 12.2 LTS with Spark 3.3.2 and we decided to upgrade to the latest known LTS version 13.3 which uses Spark 3.4.1.

However, multiple jobs start to fail after migration with an error in Encoders.bean when we test this upgrade in our test environment.

23/12/14 10:40:57 ERROR Uncaught throwable from user code: java.util.NoSuchElementException: None.get 	at scala.None$.get(Option.scala:529) 	at scala.None$.get(Option.scala:527) 	at org.apache.spark.sql.catalyst.DeserializerBuildHelper$.$anonfun$createDeserializer$8(DeserializerBuildHelper.scala:404) 	at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286) 	at scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36) 	at scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33) 	at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:38) 	at scala.collection.TraversableLike.map(TraversableLike.scala:286) 	at scala.collection.TraversableLike.map$(TraversableLike.scala:279) 	at scala.collection.AbstractTraversable.map(Traversable.scala:108) 	at org.apache.spark.sql.catalyst.DeserializerBuildHelper$.createDeserializer(DeserializerBuildHelper.scala:393) 	at org.apache.spark.sql.catalyst.DeserializerBuildHelper$.createDeserializer(DeserializerBuildHelper.scala:228) 	at org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$.apply(ExpressionEncoder.scala:61) 	at org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$.javaBean(ExpressionEncoder.scala:67) 	at org.apache.spark.sql.Encoders$.bean(Encoders.scala:181) 	at org.apache.spark.sql.Encoders.bean(Encoders.scala)

This issue is registered in the Apache Spark JIRA: SPARK-45081 Encoders.bean does no longer work with read-only properties

It has been resolved and appropriate fix has been provided in Spark 3.4.2 (released on Nov 30, 2023) and 3.5.1.

It appears that update to Spark 3.4.2 is critical and needs to be applied to Databricks asap.

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,534 questions
{count} votes

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.