I'm glad that you were able to resolve your issue and thank you for posting your solution so that others experiencing the same thing can easily reference this! Since the Microsoft Q&A community has a policy that "The question author cannot accept their own answer. They can only accept answers by others ", I'll repost your solution in case you'd like to accept the answer .
Ask:.The team runs some data analytics pipelines on Synapse workspace using the data source CosmosDB. The pipeline jobs started failing a few days back with the following error. When I checked the Dataflows, I got the same error in the Data Preview blade.
The store type is Analytical
Key Details:
- Error Message:
java.lang.NoSuchMethodError: com.azure.data.cosmos.serialization.hybridrow.RowBuffer.<init>
- Spark Version: 3.3
- Connector Involved: Azure Cosmos DB Spark connector
ScalaAI ConvertCopy
at Source 'source1': an error occurred during snapshot metadata read phase - Job aborted due to stage failure: Task 0 in stage 16.0 failed 1 times, most recent failure: Lost task 0.0 in stage 16.0 (TID 16) (vm-92f73742 executor 1): java.lang.NoSuchMethodError: com.azure.data.cosmos.serialization.hybridrow.RowBuffer.<init>(Lcosmosdb_shaded/io/netty/buffer/ByteBuf;Lcom/azure/data/cosmos/serialization/hybridrow/HybridRowVersion;Lcom/azure/data/cosmos/serialization/hybridrow/layouts/LayoutResolver;)V
at shaded.msdataflow.com.microsoft.azure.cosmos.analytics.spark.connector.hybridrow.HybridRowObjectMapper.read(HybridRowObjectMapper.java:59)
at shaded.msdataflow.com.microsoft.azure.cosmos.analytics.spark.connector.store.alos.ALoSFileManager.fetchRoot(ALoSFileManager.scala:443)
at shaded.msdataflow.com.microsoft.azure.cosmos.analytics.spark.connector.store.alos.ALoSFileManager.$anonfun$rootSegment$3(ALoSFileManager.scala:175)
at shaded.msdataflow.com.microsoft.azure.cosmos.analytics.spark.connector.store.alos.ALoSFileManager.retryHybridRowDeserialization(ALoSFileManager.scala:456)
at shaded.msdataflow.com.microsoft.azure.cosmos.analytics.spark.connector.store.alos.ALoSFileManager.$anonfun$rootSegment$2(ALoSFileManager.scala:175)
at shaded.msdataflow.com.microsoft.azure.cosmos.analytics.spark.connector.utils.ResourceUtils$.withResources(ResourceUtils.scala:54)
at shaded.msdataflow.com.microsoft.azure.cosmos.analytics.spark.connector.store.alos.ALoSFileManager.rootSegment(ALoSFileManager.scala:174)
at shaded.msdataflow.com.microsoft.azure.cosmos.analytics.spark.connector.store.alos.metadata.MapRootSegmentToFileSegmentsInfo$.processRootSegment(FileSegmentMetadata.scala:327)
at shaded.msdataflow.com.microsoft.azure.cosmos.analytics.spark.connector.store.alos.metadata.MapRootSegmentToFileSegmentsInfo$.$anonfun$run$5(FileSegmentMetadata.scala:188)
at scala.collection.Iterator$$anon$10.next(Iterator.scala:461)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:514)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:491)
at scala.collection.Iterator.foreach(Iterator.scala:943)
at scala.collection.Iterator.foreach$(Iterator.scala:943)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1431)
at scala.collection.generic.Growable.$plus$plus$eq(Growable.scala:62)
at scala.collection.generic.Growable.$plus$plus$eq$(Growable.scala:53)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:105)
at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:49)
at scala.collection.TraversableOnce.to(TraversableOnce.scala:366)
at scala.collection.TraversableOnce.to$(TraversableOnce.scala:364)
at scala.collection.AbstractIterator.to(Iterator.scala:1431)
at scala.collection.TraversableOnce.toBuffer(TraversableOnce.scala:358)
at scala.collection.TraversableOnce.toBuffer$(TraversableOnce.scala:358)
at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1431)
at scala.collection.TraversableOnce.toArray(TraversableOnce.scala:345)
at scala.collection.TraversableOnce.toArray$(TraversableOnce.scala:339)
at scala.collection.AbstractIterator.toArray(Iterator.scala:1431)
at org.apache.spark.rdd.RDD.$anonfun$collect$2(RDD.scala:1028)
at org.apache.spark.SparkContext.$anonfun$runJob$5(SparkContext.scala:2448)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
at org.apache.spark.scheduler.Task.run(Task.scala:136)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:548)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1504)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:551)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Solution: transient issue
I haven't made any changes, but the data flow randomly started working. I guess it was something on the Azure side because no changes were made on our end.
If I missed anything please let me know and I'd be happy to add it to my answer, or feel free to comment below with any additional information.
If you have any other questions, please let me know. Thank you again for your time and patience throughout this issue.
Please don’t forget to Accept Answer
and Yes
for "was this answer helpful" wherever the information provided helps you, this can be beneficial to other community members.