An error occured job aborted due to stage failure

Question

An error occured job aborted due to stage failure

Santhosh kumar 20

Hi team,

I am getting below error. Please help on this. Thanks in advance

org.apache.spark. SparkException: Job aborted due to stage failure: Task 3 in stage 819.0

failed 4 times, most recent failure: Lost task 3.3 in stage 819.0 (TID 12768, 10.236.40.34,

executor 14): java.util. NoSuchElementException

atorg.apache.spark.sql.vectorized.ColumnarBatch$1.next(ColumnarBatch.java:69)

atorg.apache.spark.sql.vectorized.ColumnarBatch$1.next(ColumnarBatch.java:58)

a scala.collection.convert.Wrappers$JIteratorWrapper.next(Wrappers.scala:44)

a org.apache.spark.sql.execution.arrow.ArrowConverters$$anon$2.next(ArrowConverters.

scala:224)

a org.apache.spark.sql.execution.arrow.ArrowConverters$$anon$2.next(ArrowConverters.

scala:205)

atorg.apache.spark.sql.catalyst.expressions.

GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(UnknownSource)

at org. apache.spark.sql.execution. BufferedRowIterator.hasNext (BufferedRowIterator.

java:43)

a org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext

(WholeStageCodegenExec.scala:733)

a org.apache.spark.sql.execution.collect.UnsafeRowBatchUtils$.encodeUnsafeRows

(UnsafeRowBatchUtils.scala:80)

a org.apache.spark.sql.execution.collect.Collector.$anonfun$processFunc$1(Collector.

scala: 187)

a org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)

a org.apache.spark.scheduler.Task.doRunTask(Task.scala:144)

a org.apache.spark.scheduler.Task.run(Task.scala:117)

a org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$9(Executor.scala:663)

atorg.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1592)

at org.apache.spark.executor. Executor$TaskRunner.run (Executor.scala:666)

a java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

a java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

at java.lang. Thread.run (Thread.java: 750)

Driver stacktrace:

a org.apache.spark.scheduler.DAGScheduler.failJobAndIndependentStages(DAGScheduler.

scala:2560)

a org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2(DAGScheduler.

scala:2507)

a org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2$adapted(DAGScheduler.

scala:2501)

a scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)

a scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)

a scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)

a org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:2501)

a org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1(DAGScheduler.

scala:1193)

a org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1$adapted

(DAGScheduler.scala:1193)

at scala.Option.foreach(Option.scala:407)

a org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:1193)

a org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler

scala:2762)

atorg.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.

scala:2709)

a org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.

scala: 2697)

atorg.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49)

a org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:983)

atorg.apache.spark.SparkContext.runJob(SparkContext.scala:2339)

a org.apache.spark.sql.execution.collect.Collector.runSparkJobs(Collector.scala:298)

a org.apache.spark.sql.execution.collect.Collector.collect(Collector.scala:308)

a org.apache.spark.sql.execution.collect.Collector$.collect(Collector.scala:82)

a org.apache.spark.sql.execution.collect.Collector$.collect(Collector.scala:88)

at org. apache. spark. sql. execution. ResultacheManager. getOrComputeResult

(ResultCacheManager.scala:508)

a org.apache.spark.sql.execution.CollectLimitExec.executeCollectResult(limit.scala:58)

a org.apache.spark.sql.Dataset.collectResult(Dataset.scala:2994)

a org.apache.spark.sql.Dataset.$anonfun$collectResult$1(Dataset.scala:2985)

a org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3709)

at org. apache.spark. sql.execution.SQLExecution$.$anonfun$withCustomExecutionEnv$5

Accepted answer

0 additional answers

Your answer

Answer 1

@Santhosh kumar - Thanks for the question and using MS Q&A paltform.

It seems that the error is caused by a task failure in stage 819.0. The error message indicates that the task failed four times, with the most recent failure being a java.util.NoSuchElementException. This error can occur when a method of a collection is called to retrieve an element from an empty collection.

To resolve this issue, you can try the following options:

Check the input data: Ensure that the input data is correct and complete. If there is any missing data, it can cause the task to fail.
Increase the resources: You can try increasing the resources allocated to the job. This can be done by increasing the number of nodes in the cluster or by increasing the memory and CPU allocated to each node.
Repartition the data: You can try repartitioning the data to avoid data skew. If the data in one partition is too large, the related task running on the node needs to consume more memory than the node itself, which can cause failure. So you can use repartition to ensure that data size in each partition is average while the memory consumption isn't too heavy.
Check the code: Check the code to ensure that there are no logical errors that could cause the task to fail.

If none of the above steps work, you can share the document or steps which you have followed and endup with the above error message for further assistance.

Hope this helps. Do let us know if you any further queries.

If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

Santhosh kumar 20 Reputation points

2023-07-19T08:22:43.3833333+00:00

On rerunning the pipeline containing notebook activity the pipeline run was successful.

Share via

An error occured job aborted due to stage failure

0 additional answers

Your answer