Job failed due to reason: at Sink 'SinkX': java.lang.ArrayIndexOutOfBoundsException: 0

Francisco Dominguez 386 Reputation points
2022-03-24T08:43:00.857+00:00

Good morning,

We've been using our pipeline for quite a while now and it contains a dataflow that parses certain files under a certain criteria from a folder. Since yesterday, we've been receiving a weird error even though we haven't changed anything on the pipeline nor the dataflow for this to happen. The error is the following:

Operation on target ImportTradeX failed: {"StatusCode":"DFExecutorUserError","Message":"Job failed due to reason: at Sink 'SinkX': java.lang.ArrayIndexOutOfBoundsException: 0","Details":"java.lang.ArrayIndexOutOfBoundsException: 0\n\tat com.microsoft.dataflow.spark.AbortExec.doExecute(AbortExec.scala:30)\n\tat org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:157)\n\tat org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:153)\n\tat org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:181)\n\tat org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)\n\tat org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:178)\n\tat org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:153)\n\tat org.apache.spark.sql.execution.InputAdapter.inputRDDs(WholeStageCodegenExec.scala:391)\n\tat com.microsoft.dataflow.spark.FineMetricExec.inputRDDs(MetricExec.scala:26)\n\tat org.apache.spark.sql.execution.ProjectExec.inputRDDs(basicPhysicalOperators.scala:46)\n\tat com.microsoft.dataflow.spark.FineMetricExec.inputRDDs(MetricExec.scala:26)\n\tat org.apache.spark.sql.execution.WholeStageCodegenEx"}

Our dataflow contains a source, an excel file (that is usually empty), a derived column, an asset activity to check for duplicates and two sinks. Using the data preview feature, we see that this error is coming up in the assert activity, when we check for duplicates.

I'm willing to share any details to help me with this issue, but I do not want to paste all the details here, so if you need all the information, I will send it via email once you ask me.

Thanks in advance.

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,672 questions
{count} votes

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.