Data Factory pipeline failed - Failed to execute dataflow with internal server error

Jaime Sendra 6 Reputation points
2022-06-17T07:32:31.073+00:00

In our project we use dataflow to ingest data into a Postgre database from a csv file. We do a join before inserting with the table itself to detect which items have been updated and thus assign an update timestamp.

The problem occurs in this join, because if I remove it there are no problems. The database exceeds 100GB. The error does not give me much information as to what might be happening:

{"StatusCode":"DFInternalServerError","Message":"Job failed due to reason: Failed to execute dataflow with internal server error, please retry later. If issue persists, please contact Microsoft support for further assistance","Details":"org.apache.spark.SparkException: Job aborted.\n\tat org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:202)\n\tat org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:159)\n\tat org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:104)\n\tat org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:102)\n\tat org.apache.spark.sql.execution.command.DataWritingCommandExec.doExecute(commands.scala:122)\n\tat org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:157)\n\tat org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:153)\n\tat org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:181)\n\tat org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)\n\tat org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:1"}

212348-datafactoryerror.png

Does any one have the same problem recently?

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
0 comments No comments
{count} vote

1 answer

Sort by: Most helpful
  1. ShaikMaheer-MSFT 38,631 Reputation points Microsoft Employee Moderator
    2022-06-21T09:11:39.993+00:00

    Hi @Jaime Sendra ,

    Thank you for posting query in Microsoft Q&A Platform.

    Usually this kind of error will be transient. They may go away when we retry. If its not the case then underlying dataflow cluster creation may giving this error.

    Please retry after some time and see if that helps. If not I would encourage you to have support ticket for this case to investigate deeper and get the resolution. Below is the link that helps with steps to submit support ticket.
    https://learn.microsoft.com/en-us/azure/azure-portal/supportability/how-to-create-azure-support-request


Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.