Data Flow Sink Fails with DF-Executor-InternalServerError and Spark Executor Timeout in ADF

Grace Ann Salvame 0 Reputation points
2025-06-25T06:02:51.6933333+00:00

I have a Data Flow activity in Azure Data Factory named DF_StageAccountInfo that was working previously without any issues. However, it suddenly started consistently failing with the following error:


Operation on target DF_StageAccountInfo failed: {"StatusCode":"DF-Executor-InternalServerError","Message":"Job failed due to reason at Sink 'stgAccountInfo': Failed to execute dataflow with internal server error, please retry later. If issue persists, please contact Microsoft support for further assistance","Details":"org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 43.0 failed 1 times, most recent failure: Lost task 0.0 in stage 43.0 (TID 37) (vm-55c14807 executor 2): ExecutorLostFailure (executor 2 exited caused by one of the running tasks) Reason: Executor heartbeat timed out after 171582 ms"} 

This issue occurs at the Sink transformation (stgAccountInfo), even though no recent changes were made to the Data Flow logic or the connected datasets.

Here are the actions I’ve already taken to troubleshoot this:

Restarted the pipeline multiple times

Created a new Integration Runtime

Validated source and sink datasets and connections

Ran the pipeline in Debug mode

  • Changed the compute size to Medium and even to Large

Despite these efforts, the error persists consistently.

I suspect this may be an internal issue with the execution environment or Spark cluster. Please advise on further steps or whether Microsoft backend intervention is required.

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,624 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Krupal Bandari 770 Reputation points Microsoft External Staff Moderator
    2025-06-27T07:07:03.08+00:00

    Hi @Grace Ann Salvame
    Thanks for the detailed information and for trying out multiple troubleshooting steps you’ve already done a great job narrowing this down.

    Based on the error Executor heartbeat timed out after 171582 ms, this usually indicates a problem within the Spark execution environment, specifically that an executor node became unresponsive or timed out during execution. Although the failure appears at the Sink (stgAccountInfo), the root cause is likely related to backend infrastructure rather than any changes in your Data Flow logic.

    To help stabilize execution, first try enabling Sink staging if you are using Azure SQL, Synapse, or SQL Server as your sink. You can do this by opening the Sink transformation in your Data Flow, enabling the “Staged insert” option, and configuring a temporary staging linked service pointing to Azure Blob Storage. This approach buffers large writes and prevents overloading the sink during high-load Spark operations. You can find more details here: Enable Staged Insert – Azure SQL Database.

    Next, consider adding a Repartition transformation just before the Sink. Setting this to Round Robin or specifying a partition count such as 8 or 16 helps distribute data evenly across executor nodes, reducing the likelihood of data skew or executor crashes.

    Also, if you are currently using a fixed compute size like Medium or Large, try switching to AutoResolveIntegrationRuntime or set the Data Flow compute size to Auto. This enables the environment to scale automatically according to workload demands.

    Finally, review the Spark monitoring view by navigating to the Data Flow activity run and clicking the eyeglass icon. Look for long-running partitions, high retry counts, or uneven data distribution (data skew), as these can indicate which parts of the pipeline may be causing bottlenecks. More information is available here: Troubleshoot Mapping Data Flows – Microsoft Docs.
    If this is helpful, please click Accept Answer and kindly upvote it so that other people who faces similar issue may get benefitted from it.Let me know if you have any further Queries.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.