ADF Data Flow _temporary/0/ directory

Diego Prieto 36 Reputation points
2022-08-15T08:18:51.387+00:00

When the Datalake sink is used in Data Flow, temporary files are created in a directory named _temporary/0/..... (Spark behaviour)

This triggers store and pipe events that fail because the files do not exist. Also, when Data Flow renames the temporary files to their final destination, the save event does not fire in these cases.

We are experiencing this behavior since Friday last week (08-12-2022), since we have been working with Data Flow for 3 months and it has never happened before.

Currently we have all pipelines failing due to the same error.

Does anyone know why this is happening?

The error is similar to this one:
https://stackoverflow.com/questions/70393987/filenotfoundexception-on-temporary-0-directory-when-saving-parquet-files
But since we use ADF and Data Flow we can't touch the Spark code, nor see it.

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,504 questions
0 comments No comments
{count} votes

Accepted answer
  1. MarkKromer-MSFT 5,186 Reputation points Microsoft Employee
    2022-08-15T16:16:21.757+00:00

    This is likely due to the recent backend Spark implementation in ADF migrating from Azure Databricks to Synapse Spark. The implementation of temporary file handling is slightly different and is an implementation detail.

    1 person found this answer helpful.
    0 comments No comments

1 additional answer

Sort by: Most helpful
  1. Shubham Pawade 46 Reputation points
    2023-01-18T13:11:08.8666667+00:00

    Hi ! I am facing the same issue, I am curious, were you able to fix this? or temporary folders kept getting created?

    0 comments No comments