This is likely due to the recent backend Spark implementation in ADF migrating from Azure Databricks to Synapse Spark. The implementation of temporary file handling is slightly different and is an implementation detail.
ADF Data Flow _temporary/0/ directory
When the Datalake sink is used in Data Flow, temporary files are created in a directory named _temporary/0/..... (Spark behaviour)
This triggers store and pipe events that fail because the files do not exist. Also, when Data Flow renames the temporary files to their final destination, the save event does not fire in these cases.
We are experiencing this behavior since Friday last week (08-12-2022), since we have been working with Data Flow for 3 months and it has never happened before.
Currently we have all pipelines failing due to the same error.
Does anyone know why this is happening?
The error is similar to this one:
https://stackoverflow.com/questions/70393987/filenotfoundexception-on-temporary-0-directory-when-saving-parquet-files
But since we use ADF and Data Flow we can't touch the Spark code, nor see it.
-
MarkKromer-MSFT 5,216 Reputation points Microsoft Employee
2022-08-15T16:16:21.757+00:00
1 additional answer
Sort by: Most helpful
-
Shubham Pawade 46 Reputation points
2023-01-18T13:11:08.8666667+00:00 Hi ! I am facing the same issue, I am curious, were you able to fix this? or temporary folders kept getting created?