Synapse Mapping Data Flow Sink creates unexpected empty 0-byte files when writing Parquet to Azure Blob Storage

Question

Synapse Mapping Data Flow Sink creates unexpected empty 0-byte files when writing Parquet to Azure Blob Storage

Shinde, Dushyant 0

Issue Summary :

When using Azure Synapse Analytics Mapping Data Flow to write an updated Parquet file from one Blob Storage container to another, the Sink creates:

The expected Parquet file inside the correct folder structure :
- <container>/<customer_no>/<month_folder>/<file>.parquet
Unexpected 0-byte blobs at intermediate folder levels:
- <container>/<customer_no>
- <container>/<customer_no>/<month_folder>

These appear in the Azure Portal as duplicate folders + empty files with the same names as folder names. These extra blobs should NOT be created.

Environment Details :

Service : Azure Synapse Analytics
Component : Mapping Data Flow (ADF/Synapse)
Compute : AutoResolveIntegrationRuntime (Data Flow runtime)
Source Storage : Azure Blob Storage
Target Storage : Azure Blob Storage
File format : Parquet
Sink settings :
- Sink type : Integration dataset (Parquet)
- File name option : "Output to single file"
- File name : Passed dynamically through parameter (original file name)
- Directory : Dynamically constructed (customer_no / month_folder)
- Partitioning : Single partition

Expected Behavior :

Only the updated Parquet file should be written :

enriched-metric-reports/<customer_no>/<month_folder>/<original_file_name>.parquet

No additional empty blobs or marker files should be created.

Actual Behavior :

Synapse Data Flow writes:

1. Expected file :

Correct Parquet file appears under :

enriched-metric-reports/341/January-2026/<original_file_name>.parquet

2. Unexpected empty files (0 KB) :

These are created automatically :

enriched-metric-reports/341 (0 bytes)

enriched-metric-reports/341/January-2026 (0 bytes)

Azure Portal shows:

Folder 341
File 341 (0 bytes)
Folder January-2026
File January-2026 (0 bytes)

These blobs should not be generated.

We need clarification from Microsoft Azure Synapse team whether this behavior is:

Expected,
A known limitation, or
A bug in Synapse Data Flow Sink for Blob Storage.

Attachments :

User's image

0 comments

1 answer

Your answer

Answer 1

Hi Shinde, Dushyant,
Thank you for reaching Microsoft Q&A! and for the detailed screenshots and configuration. What you are observing is expected behavior when Azure Synapse Mapping Data Flow writes a single Parquet file to Azure Blob Storage using a dynamic folder path and an integration dataset.

Azure Blob Storage does not have a true hierarchical file system. The folders shown in the portal are only a visual interpretation of blob names that contain /. When Mapping Data Flow (Spark runtime) writes to Blob with:

Dynamic directory path (for example: customer_no/month_folder)

File name option = Output to single file

Integration dataset sink

Parquet format

the Spark commit protocol creates 0-byte path marker blobs at each directory level before committing the final file. These blobs have the same names as the folder segments, which is why the portal shows them as both a folder and a 0-byte file.

Example of what Spark writes internally:

enriched-metric-reports/341                      ← marker blob (0 bytes)
enriched-metric-reports/341/January-2026        ← marker blob (0 bytes)
enriched-metric-reports/341/January-2026/file.parquet

This is by design and comes from the Spark FileOutputCommitter behavior when writing to non-hierarchical storage such as Azure Blob. It is not a Synapse defect and does not indicate data corruption.

This behavior does not occur when:

Writing to ADLS Gen2 (true hierarchical namespace)

Using an inline dataset sink

Allowing Spark to write multiple partition files instead of a single file

The 0-byte blobs are expected Spark path markers required for committing the file to Azure Blob Storage and can be safely ignored. If you want to avoid these marker blobs entirely, the recommended approach is to use ADLS Gen2 or change the sink configuration (inline dataset or multi-file output).

Manoj Kumar Boyini 16,640 Reputation points Microsoft External Staff Moderator

2026-02-04T19:51:02.8066667+00:00

Hi Shinde, Dushyant,

I hope you had a chance to review the information shared earlier, and I hope this information has been helpful! If you still have questions, please let us know what is needed in the comments so the question can be answered.

Share via

Synapse Mapping Data Flow Sink creates unexpected empty 0-byte files when writing Parquet to Azure Blob Storage

Issue Summary :

Environment Details :

Expected Behavior :

Actual Behavior :

1 answer

Your answer