Which is what we use. It looks like there isn't a way to ignore the Spark partitioning scheme. The suggested Filename[n] pattern implies that I can remove the "n" and thus remove -00001 being added to each file.
Remove '00001' suffix in file name generated by data flow
I'm using Azure Data Factor data flow to save the incoming data as partitioned *.parquet files (Year/Month). I'm using the pattern setting for names of the files, as shown in the screenshot below. ADF automatically appends "00001" to the file name which I don't need because I use an expression to generate the file name, e.g. "Sales Date=2021-08-07-00001". The Optimize tab is set to Key partition type.
Is there any way to remove the '00001" suffix in the file name?
Azure Data Factory
2 answers
Sort by: Most helpful
-
-
Teo 121 Reputation points
2021-08-16T22:01:08.497+00:00 How are these Spark partition files generated? I thought that they all will have '00001' suffix. But after staging a large dataset, I see that now they have different numbers. What will happen if I rerun the load? Will Spark retain the same numbers? Is there a way to control the size of the partition?