Merging millions of json files from azure blob
Hi,
we have an Azure blob container with millions of small json files. I have successfully been able to setup a copy task in azure data factory for merging these files into one file that will be more manageable for further processing, preferably in data lake. Right now the destination is a CSV file in another container.
However this takes ages, so I need help in finding the most performant way to approach this.
So what is the most efficient way to do this? Both in terms of setup, but also i choosing sink and format.
In the end I would like to have the data stored in Azure Data Lake for analyis.