You can achieve this by dataflow. Split the date column in each row into individual parts using a derived tx. Then use the key partition feature on the sink to split this by year/month/day on storage
Load CSV to ADLS by Date
I have a large CSV file that contains a extract of records dating back to year 2016. The requirement is to copy the data into azure data lake but in year/month/day.csv format. What is the best approach to implement this? I am aware of how to achieve the write on the sink (ADLS) but am not sure of how to parse the CSV to obtain the records by day using azure data factory.
-
Kiran-MSFT 691 Reputation points Microsoft Employee
2020-10-24T19:15:47.833+00:00
2 additional answers
Sort by: Most helpful
-
HimanshuSinha-msft 19,476 Reputation points Microsoft Employee
2020-10-27T00:24:51.827+00:00 Hello @Galaxy ,
Just adding to what Kiran said , you need to derive the month,yaar & day from the date colum .
I am assuming that you want the folder structure as Year-> Month -> Day.file . You can use the below expression to create the expression ('\' is requirted for
folder structure )concat( toString(year(Date)),'/',toString(month(Date)),'/',toString(dayOfMonth(Date)))
Also on the derived column option -> select key parition -> select the derived column
On the sink side I have also addded the Setting -> As data in column -> Selected the derived column
The below gif should be helful .
Thanks Himanshu
Please do consider to click on "Accept Answer" and "Up-vote" on the post that helps you, as it can be beneficial to other community members -
Galaxy 21 Reputation points
2020-11-03T23:43:47.467+00:00 Thank You @Kiran-MSFT and @HimanshuSinha-msft for your responses. I wish I could accept both answers.