Hi @Netty ,
Thank you for posting query on Microsoft Q&A Platform.
I have implemented a sample pipeline for your requirement. Please check below detailed explanation and follow same in your requirement too.
Step1: Created a pipeline parameters called "client_id" & "dates"
client_id --> To hold your client id for which you want to run execution.
dates --> Array of your dates values for which you want to run execution.
Step2: ForEach activity. Pass your dates array in to Foreach activity.
Step3: Inside ForEach activity, Use Copy activity for copy source data in your target storage. Here, on Target we want to get folder path in below format. Hence I use parameterized dataset as Sink.
my-data-lake/my-container/client_id=<clinet_id>/date=yyyy-MM-dd/*.json
Expression used for dynamic path: client_id=@{pipeline().parameters.client_id}/date=@{item()}
Hope this will help. Thank you
----------------------------------
- Please
accept an answer
if correct. Original posters help the community find answers faster by identifying the correct answer. Here is how. - Want a reminder to come back and check responses? Here is how to subscribe to a notification.
Hi @KranthiPakala-MSFT ,
Thanks so much for the response. Totally hear you on the files overwriting. I am totally OK with overwriting and wiping out old files to replace them with new ones, as long as it pertains to the partitions that are specified for that given run (in my case, the end files are all .json files that should be equally split up based on the size of the data under the partition).
Let's run through the following examples based on how I plan to partition my datasets:
Partion Level 1:
client_id=123456
client_id=234567
Partition Level 2:
date=2021-07-10
date=2021-07-11
date=2021-07-12
date=2021-07-13
date=2021-07-14
date=2021-07-15
date=2021-07-16
date=2021-07-17
date=2021-07-18
(eg: my-data-lake/my-container/client_id=123456/date=2021-07-18/*.json)
Example 1) If I run a job for both client IDs for the last 7 days, then I am perfectly OK with wiping all data and files between 7/12 and 7/18, updating it with the data from the job that runs today, as long as data from 7/10 and 7/11 are not removed or altered.
Example 2) If I run a job just for client 123456, then I would want files to be removed and replaced with updated data only for client 123456, without affecting client 234567.
Let me know if this additional info helps or if you need anything else from my end to help explain further.
Best wishes,
Netty
Hi @Netty ,
Thank you for your response. Please check below posted answer. Please
Accept Answer
. Accepting answer will help community as well.