question

SARASWATCHETAN-7930 avatar image
0 Votes"
SARASWATCHETAN-7930 asked SARASWATCHETAN-7930 edited

How to make a folder pattern while transferring data to ADLS Gen 2 using ADF mapping dataflows

I have a blob container in a azure storage account and I want to migrate it to a hierarchical storage account i.e. ADLS Gen 2 using folder structure on below basis.
{deviceid}/--->{date}/--->{time}. This would list the data in folder structure as below
deviceid
|-->year
|--->month
|---->day
|------->hour

Sample from my blob file is as below.

{"messageid":"38E3D1A3-E498-AD44-8A8A-A06DA7BAA15F","sysid":"SEN-CH33F7YZ","deviceid":"abc123","ver":"1.0","protocol":"MQTT","ttl":"5000","durable":"true","correlationid":"00000000-0000-0000-0000-000000000000","type":"TLM","timestamp":"1607485875","simulated":"false","data":{"type":"Telemetry","code":"SENSATE_TLM","attributes":[{"watertemperature":"33.540001","waterusage":"0.707000","usageduration":"15.000000"}]},"EventProcessedUtcTime":"2020-12-09T03:54:39.4274990Z","PartitionId":2,"EventEnqueuedUtcTime":"2020-12-09T03:51:13.6160000Z","IoTHub":{"MessageId":null,"CorrelationId":null,"ConnectionDeviceId":"abc123","ConnectionDeviceGenerationId":"637365983999778069","EnqueuedTime":"2020-12-09T03:51:13.5100000Z","StreamId":null}}
{"messageid":"CFE8A8BE-3234-1ED4-8377-37C16654DFF2","sysid":"SEN-CH33F7YZ","deviceid":"abc111","ver":"1.0","protocol":"MQTT","ttl":"5000","durable":"true","correlationid":"00000000-0000-0000-0000-000000000000","type":"TLM","timestamp":"1607485879","simulated":"false","data":{"type":"Telemetry","code":"SENSATE_TLM","attributes":[{"watertemperature":"19.270000","waterusage":"0.470000","usageduration":"5.000000"}]},"EventProcessedUtcTime":"2020-12-09T03:54:39.9119082Z","PartitionId":8,"EventEnqueuedUtcTime":"2020-12-09T03:51:15.6010000Z","IoTHub":{"MessageId":null,"CorrelationId":null,"ConnectionDeviceId":"set-sio32236kh","ConnectionDeviceGenerationId":"637427342913260815","EnqueuedTime":"2020-12-09T03:51:15.4130000Z","StreamId":null}}
{"messageid":"F1AC3AA6-80E6-72B4-8F84-4DC033D33CDB","sysid":"SEN-CH33F7YZ","deviceid":"abc111","ver":"1.0","protocol":"MQTT","ttl":"5000","durable":"true","correlationid":"00000000-0000-0000-0000-000000000000","type":"TLM","timestamp":"1607485877","simulated":"false","data":{"type":"Telemetry","code":"SENSATE_TLM","attributes":[{"watertemperature":"19.360001","waterusage":"0.302000","usageduration":"4.000000"}]},"EventProcessedUtcTime":"2020-12-09T03:54:39.4118700Z","PartitionId":1,"EventEnqueuedUtcTime":"2020-12-09T03:51:17.2890000Z","IoTHub":{"MessageId":null,"CorrelationId":null,"ConnectionDeviceId":"set-sio32229fm","ConnectionDeviceGenerationId":"637117926528097261","EnqueuedTime":"2020-12-09T03:51:17.2290000Z","StreamId":null}}



I am not able to find the exact sink property or setting which would make this data available.
Using "column name" would make my data filter on basis of device id but not sure how to make further folders on basis of timestamp. Please suggest if possible.



azure-data-factoryazure-data-lake-storage
· 5
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Hello @SARASWATCHETAN-7930 ,

We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet .In case if you have any resolution please do share that same with the community as it can be helpful to others . Otherwise, will respond back with the more details and we will try to help .
Thanks
Himanshu

0 Votes 0 ·

Hello @SARASWATCHETAN-7930 ,

Following up as we haven't heard back. Please let us know if your issues is not resolved.

Thanks
Saurabh

0 Votes 0 ·

Below strategy was used to partition the folder structure as suggested by Mark.

84969-image.png


However the folder structure in ADLS is having a prefix attached to each folder name, i.e. its property by which that folder is partitioned. See below snapshots.

84986-image.png


84994-image.png


85004-image.png




Is there a way i can formulate the folders with names only. Not the property of the folder with an "=" sign.

0 Votes 0 ·
image.png (31.3 KiB)
image.png (23.0 KiB)
image.png (1.8 KiB)
image.png (1.2 KiB)

For just setting the folder names, use the "Folder as column data" option in the Sink Settings and set your Optimize back to "Use current partitioning".

0 Votes 0 ·
Show more comments

1 Answer

MarkKromer-MSFT avatar image
1 Vote"
MarkKromer-MSFT answered

In data flows, you will generate hierarchical folder structures from your data using Key partitioning in the Sink transformation when writing to ADLS or Blob. Click on the Sink's optimize tab, select Key partitioning, and pick the data fields for your folder structure: https://youtu.be/7Q-db4Qgc4M?t=400

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.