Hi Pratisti Satardekar ,
Thankyou for using Microsoft Q&A platform and thanks for posting your query here.
As per my understanding you are trying to perform data transformation and convert tabular data into json using mapping dataflow . The challenge here is id1,id2 columns needs to be converted as single key in json named as 'id'. Please let me know if that is not the correct understanding.
This can't be directly achieved using derived column transformation. Since the column names needs to be coming as same 'id' , we need to perform unpivot first so that both the values 4, and 9 comes under same column and that can be used in derived column transformation to convert as key of json.
- First of all , add source transformation to point to the source file.
- Now, add 'surrogate key transformation' to create a new column ,say 'key' that will generate unique value corresponding to each records. This column will be used to ungroup by in 'unpivot transformation'
- Use 'Unpivot transformation' to modify data from being placed in two columns to single column. To ungroup by , we will use 'key' column. In unpivot key tab, provide the new col name as 'unpivotedcol'. In unpivoted columns tab, select 'id1' and 'id2' as the column names
- Now use derived column transformation and provide column name as 'collections' and in expression , add new subcolumn and provide the expression as
@(id=id1)
- Add an aggregate transformation to convert complex column 'collections' as an array with the help of collect function. Leave the group by tab as blank and go to aggregates tab, select 'collections' column and provide expression as
collect(collections)
- Now, add sink transformation to store the output json data. In settings , select filename option as 'output to single file' and set single partition , provide output filename and run the dataflow. Hope it helps. Kindly accept the answer by clicking on
Accept answer
button. Thankyou