To change the Json file data as per your requirement, you can use dataflow transformation. Below are the detailed steps to do in dataflow.
- Take the source transformation with the dataset containing Json data.
- Then to flatten the Json, take the flatten transformation. Unroll the Json by the array
d.results.to_GtinSalesOrg.results
. Give the mapping as,
Productno = d.results.Productno,
FixedPrice = d.results.FixedPrice,
SalesOrg = d.results.to_GtinSalesOrg.results.SalesOrg,
MainGtin = d.results.to_GtinSalesOrg.results.MainGtin
- Next step is to add the row number for each line item within the
Productno
andfixed price
fields. Take the Window transformation. Aggregate over the fieldsProductno
andfixedPrice
. Sort bySalesOrg
orMainGtin
column. Give the aggregate column asrow_num
and the expression for that column asrowNumber()
over: (Productno,FixedPrice),
sort: (SalesOrg),
Windows column: row_num = rowNumber()
- Then to take all
Salesorg
andMainGtin
values to a single line for eachProductNo
andFixedPrice
, you can aggregate them using aggregate transformation.
groupBy: (Productno, FixedPrice)
Aggregates:
SalesOrg = collect(toString(row_num)+'||'+SalesOrg),
MainGtin = collect(toString(row_num)+'||'+MainGtin)
- Result of aggregate transformation will contain array of values in
SalesOrg
andmainGtin
. To convert them into double pipe symbol separated values , you can take the derived column transformation and give the expression as
SalesOrg = dropRight(dropLeft(replace(replace(toString(SalesOrg),",","||"),'"',''),1),1), MainGtin = dropRight(dropLeft(replace(replace(toString(MainGtin),",","||"),'"',''),1),1)
The above expression will remove the unwanted characters and give as the pipe delimited values.