Data transformation

Khimlal, Tulsi 20 Reputation points
2024-02-29T08:56:58.86+00:00

Hi, I have a table which have data like below. User's image

JSON Format:

{

"data": [

{

"P_name": "ABC",

"P_number": 1,

"2022-23": "$198.00",

"2023-24": "$929.80",

"2024-25": "$9,292.20",

"Jan-22": "$0.00",

"Feb-22": "$0.00",

"Jan-23": "$0.00",

"Feb-23": "$0.00"

},

{

"P_name": "xyz",

"P_number": 2,

"2022-23": "$9,829.00",

"2023-24": "$98.92",

"2024-25": "$89.12",

"Jan-22": "$0.00",

"Feb-22": "$0.00",

"Jan-23": "$0.00",

"Feb-23": "$0.00"

}

]

}

We need to differentiate the data based on the "P_number" field.

The "P_name" and "P_number" columns will be directly mapped, while the remaining columns should follow the structure outlined below. The destination output should be in JSON format and presented in a nested structure.

User's image

The data should be structured as follows in JSON format:

{

"data": [

{

"P_name": "ABC",

"P_number": 1

}

],

"FY_data": [

{

"FY": "2022-23",

"Total": "$396.00",

"jan": "$198.00",

"feb": "$198.00"

},

{

"FY": "2023-24",

"Total": "$9,927.00",

"jan": "$9,829.00",

"feb": "$98.92"

}

]

}

How can I achieve this using ADF.

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,598 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Amira Bedhiafi 23,486 Reputation points
    2024-02-29T14:33:51.5466667+00:00

    You need to create a new data flow within ADF and add your source dataset.
    Since your transformation requires different structures for "P_name" and "P_number" versus the financial years and monthly data, you'll likely need to split your data into two streams for processing:

    • Stream 1: directly maps "P_name" and "P_number"
    • Stream 2: handles the transformation of fiscal year and monthly data

    For the second stream, you'll need :

    • Pivot Transformation to transform the fiscal year and monthly columns into rows
    • Aggregate Transformation to calculate totals for each fiscal year. You may need to create expressions to parse the dollar amounts and sum them up
    • Join Transformation to bring back the "P_name" and "P_number" information into this stream, if needed.

    After transforming the data, use the Derived Column transformation to construct the new JSON structure for both streams. You may need to create complex expressions to build the nested JSON objects as per your requirement.

    Finally, configure a sink to output your data. Since you want the output in JSON format, make sure to choose a dataset that supports JSON output (example Azure Blob Storage or Azure Data Lake Store) and configure the output to match the structure you need.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.