Process complex json array member having null with Dataflow

Harsh Gandhi 1 Reputation point
2022-04-08T07:07:48.43+00:00

Below is the json data that is received via API call.

[
{
"resultsMetadata": {
"count" :100
"entity" : "test"
},
"results": [
{
"uniqueid": "100ABC",
"fnbr": 1,
"relations" : [
{
"relcode": "A1"
"reldesc": "A1 test"
},
{
"relcode": "A2"
"reldesc": null
},
{
"relcode": "A3"
"reldesc": "A3 test"
}
] # end of relations
},
{
"uniqueid": "101ABC",
"fnbr": 1,
"relations" : [
{
"relcode": "A1"
"reldesc": "A1 test"
},
{
"relcode": "A3"
"reldesc": null
},
{
"relcode": "A4"
"reldesc": "A4 test"
}
] # end of relations
}
] #end of results
}
]

Aim is to flatten this json response and store as parquet file.
Having tried with dataflow (source -> flatten -> sink), data gets flatten and new columns like relcode, reldesc are generated as expected.
But the data inside these new fields for first record "uniqueid": "100ABC" is stored as
relcode reldesc
[1] A1 [1] A1test
[2] A2 [2] A3test
[3] A3
This mean, array members that have null are not stored after flatten.
If there a way where I check this array member value and replace them with empty string ("").
Also, I expect the schema changes and so would like to have a generic and dynamic solution thereby it can check this for each array member that has null.

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
7,139 questions
{count} votes