Mapping Dataflow Flatten JSON array with no projection
Hi,
I'm currently hitting an issue in with a Mapping Dataflow in Synapse pipelines when attempting a flatten transform without a schema projection. Not sure if its me that's missing something here or if i'm expecting the Mapping Data Flow to do something that's not possible without some additional steps.
I have a file with a structure similar to the attached JSON file (entity1_json.txt)- the difference is the schema of the items array will change with each file.
[
{
"meta": {
"code": 200,
"message": "OK",
"serverTime": "2023-01-12T15:43:56+00:00",
"userTimezone": { "offset": "+00:00", "name": "Europe/London" }
},
"response": {
"count": 4,
"items": [
{
"col1": "value1",
"col2": "value2"
},
{
"col1": "value1",
"col2": "value2",
"col3": "value3"
}
]
}
},
{
"meta": {
"code": 200,
"message": "OK",
"serverTime": "2023-01-12T15:43:56+00:00",
"userTimezone": { "offset": "+00:00", "name": "Europe/London" }
},
"response": {
"count": 4,
"items": [
{
"col1": "value1",
"col2": "value2",
"col3": "value3"
},
{
"col1": "value1",
"col2": "value2",
"col3": "value3",
"col4": "value4"
}
]
}
},
{
"meta": {
"code": 200,
"message": "OK",
"serverTime": "2023-01-12T15:43:56+00:00",
"userTimezone": { "offset": "+00:00", "name": "Europe/London" }
},
"response": { "count": 346, "items": [] }
}
]
With a schema projection everything is great, however this means I need to maintain one Dataflow per entity. What I was hoping to do was maintain one Dataflow and parameterise it with an entity name.
So the typical flow is
And the colum in the schema is response.items
However if I break it all down to the source and flatten, and clear the projection, I cant quite get the same results
- Clear the projection, Schema Drift is enabled in the Schema options
- Alter the flatten to use response.items (first thing I tried, seemed sensible at the time :-))
My assumption at this point is this wont work because the flow is not schema aware, to this point I have tried various routes such as
- Adding a select inbetween the source and flatten to ensure im only dealing with the response
- Adding a derived column to get the items array (all show as NULL in the preview)
- Working on a parse step, but this seems I would still need to specify the schema of the items array, which is not what i'm wanting to do.
I'd appreciate any assistance while I continue to look for a solution.