Mapping Dataflow Flatten JSON array with no projection

Question

Mapping Dataflow Flatten JSON array with no projection

Daniel Ambler 0

Hi,

I'm currently hitting an issue in with a Mapping Dataflow in Synapse pipelines when attempting a flatten transform without a schema projection. Not sure if its me that's missing something here or if i'm expecting the Mapping Data Flow to do something that's not possible without some additional steps.

I have a file with a structure similar to the attached JSON file (entity1_json.txt)- the difference is the schema of the items array will change with each file.

[
  {
    "meta": {
      "code": 200,
      "message": "OK",
      "serverTime": "2023-01-12T15:43:56+00:00",
      "userTimezone": { "offset": "+00:00", "name": "Europe/London" }
    },
    "response": {
      "count": 4,
      "items": [
        {
          "col1": "value1",
          "col2": "value2"
        },
        {
          "col1": "value1",
          "col2": "value2",
          "col3": "value3"
        }
      ]
    }
  },
  {
    "meta": {
      "code": 200,
      "message": "OK",
      "serverTime": "2023-01-12T15:43:56+00:00",
      "userTimezone": { "offset": "+00:00", "name": "Europe/London" }
    },
    "response": {
      "count": 4,
      "items": [
        {
          "col1": "value1",
          "col2": "value2",
          "col3": "value3"
        },
        {
          "col1": "value1",
          "col2": "value2",
          "col3": "value3",
          "col4": "value4"
        }
      ]
    }
  },
  {
    "meta": {
      "code": 200,
      "message": "OK",
      "serverTime": "2023-01-12T15:43:56+00:00",
      "userTimezone": { "offset": "+00:00", "name": "Europe/London" }
    },
    "response": { "count": 346, "items": [] }
  }
]

With a schema projection everything is great, however this means I need to maintain one Dataflow per entity. What I was hoping to do was maintain one Dataflow and parameterise it with an entity name.

So the typical flow is User's image

And the colum in the schema is response.items

However if I break it all down to the source and flatten, and clear the projection, I cant quite get the same results

Clear the projection, Schema Drift is enabled in the Schema options

User's image

Alter the flatten to use response.items (first thing I tried, seemed sensible at the time :-))

User's image

My assumption at this point is this wont work because the flow is not schema aware, to this point I have tried various routes such as

Adding a select inbetween the source and flatten to ensure im only dealing with the response
Adding a derived column to get the items array (all show as NULL in the preview)
Working on a parse step, but this seems I would still need to specify the schema of the items array, which is not what i'm wanting to do.

I'd appreciate any assistance while I continue to look for a solution.

MarkKromer-MSFT 5,226 Reputation points Microsoft Employee Moderator

2023-01-16T05:54:24.51+00:00

Would parameterizing the unroll by work for you? You could send in the name of the array entity which you wish to unroll as a parameter from the pipeline and reference it in the Flatten unroll by property.
KranthiPakala-MSFT 46,642 Reputation points Microsoft Employee Moderator

2023-02-23T06:28:13.9233333+00:00

Hi there,

Just checking to see if you have got a chance to try the suggestion provided Mark Kromer? Please do let us know if you are still looking for assistance.

Thank you
Daniel Ambler 0 Reputation points

2023-10-31T10:24:26.0266667+00:00

Hi,

I'm so sorry about the delay on this - the project this was part of got shelved and has only just resurfaced - I've manged to get around this particular issue by having a Derived Column transformation before the flatten, then using a column pattern to output the columns by name

The issue it left me with is that all the output columns are recognised as strings - its not a major issue but that still stops me having 1 parameterised dataflow to maintain until I figure that bit out - but thats another issue :-).

Your answer

MarkKromer-MSFT 5,226 Reputation points Microsoft Employee Moderator

2023-01-16T05:54:24.51+00:00

Would parameterizing the unroll by work for you? You could send in the name of the array entity which you wish to unroll as a parameter from the pipeline and reference it in the Flatten unroll by property.
KranthiPakala-MSFT 46,642 Reputation points Microsoft Employee Moderator

2023-02-23T06:28:13.9233333+00:00

Hi there,

Just checking to see if you have got a chance to try the suggestion provided Mark Kromer? Please do let us know if you are still looking for assistance.

Thank you
Daniel Ambler 0 Reputation points

2023-10-31T10:24:26.0266667+00:00

Hi,

I'm so sorry about the delay on this - the project this was part of got shelved and has only just resurfaced - I've manged to get around this particular issue by having a Derived Column transformation before the flatten, then using a column pattern to output the columns by name

The issue it left me with is that all the output columns are recognised as strings - its not a major issue but that still stops me having 1 parameterised dataflow to maintain until I figure that bit out - but thats another issue :-).

Share via

Mapping Dataflow Flatten JSON array with no projection

Your answer