How to go from an unknown JSON schema with nested objects to SQL Tables where the nested objects are put into their separate tables

stan rutten 1 Reputation point
2024-03-06T10:22:20.44+00:00

Hello, my use case is as follows: I need to convert unknown JSON schemas (each JSON file can look completely different with different fields, types, etc.), possibly containing nested array objects, to a SQL database. Where, the nested arrays of objects will be stored in separate tables, each with a reference ID to their parent.

An example of how the JSON could look:


[
  {
    "age": 30,
    "city": "New York",
    "grades": [95, 85, 90],
    "is_student": false,
    "name": "John Doe"
  },
  {
    "gender": "Male",
    "age": 94,
    "city": "New State",
    "grades": [95, 85, 90],
    "is_student": false,
    "name": "Foo Bar",
    "scores": [
      {
        "t1": 5,
        "t2": 6,
        "t3": 7,
        "t4": 9
      },
      {
        "t1": 14,
        "t2": 123,
        "t3": 54,
        "t4": 12
      }
    ]
  }
]

My desired outcome is to have two tables created, for example, userTable (with gender, age, city ect.) and scoresTable (with user_id, t1, t2), where scoresTable has a reference ID to userTable. Is it possible to create such a dataflow within Azure Data Factory, and if so, how would one approach this?

Hope anyone has some tips and could help me out.

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,623 questions
{count} votes

1 answer

Sort by: Most helpful
  1. phemanth 15,755 Reputation points Microsoft External Staff Moderator
    2024-03-07T12:06:54.5933333+00:00

    @stan rutten

    Thanks for reaching out to Microsoft Q&A.

     To convert unknown JSON schemas to a SQL database, where the nested arrays of objects will be stored in separate tables, each with a reference ID to their parent.

    please follow the following steps:

    1. Azure Data Factory can be used to convert JSON data to SQL tables. You can create a pipeline in Azure Data Factory that reads the JSON data from a source and writes it to a SQL database sink

    User's image

    User's image

    User's image

    1. To handle the nested arrays of objects, you can use the Flatten transformation in Azure Data Factory. The Flatten transformation can be used to flatten the nested arrays of objects into separate rows.

    User's image

    1. After flattening the data, you can use the Lookup transformation in Azure Data Factory to join the flattened data with the parent data. The Lookup transformation can be used to join the flattened data with the parent data based on a common key.
    2. Finally, you can write the data to separate tables in the SQL database sink using the Copy transformation in Azure Data Factory. The Copy transformation can be used to write the data to separate tables based on the output of the Lookup transformation.

    reference:

    https://techcommunity.microsoft.com/t5/azure-data-factory-blog/transform-complex-json-structures-from-cosmosdb-to-sql-db-with/ba-p/1221106

     Hope this helps. Do let us know if you any further queries.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.