How to save all values from the table in one JSON object?

Karolina Stuchly 20 Reputation points
2023-12-21T12:14:28.2366667+00:00

Hello,

I'm looking for a solution to save the data from database table into the JSON file in the specific format.

I have database table with 3 columns: employee_id, gender, salary.

User's image

I want to save these values in the JSON in the following format:
{

"Values": [[1, F, 1000], [2, M, 2000], [3, F, 1500], ...]

}

For now I tried to create a following data flow:

  1. Select a table as a source
  2. Add Derived Column specifying new columns "Values" as an expression: array(toString(employee_id), toString(gender), toString(salary))
  3. Add select column: Selecting only "Values" column
  4. Add Derived Column to again close all values in the array
  5. As a sink I've used JSON file

This is the format I've got:

[
  {
    "Values": [
      [
        "1",
        "F",
        "1000.000000"
      ]
    ]
  },
  {
    "Values": [
      [
        "2",
        "M",
        "2000.000000"
      ]
    ]
  },
  {
    "Values": [
      [
        "3",
        "F",
        "1500.000000"
      ]
    ]
  }
]

Could you please help me on how to proceed to get the required JSON format?

{

"Values": [[1, F, 1000], [2, M, 2000], [3, F, 1500], ...]

}

Thank you

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,624 questions
{count} votes

Accepted answer
  1. phemanth 15,755 Reputation points Microsoft External Staff Moderator
    2023-12-22T09:10:45.7+00:00

    @Karolina Stuchly

    Thanks for reaching out to Microsoft Q&A.

    It seems like you’re trying to create a JSON array of arrays, but your current approach is creating an array of objects, each containing a single array. Here’s how you can modify your approach:

    Select a table as a source: This step remains the same. You’re selecting your table with employee_id, gender, and salary columns.

    Add Derived Column: Instead of creating a new column “Values”, you can directly convert your columns into string format. You can use the expression: array(t

    oString(employee_id), toString(gender), toString(salary)). Let’s call this column “tempValues”.

    Add Aggregate Transformation: This is a new step. Here, you’ll group by some constant (you can create a derived column with a constant value if no such column exists) and make a collect() operation on “tempValues”. This will create a new column which is an array of “tempValues” arrays. Let’s call this column “Values”.

    Select Column: Now, you just need to select the “Values” column.

    Sink to JSON file: This step remains the same. You’re sinking your data into a JSON file.

    This should give you a JSON in the following format:

    {
      "Values": [
        ["1", "F", "1000.000000"],
        ["2", "M", "2000.000000"],
        ["3", "F", "1500.000000"],
        ...
      ]
    }
    
    

    Please note that the values will be in string format because of the toString() function. If you want them in their original format, you would need a method to convert these values into their original format (integer for employee_id and salary, string for gender) after reading the JSON file.

    I hope this helps! Let us know if you have any other questions.

    1 person found this answer helpful.

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.