Azure Data Factory working with REST API JSON output.

Gav Cheal 80 Reputation points
2023-01-18T16:51:24.79+00:00

Hi All.

I am using my pipeline to connect to a REST API. This is working fine now. But now I am having difficulties as to how to handle the resulting JSON.

Unfortunately the problem is that the JSON return transposed. So I am not getting rows, I am getting columns as the records.

@string(activity('getReports').output.records.property_name)

This gets me the data for the column property_name. So I've loaded that into a variable of string

{
    "name": "vStr_Json_Property_name",
    "value": "[\"A Property Name\",\"B Property Name\",\"C Property Name\",...]"
}

So annoyingly it returns with a broken double quote. I'm not sure if I can do anything about that.

The next stage is change this variable from a string to an array... I was thinking. That way I can load all of the columns into variables of type array. Either same these to all to tables in SQL Server one table for each column with a identity column for each row... or use ADF to foreach around the array and join the rows back together before load these to a table in SQL Server.

I've been on this for about 8 hours now and I am extremely frustrated, so I figured now was the time to see if any one else had come across the same problem and how they solved it?

Any help would be gratefully received.

Many thanks

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,623 questions
{count} votes

Accepted answer
  1. MartinJaffer-MSFT 26,236 Reputation points
    2023-01-18T21:42:37.28+00:00

    Actually, the returned result only appears as escaped when you look at it in string variable. Without the string conversion it is an array. The json function can help you convert it back. @Gav Cheal

    For fixing the transposition, you will need to use Mapping Data Flow. Copy activity does not handle transformation.

    mapLoop and flatten transform

    Given the dummy data:

    {"theColumns":{
    	"Prop_name":["First prop","second prop","third prop"],
    	"Numbers":[10,20,30],
    	"Flavors":["Savory","Umame","Spicy"]
    }}
    

    I made implementation (Data flow script):

    source(output(
    		theColumns as (Flavors as string[], Numbers as short[], Prop_name as string[])
    	),
    	useSchema: false,
    	allowSchemaDrift: true,
    	validateSchema: false,
    	ignoreNoFilesFound: false,
    	format: 'json',
    	fileSystem: 'data',
    	folderPath: 'input',
    	fileName: 'transpose.json',
    	documentForm: 'singleDocument') ~> source1
    source1 derive(rows = (mapLoop(size(theColumns.Prop_name),
    @(Prop=theColumns.Prop_name[#index],
      Number=theColumns.Numbers[#index],
      Flavor=theColumns.Flavors[#index])
    ))) ~> derivedColumn1
    derivedColumn1 select(mapColumn(
    		rows = rows
    	),
    	skipDuplicateMapInputs: true,
    	skipDuplicateMapOutputs: true) ~> select1
    select1 foldDown(unroll(rows, rows),
    	mapColumn(
    		each(rows,match(true()))
    	),
    	skipDuplicateMapInputs: false,
    	skipDuplicateMapOutputs: false) ~> flatten1
    flatten1 sink(allowSchemaDrift: true,
    	validateSchema: false,
    	format: 'delimited',
    	fileSystem: 'data',
    	folderPath: 'out',
    	columnDelimiter: ',',
    	escapeChar: '\\',
    	quoteChar: '\"',
    	columnNamesAsHeader: true,
    	partitionFileNames:['transpose_results.csv'],
    	umask: 0022,
    	preCommands: [],
    	postCommands: [],
    	skipDuplicateMapInputs: true,
    	skipDuplicateMapOutputs: true,
    	partitionBy('hash', 1)) ~> sink1
    

    This has 3 transforms. First a derived column where I create a single column which contains an array of objects. I use mapLoop to get an element from each column by index.

    (mapLoop(size(theColumns.Prop_name),@(Prop=theColumns.Prop_name[#index], Number=theColumns.Numbers[#index], Flavor=theColumns.Flavors[#index])))

    Next a select transform where I discard the original json form.

    Lastly, a Flatten transform where the array is turned into rows, and I use a rule-based mapping to extract each property into its own column.


1 additional answer

Sort by: Most helpful
  1. Eugene Tretiak 0 Reputation points
    2023-01-18T17:13:37.84+00:00

    Maybe you can use Logic App for this? There are all the same functions as in ADF (even more), and with Compose (Data Operations) you can construct your data and after that export as a structured file for example.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.