How to split json keys and values separately without data flow in azure data factory

Uday Juttu 1 Reputation point
2022-08-10T03:01:58.5+00:00

I have HubSpot dataset and I am doing a lookup activity which gives first row from dataset

Firstrow : {

'col1' : 'value1',
'col2' : ' value2',

}

I want to get all columns from output JSON (I,e col1, col2) so that I can do schema validation for table. I am wondering how get dynamic column information

Note : I cannot use Data flow & get metadata activity HubSpot datasets cannot support this.

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
Community Center | Not monitored
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. MartinJaffer-MSFT 26,161 Reputation points
    2022-08-10T16:51:00.513+00:00

    Hello @Uday Juttu ,
    Thanks for the question and using MS Q&A platform.

    So you want to check all the column names for schema validation without using dataflow.

    Much to my frustration, there is no function for returning all the keys/properties of a JSON. If there was, I could do so many things.

    However there is another way. Instead of getting a list of all keys from the lookup activity, how about providing a list of all expected column names as a parameter. Then we can ForEach that list of expected names and try

    @contains( activity('lookup').output.firstrow , item() )  
    

    This would verify each item you provide does exist, but it doesn't verify that there are no extra columns. For that we need another test. Since json key-value pairs always have a ; between them, we can count them by splitting them and subtracting 1 from the length of the resulting array.

    @sub(  
      length(  
        split(  
          string(activity('lookup').output.firstRow)  
        , ':')  
      )  
    ,1)  
    

    Of course we want to compare to the length of your list of expected columns, so wrap that in

    equals( length( pipeline.parameters.list_to_compare , count_of_lookup_columns )  
    

    Please do let me if you have any queries.

    Thanks
    Martin


    • Please don't forget to click on 130616-image.png or upvote 130671-image.png button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is how
    • Want a reminder to come back and check responses? Here is how to subscribe to a notification
      • If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.