Azure Data Factory: Using data from Script output in subsequent activities

ellishar 10 Reputation points
2023-03-09T18:18:07.3533333+00:00

I have a Data Facrtory pipeline containing two script activities that each return a single column of data (a list of columns for a specified table - the tables are in databases on different unlinked servers so I cannot do this within the database.)

I want to compare the two outputs and find the intersect. (i.e. the set of columns common to both tables), and then use that set to select columns from the tables for further comparison.

Can I do this in ADF using script activites?

The first script is as follows:

SELECT a.[name] colName FROM sys.columns a
		WHERE a.object_id = OBJECT_ID(N'schema.table1')

The second script is identical but points to a different table ('table2')

I have been trying to use a third script to get the column intersect like so:


@{activity('getColNamesA').output.resultSets[0].rows}

INTERSECT

@{activity('getColNamesB').output.resultSets[0].rows}

But that does not work. How do I get the data from the JSON output and make use of it in a subsequent script?

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
10,198 questions
{count} votes

1 answer

Sort by: Most helpful
  1. KranthiPakala-MSFT 46,442 Reputation points Microsoft Employee
    2023-03-11T16:04:29.02+00:00

    Hi @ellishar ,

    Welcome to Microsoft Q&A forum and thanks for reaching out here.

    You can use the @intersection() function in Azure Data Factory pipeline expression to compare two collections and get the intersect values.

    Here is the syntax for the "intersection" function:

    @intersection([<collection1>], [<collection2>], ...)

    The intersection() function returns a collection that has only the common items across the specified collections.

    For your requirement what you can do is, create a array type pipeline variable and after Script activity 1 and script activity 2, then have a subsequent set variable activity in which you will find the intersection values and store them for further processing.

    Below is how the flow looks like:

    User's image

    Below are the outputs of my two script activities which will return the column names of two different tables.
    User's image

    In the set variable activity, I used below expression to find the intersection values of both the collections (script activity outputs)

    @intersection(activity('getColNamesFromEmployeeTable').output.resultSets[0].rows,activity('getColNamesFromEmployeeSalaryTable').output.resultSets[0].rows)
    

    Below is the output of the above expression:

    {
        "name": "varIntersecValues",
        "value": [
            {
                "colName": "EMPLOYEEID"
            },
            {
                "colName": "LASTNAME"
            },
            {
                "colName": "FIRSTNAME"
            }
        ]
    }
    

    Hope this helps.


    Please don’t forget to Accept Answer and Yes for "was this answer helpful" wherever the information provided helps you, this can be beneficial to other community members.

    1 person found this answer helpful.