Breyta

Deila með


How to use parameters, expressions and functions in Azure Data Factory

APPLIES TO: Azure Data Factory Azure Synapse Analytics

Tip

Try out Data Factory in Microsoft Fabric, an all-in-one analytics solution for enterprises. Microsoft Fabric covers everything from data movement to data science, real-time analytics, business intelligence, and reporting. Learn how to start a new trial for free!

In this document, we will primarily focus on learning fundamental concepts with various examples to explore the ability to create parameterized data pipelines within Azure Data Factory. Parameterization and dynamic expressions are such notable additions to ADF because they can save a tremendous amount of time and allow for a much more flexible Extract, Transform, Load (ETL) or Extract, Load, Transform (ELT) solution, which will dramatically reduce the cost of solution maintenance and speed up the implementation of new features into existing pipelines. These gains are because parameterization minimizes the amount of hard coding and increases the number of reusable objects and processes in a solution.

Azure Data Factory UI and parameters

If you are new to Azure Data Factory parameter usage in ADF user interface, please review Data Factory UI for linked services with parameters and Data Factory UI for metadata driven pipeline with parameters for a visual explanation.

Parameter and expression concepts

You can use parameters to pass external values into pipelines, datasets, linked services, and data flows. Once the parameter has been passed into the resource, it cannot be changed. By parameterizing resources, you can reuse them with different values each time. Parameters can be used individually or as a part of expressions. JSON values in the definition can be literal or expressions that are evaluated at runtime.

For example:

"name": "value"

or

"name": "@pipeline().parameters.password"

Expressions can appear anywhere in a JSON string value and always result in another JSON value. Here, password is a pipeline parameter in the expression. If a JSON value is an expression, the body of the expression is extracted by removing the at-sign (@). If a literal string is needed that starts with @, it must be escaped by using @@. The following examples show how expressions are evaluated.

JSON value Result
"parameters" The characters 'parameters' are returned.
"parameters[1]" The characters 'parameters[1]' are returned.
"@@" A 1 character string that contains '@' is returned.
" @" A 2 character string that contains ' @' is returned.

Expressions can also appear inside strings, using a feature called string interpolation where expressions are wrapped in @{ ... }. For example: "name" : "First Name: @{pipeline().parameters.firstName} Last Name: @{pipeline().parameters.lastName}"

Using string interpolation, the result is always a string. Say I have defined myNumber as 42 and myString as foo:

JSON value Result
"@pipeline().parameters.myString" Returns foo as a string.
"@{pipeline().parameters.myString}" Returns foo as a string.
"@pipeline().parameters.myNumber" Returns 42 as a number.
"@{pipeline().parameters.myNumber}" Returns 42 as a string.
"Answer is: @{pipeline().parameters.myNumber}" Returns the string Answer is: 42.
"@concat('Answer is: ', string(pipeline().parameters.myNumber))" Returns the string Answer is: 42
"Answer is: @@{pipeline().parameters.myNumber}" Returns the string Answer is: @{pipeline().parameters.myNumber}.

Examples of using parameters in expressions

Complex expression example

The below example shows a complex example that references a deep sub-field of activity output. To reference a pipeline parameter that evaluates to a sub-field, use [] syntax instead of dot(.) operator (as in case of subfield1 and subfield2)

@activity('*activityName*').output.*subfield1*.*subfield2*[pipeline().parameters.*subfield3*].*subfield4*

Dynamic content editor

Dynamic content editor automatically escapes characters in your content when you finish editing. For example, the following content in content editor is a string interpolation with two expression functions.

{ 
  "type": "@{if(equals(1, 2), 'Blob', 'Table' )}",
  "name": "@{toUpper('myData')}"
}

Dynamic content editor converts above content to expression "{ \n \"type\": \"@{if(equals(1, 2), 'Blob', 'Table' )}\",\n \"name\": \"@{toUpper('myData')}\"\n}". The result of this expression is a JSON format string showed below.

{
  "type": "Table",
  "name": "MYDATA"
}

A dataset with parameters

In the following example, the BlobDataset takes a parameter named path. Its value is used to set a value for the folderPath property by using the expression: dataset().path.

{
    "name": "BlobDataset",
    "properties": {
        "type": "AzureBlob",
        "typeProperties": {
            "folderPath": "@dataset().path"
        },
        "linkedServiceName": {
            "referenceName": "AzureStorageLinkedService",
            "type": "LinkedServiceReference"
        },
        "parameters": {
            "path": {
                "type": "String"
            }
        }
    }
}

A pipeline with parameters

In the following example, the pipeline takes inputPath and outputPath parameters. The path for the parameterized blob dataset is set by using values of these parameters. The syntax used here is: pipeline().parameters.parametername.

{
    "name": "Adfv2QuickStartPipeline",
    "properties": {
        "activities": [
            {
                "name": "CopyFromBlobToBlob",
                "type": "Copy",
                "inputs": [
                    {
                        "referenceName": "BlobDataset",
                        "parameters": {
                            "path": "@pipeline().parameters.inputPath"
                        },
                        "type": "DatasetReference"
                    }
                ],
                "outputs": [
                    {
                        "referenceName": "BlobDataset",
                        "parameters": {
                            "path": "@pipeline().parameters.outputPath"
                        },
                        "type": "DatasetReference"
                    }
                ],
                "typeProperties": {
                    "source": {
                        "type": "BlobSource"
                    },
                    "sink": {
                        "type": "BlobSink"
                    }
                }
            }
        ],
        "parameters": {
            "inputPath": {
                "type": "String"
            },
            "outputPath": {
                "type": "String"
            }
        }
    }
}

Calling functions within expressions

You can call functions within expressions. The following sections provide information about the functions that can be used in an expression.

String functions

To work with strings, you can use these string functions and also some collection functions. String functions work only on strings.

String function Task
concat Combine two or more strings, and return the combined string.
endsWith Check whether a string ends with the specified substring.
guid Generate a globally unique identifier (GUID) as a string.
indexOf Return the starting position for a substring.
lastIndexOf Return the starting position for the last occurrence of a substring.
replace Replace a substring with the specified string, and return the updated string.
split Return an array that contains substrings, separated by commas, from a larger string based on a specified delimiter character in the original string.
startsWith Check whether a string starts with a specific substring.
substring Return characters from a string, starting from the specified position.
toLower Return a string in lowercase format.
toUpper Return a string in uppercase format.
trim Remove leading and trailing whitespace from a string, and return the updated string.

Collection functions

To work with collections, generally arrays, strings, and sometimes, dictionaries, you can use these collection functions.

Collection function Task
contains Check whether a collection has a specific item.
empty Check whether a collection is empty.
first Return the first item from a collection.
intersection Return a collection that has only the common items across the specified collections.
join Return a string that has all the items from an array, separated by the specified character.
last Return the last item from a collection.
length Return the number of items in a string or array.
skip Remove items from the front of a collection, and return all the other items.
take Return items from the front of a collection.
union Return a collection that has all the items from the specified collections.

Logical functions

These functions are useful inside conditions, they can be used to evaluate any type of logic.

Logical comparison function Task
and Check whether all expressions are true.
equals Check whether both values are equivalent.
greater Check whether the first value is greater than the second value.
greaterOrEquals Check whether the first value is greater than or equal to the second value.
if Check whether an expression is true or false. Based on the result, return a specified value.
less Check whether the first value is less than the second value.
lessOrEquals Check whether the first value is less than or equal to the second value.
not Check whether an expression is false.
or Check whether at least one expression is true.

Conversion functions

These functions are used to convert between each of the native types in the language:

  • string
  • integer
  • float
  • boolean
  • arrays
  • dictionaries
Conversion function Task
array Return an array from a single specified input. For multiple inputs, see createArray.
base64 Return the base64-encoded version for a string.
base64ToBinary Return the binary version for a base64-encoded string.
base64ToString Return the string version for a base64-encoded string.
binary Return the binary version for an input value.
bool Return the Boolean version for an input value.
coalesce Return the first non-null value from one or more parameters.
createArray Return an array from multiple inputs.
dataUri Return the data URI for an input value.
dataUriToBinary Return the binary version for a data URI.
dataUriToString Return the string version for a data URI.
decodeBase64 Return the string version for a base64-encoded string.
decodeDataUri Return the binary version for a data URI.
decodeUriComponent Return a string that replaces escape characters with decoded versions.
encodeUriComponent Return a string that replaces URL-unsafe characters with escape characters.
float Return a floating point number for an input value.
int Return the integer version for a string.
json Return the JavaScript Object Notation (JSON) type value or object for a string or XML.
string Return the string version for an input value.
uriComponent Return the URI-encoded version for an input value by replacing URL-unsafe characters with escape characters.
uriComponentToBinary Return the binary version for a URI-encoded string.
uriComponentToString Return the string version for a URI-encoded string.
xml Return the XML version for a string.
xpath Check XML for nodes or values that match an XPath (XML Path Language) expression, and return the matching nodes or values.

Math functions

These functions can be used for either types of numbers: integers and floats.

Math function Task
add Return the result from adding two numbers.
div Return the result from dividing two numbers.
max Return the highest value from a set of numbers or an array.
min Return the lowest value from a set of numbers or an array.
mod Return the remainder from dividing two numbers.
mul Return the product from multiplying two numbers.
rand Return a random integer from a specified range.
range Return an integer array that starts from a specified integer.
sub Return the result from subtracting the second number from the first number.

Date functions

Date or time function Task
addDays Add a number of days to a timestamp.
addHours Add a number of hours to a timestamp.
addMinutes Add a number of minutes to a timestamp.
addSeconds Add a number of seconds to a timestamp.
addToTime Add a number of time units to a timestamp. See also getFutureTime.
convertFromUtc Convert a timestamp from Universal Time Coordinated (UTC) to the target time zone.
convertTimeZone Convert a timestamp from the source time zone to the target time zone.
convertToUtc Convert a timestamp from the source time zone to Universal Time Coordinated (UTC).
dayOfMonth Return the day of the month component from a timestamp.
dayOfWeek Return the day of the week component from a timestamp.
dayOfYear Return the day of the year component from a timestamp.
formatDateTime Return the timestamp as a string in optional format.
getFutureTime Return the current timestamp plus the specified time units. See also addToTime.
getPastTime Return the current timestamp minus the specified time units. See also subtractFromTime.
startOfDay Return the start of the day for a timestamp.
startOfHour Return the start of the hour for a timestamp.
startOfMonth Return the start of the month for a timestamp.
subtractFromTime Subtract a number of time units from a timestamp. See also getPastTime.
ticks Return the ticks property value for a specified timestamp.
utcNow Return the current timestamp as a string.

Detailed examples for practice

Detailed Azure Data Factory copy pipeline with parameters

This Azure Data Factory copy pipeline parameter passing tutorial walks you through how to pass parameters between a pipeline and activity as well as between the activities.

Detailed Mapping data flow pipeline with parameters

Please follow Mapping data flow with parameters for comprehensive example on how to use parameters in data flow.

Detailed Metadata driven pipeline with parameters

Please follow Metadata driven pipeline with parameters to learn more about how to use parameters to design metadata driven pipelines. This is a popular use case for parameters.

For a list of system variables you can use in expressions, see System variables.