Aggregate functions in mapping data flow
APPLIES TO: Azure Data Factory Azure Synapse Analytics
Tip
Try out Data Factory in Microsoft Fabric, an all-in-one analytics solution for enterprises. Microsoft Fabric covers everything from data movement to data science, real-time analytics, business intelligence, and reporting. Learn how to start a new trial for free!
Data flows are available both in Azure Data Factory and Azure Synapse Pipelines. This article applies to mapping data flows. If you are new to transformations, please refer to the introductory article Transform data using a mapping data flow.
The following articles provide details about aggregate functions supported by Azure Data Factory and Azure Synapse Analytics in mapping data flows.
Aggregate function list
The following functions are only available in aggregate, pivot, unpivot, and window transformations.
Aggregate function | Task |
---|---|
approxDistinctCount | Gets the approximate aggregate count of distinct values for a column. The optional second parameter is to control the estimation error. |
avg | Gets the average of values of a column. |
avgIf | Based on a criteria gets the average of values of a column. |
collect | Collects all values of the expression in the aggregated group into an array. Structures can be collected and transformed to alternate structures during this process. The number of items will be equal to the number of rows in that group and can contain null values. The number of collected items should be small. |
collectUnique | Collects all values of the expression in the aggregated group into a unique array. Structures can be collected and transformed to alternate structures during this process.The number of items will be smaller or equal to the number of rows in that group and can contain null values. The number of collected items should be small. |
count | Gets the aggregate count of values. If the optional column(s) is specified, it ignores NULL values in the count. |
countAll | Gets the aggregate count of values including NULLs. |
countDistinct | Gets the aggregate count of distinct values of a set of columns. |
countAllDistinct | Gets the aggregate count of distinct values of a set of columns including NULLs. |
countIf | Based on a criteria gets the aggregate count of values. If the optional column is specified, it ignores NULL values in the count. |
covariancePopulation | Gets the population covariance between two columns. |
covariancePopulationIf | Based on a criteria, gets the population covariance of two columns. |
covarianceSample | Gets the sample covariance of two columns. |
covarianceSampleIf | Based on a criteria, gets the sample covariance of two columns. |
first | Gets the first value of a column group. If the second parameter ignoreNulls is omitted, it is assumed false. |
isDistinct | Finds if a column or set of columns is distinct. It does not count null as a distinct value |
kurtosis | Gets the kurtosis of a column. |
kurtosisIf | Based on a criteria, gets the kurtosis of a column. |
last | Gets the last value of a column group. If the second parameter ignoreNulls is omitted, it is assumed false. |
max | Gets the maximum value of a column. |
maxIf | Based on a criteria, gets the maximum value of a column. |
mean | Gets the mean of values of a column. Same as AVG. |
meanIf | Based on a criteria gets the mean of values of a column. Same as avgIf. |
min | Gets the minimum value of a column. |
minIf | Based on a criteria, gets the minimum value of a column. |
skewness | Gets the skewness of a column. |
skewnessIf | Based on a criteria, gets the skewness of a column. |
stddev | Gets the standard deviation of a column. |
stddevIf | Based on a criteria, gets the standard deviation of a column. |
stddevPopulation | Gets the population standard deviation of a column. |
stddevPopulationIf | Based on a criteria, gets the population standard deviation of a column. |
stddevSample | Gets the sample standard deviation of a column. |
stddevSampleIf | Based on a criteria, gets the sample standard deviation of a column. |
sum | Gets the aggregate sum of a numeric column. |
sumDistinct | Gets the aggregate sum of distinct values of a numeric column. |
sumDistinctIf | Based on criteria gets the aggregate sum of a numeric column. The condition can be based on any column. |
sumIf | Based on criteria gets the aggregate sum of a numeric column. The condition can be based on any column. |
topN | Gets the top N values for this column. |
variance | Gets the variance of a column. |
varianceIf | Based on a criteria, gets the variance of a column. |
variancePopulation | Gets the population variance of a column. |
variancePopulationIf | Based on a criteria, gets the population variance of a column. |
varianceSample | Gets the unbiased variance of a column. |
varianceSampleIf | Based on a criteria, gets the unbiased variance of a column. |
Related content
- List of all array functions.
- List of all cached lookup functions.
- List of all conversion functions.
- List of all date and time functions.
- List of all expression functions.
- List of all map functions.
- List of all metafunctions.
- List of all window functions.
- Usage details of all data transformation expressions.
- Learn how to use Expression Builder.