Edit

Share via


Aggregate functions in mapping data flows

APPLIES TO: Azure Data Factory Azure Synapse Analytics

Tip

Try out Data Factory in Microsoft Fabric, an all-in-one analytics solution for enterprises. Microsoft Fabric covers everything from data movement to data science, real-time analytics, business intelligence, and reporting. Learn how to start a new trial for free!

Data flows are available in both Azure Data Factory pipelines and Azure Synapse Analytics pipelines. This article applies to mapping data flows. If you're new to transformations, refer to the introductory article Transform data using mapping data flows.

This article provides details about aggregate functions supported by Azure Data Factory and Azure Synapse Analytics in mapping data flows.

Aggregate function list

The following functions are available only in aggregate, pivot, unpivot, and window transformations.

Aggregate function Task
approxDistinctCount Gets the approximate aggregate count of distinct values for a column. The optional second parameter is to control the estimation error.
avg Gets the average of values of a column.
avgIf Gets the average of values of a column, based on criteria.
collect Collects all values of the expression in the aggregated group into an array. During this process, you can collect and transform structures to alternate structures. The number of items is equal to the number of rows in that group and can contain null values. The number of collected items should be small.
collectUnique Collects all values of the expression in the aggregated group into a unique array. During this process, you can collect and transform structures to alternate structures. The number of items is smaller or equal to the number of rows in that group and can contain null values. The number of collected items should be small.
count Gets the aggregate count of values. If the optional columns are specified, it ignores NULL values in the count.
countAll Gets the aggregate count of values, including NULL values.
countDistinct Gets the aggregate count of distinct values of a set of columns.
countAllDistinct Gets the aggregate count of distinct values of a set of columns, including NULL values.
countIf Gets the aggregate count of values, based on criteria. If the optional column is specified, it ignores NULL values in the count.
covariancePopulation Gets the population covariance between two columns.
covariancePopulationIf Gets the population covariance of two columns, based on criteria.
covarianceSample Gets the sample covariance of two columns.
covarianceSampleIf Gets the sample covariance of two columns, based on criteria.
first Gets the first value of a column group. If the second parameter ignoreNulls is omitted, false is assumed.
isDistinct Finds if a column or set of columns is distinct. It doesn't count null as a distinct value.
kurtosis Gets the kurtosis of a column.
kurtosisIf Gets the kurtosis of a column, based on criteria.
last Gets the last value of a column group. If the second parameter ignoreNulls is omitted, false is assumed.
max Gets the maximum value of a column.
maxIf Gets the maximum value of a column, based on criteria.
mean Gets the mean of values of a column. Same as AVG.
meanIf Gets the mean of values of a column, based on criteria. Same as avgIf.
min Gets the minimum value of a column.
minIf Gets the minimum value of a column, based on criteria.
skewness Gets the skewness of a column.
skewnessIf Gets the skewness of a column, based on criteria.
stddev Gets the standard deviation of a column.
stddevIf Gets the standard deviation of a column, based on criteria.
stddevPopulation Gets the population standard deviation of a column.
stddevPopulationIf Gets the population standard deviation of a column, based on criteria.
stddevSample Gets the sample standard deviation of a column.
stddevSampleIf Gets the sample standard deviation of a column, based on criteria.
sum Gets the aggregate sum of a numeric column.
sumDistinct Gets the aggregate sum of distinct values of a numeric column.
sumDistinctIf Gets the aggregate sum of a numeric column, based on criteria. The condition can be based on any column.
sumIf Gets the aggregate sum of a numeric column, based on criteria. The condition can be based on any column.
topN Gets the top N values for this column.
variance Gets the variance of a column.
varianceIf Gets the variance of a column, based on criteria.
variancePopulation Gets the population variance of a column.
variancePopulationIf Gets the population variance of a column, based on criteria.
varianceSample Gets the unbiased variance of a column.
varianceSampleIf Gets the unbiased variance of a column, based on criteria.