DataFlow.Summarize Method

Definition

Summarizes data by running aggregate functions over specific columns.

public Microsoft.DataPrep.Common.DataFlow Summarize(System.Collections.Generic.List<Microsoft.DataPrep.Common.SummaryColumnsValue> summaryColumns = null, System.Collections.Generic.List<string> groupByColumns = null, bool joinBack = false, string joinBackColumnsPrefix = null);
member this.Summarize : System.Collections.Generic.List<Microsoft.DataPrep.Common.SummaryColumnsValue> * System.Collections.Generic.List<string> * bool * string -> Microsoft.DataPrep.Common.DataFlow
Public Function Summarize (Optional summaryColumns As List(Of SummaryColumnsValue) = null, Optional groupByColumns As List(Of String) = null, Optional joinBack As Boolean = false, Optional joinBackColumnsPrefix As String = null) As DataFlow

Parameters

summaryColumns
List<SummaryColumnsValue>

List of SummaryColumnsValue where each value defines column to summarize, summary function to use and name of resulting column to add.

groupByColumns
List<String>

Columns to group by.

joinBack
Boolean

Whether to append subtotals or replace current data with them.

joinBackColumnsPrefix
String

Prefix to use for subtotal columns when appending them to current data.

Returns

Remarks

The aggregate functions are independent and it is possible to aggregate the same column multiple times. Unique names have to be provided for the resulting columns. The aggregations can be grouped, in which case one record is returned per group; or ungrouped, in which case one record is returned for the whole dataset. Additionally, the results of the aggregations can either replace the current dataset or augment it by appending the result columns.

Applies to