GROUP BY (Transact-SQL)
Groups a selected set of rows into a set of summary rows by the values of one or more columns or expressions. One row is returned for each group. Aggregate functions in the SELECT clause <select> list provide information about each group instead of individual rows.
The GROUP BY clause has an ISO-compliant syntax and a non-ISO-compliant syntax. Only one syntax style can be used in a single SELECT statement. Use the ISO compliant syntax for all new work. The non-ISO compliant syntax is provided for backward compatibility.
In this topic, a GROUP BY clause can be described as general or simple:
A general GROUP BY clause includes GROUPING SETS, CUBE, ROLLUP, WITH CUBE, or WITH ROLLUP.
A simple GROUP BY clause does not include GROUPING SETS, CUBE, ROLLUP, WITH CUBE, or WITH ROLLUP. GROUP BY (), grand total, is considered a simple GROUP BY.
Syntax
ISO-Compliant Syntax
GROUP BY <group by spec>
<group by spec> ::=
<group by item> [ ,...n ]
<group by item> ::=
<simple group by item>
| <rollup spec>
| <cube spec>
| <grouping sets spec>
| <grand total>
<simple group by item> ::=
<column_expression>
<rollup spec> ::=
ROLLUP ( <composite element list> )<cube spec> ::=
CUBE ( <composite element list> )<composite element list> ::=
<composite element> [ ,...n ]
<composite element> ::=
<simple group by item>
| ( <simple group by item list> )<simple group by item list> ::=
<simple group by item> [ ,...n ]
<grouping sets spec> ::=
GROUPING SETS ( <grouping set list> )<grouping set list> ::=
<grouping set> [ ,...n ]
<grouping set> ::=
<grand total>
| <grouping set item>
| ( <grouping set item list> )<empty group> ::=()<grouping set item> ::=
<simple group by item>
| <rollup spec>
| <cube spec>
<grouping set item list> ::=
<grouping set item> [ ,...n ]
Non-ISO-Compliant Syntax
[ GROUP BY [ ALL ] group_by_expression [ ,...n ]
[ WITH { CUBE | ROLLUP } ]
]
Arguments
<column_expression>
- Is the expression on which the grouping operation is performed.
ROLLUP ( )
Generates the simple GROUP BY aggregate rows, plus subtotal or super-aggregate rows, and also a grand total row.The number of groupings that is returned equals the number of expressions in the <composite element list> plus one. For example, consider the following statement.
SELECT a, b, c, SUM ( <expression> ) FROM T GROUP BY ROLLUP (a,b,c)
One row with a subtotal is generated for each unique combination of values of (a, b, c), (a, b), and (a). A grand total row is also calculated.
Columns are rolled up from right to left. The column order affects the output groupings of ROLLUP and can affect the number of rows in the result set.
CUBE ( )
Generates simple GROUP BY aggregate rows, the ROLLUP super-aggregate rows, and cross-tabulation rows.CUBE outputs a grouping for all permutations of expressions in the <composite element list>.
The number of groupings that is generated equals (2n), where n = the number of expressions in the <composite element list>. For example, consider the following statement.
SELECT a, b, c, SUM (<expression>) FROM T GROUP BY CUBE (a,b,c)
One row is produced for each unique combination of values of (a, b, c), (a, b), (a, c), (b, c), (a), (b) and (c) with a subtotal for each row and a grand total row.
Column order does not affect the output of CUBE.
GROUPING SETS ( )
Specifies multiple groupings of data in one query. Only the specified groups are aggregated instead of the full set of aggregations that are generated by CUBE or ROLLUP. The results are the equivalent of UNION ALL of the specified groups. GROUPING SETS can contain a single element or a list of elements. GROUPING SETS can specify groupings equivalent to those returned by ROLLUP or CUBE. For examples, see GROUPING SETS Equivalents. The <grouping set item list> can contain ROLLUP or CUBE.()
The empty group generates a total.
Non-ISO Compliant Syntax
ALL
This feature will be removed in a future version of Microsoft SQL Server. Avoid using this feature in new development work, and plan to modify applications that currently use this feature. Includes all groups and result sets, even those that do not have any rows that meet the search condition specified in the WHERE clause. When ALL is specified, null values are returned for the summary columns of groups that do not meet the search condition. You cannot specify ALL with the CUBE or ROLLUP operators.GROUP BY ALL is not supported in queries that access remote tables if there is also a WHERE clause in the query. GROUP BY ALL will fail on columns that have the FILESTREAM attribute.
group_by_expression
Is an expression on which grouping is performed. group_by_expression is also known as a grouping column. group_by expression can be a column or a non-aggregate expression that references a column returned by the FROM clause. A column alias that is defined in the SELECT list cannot be used to specify a grouping column.Note
Columns of type text, ntext, and image cannot be used in group_by_expression.
For GROUP BY clauses that do not contain CUBE or ROLLUP, the number of group_by_expression items is limited by the GROUP BY column sizes, the aggregated columns, and the aggregate values involved in the query. This limit originates from the limit of 8,060 bytes on the intermediate worktable that is needed to hold intermediate query results. A maximum of 12 grouping expressions is permitted when CUBE or ROLLUP is specified.
xml data type methods cannot be specified directly in group_by_expression. Instead, refer to a user-defined function that uses xml data type methods inside it, or refer to a computed column that uses them.
WITH CUBE
This feature will be removed in a future version of Microsoft SQL Server. Avoid using this feature in new development work, and plan to modify applications that currently use this feature. Specifies that in addition to the usual rows provided by GROUP BY, summary rows are introduced into the result set. A GROUP BY summary row is returned for every possible combination of group and subgroup in the result set. Use the GROUPING function to determine whether null values in the result set are GROUP BY summary values.The number of summary rows in the result set is determined by the number of columns included in the GROUP BY clause. Because CUBE returns every possible combination of group and subgroup, the number of rows is the same, regardless of the order in which the grouping columns are specified.
WITH ROLLUP
This feature will be removed in a future version of Microsoft SQL Server. Avoid using this feature in new development work, and plan to modify applications that currently use this feature. Specifies that in addition to the usual rows provided by GROUP BY, summary rows are introduced into the result set. Groups are summarized in a hierarchical order, from the lowest level in the group to the highest. The group hierarchy is determined by the order in which the grouping columns are specified. Changing the order of the grouping columns can affect the number of rows produced in the result set.Important
Distinct aggregates, for example, AVG (DISTINCT column_name), COUNT (DISTINCT column_name), and SUM (DISTINCT column_name), are not supported when you use CUBE or ROLLUP. If these are used, the SQL Server Database Engine returns an error message and cancels the query.
Remarks
Expressions in the GROUP BY clause can contain columns of the tables, derived tables or views in the FROM clause. The columns are not required to appear in the SELECT clause <select> list.
Each table or view column in any nonaggregate expression in the <select> list must be included in the GROUP BY list:
The following statements are allowed:
SELECT ColumnA, ColumnB FROM T GROUP BY ColumnA, ColumnB SELECT ColumnA + ColumnB FROM T GROUP BY ColumnA, ColumnB SELECT ColumnA + ColumnB FROM T GROUP BY ColumnA + ColumnB SELECT ColumnA + ColumnB + constant FROM T GROUP BY ColumnA, ColumnB
The following statements are not allowed:
SELECT ColumnA, ColumnB FROM T GROUP BY ColumnA + ColumnB SELECT ColumnA + constant + ColumnB FROM T GROUP BY ColumnA + ColumnB.
If aggregate functions are included in the SELECT clause <select list>, GROUP BY calculates a summary value for each group. These are known as vector aggregates.
Rows that do not meet the conditions in the WHERE clause are removed before any grouping operation is performed.
The HAVING clause is used with the GROUP BY clause to filter groups in the result set.
The GROUP BY clause does not order the result set. Use the ORDER BY clause to order the result set.
If a grouping column contains null values, all null values are considered equal, and they are put into a single group.
You cannot use GROUP BY with an alias to replace a column name in the AS clause unless the alias replaces a column name in a derived table in the FROM clause.
Duplicate grouping sets in a GROUPING SETS list are not eliminated. Duplicate grouping sets can be generated by specifying a column expression more than one time or by listing a column expression also generated by a CUBE or ROLLUP in the GROUPING SETS list.
Distinct aggregates, for example, AVG (DISTINCT column_name), COUNT (DISTINCT column_name), and SUM (DISTINCT column_name) are supported with ROLLUP, CUBE, and GROUPING SETS.
ROLLUP, CUBE, and GROUPING SETS cannot be specified in an indexed view.
GROUP BY or HAVING cannot be used directly on columns of ntext, text, or image. These columns can be used as arguments in functions that return a value of another data type, such as SUBSTRING() and CAST().
xml data type methods cannot be specified directly in a <column_expression>. Instead, refer to a user-defined function that uses xml data type methods inside it, or refer to a computed column that uses them.
GROUP BY Limitations for GROUPING SETS, ROLLUP, and CUBE
Syntax Limitations
GROUPING SETS are not allowed in the GROUP BY clause unless they are part of a GROUPING SETS list. For example, GROUP BY C1, (C2,..., Cn) is not allowed but GROUP BY GROUPING SETS (C1, (C2, ..., Cn)) is allowed.
GROUPING SETS are not allowed inside GROUPING SETS. For example, GROUP BY GROUPING SETS (C1, GROUPING SETS (C2, C3)) is not allowed.
The non-ISO ALL, WITH CUBE, and WITH ROLLUP keywords are not allowed in a GROUP BY clause with the ROLLUP, CUBE or GROUPING SETS keywords.
Size Limitations
For simple GROUP BY, there is no limit on the number of expressions.
For a GROUP BY clause that uses ROLLUP, CUBE, or GROUPING SETS, the maximum number of expressions is 32, and the maximum number of grouping sets that can be generated is 4096 (212). The following examples fail because the GROUP BY clause is too complex:
The following examples generate 8192 (213) grouping sets.
GROUP BY CUBE (a1, ..., a13) GROUP BY a1, ..., a13 WITH CUBE
The following example generates 4097 (212 + 1) grouping sets.
GROUP BY GROUPING SETS( CUBE(a1, ..., a12), b )
The following example also generates 4097 (212 + 1) grouping sets. Both CUBE () and the () grouping set produce a grand total row and duplicate grouping sets are not eliminated.
GROUP BY GROUPING SETS( CUBE(a1, ..., a12), ())
Support for ISO and ANSI SQL-2006 GROUP BY Features
SQL Server 2008 supports all GROUP BY features that are included in the SQL-2006 standard with the following syntax exceptions:
Grouping sets are not allowed in the GROUP BY clause unless they are part of an explicit GROUPING SETS list. For example, GROUP BY Column1, (Column2, ...ColumnN) is allowed in the standard but not in SQL Server. GROUP BY C1, GROUPING SETS ((Column2, ...ColumnN)) or GROUP BY Column1, Column2, ... ColumnN is allowed. These are semantically equivalent to the previous GROUP BY example. This is to avoid the possibility that GROUP BY Column1, (Column2, ...ColumnN) might be misinterpreted as GROUP BY C1, GROUPING SETS ((Column2, ...ColumnN)). This is not semantically equivalent.
Grouping sets are not allowed inside grouping sets. For example, GROUP BY GROUPING SETS (A1, A2,…An, GROUPING SETS (C1, C2, ...Cn)) is allowed in the SQL-2006 standard but not in SQL Server. SQL Server 2008 allows GROUP BY GROUPING SETS( A1, A2,...An, C1, C2, ...Cn ) or GROUP BY GROUPING SETS( (A1), (A2), ... (An), (C1), (C2), ... (Cn) ). These examples are semantically equivalent to the first GROUP BY example and have a clearer syntax.
GROUP BY [ALL/DISTINCT] is not allowed in a general GROUP BY clause or with the GROUPING SETS, ROLLUP, CUBE, WITH CUBE or WITH ROLLUP constructs. ALL is the default and is implicit.
Comparison of Supported GROUP BY Features
The following table describes the GROUP BY features that are supported based upon the version of SQL Server and the database compatibility level.
Feature |
SQL Server 2005 Integration Services |
SQL Server 2008 compatibility level 100 |
SQL Server 2008 compatibility level 90 or earlier |
---|---|---|---|
DISTINCT aggregates |
Not supported for WITH CUBE or WITH ROLLUP. |
Supported for WITH CUBE, WITH ROLLUP, GROUPING SETS, CUBE, or ROLLUP. |
Same as SQL Server 2008 compatibility level 100. |
User-defined function with CUBE or ROLLUP name in the GROUP BY clause |
User-defined function dbo.cube(arg1,...argN) or dbo.rollup(arg1,...argN) in the GROUP BY clause is allowed. For example:
|
User-defined function dbo.cube (arg1,...argN) or dbo.rollup(arg1,...argN) in the GROUP BY clause is not allowed. For example:
The following error message is returned: "Incorrect syntax near the keyword 'cube'|'rollup'." To avoid this problem, replace dbo.cube with [dbo].[cube] or dbo.rollup with [dbo].[rollup]. The following example is allowed:
|
User-defined function dbo.cube (arg1,...argN) or dbo.rollup(arg1,...argN) in the GROUP BY clause is allowed For example:
|
GROUPING SETS |
Not supported |
Supported |
Supported |
CUBE |
Not supported |
Supported |
Not supported |
ROLLUP |
Not supported |
Supported |
Not supported |
Grand total, such as GROUP BY () |
Not supported |
Supported |
Supported |
GROUPING_ID function |
Not supported |
Supported |
Supported |
GROUPING function |
Supported |
Supported |
Supported |
WITH CUBE |
Supported |
Supported |
Supported |
WITH ROLLUP |
Supported |
Supported |
Supported |
WITH CUBE or WITH ROLLUP "duplicate" grouping removal |
Supported |
Supported |
Supported |
Examples
For examples that use GROUPING SETS, ROLLUP, and CUBE, see Using GROUP BY with ROLLUP, CUBE, and GROUPING SETS.
A. Using a simple GROUP BY clause
The following example retrieves the total for each SalesOrderID from the SalesOrderDetail table.
USE AdventureWorks;
GO
SELECT SalesOrderID, SUM(LineTotal) AS SubTotal
FROM Sales.SalesOrderDetail sod
GROUP BY SalesOrderID
ORDER BY SalesOrderID;
B. Using a GROUP BY clause with multiple tables
The following example retrieves the number of employees for each City from the Address table joined to the EmployeeAddress table.
USE AdventureWorks;
GO
SELECT a.City, COUNT(ea.AddressID) EmployeeCount
FROM HumanResources.EmployeeAddress ea
INNER JOIN Person.Address a
ON ea.AddressID = a.AddressID
GROUP BY a.City
ORDER BY a.City;
C. Using a GROUP BY clause with an expression
The following example retrieves the total sales for each year by using the DATEPART function. The same expression must be present in both the SELECT list and GROUP BY clause.
USE AdventureWorks;
GO
SELECT DATEPART(yyyy,OrderDate) AS N'Year'
,SUM(TotalDue) AS N'Total Order Amount'
FROM Sales.SalesOrderHeader
GROUP BY DATEPART(yyyy,OrderDate)
ORDER BY DATEPART(yyyy,OrderDate);
D. Using a GROUP BY clause with a HAVING clause
The following example uses the HAVING clause to specify which of the groups generated in the GROUP BY clause should be included in the result set.
USE AdventureWorks;
GO
SELECT DATEPART(yyyy,OrderDate) AS N'Year'
,SUM(TotalDue) AS N'Total Order Amount'
FROM Sales.SalesOrderHeader
GROUP BY DATEPART(yyyy,OrderDate)
HAVING DATEPART(yyyy,OrderDate) >= N'2003'
ORDER BY DATEPART(yyyy,OrderDate);