Grouping Rows with GROUP BY

The GROUP BY clause is used to produce aggregate values for each row in the result set. When used without a GROUP BY clause, aggregate functions report only one aggregate value for a SELECT statement.

The following example finds the total of each sales order in the database.

USE AdventureWorks;
GO
SELECT SalesOrderID, SUM(LineTotal) AS SubTotal
FROM Sales.SalesOrderDetail sod
GROUP BY SalesOrderID
ORDER BY SalesOrderID ;
GO

The GROUP BY keywords are followed by a list of columns, known as the grouping columns. The GROUP BY clause restricts the rows of the result set; there is only one row for each distinct value in the grouping column or columns. Each result set row contains summary data related to the specific value in its grouping columns.

There are restrictions on the items that can be specified in the select list when a SELECT statement contains a GROUP BY. Items allowed in the select list are:

  • The grouping columns.
  • Expressions that return only one value for each value in the grouping columns, such as aggregate functions that have a column name as one of their parameters. These are known as vector aggregates.

For example, TableX contains:

ColumnA ColumnB ColumnC

-------

-------

-------

1

abc

5

1

def

4

1

ghi

9

2

jkl

8

2

mno

3

If ColumnA is the grouping column, there will be two rows in the result set, one summarizing the information for the value 1, and the other summarizing the information for value 2.

When ColumnA is the grouping column, the only way ColumnB or ColumnC can be referenced is if they are parameters in an aggregate function that can return a single value for each value in ColumnA. It is legal for the select list to include expressions such as MAX(ColumnB), SUM(ColumnC), or AVG(ColumnC):

SELECT ColumnA,
       MAX(ColumnB) AS MaxB,
       SUM(ColumnC) AS SumC
FROM TableX
GROUP BY ColumnA

This select returns two rows, one for each unique value in ColumnA:

ColumnA     MaxB SumC        
----------- ---- ----------- 
1           ghi  18          
2           mno  11          

(2 row(s) affected)

It is not legal, however, to have just the expression ColumnB in the select list:

SELECT ColumnA,
       ColumnB,
       SUM(ColumnC) AS SumC
FROM TableX
GROUP BY ColumnA

Because the GROUP BY can return only one row with a value of 1 in ColumnA, there is no way to return the three values of ColumnB (abc, def, and ghi) associated with the value 1 in ColumnA.

You cannot use GROUP BY or HAVING on ntext, text, image, or bit columns unless they are in a function that returns a value having another data type. Examples of such functions are SUBSTRING and CAST.

See Also

Other Resources

CAST and CONVERT (Transact-SQL)
SUBSTRING (Transact-SQL)
SELECT (Transact-SQL)

Help and Information

Getting SQL Server 2005 Assistance