SQL - Group By Clause



The SQL GROUP BY Clause

The SQL GROUP BY clause is used in conjunction with the SELECT statement to arrange identical data into groups. This clause follows the WHERE clause in a SELECT statement and precedes the ORDER BY and HAVING clauses (if they exist).

The main purpose of grouping the records of a table based on particular columns is to perform calculations on these groups. Therefore, The GROUP BY clause is typically used with aggregate functions such as SUM(), COUNT(), AVG(), MAX(), or MIN() etc.

For example, if you have a table named SALES_DATA containing the sales data with the columns YEAR, PRODUCT, and SALES. To calculate the total sales in an year, the GROUP BY clause can be used to group the records in this table based on the year and calculate the sum of sales in each group using the SUM() function.

Syntax

Following is the basic syntax of the SQL GROUP BY clause −

SELECT column_name(s)
FROM table_name
GROUP BY column_name(s);

Where, column_name(s) refers to the name of one or more columns in the table that we want to group the data by and the table_name refers to the name of the table that we want to retrieve data from.

GROUP BY Clause with Aggregate Functions

Typically, we group the record of a table to perform calculations on them. Therefore, the SQL GROUP BY clause is often used with the aggregate functions such as SUM(), AVG(), MIN(), MAX(), COUNT(), etc.

Example

Assume we have created a table named CUSTOMERS, which contains the personal details of customers including their name, age, address and salary, using the following query −

CREATE TABLE CUSTOMERS (
   ID INT NOT NULL,
   NAME VARCHAR (20) NOT NULL,
   AGE INT NOT NULL,
   ADDRESS CHAR (25),
   SALARY DECIMAL (18, 2),
   PRIMARY KEY (ID)
);

Now insert values into this table using the INSERT statement as follows −

INSERT INTO CUSTOMERS VALUES
(1, 'Ramesh', 32, 'Ahmedabad', 2000.00),
(2, 'Khilan', 25, 'Delhi', 1500.00),
(3, 'Kaushik', 23, 'Kota', 2000.00),
(4, 'Chaitali', 25, 'Mumbai', 6500.00),
(5, 'Hardik', 27, 'Bhopal', 8500.00),
(6, 'Komal', 22, 'Hyderabad', 4500.00),
(7, 'Muffy', 24, 'Indore', 10000.00);

The table created is as shown below −

ID NAME AGE ADDRESS SALARY
1 Ramesh 32 Ahmedabad 2000.00
2 Khilan 25 Delhi 1500.00
3 Kaushik 23 Kota 2000.00
4 Chaitali 25 Mumbai 6500.00
5 Hardik 27 Bhopal 8500.00
6 Komal 22 Hyderabad 4500.00
7 Muffy 24 Indore 10000.00

The following SQL query groups the CUSTOMERS table based on AGE and counts the number of records in each group −

SELECT AGE, COUNT(Name) FROM CUSTOMERS GROUP BY AGE;

Output

Following is the result produced −

AGE COUNT(Name)
32 1
25 2
23 1
27 1
22 1
24 1

Example

In the following query, we are finding the highest salary for each age −

SELECT AGE, MAX(salary) AS MAX_SALARY 
FROM CUSTOMERS GROUP BY AGE;

Output

Following is the output of the above query −

AGE MAX_SALARY
32 2000.00
25 6500.00
23 2000.00
27 8500.00
22 4500.00
24 10000.00

Similarly we can group the records of the CUSTOMERS table based on the AGE column and calculate the maximum salary, average and sum of the SALARY values in each group using the MIN(), AVG() and SUM() functions respectively.

GROUP BY Clause on Single Columns

When we use the GROUP BY clause with a single column, all the rows in the table that have the same value in that particular column will be merged into a single record.

Example

In the following example we are grouping the above created CUSTOMERS table by the ADDRESS column and calculating the average salary of the customer from each city −

SELECT ADDRESS, AVG(SALARY) as AVG_SALARY 
FROM CUSTOMERS GROUP BY ADDRESS;

Output

This would produce the following result −

ADDRESS AVG_SALARY
Ahmedabad 2000.000000
Delhi 1500.000000
Kota 2000.000000
Mumbai 6500.000000
Bhopal 8500.000000
Hyderabad 4500.000000
Indore 10000.000000

GROUP BY Clause with Multiple Columns

When we use the GROUP BY clause with multiple columns, all the rows in the table that have the same values in all of the specified columns will be merged into a single group.

Example

In the following query we are grouping the records of the CUSTOMERS table based on the columns ADDRESS and AGE and −

SELECT ADDRESS, AGE, SUM(SALARY) AS TOTAL_SALARY 
FROM CUSTOMERS GROUP BY ADDRESS, AGE;

Output

This would produce the following result −

ADDRESS AGE TOTAL_SALARY
Ahmedabad 32 2000.00
Delhi 25 1500.00
Kota 23 2000.00
Mumbai 25 6500.00
Bhopal 27 8500.00
Hyderabad 22 4500.00
Indore 24 10000.00

GROUP BY with ORDER BY Clause

We can use the ORDER BY clause with GROUP BY in SQL to sort the grouped data by one or more columns.

Syntax

Following is the syntax for using ORDER BY clause with GROUP BY clause in SQL −

SELECT column1, column2, ..., aggregate_function(columnX) AS alias
FROM table
GROUP BY column1, column2, ...
ORDER BY column1 [ASC | DESC], column2 [ASC | DESC], ...;

Example

In here, we are finding the highest salary for each age, sorted by high to low −

SELECT AGE, MIN(SALARY) AS MIN_SALARY 
FROM CUSTOMERS 
GROUP BY AGE ORDER BY MIN_SALARY DESC;

Output

Following is the result produced −

AGE MIN_SALARY
24 10000.00
27 8500.00
22 4500.00
32 2000.00
23 2000.00
25 1500.00

GROUP BY with HAVING Clause

We can also use the GROUP BY clause with the HAVING clause filter the grouped data in a table based on specific criteria.

Syntax

Following is the syntax for using ORDER BY clause with HAVING clause in SQL −

SELECT column1, column2, aggregate_function(column)
FROM table_name
GROUP BY column1, column2
HAVING condition;

Example

In the following query, we are grouping the customers by their age and calculating the minimum salary for each group. Using the HAVING clause we are filtering the groups where the age is greater than 24 −

SELECT ADDRESS, AGE, MIN(SALARY) AS MIN_SUM 
FROM CUSTOMERS 
GROUP BY ADDRESS, AGE HAVING AGE>24;

Output

The result produced is as follows −

ADDRESS AGE MIN_SUM
Ahmedabad 32 2000.00
Delhi 25 1500.00
Mumbai 25 6500.00
Bhopal 27 8500.00
Advertisements