GROUP BY & HAVING

1. Definition

GROUP BY is used to group rows that have the same values in specified columns into summary rows.

It is typically used with aggregation functions to compute results per group instead of the whole table.

Example use cases:

  • Count employees per department
  • Calculate average salary per department
  • Total sales per day
  • Total transactions per user

2. Basic Syntax

SELECT column_name, AGG_FUNCTION(column) FROM table_name GROUP BY column_name;

3. Example Table

Employee
---------
id
name
salary
dept_id

Sample data:

id name salary dept_id
1 A 50000 10
2 B 60000 10
3 C 55000 20
4 D 70000 20

4.GROUP BY Example

Find average salary per department.

SELECT dept_id, AVG(salary) FROM employee GROUP BY dept_id;

Result:

dept_id avg_salary
10 55000
20 62500

5. GROUP BY Execution Flow

SQL execution order (simplified):

FROM
WHERE
GROUP BY
HAVING
SELECT
ORDER BY

Steps:

  • Read rows from table
  • Apply WHERE filter
  • Group rows by GROUP BY column
  • Apply aggregation functions
  • Apply HAVING filter
  • Return results

6. Multiple Column GROUP BY

Grouping can be done using multiple columns.

Example:

SELECT dept_id, manager_id, COUNT(*) FROM employee GROUP BY dept_id, manager_id;

This creates groups based on the combination of both columns.


7. HAVING Clause

HAVING is used to filter grouped results.

  • WHERE filters rows before grouping
  • HAVING filters groups after aggregation

Example:

SELECT dept_id, COUNT(*) FROM employee GROUP BY dept_id
HAVING COUNT(*) > 5;

Only departments with more than 5 employees will appear.

8. WHERE vs HAVING

Feature WHERE HAVING
Applied on Rows Groups
Execution stage Before GROUP BY After GROUP BY
Uses aggregation Not allowed Allowed
Performance Faster Slightly slower

Example:

Using WHERE

SELECT dept_id, COUNT(*) FROM employee WHERE salary > 50000 GROUP BY dept_id;

First filters rows → then groups.

Using HAVING

SELECT dept_id, COUNT(*) FROM employee GROUP BY dept_id HAVING COUNT(*) > 5;

Groups first → then filters groups.

9. Selecting Columns with GROUP BY

Rule:

Every column in SELECT must be either:

  • in GROUP BY
  • or used inside an aggregation function

Invalid query:

SELECT dept_id, name, AVG(salary) FROM employee GROUP BY dept_id;

Reason:

  • name is neither aggregated nor grouped.

Correct version:

SELECT dept_id, AVG(salary) FROM employee GROUP BY dept_id;

10. GROUP BY with ORDER BY

Results can be sorted.

Example:

SELECT dept_id, COUNT(*) FROM employee GROUP BY dept_id ORDER BY COUNT(*) DESC;

Departments with the highest employee count appear first.


11 Summary

  • GROUP BY groups rows based on column values.
  • Aggregation functions operate per group.
  • WHERE filters rows before grouping.
  • HAVING filters grouped results.
  • Columns in SELECT must either be grouped or aggregated.

This site uses Just the Docs, a documentation theme for Jekyll.