MySQL Recursive CTE (Common Table Expressions)


MySQL Recursive CTE allows users to write queries that involve recursive operations. Recursive CTE is an expression that is defined recursively. It is useful in hierarchical data, graph traversals, data aggregation, and data reporting. In this article, we will discuss Recursive CTE with its syntax and examples.

Introduction

Common Table Expression (CTE) is a way to give a name to the temporary result sets generated by each query in MySQL. The WITH clause is used to define a CTE, and multiple CTEs can be defined in a single statement using this clause. However, a CTE can only reference other CTEs that are defined earlier in the same WITH clause. The scope of each CTE is limited to the statement in which it is defined.

A recursive CTE is a type of subquery that refers to itself using its own name. To define a recursive CTE, the WITH RECURSIVE clause is used, and it must have a terminating condition. Recursive CTEs are commonly used for generating series and traversing hierarchical or tree-structured data.

Syntax

The syntax for defining a recursive CTE in MySQL is as follows:

WITH RECURSIVE cte_name [(col1, col2, ...)]
AS (subquery)
SELECT col1, col2, ... FROM cte_name;
  • `cte_name`: The name given to the recursive subquery written in the subquery block.

  • `col1, col2, ..., colN`: The names given to the columns generated by the subquery.

  • `subquery`: A MySQL query that refers to itself using `cte_name` as its own name. The column names given in the SELECT statement should match the names provided in the list followed by `cte_name`.

The recursive CTE structure provided in the subquery block

SELECT col1, col2, ..., colN FROM table_name
UNION [ALL, DISTINCT]
SELECT col1, col2, ..., colN FROM cte_name
WHERE clause

Recursive CTE has non-recursive subquery then followed by recursive subquery.

  • First SELECT statement is a non-recursive statement. It provides initial rows for the result set.

  • `UNION [ALL, DISTINCT]` is used to add additional rows to the previous result set. Use of `ALL` and `DISTINCT` keywords is used to add or remove duplicate rows in the last result set.

  • Second SELECT statement is a recursive statement. It produces the result set iteratively until the condition provided in the WHERE clause is true.

  • Result set produced at each iteration takes the result set produced at the previous iteration as the base table.

  • Recursion ends when the recursive SELECT statement does not produce any additional rows.

Example-1

Consider a table named "employees". It has columns "id", "name", and "salary". Find the average salary of employees who have been with the company for at least 2 years. The "employees" table has the following values:

id

name

salary

1

John

50000

2

Jane

60000

3

Bob

70000

4

Alice

80000

5

Michael

90000

6

Sarah

100000

7

David

110000

8

Emily

120000

9

Mark

130000

10

Julia

140000

So, the required query is given below

WITH RECURSIVE employee_tenure AS (
   SELECT id, name, salary, hire_date, 0 AS tenure
   FROM employees
   UNION ALL
   SELECT e.id, e.name, e.salary, e.hire_date, et.tenure + 1
   FROM employees e
   JOIN employee_tenure et ON e.id = et.id
   WHERE et.hire_date < DATE_SUB(NOW(), INTERVAL 2 YEAR)
)
SELECT AVG(salary) AS average_salary
FROM employee_tenure
WHERE tenure >= 2;

In this query, we first define a recursive CTE called "employee_tenure". It computes the tenure of each employee by recursively joining the "employees" table with the CTE itself. Base case of the recursion selects all employees from the "employees" table with a starting tenure of 0. Recursive case joins each employee with the CTE and increments their tenure by 1.

The resulting "employee_tenure" CTE has columns "id", "name", "salary", "hire_date", and "tenure". We then select the average salary of employees whose tenure is at least 2 years. It uses a simple SELECT statement with a WHERE clause that filters out employees with tenure less than 2.

The output of the query will be a single row. It will have the average salary of employees who have been with the company for at least 2 years. The specific value will depend on the random salaries assigned to each employee in the "employees" table.

Example-2

Here is an example of using recursive CTEs in MySQL to generate a series of the first 5 odd numbers:

Query

WITH RECURSIVE 
odd_no (sr_no, n) AS
(
   SELECT 1, 1 
   UNION ALL
   SELECT sr_no+1, n+2 FROM odd_no WHERE sr_no < 5 
)
SELECT * FROM odd_no;  

Output

sr_no

n

1

1

2

3

3

5

4

7

5

9

The above query consists of two parts- non-recursive and recursive.

Non-Recursive part - It will produce initial rows which consist of two columns named as "sr_no" and "n" and a single row.

Query

SELECT 1, 1

Output

sr_no

n

1

1

Recursive part - It will add rows to the previous output until the terminating condition is satisfied, which in this case is when sr_no is less than 5.

SELECT sr_no+1, n+2 FROM odd_no WHERE sr_no < 5 

When `sr_no` becomes 5, the condition becomes false and the recursion is terminated.

Conclusion

MySQL Recursive CTE is an expression that is defined recursively and is useful in hierarchical data, graph traversals, data aggregation, and data reporting. Recursive CTE refers to itself using its own name and must have a terminating condition. The syntax for defining a recursive CTE involves using the WITH RECURSIVE clause and subqueries that are non-recursive and recursive. In this article, we have discussed Recursive CTE with its syntax and examples, including using Recursive CTE to find the average salary of employees who have been with a company for at least 2 years, and generating a series of the first 5 odd numbers. Overall, Recursive CTE is a powerful tool that can help users write complex queries in MySQL.

Updated on: 18-May-2023

2K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements