WITH (Common Table Expressions)
Common Table Expression (CTE) is a temporary result set defined within the scope of a single statement, valid only during the execution of the query. It can self-reference and can be referenced multiple times within the same query. Its purpose is simplifying complex queries, enhance code readability, and improve maintainability. A CTE can be seen as a temporary view that exists only for the query's execution and is not visible to external queries.
Once defined, a CTE can be referenced like a
CREATE VIEW statement.
WITH clause is used to specify Common Table Expressions, and the
WITH clause can include one or more comma-separated clauses. Each clause provides a subquery that generates a result set and associates a name with the subquery.
- CTEs can reuse the same subquery in multiple places, avoiding redundant logic.
- They can simplify recursive queries, such as querying tree-structured data.
- Complex queries can be broken down into smaller parts using CTEs, making the query logic clearer and more understandable.
Common Table Expressions are divided into two types: non-recursive and recursive:
Non-recursive CTE refers to an expression in which the CTE does not reference itself. It is used to build a one-time temporary result set and does not involve recursion.
Recursive CTE refers to an expression in which the CTE references itself. It handles data with recursive structures, such as trees or graphs. A recursive CTE includes a base query (initial condition) in its definition, then performs recursive operations on the result of that base query until a stop condition is met.
WITH <query_name> AS ( <query_definition> ) SELECT ... FROM <query_name>;
<query_name>: Specifies the temporary name assigned to the CTE result set. It can be any valid identifier, similar to a table or column name.
<query_definition>: The query statement defines the CTE result set. It can be any valid
SELECTquery used to create the result set of the CTE.
SELECT ... FROM <query_name>: This is the query executed on the CTE, where you can use the name of the CTE.
WITH RECURSIVE <query_name> AS ( <query_definition> ) SELECT ... FROM <query_name>;
WITH RECURSIVE: Indicates that this is a recursive CTE.
<query_name>: Specifies the temporary name assigned to the result set of the recursive CTE. It can be any valid identifier, similar to a table or column name.
<query_definition>: This consists of two parts in the context of a recursive CTE:
- Initial part: Defines the recursion's initial condition and result set.
- Recursive function: Defines how to recursively generate the next round of the result set from the initial result set.
SELECT ... FROM <query_name>: Use the name of the recursive CTE to query the recursive CTE.
Guidelines for Using Recursive CTEs
Anchor and Recursive Members
A recursive common table expression (CTE) must consist of at least two query definitions: an anchor member and a recursive member. The anchor member should come before the first recursive member, and you can define multiple anchor and recursive members. All CTE query definitions are considered anchor members unless they reference the CTE itself.
Suppose you have a table named
Employee that contains employee information, including fields like
ManagerID, representing the employee's ID, name, and ID of their manager. You can use a recursive CTE to query the hierarchical relationship between employees and subordinates.
Assuming the table data is as follows:
Here's an example of using a recursive CTE to query the hierarchical relationship between employees and their subordinates:
WITH RECURSIVE EmployeeHierarchy AS ( -- Anchor member: Find top-level employees SELECT EmployeeID, Name, ManagerID, 0 AS Level FROM Employee WHERE ManagerID IS NULL UNION ALL -- Recursive member: Recursively query subordinate employees SELECT e.EmployeeID, e.Name, e.ManagerID, eh.Level + 1 FROM Employee AS e JOIN EmployeeHierarchy AS eh ON e.ManagerID = eh.EmployeeID ) SELECT Name, Level FROM EmployeeHierarchy;
In the above example:
- The anchor member selects top-level employees (with
ManagerIDas NULL) and sets their level (
Level) to 0.
- The recursive member queries subordinate employees based on the previous round's results (
EmployeeHierarchy), incrementing the level.
- The final query uses
SELECTto retrieve employee names and levels from the recursive CTE.
Executing this query will provide information about the hierarchical relationship between employees and their subordinates. Both anchor and recursive members together form the structure of a recursive query. On the other hand, a non-recursive CTE is used to create a temporary result set with a single query definition, and you only need to reference this CTE in your query without concerning anchor and recursive members.
Operators and Statement Requirements
Set Operators: Anchor members must be combined using set operators (such as
UNION ALLis allowed between the last anchor member and the first recursive member, as well as when combining multiple recursive members.
Column Matching: The number of columns in anchor and recursive members must be the same.
Data Types: Columns in the recursive member must have the same data types as the corresponding columns in the anchor member.
FROM Clause: The FROM clause of a recursive member can only reference the CTE expression_name once.
Unsupported Features: Certain features are not allowed in the CTE_query_definition of a recursive member, including:
- Using the
SELECT DISTINCTkeyword for distinct queries.
GROUP BYto group results.
HAVINGto filter results after grouping.
- Scalar aggregation applies an aggregate function (like
AVG, etc.) to a set of rows and returns a single value.
- Outer join operations like
INNER JOINis allowed).
- Using the
- Non-recursive CTE example:
CREATE TABLE employees ( id INT PRIMARY KEY, name VARCHAR(50), salary INT ); INSERT INTO employees (id, name, salary) VALUES (1, 'Alice', 50000), (2, 'Bob', 60000), (3, 'Charlie', 75000), (4, 'David', 55000), (5, 'Eve', 80000); -- Query employees whose salary is higher than the average salary mysql> WITH avg_salary AS ( SELECT AVG(salary) AS avg_salary FROM employees) SELECT name, salary FROM employees JOIN avg_salary ON salary > avg_salary.avg_salary; +---------+--------+ | name | salary | +---------+--------+ | Charlie | 75000 | | Eve | 80000 | +---------+--------+ 2 rows in set (0.00 sec)
- Recursive CTE example:
CREATE TABLE employees_hierarchy ( id INT PRIMARY KEY, name VARCHAR(50), manager_id INT ); INSERT INTO employees_hierarchy (id, name, manager_id) VALUES (1, 'Alice', NULL), (2, 'Bob', 1), (3, 'Charlie', 1), (4, 'David', 2), (5, 'Eve', 2), (6, 'Frank', 3); -- Query an employee and all his employees mysql> WITH RECURSIVE employee_hierarchy_cte (id, name, manager_id, level) AS ( SELECT id, name, manager_id, 0 FROM employees_hierarchy WHERE name = 'Alice' UNION ALL SELECT e.id, e.name, e.manager_id, eh.level + 1 FROM employees_hierarchy AS e JOIN employee_hierarchy_cte AS eh ON e.manager_id = eh.id ) SELECT name, level FROM employee_hierarchy_cte; +---------+-------+ | name | level | +---------+-------+ | Alice | 0 | | Bob | 1 | | Charlie | 1 | | David | 2 | | Eve | 2 | | Frank | 2 | +---------+-------+ 6 rows in set (0.00 sec)