In the previous article I wrote about common table expressions in sql. In this article, I’ll go through recursive CTEs.
Recursive CTEs can be used to generate result sets when working with data that represents a hierarchy or network or pattern.
The syntax of a recursive query is similar to any CTE.
WITH cte_name AS
(
...first_query
UNION ALL
...second_query
)
...outer_query_statement
The first_query
should contain the initial state i.e the starting values.
The second_query
selects the values from the cte_name
result set and can reference columns from first_query
.
This query should contain the base case(when to stop).
Example 1: Generate even numbers between 0-10
Below is an example of a recursive query that generates even numbers between 0-10 using a recursive query.
WITH numbers AS
(
SELECT 0 AS n -- first_query
UNION
-- second query
SELECT
n+2
FROM numbers
WHERE n < 9 -- base case
)
SELECT * FROM numbers;
This will generate the numbers 0,2,4,6,8,10
n |
---|
0 |
2 |
… |
10 |
UNION
is a set operator that concatenates the results of two queries into a single result set.
You can use UNION ALL
if you want to include duplicate rows.
Example 2: Generate Dates Between a Date Range
An example of how to apply this would be generating dates between a date range. As follows
WITH dates AS
(
SELECT CAST('2023-01-01') AS d
UNION ALL
SELECT
DATEADD(MONTH, 1, d)
FROM dates
WHERE d < '2024-01-01'
)
SELECT * FROM dates;
Result:
d |
---|
2023-01-01 |
2023-02-01 |
… |
2024-01-01 |
DATEADD
is a function that adds a number to a datepart of an input date.
For this case it adds one month to the column d
A more practical example. Say you want to generate month on month revenue for the year 2023 for a business. Something similar to:
Month | Revenue |
---|---|
Jan | 2,000 |
Feb | 1,000 |
… | … |
Dec | 5,000 |
The query below would be used to achieve this.
WITH dates AS
(
SELECT CAST('2023-01-01') AS d
UNION ALL
SELECT
DATEADD(MONTH, 1, d)
FROM dates
WHERE d < '2024-01-01'
)
SELECT
revenue.sale_date AS date,
SUM(revenue.amount) AS Amount
FROM tbl_revenue AS revenue
JOIN dates ON
DATETRUNC(MONTH, dates.d) = DATETRUNC(MONTH, revenue.sale_date)
GROUP BY revenue.sale_date
ORDER BY Amount DESC
In this query, we first generate dates
, then join the result set with tbl_revenue
.
We then group by the date and get the sum of the amount
column.
DATETRUNC
is a function that was added in MSSQL 2022 that truncates a date based on the specified part.
For example 2023-01-15
after truncating by month becomes 2023-01-01
.
Result:
d | Revenue |
---|---|
2023-01-01 | 2,000 |
2023-02-01 | 1,000 |
… | … |
2024-01-01 | 5,000 |