In the previous article I wrote about common table expressions in sql. In this article, I’ll go through recursive CTEs.
Recursive CTEs can be used to generate result sets when working with data that represents a hierarchy or network or pattern.
The syntax of a recursive query is similar to any CTE.
WITH cte_name AS
(
...first_query
UNION ALL
...second_query
)
...outer_query_statement
The first_query should contain the initial state i.e the starting values.
The second_query selects the values from the cte_name result set and can reference columns from first_query.
This query should contain the base case(when to stop).
Example 1: Generate even numbers between 0-10
Below is an example of a recursive query that generates even numbers between 0-10 using a recursive query.
WITH numbers AS
(
SELECT 0 AS n -- first_query
UNION
-- second query
SELECT
n+2
FROM numbers
WHERE n < 9 -- base case
)
SELECT * FROM numbers;
This will generate the numbers 0,2,4,6,8,10
| n |
|---|
| 0 |
| 2 |
| … |
| 10 |
UNION is a set operator that concatenates the results of two queries into a single result set.
You can use UNION ALL if you want to include duplicate rows.
Example 2: Generate Dates Between a Date Range
An example of how to apply this would be generating dates between a date range. As follows
WITH dates AS
(
SELECT CAST('2023-01-01') AS d
UNION ALL
SELECT
DATEADD(MONTH, 1, d)
FROM dates
WHERE d < '2024-01-01'
)
SELECT * FROM dates;
Result:
| d |
|---|
| 2023-01-01 |
| 2023-02-01 |
| … |
| 2024-01-01 |
DATEADD is a function that adds a number to a datepart of an input date.
For this case it adds one month to the column d
A more practical example. Say you want to generate month on month revenue for the year 2023 for a business. Something similar to:
| Month | Revenue |
|---|---|
| Jan | 2,000 |
| Feb | 1,000 |
| … | … |
| Dec | 5,000 |
The query below would be used to achieve this.
WITH dates AS
(
SELECT CAST('2023-01-01') AS d
UNION ALL
SELECT
DATEADD(MONTH, 1, d)
FROM dates
WHERE d < '2024-01-01'
)
SELECT
revenue.sale_date AS date,
SUM(revenue.amount) AS Amount
FROM tbl_revenue AS revenue
JOIN dates ON
DATETRUNC(MONTH, dates.d) = DATETRUNC(MONTH, revenue.sale_date)
GROUP BY revenue.sale_date
ORDER BY Amount DESC
In this query, we first generate dates, then join the result set with tbl_revenue.
We then group by the date and get the sum of the amount column.
DATETRUNC is a function that was added in MSSQL 2022 that truncates a date based on the specified part.
For example 2023-01-15 after truncating by month becomes 2023-01-01.
Result:
| d | Revenue |
|---|---|
| 2023-01-01 | 2,000 |
| 2023-02-01 | 1,000 |
| … | … |
| 2024-01-01 | 5,000 |