Organize queries with Common Table Expressions
When working with complex queries, you often need to break down logic into manageable pieces or reference the same subquery multiple times. Common Table Expressions (CTEs) provide a way to define temporary named result sets that exist only during a single query, making your code more readable and maintainable.
Understand CTE syntax
A Common Table Expression is defined using the WITH clause, followed by the CTE name, an optional column list, and a query that defines the result set. The CTE can then be referenced in the subsequent SELECT, INSERT, UPDATE, or DELETE statement.
WITH CTE_Name (Column1, Column2)
AS
(
-- CTE query definition
SELECT Column1, Column2
FROM SomeTable
WHERE SomeCondition = 'Value'
)
SELECT * FROM CTE_Name;
CTEs offer several advantages over derived tables and subqueries:
- Improved readability: Complex queries become easier to understand when broken into named logical sections
- Self-referencing capability: Recursive CTEs can reference themselves, enabling hierarchical data traversal
- Multiple references: Reference the same CTE multiple times in the outer query without redefining it
- Modular design: Build complex queries incrementally by defining multiple CTEs
Note
CTEs are temporary result sets that exist only during query execution. Unlike traditional tables, they don't persist beyond the statement that uses them and don't require explicit cleanup.
Create nonrecursive CTEs
Nonrecursive CTEs define a result set based on a straightforward query that doesn't reference itself. This pattern is useful for simplifying complex joins, breaking down multi-step calculations, or improving code organization.
The following example uses a CTE to calculate sales metrics before joining with product information:
WITH SalesSummary AS
(
SELECT
ProductID,
SUM(OrderQty) AS TotalQuantity,
SUM(LineTotal) AS TotalRevenue,
COUNT(DISTINCT SalesOrderID) AS OrderCount
FROM SalesLT.SalesOrderDetail
GROUP BY ProductID
)
SELECT
p.Name AS ProductName,
p.ProductNumber,
p.ListPrice,
ss.TotalQuantity,
ss.TotalRevenue,
ss.OrderCount,
ss.TotalRevenue / NULLIF(ss.TotalQuantity, 0) AS AverageUnitPrice
FROM SalesLT.Product AS p
INNER JOIN SalesSummary AS ss
ON p.ProductID = ss.ProductID
ORDER BY ss.TotalRevenue DESC;
This query first creates a CTE named SalesSummary that aggregates order details by product, calculating total quantity sold, total revenue, and order count. The main query then joins this CTE with the Product table to display product names alongside their sales metrics. The NULLIF function prevents division by zero when calculating the average unit price.
You can define multiple CTEs in a single WITH clause by separating them with commas. Later CTEs can reference earlier ones, enabling progressive data transformation:
WITH CategorySales AS
(
SELECT
p.ProductCategoryID,
SUM(sod.LineTotal) AS CategoryRevenue
FROM SalesLT.Product AS p
INNER JOIN SalesLT.SalesOrderDetail AS sod
ON p.ProductID = sod.ProductID
GROUP BY p.ProductCategoryID
),
RankedCategories AS
(
SELECT
ProductCategoryID,
CategoryRevenue,
RANK() OVER (ORDER BY CategoryRevenue DESC) AS RevenueRank
FROM CategorySales
)
SELECT
pc.Name AS CategoryName,
rc.CategoryRevenue,
rc.RevenueRank
FROM RankedCategories AS rc
INNER JOIN SalesLT.ProductCategory AS pc
ON rc.ProductCategoryID = pc.ProductCategoryID
WHERE rc.RevenueRank <= 5;
Tip
When building queries with multiple CTEs, start with the most granular data transformations and progressively aggregate or filter in subsequent CTEs. This approach makes debugging easier since you can test each CTE independently.
Create recursive CTEs
Recursive CTEs reference themselves to process hierarchical or graph-like data structures. A recursive CTE consists of two parts: an anchor member that provides the initial result set, and a recursive member that references the CTE to build upon previous results.
The general syntax for a recursive CTE is:
WITH RecursiveCTE AS
(
-- Anchor member: starting point
SELECT columns
FROM table
WHERE starting_condition
UNION ALL
-- Recursive member: references the CTE
SELECT columns
FROM table
INNER JOIN RecursiveCTE
ON join_condition
)
SELECT * FROM RecursiveCTE;
A common use case is traversing an organizational hierarchy. Consider an employee table where each employee has a manager:
WITH EmployeeHierarchy AS
(
-- Anchor: Start with top-level managers (no manager)
SELECT
EmployeeID,
FirstName,
LastName,
ManagerID,
0 AS Level,
CAST(FirstName + ' ' + LastName AS NVARCHAR(500)) AS HierarchyPath
FROM HumanResources.Employee
WHERE ManagerID IS NULL
UNION ALL
-- Recursive: Find employees who report to previously found employees
SELECT
e.EmployeeID,
e.FirstName,
e.LastName,
e.ManagerID,
eh.Level + 1,
CAST(eh.HierarchyPath + ' > ' + e.FirstName + ' ' + e.LastName AS NVARCHAR(500))
FROM HumanResources.Employee AS e
INNER JOIN EmployeeHierarchy AS eh
ON e.ManagerID = eh.EmployeeID
)
SELECT
EmployeeID,
FirstName,
LastName,
Level,
HierarchyPath
FROM EmployeeHierarchy
ORDER BY HierarchyPath;
Important
Recursive CTEs can cause infinite loops if the termination condition is never met. SQL Server limits recursion to 100 levels by default. Use OPTION (MAXRECURSION n) to change this limit, where n is the maximum recursion depth (0 for unlimited).
Generate sequences with recursive CTEs
Recursive CTEs excel at generating sequences of numbers or dates without requiring a physical numbers table. This technique is useful for creating date ranges, filling gaps in data, or generating test data:
-- Generate a sequence of dates for the current month
WITH DateSequence AS
(
-- Anchor: Get the first day of the current month
SELECT CAST(DATEADD(MONTH, DATEDIFF(MONTH, 0, GETDATE()), 0) AS DATE) AS DateValue
UNION ALL
-- Recursive: Add one day until we reach the end of the month
SELECT DATEADD(DAY, 1, DateValue)
FROM DateSequence
WHERE DateValue < EOMONTH(GETDATE())
)
-- Output each date with its day name
SELECT DateValue, DATENAME(WEEKDAY, DateValue) AS DayName
FROM DateSequence
OPTION (MAXRECURSION 31); -- Allow up to 31 iterations (max days in a month)
You can combine generated sequences with actual data to identify gaps or create summary reports:
-- Generate a numbers table from 1 to 1000
WITH Numbers AS
(
-- Anchor: Start with 1
SELECT 1 AS n
UNION ALL
-- Recursive: Increment until we reach 1000
SELECT n + 1 FROM Numbers WHERE n < 1000
),
-- Convert numbers to dates for the entire year
DateRange AS
(
SELECT DATEADD(DAY, n - 1, '2024-01-01') AS OrderDate
FROM Numbers
WHERE DATEADD(DAY, n - 1, '2024-01-01') <= '2024-12-31'
)
-- Count orders for each date, showing 0 for dates with no orders
SELECT
dr.OrderDate,
COALESCE(COUNT(soh.SalesOrderID), 0) AS OrderCount
FROM DateRange AS dr
LEFT JOIN SalesLT.SalesOrderHeader AS soh
ON CAST(soh.OrderDate AS DATE) = dr.OrderDate
GROUP BY dr.OrderDate
ORDER BY dr.OrderDate
OPTION (MAXRECURSION 366); -- Allow up to 366 iterations (leap year)
Use CTEs with data modification statements
CTEs can be used with INSERT, UPDATE, and DELETE statements, not just SELECT. This capability is useful when the modification logic requires complex filtering or calculations:
-- Update using a CTE to identify target rows
WITH DiscontinuedProducts AS
(
SELECT ProductID
FROM SalesLT.Product
WHERE SellEndDate < DATEADD(YEAR, -2, GETDATE())
AND ProductID NOT IN (
SELECT DISTINCT ProductID
FROM SalesLT.SalesOrderDetail
WHERE ModifiedDate > DATEADD(YEAR, -1, GETDATE())
)
)
UPDATE SalesLT.Product
SET DiscontinuedDate = GETDATE()
WHERE ProductID IN (SELECT ProductID FROM DiscontinuedProducts);
This query uses a CTE to identify products that should be marked as discontinued. The CTE finds products where the sold end date is more than two years ago and that haven't appeared in any order details modified within the last year. The UPDATE statement then sets the DiscontinuedDate for those products. By separating the selection logic into a CTE, the query becomes easier to read and test independently.
For more information about Common Table Expressions, see WITH common_table_expression (Transact-SQL) and Recursive Queries Using Common Table Expressions.