Recursive queries with SQL Server

 
 
  • Gérald Barré

The purpose of this article is to present recursion with Common Table Expression (CTE). For that we will use the following table:

Or in SQL:

SQL
CREATE TABLE [dbo].[Employee] (
   [Id]        INT            IDENTITY (1, 1) NOT NULL,
   [FirstName] NVARCHAR (100) NOT NULL,
   [LastName]  NVARCHAR (100) NOT NULL,
   [ManagerId] INT            NULL,
   PRIMARY KEY CLUSTERED ([Id] ASC),
   CONSTRAINT [FK_Manager] FOREIGN KEY ([ManagerId]) REFERENCES [dbo].[Employee] ([Id])
);

The goal is to obtain in a single query the complete hierarchy with the indentation:

Here's what MSDN is saying about CTE:

A common table expression (CTE) can be thought of as a temporary result set that is defined within the execution scope of a single SELECT, INSERT, UPDATE, DELETE, or CREATE VIEW statement. A CTE is similar to a derived table in that it is not stored as an object and lasts only for the duration of the query. Unlike a derived table, a CTE can be self-referencing and can be referenced multiple times in the same query.

Here is an example of CTE:

SQL
-- Declare the CTE
WITH EmployeeCTE (Id, FullName)
AS
(
    SELECT Id, FirstName + ' ' + LastName
    FROM Employee
)
-- Use the CTE
SELECT * From EmployeeCTE

Let's see how we can use CTE to query the employee table.

#Get all results with recursion

SQL Server allows the use of recursive CTE. For this you need to create a query in 2 parts:

  • The first returns the data to initiate recursion. In our case the initial request selects the employees who do not have a manager (the root of the hierarchy).
  • The second selects new lines based on the already selected lines. In our case we use the relation between Id and ManagerId to select new lines.

Recursion stops when recursion no longer returns rows.

Here is the query to select all employees recursively:

SQL
WITH EmployeeHierarchy(Id, FirstName, LastName, ManagerId)
AS
(
  -- Root level of the hierarchy
  SELECT Id, FirstName, LastName, ManagerId
  FROM Employee
  WHERE ManagerId IS NULL
  -- Combine result with the result of the recursion
  UNION ALL
  -- Load the next level of the hierarchy
  SELECT e.Id, e.FirstName, e.LastName, e.ManagerId
  FROM Employee e
  INNER JOIN EmployeeHierarchy eh -- self-reference! (recursion)
  ON e.ManagerId = eh.Id
)
SELECT * FROM EmployeeHierarchy

To illustrate how the query works, here are the results obtained after each recursion:

#Show indentation and full name

We now have the desired dataset but it misses the notion of hierarchy ("-" before the name) in the result. We can simply add the notion of depth by adding a new column to the CTE. The value of this column is incremented with each recursion:

SQL
WITH EmployeeHierarchy(Id, FirstName, LastName, ManagerId, level)
AS
(
  SELECT Id, FirstName, LastName, ManagerId, 0
  FROM Employee
  WHERE ManagerId IS NULL
  UNION ALL
  SELECT e.Id, e.FirstName, e.LastName, e.ManagerId, eh.level + 1 -- compute the level
  FROM Employee e
  INNER JOIN EmployeeHierarchy eh ON e.ManagerId = eh.Id
)
SELECT Id, REPLICATE('--', level) + ' ' + FirstName + ' ' + LastName, ManagerId
FROM EmployeeHierarchy

#Sort the rows in the desired order

We have the notion of hierarchy, but there remains the problem of sorting. The columns available with this CTE do not correctly sort the data. The idea is to add a new column containing the full path as a string of a line in the hierarchy. We can sort on this column:

SQL
WITH EmployeeHierarchy(Id, FirstName, LastName, ManagerId, level, fullpath)
AS
(
  SELECT Id, FirstName, LastName, ManagerId, 0, CAST(Id AS NVARCHAR(200))
  FROM Employee
  WHERE ManagerId IS NULL
  UNION ALL
  SELECT e.Id, e.FirstName, e.LastName, e.ManagerId, eh.level + 1, CAST(eh.fullpath + '/' + CAST(e.Id AS NVARCHAR(20)) AS NVARCHAR(200))
  FROM Employee e
  INNER JOIN EmployeeHierarchy eh ON e.ManagerId = eh.Id
)
SELECT REPLICATE('--', level) + ' ' + FirstName + ' ' + LastName, fullpath FROM EmployeeHierarchy
ORDER BY fullpath

CAST AS NVARCHAR(200) allows you to have the same type for the fullpath column throughout the recursion. Indeed the type of a column of a CTE must be constant. If the depth of the hierarchy is important, you must of course adjust the size of the fullpath column to avoid truncation to 200 characters.

This time the result is the one expected:

CTEs are very practical and powerful in solving this kind of problem. Note that there are other ways to model this table to avoid getting out the heavy artillery. For example, SQL Server 2008 introduced the HierarchyId type, but this will be the subject of another article.

Do you have a question or a suggestion about this post? Contact me!

Follow me:
Enjoy this blog?Buy Me A Coffee💖 Sponsor on GitHub