Hierarchical Data in the SQL Server EF Core Provider

Note

This feature was added in EF Core 8.0.

Azure SQL and SQL Server have a special data type called hierarchyid that is used to store hierarchical data. In this case, "hierarchical data" essentially means data that forms a tree structure, where each item can have a parent and/or children. Examples of such data are:

  • An organizational structure
  • A file system
  • A set of tasks in a project
  • A taxonomy of language terms
  • A graph of links between Web pages

The database is then able to run queries against this data using its hierarchical structure. For example, a query can find ancestors and dependents of given items, or find all items at a certain depth in the hierarchy.

Using HierarchyId in .NET and EF Core

At the lowest level, the Microsoft.SqlServer.Types NuGet package includes a type called SqlHierarchyId. While this type supports working hierarchyid values, it is a bit cumbersome to work with in LINQ.

At the next level, a new Microsoft.EntityFrameworkCore.SqlServer.Abstractions package has been introduced, which includes a higher-level HierarchyId type intended for use in entity types.

Tip

The HierarchyId type is more idiomatic to the norms of .NET than SqlHierarchyId, which is instead modeled after how .NET Framework types are hosted inside the SQL Server database engine. HierarchyId is designed to work with EF Core, but it can also be used outside of EF Core in other applications. The Microsoft.EntityFrameworkCore.SqlServer.Abstractions package doesn't reference any other packages, and so has minimal impact on deployed application size and dependencies.

Use of HierarchyId for EF Core functionality such as queries and updates requires the Microsoft.EntityFrameworkCore.SqlServer.HierarchyId package. This package brings in Microsoft.EntityFrameworkCore.SqlServer.Abstractions and Microsoft.SqlServer.Types as transitive dependencies, and so is often the only package needed.

dotnet add package Microsoft.EntityFrameworkCore.SqlServer.HierarchyId

Once the package is installed, use of HierarchyId is enabled by calling UseHierarchyId as part of the application's call to UseSqlServer. For example:

options.UseSqlServer(
    connectionString,
    x => x.UseHierarchyId());

Modeling hierarchies

The HierarchyId type can be used for properties of an entity type. For example, assume we want to model the paternal family tree of some fictional halflings. In the entity type for Halfling, a HierarchyId property can be used to locate each halfling in the family tree.

public class Halfling
{
    public Halfling(HierarchyId pathFromPatriarch, string name, int? yearOfBirth = null)
    {
        PathFromPatriarch = pathFromPatriarch;
        Name = name;
        YearOfBirth = yearOfBirth;
    }

    public int Id { get; private set; }
    public HierarchyId PathFromPatriarch { get; set; }
    public string Name { get; set; }
    public int? YearOfBirth { get; set; }
}

Tip

The code shown here and in the examples below comes from HierarchyIdSample.cs.

Tip

If desired, HierarchyId is suitable for use as a key property type.

In this case, the family tree is rooted with the patriarch of the family. Each halfling can be traced from the patriarch down the tree using its PathFromPatriarch property. SQL Server uses a compact binary format for these paths, but it is common to parse to and from a human-readable string representation when when working with code. In this representation, the position at each level is separated by a / character. For example, consider the family tree in the diagram below:

Halfling family tree

In this tree:

  • Balbo is at the root of the tree, represented by /.
  • Balbo has five children, represented by /1/, /2/, /3/, /4/, and /5/.
  • Balbo's first child, Mungo, also has five children, represented by /1/1/, /1/2/, /1/3/, /1/4/, and /1/5/. Notice that the HierarchyId for Mungo (/1/) is the prefix for all his children.
  • Similarly, Balbo's third child, Ponto, has two children, represented by /3/1/ and /3/2/. Again the each of these children is prefixed by the HierarchyId for Ponto, which is represented as /3/.
  • And so on down the tree...

The following code inserts this family tree into a database using EF Core:

await AddRangeAsync(
    new Halfling(HierarchyId.Parse("/"), "Balbo", 1167),
    new Halfling(HierarchyId.Parse("/1/"), "Mungo", 1207),
    new Halfling(HierarchyId.Parse("/2/"), "Pansy", 1212),
    new Halfling(HierarchyId.Parse("/3/"), "Ponto", 1216),
    new Halfling(HierarchyId.Parse("/4/"), "Largo", 1220),
    new Halfling(HierarchyId.Parse("/5/"), "Lily", 1222),
    new Halfling(HierarchyId.Parse("/1/1/"), "Bungo", 1246),
    new Halfling(HierarchyId.Parse("/1/2/"), "Belba", 1256),
    new Halfling(HierarchyId.Parse("/1/3/"), "Longo", 1260),
    new Halfling(HierarchyId.Parse("/1/4/"), "Linda", 1262),
    new Halfling(HierarchyId.Parse("/1/5/"), "Bingo", 1264),
    new Halfling(HierarchyId.Parse("/3/1/"), "Rosa", 1256),
    new Halfling(HierarchyId.Parse("/3/2/"), "Polo"),
    new Halfling(HierarchyId.Parse("/4/1/"), "Fosco", 1264),
    new Halfling(HierarchyId.Parse("/1/1/1/"), "Bilbo", 1290),
    new Halfling(HierarchyId.Parse("/1/3/1/"), "Otho", 1310),
    new Halfling(HierarchyId.Parse("/1/5/1/"), "Falco", 1303),
    new Halfling(HierarchyId.Parse("/3/2/1/"), "Posco", 1302),
    new Halfling(HierarchyId.Parse("/3/2/2/"), "Prisca", 1306),
    new Halfling(HierarchyId.Parse("/4/1/1/"), "Dora", 1302),
    new Halfling(HierarchyId.Parse("/4/1/2/"), "Drogo", 1308),
    new Halfling(HierarchyId.Parse("/4/1/3/"), "Dudo", 1311),
    new Halfling(HierarchyId.Parse("/1/3/1/1/"), "Lotho", 1310),
    new Halfling(HierarchyId.Parse("/1/5/1/1/"), "Poppy", 1344),
    new Halfling(HierarchyId.Parse("/3/2/1/1/"), "Ponto", 1346),
    new Halfling(HierarchyId.Parse("/3/2/1/2/"), "Porto", 1348),
    new Halfling(HierarchyId.Parse("/3/2/1/3/"), "Peony", 1350),
    new Halfling(HierarchyId.Parse("/4/1/2/1/"), "Frodo", 1368),
    new Halfling(HierarchyId.Parse("/4/1/3/1/"), "Daisy", 1350),
    new Halfling(HierarchyId.Parse("/3/2/1/1/1/"), "Angelica", 1381));

await SaveChangesAsync();

Tip

If needed, decimal values can be used to create new nodes between two existing nodes. For example, /3/2.5/2/ goes between /3/2/2/ and /3/3/2/.

Querying hierarchies

HierarchyId exposes several methods that can be used in LINQ queries.

Method Description
GetAncestor(int n) Gets the node n levels up the hierarchical tree.
GetDescendant(HierarchyId? child1, HierarchyId? child2) Gets the value of a descendant node that is greater than child1 and less than child2.
GetLevel() Gets the level of this node in the hierarchical tree.
GetReparentedValue(HierarchyId? oldRoot, HierarchyId? newRoot) Gets a value representing the location of a new node that has a path from newRoot equal to the path from oldRoot to this, effectively moving this to the new location.
IsDescendantOf(HierarchyId? parent) Gets a value indicating whether this node is a descendant of parent.

In addition, the operators ==, !=, <, <=, > and >= can be used.

The following are examples of using these methods in LINQ queries.

Get entities at a given level in the tree

The following query uses GetLevel to return all halflings at a given level in the family tree:

var generation = await context.Halflings.Where(halfling => halfling.PathFromPatriarch.GetLevel() == level).ToListAsync();

This translates to the following SQL:

SELECT [h].[Id], [h].[Name], [h].[PathFromPatriarch], [h].[YearOfBirth]
FROM [Halflings] AS [h]
WHERE [h].[PathFromPatriarch].GetLevel() = @__level_0

Running this in a loop we can get the halflings for every generation:

Generation 0: Balbo
Generation 1: Mungo, Pansy, Ponto, Largo, Lily
Generation 2: Bungo, Belba, Longo, Linda, Bingo, Rosa, Polo, Fosco
Generation 3: Bilbo, Otho, Falco, Posco, Prisca, Dora, Drogo, Dudo
Generation 4: Lotho, Poppy, Ponto, Porto, Peony, Frodo, Daisy
Generation 5: Angelica

Get the direct ancestor of an entity

The following query uses GetAncestor to find the direct ancestor of a halfling, given that halfling's name:

async Task<Halfling?> FindDirectAncestor(string name)
    => await context.Halflings
        .SingleOrDefaultAsync(
            ancestor => ancestor.PathFromPatriarch == context.Halflings
                .Single(descendent => descendent.Name == name).PathFromPatriarch
                .GetAncestor(1));

This translates to the following SQL:

SELECT TOP(2) [h].[Id], [h].[Name], [h].[PathFromPatriarch], [h].[YearOfBirth]
FROM [Halflings] AS [h]
WHERE [h].[PathFromPatriarch] = (
    SELECT TOP(1) [h0].[PathFromPatriarch]
    FROM [Halflings] AS [h0]
    WHERE [h0].[Name] = @__name_0).GetAncestor(1)

Running this query for the halfling "Bilbo" returns "Bungo".

Get the direct descendants of an entity

The following query also uses GetAncestor, but this time to find the direct descendants of a halfling, given that halfling's name:

IQueryable<Halfling> FindDirectDescendents(string name)
    => context.Halflings.Where(
        descendent => descendent.PathFromPatriarch.GetAncestor(1) == context.Halflings
            .Single(ancestor => ancestor.Name == name).PathFromPatriarch);

This translates to the following SQL:

SELECT [h].[Id], [h].[Name], [h].[PathFromPatriarch], [h].[YearOfBirth]
FROM [Halflings] AS [h]
WHERE [h].[PathFromPatriarch].GetAncestor(1) = (
    SELECT TOP(1) [h0].[PathFromPatriarch]
    FROM [Halflings] AS [h0]
    WHERE [h0].[Name] = @__name_0)

Running this query for the halfling "Mungo" returns "Bungo", "Belba", "Longo", and "Linda".

Get all ancestors of an entity

GetAncestor is useful for searching up or down a single level, or, indeed, a specified number of levels. On the other hand, IsDescendantOf is useful for finding all ancestors or dependents. For example, the following query uses IsDescendantOf to find the all the ancestors of a halfling, given that halfling's name:

IQueryable<Halfling> FindAllAncestors(string name)
    => context.Halflings.Where(
            ancestor => context.Halflings
                .Single(
                    descendent =>
                        descendent.Name == name
                        && ancestor.Id != descendent.Id)
                .PathFromPatriarch.IsDescendantOf(ancestor.PathFromPatriarch))
        .OrderByDescending(ancestor => ancestor.PathFromPatriarch.GetLevel());

Important

IsDescendantOf returns true for itself, which is why it is filtered out in the query above.

This translates to the following SQL:

SELECT [h].[Id], [h].[Name], [h].[PathFromPatriarch], [h].[YearOfBirth]
FROM [Halflings] AS [h]
WHERE (
    SELECT TOP(1) [h0].[PathFromPatriarch]
    FROM [Halflings] AS [h0]
    WHERE [h0].[Name] = @__name_0 AND [h].[Id] <> [h0].[Id]).IsDescendantOf([h].[PathFromPatriarch]) = CAST(1 AS bit)
ORDER BY [h].[PathFromPatriarch].GetLevel() DESC

Running this query for the halfling "Bilbo" returns "Bungo", "Mungo", and "Balbo".

Get all descendants of an entity

The following query also uses IsDescendantOf, but this time to all the descendants of a halfling, given that halfling's name:

IQueryable<Halfling> FindAllDescendents(string name)
    => context.Halflings.Where(
            descendent => descendent.PathFromPatriarch.IsDescendantOf(
                context.Halflings
                    .Single(
                        ancestor =>
                            ancestor.Name == name
                            && descendent.Id != ancestor.Id)
                    .PathFromPatriarch))
        .OrderBy(descendent => descendent.PathFromPatriarch.GetLevel());

This translates to the following SQL:

SELECT [h].[Id], [h].[Name], [h].[PathFromPatriarch], [h].[YearOfBirth]
FROM [Halflings] AS [h]
WHERE [h].[PathFromPatriarch].IsDescendantOf((
    SELECT TOP(1) [h0].[PathFromPatriarch]
    FROM [Halflings] AS [h0]
    WHERE [h0].[Name] = @__name_0 AND [h].[Id] <> [h0].[Id])) = CAST(1 AS bit)
ORDER BY [h].[PathFromPatriarch].GetLevel()

Running this query for the halfling "Mungo" returns "Bungo", "Belba", "Longo", "Linda", "Bingo", "Bilbo", "Otho", "Falco", "Lotho", and "Poppy".

Finding a common ancestor

One of the most common questions asked about this particular family tree is, "who is the common ancestor of Frodo and Bilbo?" We can use IsDescendantOf to write such a query:

async Task<Halfling?> FindCommonAncestor(Halfling first, Halfling second)
    => await context.Halflings
        .Where(
            ancestor => first.PathFromPatriarch.IsDescendantOf(ancestor.PathFromPatriarch)
                        && second.PathFromPatriarch.IsDescendantOf(ancestor.PathFromPatriarch))
        .OrderByDescending(ancestor => ancestor.PathFromPatriarch.GetLevel())
        .FirstOrDefaultAsync();

This translates to the following SQL:

SELECT TOP(1) [h].[Id], [h].[Name], [h].[PathFromPatriarch], [h].[YearOfBirth]
FROM [Halflings] AS [h]
WHERE @__first_PathFromPatriarch_0.IsDescendantOf([h].[PathFromPatriarch]) = CAST(1 AS bit)
  AND @__second_PathFromPatriarch_1.IsDescendantOf([h].[PathFromPatriarch]) = CAST(1 AS bit)
ORDER BY [h].[PathFromPatriarch].GetLevel() DESC

Running this query with "Bilbo" and "Frodo" tells us that their common ancestor is "Balbo".

Updating hierarchies

The normal change tracking and SaveChanges mechanisms can be used to update hierarchyid columns.

Re-parenting a sub-hierarchy

For example, I'm sure we all remember the scandal of SR 1752 (a.k.a. "LongoGate") when DNA testing revealed that Longo was not in fact the son of Mungo, but actually the son of Ponto! One fallout from this scandal was that the family tree needed to be re-written. In particular, Longo and all his descendants needed to be re-parented from Mungo to Ponto. GetReparentedValue can be used to do this. For example, first "Longo" and all his descendants are queried:

var longoAndDescendents = await context.Halflings.Where(
        descendent => descendent.PathFromPatriarch.IsDescendantOf(
            context.Halflings.Single(ancestor => ancestor.Name == "Longo").PathFromPatriarch))
    .ToListAsync();

Then GetReparentedValue is used to update the HierarchyId for Longo and each descendent, followed by a call to SaveChangesAsync:

foreach (var descendent in longoAndDescendents)
{
    descendent.PathFromPatriarch
        = descendent.PathFromPatriarch.GetReparentedValue(
            mungo.PathFromPatriarch, ponto.PathFromPatriarch)!;
}

await context.SaveChangesAsync();

This results in the following database update:

SET NOCOUNT ON;
UPDATE [Halflings] SET [PathFromPatriarch] = @p0
OUTPUT 1
WHERE [Id] = @p1;
UPDATE [Halflings] SET [PathFromPatriarch] = @p2
OUTPUT 1
WHERE [Id] = @p3;
UPDATE [Halflings] SET [PathFromPatriarch] = @p4
OUTPUT 1
WHERE [Id] = @p5;

Using these parameters:

 @p1='9',
 @p0='0x7BC0' (Nullable = false) (Size = 2) (DbType = Object),
 @p3='16',
 @p2='0x7BD6' (Nullable = false) (Size = 2) (DbType = Object),
 @p5='23',
 @p4='0x7BD6B0' (Nullable = false) (Size = 3) (DbType = Object)

Note

The parameters values for HierarchyId properties are sent to the database in their compact, binary format.

Following the update, querying for the descendants of "Mungo" returns "Bungo", "Belba", "Linda", "Bingo", "Bilbo", "Falco", and "Poppy", while querying for the descendants of "Ponto" returns "Longo", "Rosa", "Polo", "Otho", "Posco", "Prisca", "Lotho", "Ponto", "Porto", "Peony", and "Angelica".

Function mappings

.NET SQL
hierarchyId.GetAncestor(n) @hierarchyId.GetAncestor(@n)
hierarchyId.GetDescendant(child) @hierarchyId.GetDescendant(@child, NULL)
hierarchyId.GetDescendant(child1, child2) @hierarchyId.GetDescendant(@child1, @child2)
hierarchyId.GetLevel() @hierarchyId.GetLevel()
hierarchyId.GetReparentedValue(oldRoot, newRoot) @hierarchyId.GetReparentedValue(@oldRoot, @newRoot)
HierarchyId.GetRoot() hierarchyid::GetRoot()
hierarchyId.IsDescendantOf(parent) @hierarchyId.IsDescendantOf(@parent)
HierarchyId.Parse(input) hierarchyid::Parse(@input)
hierarchyId.ToString() @hierarchyId.ToString()

Additional resources