Here's a solution using SQL Server Graph Database Capabilities.
What we're after here is called an Equivalence Relation:
In mathematics, an equivalence relation is a binary relation that is reflexive, symmetric and transitive. The relation "is equal to" is the canonical example of an equivalence relation, where for any objects a, b, and c:
a = a (reflexive property),
if a = b then b = a (symmetric property), and
if a = b and b = c, then a = c (transitive property).As a consequence of the reflexive, symmetric, and transitive properties, any equivalence relation provides a partition of the underlying set into disjoint equivalence classes. Two elements of the given set are equivalent to each other, if and only if they belong to the same equivalence class.
https://en.wikipedia.org/wiki/Equivalence_relation
So we're looking to compute the "equivilence class" that each Employee belongs to.
In SQL Server a Graph edge table stores edges of a Directed Graph, and a directed graph can be used to model a binary relation. It also provides graph query extensions to TSQL, including SHORTEST_PATH which can be used to calculate the transitive closure of the relation modeled by the graph. So to make the graph model an equivalence relation, we just need to add all the inverse edges, and the identity edges.
eg:
--drop table if exists DuplicateOf
--drop table if exists Employee
CREATE TABLE Employee (EmployeeID varchar(50)) AS NODE;
CREATE TABLE DuplicateOf AS EDGE;
insert into Employee(EmployeeID) values ('E1'),('E2'),('E3'),('E4'),('E5'),('E6'),('E7'),('E8')
INSERT INTO DuplicateOf VALUES
(
(SELECT $node_id FROM Employee WHERE EmployeeID = 'E1'), (SELECT $node_id FROM Employee WHERE EmployeeID = 'E2')
),
(
(SELECT $node_id FROM Employee WHERE EmployeeID = 'E2'), (SELECT $node_id FROM Employee WHERE EmployeeID = 'E3')
),
(
(SELECT $node_id FROM Employee WHERE EmployeeID = 'E4'), (SELECT $node_id FROM Employee WHERE EmployeeID = 'E1')
),
(
(SELECT $node_id FROM Employee WHERE EmployeeID = 'E5'), (SELECT $node_id FROM Employee WHERE EmployeeID = 'E6')
);
--add the inverse of each edge to model a symmetric relation
INSERT DuplicateOf($from_id, $to_id)
SELECT $to_id, $from_id
FROM DuplicateOf;
--add add each employeeid to make model a reflexive relation
INSERT DuplicateOf($from_id, $to_id)
SELECT $node_id, $node_id
FROM Employee;
with q as
(
SELECT
e1.EmployeeID,
last_node =
LAST_VALUE(e2.EmployeeID)
WITHIN GROUP (GRAPH PATH)
FROM
dbo.Employee AS e1,
dbo.DuplicateOf FOR PATH AS dup,
dbo.Employee FOR PATH AS e2
WHERE MATCH(SHORTEST_PATH(e1(-(dup)->e2)+))
)
SELECT
q.EmployeeID,
MainEmployeeId = min(q.last_node),
EquivilenceClass =
STRING_AGG(q.last_node, ',')
WITHIN GROUP (ORDER BY q.last_node)
FROM q
GROUP BY q.EmployeeID;
outputs
EmployeeID MainEmployeeId EquivilenceClass
------------- ----------------- -------------------
E1 E1 E1,E2,E3,E4
E2 E1 E1,E2,E3,E4
E3 E1 E1,E2,E3,E4
E4 E1 E1,E2,E3,E4
E5 E5 E5,E6
E6 E5 E5,E6
E7 E7 E7
E8 E8 E8