Use self joins
So far, the joins we've used have involved different tables. There may be scenarios in which you need to retrieve and compare rows from a table with other rows from the same table. For example, in a human resources application, an Employee table might include information about the manager of each employee, and store the manager's ID in the employee's own row. Each manager is also listed as an employee.
EmployeeID
FirstName
ManagerID
1
Dan
NULL
2
Aisha
1
3
Rosie
1
4
Naomi
3
To retrieve the employee information and match it to the related manager, you can use the table twice in your query, joining it to itself for the purposes of the query.
SELECT emp.FirstName AS Employee,
mgr.FirstName AS Manager
FROM HR.Employee AS emp
LEFT OUTER JOIN HR.Employee AS mgr
ON emp.ManagerID = mgr.EmployeeID;
The results of this query include a row for each employee with the name of their manager. The CEO of the company has no manager. To include the CEO in the results, an outer join is used, and the manager name is returned as NULL for rows where the ManagerID field has no matching EmployeeID field.
Employee
Manager
Dan
NULL
Aisha
Dan
Rosie
Dan
Naomi
Rosie
There are other scenarios in which you'll want to compare rows in a table with different rows in the same table. As you've seen, it's fairly easy to compare columns in the same row using T-SQL, but the method to compare values from different rows (such as a row that stores a starting time, and another row in the same table that stores a corresponding stop time) is less obvious. Self-joins are a useful technique for these types of queries.
To accomplish tasks like this, you should consider the following guidelines:
- Define two instances of the same table in the FROM clause, and join them as needed, using inner or outer joins.
- Use table aliases to differentiate the two instances of the same table.
- Use the ON clause to provide a filter comparing columns of one instance of the table with columns from the other instance of the table.