SQL Server; What is sql server hash join, loop join, merge join, nested loop

T.Zacks 3,996 Reputation points
2022-09-17T18:40:59.603+00:00

Need a explanation when sql server uses hash join, loop join, merge join, nested loop ?

please give me few sample sql for which sql server uses hash join, loop join ,merge join, nested loop.

please guide me. thanks

Azure SQL Database
Developer technologies Transact-SQL
0 comments No comments
{count} votes

Accepted answer
  1. Erland Sommarskog 121.4K Reputation points MVP Volunteer Moderator
    2022-09-17T19:20:27.597+00:00

    A full treatise of this topic could fill a book. Here is a very simplified summary: The examples run in the Northjgale database. To build it, first create Northwind, which you find on https://www.sommarskog.se/dynsearch-2008/Northwind.sql.txt. Once you have created it, you can run https://www.sommarskog.se/dynsearch-2008/Northgale.sql.txt.

    Nested Loops is good when the optimizer assume that a smaller number of rows will be hit in the inner table. For instance:

       SELECT O.OrderID, SUM(OD.Quantity * OD.UnitPrice)  
       FROM   Orders O  
       JOIN   "Order Details" OD ON O.OrderID = OD.OrderID  
       WHERE  O.CustomerID = 'VINET'  
       GROUP  BY O.OrderID  
    

    First SQL Server finds the Order rows for this Customer, then for each order looks up the order lines in the details table.

    But nested loops is not good when the condition hits many rows in the inner table. In that case, it is better to scan at least one of the tables and build a hash table for the join. In the query above, replace VINET with ERNTC, a customer that accounts for 10% for the orders in Northgale.

    Merge Join is an alternative to Hash Join, which is useful when the inputs are sorted in the same way. This query results in a Merge Join:

       SELECT O.OrderID, SUM(OD.Quantity * OD.UnitPrice)  
       FROM   Orders O  
       JOIN   "Order Details" OD ON O.OrderID = OD.OrderID  
       WHERE  O.OrderID BETWEEN 11000 AND 30000  
       GROUP  BY O.OrderID  
    

    Because so many orders are retrieved, the optimizer deems a Loop Join to be too expensive. But since both tables have their clustered index on the OrderID, SQL Server can scan through the tables in order, which is cheaper than running a Hash Join. But had the clustered index on Order Details been on the ProductID a Merge Join would not be possible. (Or more precisely, it had a require a Sort operator before the Merge Join, which the optimizer may deem too expensive.)

    2 people found this answer helpful.
    0 comments No comments

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.