SQL Server; What is sql server hash join, loop join, merge join, nested loop

Question

SQL Server; What is sql server hash join, loop join, merge join, nested loop

T.Zacks 3,996

Need a explanation when sql server uses hash join, loop join, merge join, nested loop ?

please give me few sample sql for which sql server uses hash join, loop join ,merge join, nested loop.

please guide me. thanks

Accepted answer

0 additional answers

Your answer

Answer 1

A full treatise of this topic could fill a book. Here is a very simplified summary: The examples run in the Northjgale database. To build it, first create Northwind, which you find on https://www.sommarskog.se/dynsearch-2008/Northwind.sql.txt. Once you have created it, you can run https://www.sommarskog.se/dynsearch-2008/Northgale.sql.txt.

Nested Loops is good when the optimizer assume that a smaller number of rows will be hit in the inner table. For instance:

   SELECT O.OrderID, SUM(OD.Quantity * OD.UnitPrice)  
   FROM   Orders O  
   JOIN   "Order Details" OD ON O.OrderID = OD.OrderID  
   WHERE  O.CustomerID = 'VINET'  
   GROUP  BY O.OrderID

First SQL Server finds the Order rows for this Customer, then for each order looks up the order lines in the details table.

But nested loops is not good when the condition hits many rows in the inner table. In that case, it is better to scan at least one of the tables and build a hash table for the join. In the query above, replace VINET with ERNTC, a customer that accounts for 10% for the orders in Northgale.

Merge Join is an alternative to Hash Join, which is useful when the inputs are sorted in the same way. This query results in a Merge Join:

   SELECT O.OrderID, SUM(OD.Quantity * OD.UnitPrice)  
   FROM   Orders O  
   JOIN   "Order Details" OD ON O.OrderID = OD.OrderID  
   WHERE  O.OrderID BETWEEN 11000 AND 30000  
   GROUP  BY O.OrderID

Because so many orders are retrieved, the optimizer deems a Loop Join to be too expensive. But since both tables have their clustered index on the OrderID, SQL Server can scan through the tables in order, which is cheaper than running a Hash Join. But had the clustered index on Order Details been on the ProductID a Merge Join would not be possible. (Or more precisely, it had a require a Sort operator before the Merge Join, which the optimizer may deem too expensive.)

Share via

SQL Server; What is sql server hash join, loop join, merge join, nested loop

0 additional answers

Your answer