Maximum Row Size and Query Hints

In my last post (yes, that was two months ago), I gave an example of how a query hint could cause a query to fail.  In this post, I'll give another example of how query hints can cause problems.  As with my last post, this post was inspired by a question submitted by a reader.

SQL Server has a documented row size limit of 8060 bytes.  This row size limit applies specifically to the fixed width or in-row portion of variable width columns.  A web search yields plenty of articles and discussions about this limit.  If you try to create a table that exceeds this limit, you will encounter the following error:

CREATE TABLE T (A INT, B CHAR(8000), C CHAR(8000))

Msg 1701, Level 16, State 1, Line 1
Creating or altering table 'T' failed because the minimum row size would be 16011, including 7 bytes of internal overhead. This exceeds the maximum allowable table row size of 8060 bytes.

So, what does this row size limit have to do with query hints?  Let's start with the following simple join:

CREATE TABLE T1 (A INT, B CHAR(8000))
CREATE TABLE T2 (A INT, B CHAR(8000))
CREATE TABLE T3 (A INT, B CHAR(8000))

SELECT *
FROM T1 JOIN T2 ON T1.A = T2.A

  |--Hash Match(Inner Join, HASH:([T2].[A])=([T1].[A]), RESIDUAL:([T2].[A]=[T1].[A]))
       |--Table Scan(OBJECT:([T2]))
       |--Table Scan(OBJECT:([T1]))

So far, so good.  Now let's suppose that we want to sort the results of this query:

SELECT *
FROM T1 JOIN T2 ON T1.A = T2.A
ORDER BY T1.A

  |--Merge Join(Inner Join, MANY-TO-MANY MERGE:([T1].[A])=([T2].[A]), RESIDUAL:([T2].[A]=[T1].[A]))
       |--Sort(ORDER BY:([T1].[A] ASC))
       |    |--Table Scan(OBJECT:([T1]))
       |--Sort(ORDER BY:([T2].[A] ASC))
            |--Table Scan(OBJECT:([T2]))

The optimizer has switched to a merge join.  This plan may seem pretty reasonable.  SQL Server needs to sort the data, so why not before the join?  Recall that merge join is order preserving so SQL Server does not need to sort again after the join.  However, suppose that we really want the hash join after all.  We might be tempted to add a hint forcing it:

SELECT *
FROM T1 JOIN T2 ON T1.A = T2.A
ORDER BY T1.A
OPTION (HASH JOIN)

Msg 8618, Level 16, State 2, Line 1
The query processor could not produce a query plan because a worktable is required, and its minimum row size exceeds the maximum allowable of 8060 bytes. A typical reason why a worktable is required is a GROUP BY or ORDER BY clause in the query. If the query has a GROUP BY or ORDER BY clause, consider reducing the number and/or size of the fields in the clause. Consider using prefix (LEFT()) or hash (CHECKSUM()) of fields for grouping or prefix for ordering. Note however that this will change the behavior of the query.

What happened?  Recall that hash join is not order preserving.  Thus, SQL Server must sort after the join.  The sort requires a work table which must include all columns to be sorted.  In this example, the columns include T1.B and T2.B.  These two CHAR(8000) columns together exceed the 8060 byte row size limit, SQL Server cannot create the work table, and the query fails.  Note that the hash join itself is not a problem (the original query above succeeds with a hash join) but rather it is the sort that causes the failure.

Let's try another example:

SELECT *
FROM (T1 JOIN T2 ON T1.A = T2.A) JOIN T3 ON T1.A = T3.A

  |--Nested Loops(Inner Join, WHERE:([T2].[A]=[T3].[A]))
       |--Hash Match(Inner Join, HASH:([T3].[A])=([T1].[A]), RESIDUAL:([T1].[A]=[T3].[A]))
       |    |--Table Scan(OBJECT:([T3]))
       |    |--Table Scan(OBJECT:([T1]))
       |--Table Scan(OBJECT:([T2]))

Notice that this query uses one hash join and one nested loops join.  Suppose that we really want two hash joins.  We might again be tempted to use a hint:

SELECT *
FROM (T1 JOIN T2 ON T1.A = T2.A) JOIN T3 ON T1.A = T3.A
OPTION (HASH JOIN)

This query fails with the same 8618 error that we saw above.  Why?  Like a sort, a hash join also requires a work table.  In fact, a hash join requires work tables for both the build and probe (i.e., left and right inputs).  No matter which two tables SQL Server joins first, the results of that join, which includes two CHAR(8000) columns, must be joined with the third and final table.  This second join cannot use a hash join as SQL Server cannot build a work table that exceeds the 8060 byte row size limit.

There are many more examples possible where hints may cause a query to fail due to requiring a work table that exceeds the 8060 byte row size limit.  One solution to this problem is, of course, to remove the offending hint.  Another solution is to replace large fixed length CHAR columns with variable length VARCHAR columns.