Unexplainabe slow execution of a query

Question

Hi all,

I have two queries which return same output, except that query 1 runs in less than 1 second and query 2 runs in 26 minutes.

select OIdA into #test from A where A.User = 123456
SELECT AD.* FROM AD where AD.OIdA in (select OIdA from #test)
SELECT AD.* FROM AD, A WHERE A.OIdA = AD.OIdA AND A.User = 123456

Additional info

SQL Server 2016.
A and AD are large tables with millions of rows. Some records from tables A and AD are deleted on daily basis.
OIdA is primary key on A.
OIdAD is primary key on AD.
AD has an index on columns OIdA, OIdAD.

Thank you for your help.

Accepted Answer

It would have helped to see the actual execution plans, and not only the estimated ones. Often in cases like this, there are miss-estimates, and they are revealed by the actual plans.

It seems though, that the optimizer expects that the condition in table A to return 2525. The temp table on the other hand has an estimate of 255 rows, and I guess that estimate comes from a previous execution with the same value for UserID.

For that few rows, the optimizer thinks that a nested loop join + key lookup is the cheapest solution., but for the tenfold number of rows, the optimizer thinks it's cheaper to scan the table AD once for a hash joins. The optimizer is predisposed against loop joins and key lookups, and I would not have been so surprised, if AD had had some 10 million rows, but now we are talking 867 million. But one answer may be found in the estimate for how many rows the optimizer estimates the operation to produce in total: For the join with the temp table, it estimates 868000 rows, and for the direct join a whopping number of 12 million. 12 million key lookups is not fun.

What is the actual number returned?

As for why the estimates are out of whack, it could be that statistics are out of date, and you could try running UPDATE STATISTICS on the tables. And to get the most accurate statistics you could try

UPDATE STATISTICS A IX_A_User A WITH FULLSCAN
UPDATE STATISTICS AD IX_A WITH FULLSCAN

But I am not sure how much this is going to help. Also, why the nested loop join may be the best choice for this user id, it may be different for other user ids. Then again, I don't know how you actually use this query. May this particular user is the important one?

Answer

IX_A_User and IX_A are indexes. Are they suppose to be always up to date? Why would I need to update statistics on them?

The indexes are of course always up to date. However, the statistics are necessarily not. SQL Server updates these when enough changes have occurred, and by default it uses sampled statistics. Which often is good enough, but not always.

As a matter of fact, for these particular tables nested loop join + key lookup is the best solution.

I am not so sure with the given estimates. For this user, there is a gross over-estimate, but that makes me think that there are user with tons of rows. You can try this query (although it will as long time as query 2):

SELECT A."User", COUNT(*)
FROM   A
JOIN   AD ON AD.FOldA = A.OldA
GROUP  BY A."User"
ORDER BY 2 DESC

I suspect you will find some users with very many rows. Then again, these may be system users that you will never query about.

Rather than using the temp table, you can force the loop join by adding the hint

OPTION (LOOP JOIN)

to the end of the query.

Answer

I attached the query plans in XML format.

146929-query1.txt 146930-query2.txt

Unexplainabe slow execution of a query

2 additional answers