Unexplainabe slow execution of a query

Igor Gelin 21 Reputation points
2021-11-05T23:54:29.723+00:00

Hi all,

I have two queries which return same output, except that query 1 runs in less than 1 second and query 2 runs in 26 minutes.

  1. select OIdA into #test from A where A.User = 123456
    SELECT AD.* FROM AD where AD.OIdA in (select OIdA from #test)
  2. SELECT AD.* FROM AD, A WHERE A.OIdA = AD.OIdA AND A.User = 123456

Additional info

  1. SQL Server 2016.
  2. A and AD are large tables with millions of rows. Some records from tables A and AD are deleted on daily basis.
  3. OIdA is primary key on A.
  4. OIdAD is primary key on AD.
  5. AD has an index on columns OIdA, OIdAD.

Thank you for your help.

Transact-SQL
Transact-SQL
A Microsoft extension to the ANSI SQL language that includes procedural programming, local variables, and various support functions.
4,536 questions
{count} votes

Accepted answer
  1. Erland Sommarskog 100.1K Reputation points MVP
    2021-11-07T11:24:07.567+00:00

    It would have helped to see the actual execution plans, and not only the estimated ones. Often in cases like this, there are miss-estimates, and they are revealed by the actual plans.

    It seems though, that the optimizer expects that the condition in table A to return 2525. The temp table on the other hand has an estimate of 255 rows, and I guess that estimate comes from a previous execution with the same value for UserID.

    For that few rows, the optimizer thinks that a nested loop join + key lookup is the cheapest solution., but for the tenfold number of rows, the optimizer thinks it's cheaper to scan the table AD once for a hash joins. The optimizer is predisposed against loop joins and key lookups, and I would not have been so surprised, if AD had had some 10 million rows, but now we are talking 867 million. But one answer may be found in the estimate for how many rows the optimizer estimates the operation to produce in total: For the join with the temp table, it estimates 868000 rows, and for the direct join a whopping number of 12 million. 12 million key lookups is not fun.

    What is the actual number returned?

    As for why the estimates are out of whack, it could be that statistics are out of date, and you could try running UPDATE STATISTICS on the tables. And to get the most accurate statistics you could try

    UPDATE STATISTICS A IX_A_User A WITH FULLSCAN
    UPDATE STATISTICS AD IX_A WITH FULLSCAN
    

    But I am not sure how much this is going to help. Also, why the nested loop join may be the best choice for this user id, it may be different for other user ids. Then again, I don't know how you actually use this query. May this particular user is the important one?

    1 person found this answer helpful.

2 additional answers

Sort by: Most helpful
  1. Erland Sommarskog 100.1K Reputation points MVP
    2021-11-07T19:59:55.293+00:00

    IX_A_User and IX_A are indexes. Are they suppose to be always up to date? Why would I need to update statistics on them?

    The indexes are of course always up to date. However, the statistics are necessarily not. SQL Server updates these when enough changes have occurred, and by default it uses sampled statistics. Which often is good enough, but not always.

    As a matter of fact, for these particular tables nested loop join + key lookup is the best solution.

    I am not so sure with the given estimates. For this user, there is a gross over-estimate, but that makes me think that there are user with tons of rows. You can try this query (although it will as long time as query 2):

    SELECT A."User", COUNT(*)
    FROM   A
    JOIN   AD ON AD.FOldA = A.OldA
    GROUP  BY A."User"
    ORDER BY 2 DESC
    

    I suspect you will find some users with very many rows. Then again, these may be system users that you will never query about.

    Rather than using the temp table, you can force the loop join by adding the hint

    OPTION (LOOP JOIN)
    

    to the end of the query.

    1 person found this answer helpful.

  2. Igor Gelin 21 Reputation points
    2021-11-07T01:10:07.68+00:00

    I attached the query plans in XML format.

    146929-query1.txt146930-query2.txt

    0 comments No comments