SQL Server Query Optimizer estimates ~1 row, but eq_rows statistics shows ~12000

PetervdH 61 Reputation points
2021-12-17T19:45:26.623+00:00

I have a (composite) index consisting of columns A, B, C and D. The statics are up to date and show EQ_ROWS value 12000 for value 'x' (that is the RANGE_HI_KEY) for column A. When I do a SELECT * FROM <table> WHERE a = 'x' the Estimated Query Plan shows for all operators (index seek on the mentioned composite index, key lookup and nested loop) an estimated number of rows of ~1. How is this possible? I would have thought it to be 12000 for the index seek for example.

Note that in reality my column names have decent names, but for company security reason I cannot use real names. I hope my explanation is clear enough.

SQL Server | Other
{count} votes

Answer accepted by question author
  1. Erland Sommarskog 128.9K Reputation points MVP Volunteer Moderator
    2021-12-24T14:47:48.963+00:00

    I am afraid that I don't have a good answer for you. I know that there are imperfections in Showplan and sometimes information does not display fully accurately. In this case, I notice that there are a few more filter conditions on the DISPENSEEVENT table than just the CLIENTID. The remain predicates are filtered in the Key Lookup, which is just below the Index Seek. It is possible that the total estimate for the predicates is 1 row.

    In any case, it seems to be that there is all reason to consider adding an index on (CLIENTID, MDEMEDICALGUIDANCESTATEID, DISPENSEEVENTPROCESSORIGINID, PARENTDISPENSEEVENTID, MDEAPPROVALSTATEID) INCLUDE (STATE). I need to add the caveat that I only see this query, and I don't know how dynamic these conditions are. That is, if this query exists in multiple permutations, it may not be wise to add indexes for all of them.

    0 comments No comments

12 additional answers

Sort by: Most helpful
  1. PetervdH 61 Reputation points
    2021-12-23T21:36:46.313+00:00

    Hi @Erland Sommarskog , I encountered the situation again in our production system as I described originally. The statistics are a bit different since they are updated overnight, but again a plan for which the estimated rows is 1. So, the parameter value for CLIENTID used to compile the plan is 10814, it uses the index INDX3 with seek preference CLIENTID, the 10814 has 2070,34 as EQ_ROWS. Then, why is the estimated rows not equal to the EQ_ROWS, but 1? I would like to attach the plan and two spreadsheets with histogram data (I have been able to anonymize it), since you asked for that, but it seems these formats (xlsx and sqlplan) are not allowed, so not sure how to do this. Any suggestions how I can get you the files?


  2. PetervdH 61 Reputation points
    2021-12-24T10:08:57.35+00:00

    Hello ErlandSommarskog, I have made screenshots of the excel files. In the indx3 histogram I have skipped a couple of lines to be able to fit it on one screenshot. It just doesn't accept uploads, whatever suffix I try to change it to. I hope this provides enough information for you.

    160337-histogram-indx3.jpg160361-histogram-table.jpg

    0 comments No comments

  3. PetervdH 61 Reputation points
    2021-12-24T23:18:27.767+00:00

    Hi @Erland Sommarskog , thank you very much for your time. I will have a look at whether I can find out if the predicates in the Key Lookup are the cause of this. This is an interesting suggestion to further research. It may take some time, but if I manage to get to the bottom of this, I will certainly let you know what I have found out.


  4. PetervdH 61 Reputation points
    2022-01-04T19:37:54.723+00:00

    Hi @Erland Sommarskog , I had this exact case popping up again in production, so I was able to test your suggestion (comment out other conditions). There was no change to the query plan, it kept doing the exact 1 row estimate. There was one other thing I noticed a few days ago, namely that some of the predicates of the key lookup had no automatically created statistics object. That seems a bit odd to me, since I would have thought the optimizer would have created them since these columns are used as predicates, but apparently the optimizer didn't think this would be useful at all and therefor didn't create them?


Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.