Send a email report of query store regressed/query with high variations

udhayan d 181 Reputation points
2023-02-24T11:40:32.9333333+00:00

Hi ,

I recently faced a P1 situation due to 100% CPU usage in my PRINCIPAL DB.

Root cause is the regressed query plan. The execution plan changed from index seek to a scan and the execution time spiked from <3ms to >30s.

I still dont know the exact reason for the execution plan change because there was no schema change or update stats on that table before the plan change and only difference I notice from query store is the execution count increased from ~600 to 1000. I have forced the known good plan.

1st Question: what are the factors/causes that trigger a plan change automatically in SQL 2017 STD. What could have triggered a query plan change? How to identify that?

2nd Question: How can I avoid this situation in the future?

I am thinking of getting an email alert with the report of regressed/Query with high variations based on the CPU usage. Is it possible to get those details from the dmvs related to query store? Is there already solution in any other forums/blogs

I am using STD edition, so automatic plan correction is not an option for me.

I dont think I will be able to control the Optimizer behavior of changing the query execution plan, but if I were to face a similar situation again, if I have a email alert with the report highlighting queries with high variation I can take a look at it and action before the issue escalates into a P1 bringing down SQL server to its knees.

SQL Server
SQL Server
A family of Microsoft relational database management and analysis systems for e-commerce, line-of-business, and data warehousing solutions.
12,816 questions
{count} votes

Accepted answer
  1. Erland Sommarskog 101.9K Reputation points MVP
    2023-02-24T16:01:35.5666667+00:00

    because there was no schema change or update stats

    How did you verify that the last part? Keep in mind that autostats can kick in often. In my experience, it is uncommon that queries run with multiple plans during the course of a day. This is particularly true with temp tables, since they often trigger recompilation.

    For a list of other recompile reasons, see this query:

    SELECT * FROM sys.trace_subclass_values WHERE trace_event_id = 166 AND trace_column_id = 21
    
    

    As for avoiding this in the future, you need to review your queries very carefully. Yes, that can be quite a daunting task. You first need to understand why this plan changed. The interetsing part is not what triggered the recompile, but why you got such a bad plan. Most likely the query was sniffed for a value that is untypical. So while the plan was good for the sniffed parameter value, it was not good for more commonly used parameter values.

    I'm not sure setting up a mail alert is the way to do. There is a report in SSMS for regressed queries. You can also write your own queries against the Query Store tables.

    Further reading: I have written about parameter sniffing in this article: https://www.sommarskog.se/query-plan-mysteries.html.

    Enrico van der Laar has a series of article on Query Store on http://www.simple-talk.com/sql/database-administration/the-sql-server-2016-query-store-overview-and-architecture/. I think this includes some overview over the QS tables.


0 additional answers

Sort by: Most helpful