Why Restarting SQL Server fixed all the query performance issues and CPU decreased from 100% to 20%

Ashif Shaikh 156 Reputation points
2021-04-06T21:16:38.233+00:00

Hello All,

Just intrigued to get some insights on why all the query performance issue with my SQL Server 2016 SP2 CU5 (Enterprise) was fixed by just a SQL Server restart.

Some background:

We had a network degradation starting from 7:30 AM however it was resolved by 10:30 AM but unfortunately the queries were still behaving badly and became worst once the load (throughput) increased on the SQL Server.

We have a 5 nodes Always on Cluster with 3 nodes in a Data Center and 2 nodes in AWS and the network degradation happened between the physical data center and AWS (Writer node is in the physical data center and 30% of the traffic comes from AWS)

I got the following error during the network issue at 10 AM:

DESCRIPTION: A fatal error occurred while reading the input stream from the network. The session will be terminated (input error: 10060, output error: 0).

Apart from this particular error there were no other errors specifically pointing to any degradation such as memory pressure\disk IO etc.

During the high CPU on the DB all I can see that even the smallest query (querying tables on few rows) is also taking time.

There were no memory pressure alerts\errors and disk was also performing optimally.

IMO it seems to be related to plan cache which eventually got cleared after the restart but how can I prove it??

SQL Server
SQL Server
A family of Microsoft relational database management and analysis systems for e-commerce, line-of-business, and data warehousing solutions.
12,700 questions
SQL Server Migration Assistant
SQL Server Migration Assistant
A Microsoft tool designed to automate database migration to SQL Server from Access, DB2, MySQL, Oracle, and SAP ASE.
494 questions
Transact-SQL
Transact-SQL
A Microsoft extension to the ANSI SQL language that includes procedural programming, local variables, and various support functions.
4,552 questions
{count} votes

2 answers

Sort by: Most helpful
  1. Erland Sommarskog 100.9K Reputation points MVP
    2021-04-06T21:37:02.76+00:00

    First, you are a bit behind on the CUs. The most recent CU is CU17, you can get it here: https://www.microsoft.com/en-us/download/details.aspx?id=56975.

    Yes, it sounds like a plan-cache issue. That is, you have a bad plan in the cache, and by restarting SQL Server you clear everything out.

    To prove/disprove this hypothesis, next time issue DBCC FREEPROCCACHE to clear the plan cache. If this fixes the issue, it was the plan cache. In such case, Query Store is your friend to find the culprit. (If you have not enabled Query Store, do so!)

    If FREEPROCCACHE does not help, it is obviously something else. I would start with looking into the SQL Server errorlog to see what is reported there. And I would definitely slap on CU17, as it could be an issue that has been fixed.

    1 person found this answer helpful.
    0 comments No comments

  2. Ashif Shaikh 156 Reputation points
    2021-04-06T22:01:38.997+00:00

    84997-wait-type.png

    Some more info on the waits during the issue.

    0 comments No comments