Detailed Network Sequence for ASP.NET SQL Transactions

asked 2022-12-30T17:14:37.847+00:00
Nathon Dalton 1 Reputation point

I'm trying to find a resource that will explain the sequence that occurs (in detail) when an ASP.NET application using Entity Framework performs a transaction in SQL Server 2016. Specifically what I'm looking for is what are all of the actions that occur (i.e. .NET events, EF events, packets sent, packets received, protocols used, SQL events, etc.) when a developer updates a record in a table for instance.

I'll explain the situation I'm trying to resolve. I have a large (2+TB) SQL 2016 database running on a virtualized Windows 2012 Server. When SQL host CPU load is high (70%+), we begin to see EF connection counts climb and eventually break the max connection pool limit. There begin to be a large number of sleeping connections, long locks, timeouts, deadlocks and if left long enough... connections break. They keep trying to reconnect, but it's as if the SQL Server refuses to accept any connections. At this point the connection count is above 750 on the SQL Server. When the host CPU load is manageable, everything seems to run fine and connections remain stable.

I found this article that seems to sum up the symptoms I'm seeing pretty well. I'm thinking with the combination of high CPU load and the virtualization layer between the SQL host and SQL Server, there's enough latency that RPC calls are being delayed enough that SQL is slowing enough to start causing transactions to start backing up and causing an ever increasing problem.

However, I'm looking for a good way to see the problem in action, which means I need to understand the entire sequence that is occurring. Are there counters I can watch? Do I need to sniff the network? Any help would be appreciated.

https://sqlperformance.com/2019/07/sql-performance/unintended-side-effects-sleeping-sessions-holding-locks

SQL Server
SQL Server
A family of Microsoft relational database management and analysis systems for e-commerce, line-of-business, and data warehousing solutions.
8,454 questions
ASP.NET MVC
ASP.NET MVC
A Microsoft web application framework that implements the model-view-controller (MVC) design pattern.
574 questions
Windows Server Infrastructure
Windows Server Infrastructure
Windows Server: A family of Microsoft server operating systems that support enterprise-level management, data storage, applications, and communications.Infrastructure: A Microsoft solution area focused on providing organizations with a cloud solution that supports their real-world needs and meets evolving regulatory requirements.
424 questions
No comments
{count} votes

4 answers

Sort by: Most helpful
  1. answered 2022-12-30T19:09:40.987+00:00
    Bruce (SqlWork.com) 27,846 Reputation points

    It sounds like you have too issues

    1. Hitting max performance of Sqlserver. You can optimize your queries and or scale up Frye server
    2. You error handling does not release resources (maybe fails to dispose) and causes worse issues.

    If you are going to run a Sqlserver at max (not necessary bad), the clients need to detect overload and respond correctly, typically with holdbacks, or too busy responses

    You can use sql profiler to monitor the transactions and see dangling connections.

    No comments

  2. answered 2022-12-30T22:05:32.903+00:00
    Nathon Dalton 1 Reputation point

    Thanks Bruce. I did trace the connection string that is growing beyond the connection pool size and the code is closing and disposing of the connection inside of a finally block, so even under exceptions it should be disposing of the connections properly. The SQL Server itself isn't under maximum load though. It's the SQL's VM host that's showing > 70% CPU when this occurs. The SQL Server is showing 40-70% CPU. This is similar to what I saw in the article. The SQL Server is waiting on the commands to close our transactions.

    Your idea for adding code to handle times of max load with client-side messaging is a good idea. I'll look into that.

    Overall, I want to understand why we're running fine when the host's load is manageable, but exceeding the max connection pool size when it's high. The web server is practically idle at all times, even when we're at our peak usage. CPU, memory, IIS, .NET queues, everything is perfectly fine, so that's not a concern.

    We already have 30 CPUs and 768GB RAM on this instance. With this configuration we're running in the neighborhood of 70% CPU a lot of the time with regular peaks to 100% and this SQL Server is the only VM running on the host. We're planning to move off of virtualization soon to see if going to flat hardware will help. I know SQL is very latency sensitive.


  3. answered 2022-12-30T22:11:55.287+00:00
    Erland Sommarskog 67,481 Reputation points Microsoft MVP

    If SQL Server does not accept connections, this could because you have exhausted all worker threads. If you look in sys.dm_os_waiting_tasks, you will see a lot of entries with NULL in waiting_session_id and with a wait type of THREADPOOL.

    The symptoms you describe sounds to me like query timeouts that are not handled correctly. When you submit a batch to SQL Server and this batch starts a transaction, and this batch is later cancelled, the transaction will not rolled back. Most client APIs, and this includes SqlClient, have a default timeout of 30 seconds. If SQL Server does not complete in this time frame, they get tired of waiting and tells SQL Server to cancel execution, and throw the error "timeout expired". On this error (as with any other error for that matter), the client must respond with "IF @@trancount > 0 ROLLBACK TRANSACTION". If the process jogs along with the transaction, yes, you will get a mess.

    One tool you can use to see if you have something like this is my beta_lockinfo which you gives a snapshot of what is going on in SQL Server, including the locks.

    No comments

  4. answered 2023-01-04T16:57:21.783+00:00
    Nathon Dalton 1 Reputation point

    Thank you to everyone for the responses. We're actively hunting down the problem. I'll mark the most appropriate post as the answer or will provide the answer as soon as we're able to nail down the root cause. Thanks again!

    No comments