共用方式為


How It Works: Error 18056 - The client was unable to reuse a session - Part 2

I have had several questions on my blog post: https://blogs.msdn.com/b/psssql/archive/2010/08/03/how-it-works-error-18056-the-client-was-unable-to-reuse-a-session-with-spid-which-had-been-reset-for-connection-pooling.aspx related to SQL Server 2008's honoring of an query cancel (attention) during the processing of the reset connection.  This blog will augment my prior post.

Facts

  • You will not see the sp_reset_connection on the wire when tracing the network packets.   It is only a bit set in the TDS header and not RPC text in the packet.
  • sp_reset_connection is an internal operation and generates RPC events to show its activity.
  • Newer builds of SQL Server added logical disconnect and connect events. https://blogs.msdn.com/b/psssql/archive/2007/03/29/sql-server-2005-sp2-trace-event-change-connection-based-events.aspx
  • An attention from the client (specific cancel or query timeout) records the time it arrives (out-of-band) but the attention event is not produced until the query has ceased execution, honored the attention.   This makes the start time of the attention the received time, the end time the complete honor time and the duration how long it took to interrupt the execution, handle rollback operations if necessary and return control of the session to the client.

The questions normally center around the Error 18056, State 29 and how one can encounter it.   I have outlined the high level flow in the diagram below for producing the error.

The application will reuse a connection from the pool.   When this occurs the client driver will set the reset bit in the TDS header when the next command is executed.  In the diagram I used an ODBC example of SQLExecDirect.

  1. The command is received at the SQL Server, assigned to a worker and begins processing.  If the reset bit is located the sp_reset_connection logic is invoked.
  2. When tracing the RPC:Starting and logical Disconnect events are produced.
  3. The login is redone; checking permissions, making sure password has not expired, database still exists and is online, user has permission in the database and other validations take place.  
  4. Client explicitly cancels (SQLCancel) or query timeout is detected by client drivers and an attention is submitted to the SQL Server.   The attention is read by the SQL Server, starting time captured and the session is notified of the cancel request, STOP!  (Note:  This is often a point of confusion.  The overall query timeout applies to reset login and execution of the query in this scenario.)
  5. During all these checks the logic will also check to see if a query cancellation (attention) has arrived.  If so the Redo Login processing is interrupted the 18056 is reported and processing is stopped.
  6. The attention event is always produced after the completed event.  (When looking at a trace look for the attention event after the completed event to determine if the execution was cancelled.)  This allows the attention event to show the duration required to honor the attention.  For example, if SET_XACT_ABORT is enabled an attention will upgrade to a rollback of the transaction.   If it was a long running transaction the rollback processing could be significant.   Without SET_XACT_ABORT the attention interrupts processing as quickly as possible and leaves the transaction active.   The client is then responsible for the scope of the transaction.

image

The "If Cancelled" used by Redo Login is where the change occurs between SQL 2005 and SQL 2008.   The cancel was not checked as frequently in SQL 2005 so it was not honored until the command execution started.   SQL Server 2008 will honor the attention during the redo login processing. 

Here was an example that I received that will show the behavior.   Notice that the execution (rs.Open) is done asynchronously so control returns to the client as soon as the query is put on the wire to the SQL Server.   The cn.Cancel following the rs.Open will submit the attention for the request that was traveling to the SQL Server.   This will produce the same pattern as shown in the diagram above, interrupting the Redo Login.  If you were not using pooled connections the reset activity would not be taking place and the query itself would be interrupted.

dim cn

dim rs

set cn = CreateObject("ADODB.Connection")

set rs = CreateObject("ADODB.Recordset")

for i = 1 to 1000

                cn.Open "Provider=SQLNCLI10;Integrated Security=SSPI;Data Source=SQL2K8Server; initial catalog =whatever;"

                rs.ActiveConnection = cn

                rs.CursorLocation = 2

                             ‘ 48 = adAsyncExecute + adAsyncFetch

                rs.Open "select * from whatever", cn, 0, 1, 48

                cn.Cancel

                cn.Close

next

Internally an attention is raised as a 3617 error and handled by the SQL Server error handlers to stop execution of the request.   You can see the 3617 errors in the sys.dm_os_ring_buffers.  You can watch them with the trace exception events as well.

<Record id= "1715" type="RING_BUFFER_EXCEPTION" time="12558630"><Exception><Task address= 0x11B4D1B88</Task><Error>3617</Error><Severity>25</Severity><State>23</State><UserDefined>0</UserDefined></Exception><Stack

Bob Dorr - Principal SQL Server Escalation Engineer