SQL Server Performance Analysis Part IV
PART IV: "How To" get down to whatever is bringing SQL to its knees...
At this point, one should know already whether or not they have memory pressure / cpu bottleneck / disk issues (previous blog posts). With that information, we now filter down and look for the cause.
What is truly causing SQL Server's performance to degrade:
SQL Blocker script:
https://support.microsoft.com/kb/298475
https://support.microsoft.com/kb/271509/
Suport WebCast: Microsoft SQL Server: Rapid Blocker Script Analysis
https://support.microsoft.com/kb/324885
This is a great script that provides us with a lot more information than we give it credit for.
Analyzing this script can be tedious - depending on it size - so we recommend using SQL Nexus (explanation of this tool on another blog post)
However should you analyze it using a editor, well then this is what we can get from it:
1. Blocking Information
If SQL Server is suffering from blocking - this will be the data that will help you identify what spids / what commands are involved. Please note that profiler should be used in conjunction with the blocker script in order to get the complete statement / command being executed.
The blocker script is takes a snapshot of sysprocesses every (default) 5 seconds and identifies blocking the following manner:
SPIDs at the head of blocking chains
spid
------
85
92
175
One can then look at see what the following spids where executing and why was this causing blocking. The two most obvious (and common) reasons for this is
1. Un-optimized queries - one can then look into profiler trace (performance events) or even execution plans of the commands
2. Spids "waiting for command" - spids that are waiting for user input - this could be from a commit to rollback to any other command. This spid is basically sitting on locks and waiting for a command to either let go or continue. The open_transactions will equal to 1 in this case.
2. Execution Information:
This script also gives us information in regards to CPU, duration, reads, writes, and other data relevant to the execution of the statement and work that spid has been doing (should he be executing a batch of statements). These counters are also a good way to see if a spid (not involved in locks but possible head blocker) is still doing any work - if the counters are still increasing then the spid is still alive.
Other useful information is the hostname (pinpoint source), dbid (where is all the action happening) and loginname.
NOTE:
DBID is quite handy as you many have a situation in which you don't have blocking but your SQL's performance is slow. Through this script you can see what activity is going on your server and on which database. If you see lots of spids on tempdb, you may have a tempdb contention. This is one example of how useful the information given by this script is.
https://support.microsoft.com/kb/328551 --> section MORE INFORMATION
"From the sysprocesses system table output, the waitresource may show up as "2:1:1" explain dbid, fileid , page num (PFS Page) or "2:1:3" (SGAM Page). Depending on the degree of contention, this may also lead to SQL Server appearing unresponsive for short periods."
Note:
Waitresource gives information in regards to the object involved:
example:
waitresource=KEY: 9:72057594065256448 (ef008dce9cbe)
waitresource=KEY: 6:72057594057457664
dbid=6 objectname=AdventureWorks.dbo.T2
associatedObjectId=72057594057457664
keylock hobt_id=72057594057457664From the Hobt_id we get all information:
SELECT o.name, i.name
FROM sys.partitions p
JOIN sys.objects o ON p.object_id = o.object_id
JOIN sys.indexes i ON p.object_id = i.object_id
AND p.index_id = i.index_id
WHERE p.hobt_id = 72057594065256448example 2:
Resource Format Example
Table DatabaseID:ObjectID TAB: 5:261575970
In this case, database ID 5 is the pubs sample database and object ID 261575970 is the titles table.
Page DatabaseID:FileID:PageID PAG: 5:1:104
In this case, database ID 5 is pubs, file ID 1 is the primary data file, and page 104 is a page belonging to the titles table.
Key DatabaseID:ObjectID:IndexID (Hash value for index key) KEY: 5:261575970:1 (5d0164fb1eac)
In this case, database ID 5 is pubs, object ID 261575970 is the titles table, index ID 1 is the clustered index, and the hash value indicates the index key value for the particular row.https://msdn.microsoft.com/en-us/library/ms178104.aspx
This leads to the third most interesting aspect of this script result:
3. Waittypes | WaitResource | Waittime
The information provided here can tell us quite a lot of what is going on.
Waittypes: |
The value in the waittype field indicates the type of resource that the connection is waiting on. |
WaitResources: |
The waitresource column provides more detailed information about the specific resource that the SPID is waiting on. |
Waittime: |
The waittime column provides the number of milliseconds that the SPID has been waiting with the current waittype. |
SQL Server 2000 Wait Types
https://www.sqldev.net/misc/waittypes.htm
Description of the waittype and lastwaittype columns in the master.dbo.sysprocesses table in SQL Server 2000 and SQL Server 2005
https://support.microsoft.com/kb/822101
Best way to explain how useful this information is, is by providing an example(s):
1. Tempdb Contention
Spid |
Status |
Blocked |
Open_Transactions |
WaitResource |
Waittype |
142 |
Suspended |
152 |
2 |
2:1:6508 |
0x0034 |
144 |
Suspended |
152 |
2 |
2:1:6508 |
0x0034 |
152 |
Runnable |
2 |
2:1:6508 |
0x0000 |
Most of the SPIDS are blocked with waitresource = '2:1:X' (example) and waittype as "PAGELATCH_XX".
This seems to contribute to majority of the blocking. The wait resource the '2:1:X' indicates blocking on SGAM page in the tempdb database.
2. Running out of Threads
SYSPROC FIRST PASS
Spid |
Ecid |
Waittype |
17 |
0 |
0x0046 |
52 |
0 |
0x0046 |
90 |
0 |
0x020B |
165 |
0 |
0x0046 |
190 |
0 |
0x0208 |
...
227 |
24 |
0x0208 |
227 |
25 |
0x0046 |
227 |
26 |
0x0208 |
232 |
55 |
0x0046 |
232 |
56 |
0x0046 |
235 |
0 |
0x0800 |
...
244 |
40 |
0x0208 |
247 |
0 |
0x0046 |
248 |
0 |
0x0046 |
268 |
0 |
0x0046 |
270 |
0 |
0x0208 |
...
270 |
22 |
0x0208 |
270 |
23 |
0x0046 |
270 |
24 |
0x0208 |
UMSTHREAD 0x46 This waittype indicates that a batch has been received from a client application but that there are no worker threads that are available to service the request. If you consistently see 0x0046 waittypes for multiple SPIDs, there is a significant bottleneck elsewhere in the system that is using all the available worker threads. Note that the waittime column is always 0 for the UMSTHREAD waittype, and the lastwaittype column may erroneously show the name of a different waittype instead of UMSTHREAD."
Note:
Execution context identifier (ECID). The execution context ID of a given thread associated with a specific SPID.
ECID = {0,1,2,3, ...n}, where 0 always represents the main or parent thread, and {1,2,3, ...n} represent the subthreads.
In this case, we would need to analyze why we are running out of threads - or better yet - why aren't the other spids letting go of theirs.
3. CXPacket
Spid |
Kpid |
Waittype |
Waittime |
lastwaittype |
<SPID> |
4224 |
0x00BA |
546 |
CXPACKET |
<SPID> |
148 |
0x00BA |
546 |
CXPACKET |
<SPID> |
5764 |
0x00BA |
1234 |
CXPACKET |
<SPID> |
6008 |
0x00BA |
718 |
CXPACKET |
<SPID> |
3164 |
0x00BA |
5093 |
CXPACKET |
<SPID> |
268 |
0x00BA |
5125 |
CXPACKET |
<SPID> |
1760 |
0x00BA |
5125 |
CXPACKET |
<SPID> |
5704 |
0x00BA |
5078 |
CXPACKET |
Note: Many times the CXPacket spid will be in the "suspended" state.
A cxpacket wait basically means that part of the job is done but is waiting on another part of the job to finish. Queries that are involved in these wait types have a cost threshold longer than the (default) 5 second hence they need to be optimized to run quicker (indexes / tuning) and maxdop needs to configured correctly (see previous posts).
Please note that this script should not be used solemnly if you suffer from blocking issues. This script provides us with a look at what is going on with SQL for a duration of time. It tells us who (spid / hostname/ login name / application name), what (commands / reads / writes / duration / cpu), where (dbid), and when (start time / end time) is happening in SQL.
Next Blog Post: Part V – PerfStats ….
Comments
- Anonymous
January 01, 2003
PingBack from http://microsoft.linkedz.info/2009/04/29/sql-server-performance-analysis-part-iv/