Optimizing I/O Performance by Sorting – Part 2
In my last post, I discussed how SQL Server can use sorts to transform random I/Os into sequential I/Os. In this post, I'll demonstrate directly how such a sort can impact performance. For the following experiments, I'll use the same 3 GByte database that I created last week.
The system I'm using to run this test has 8 GBytes of memory. To exaggerate the performance effects and simulate an even larger table that does not fit in main memory, I'm going to adjust the ‘MAX SERVER MEMORY' SP_CONFIGURE option to allow SQL Server to use just 1 GByte of memory. I'm going to use CHECKPOINT to ensure that the newly created database is completely flushed to disk before running any experiments. Finally, I'm going to run DBCC DROPCLEANBUFFERS before each test to ensure that none of the data is cached in the buffer pool between tests.
CHECKPOINT
EXEC SP_CONFIGURE 'SHOW ADVANCED OPTIONS', '1'
RECONFIGURE
EXEC SP_CONFIGURE 'MAX SERVER MEMORY', '1024'
RECONFIGUREDBCC DROPCLEANBUFFERS
Note that you will NOT want to run these statements on a production server.
As I discussed last week, SQL Server can use one of three plans for the following query depending on the value of the constant:
SELECT SUM(Data)
FROM T
WHERE RandKey < constant
To recap, if the constant is small, SQL Server uses a non-clustered index seek and a bookmark lookup. If the constant is large, SQL Server uses a clustered index scan to avoid performing many random I/Os. Finally, if the constant is somewhere in the middle, SQL Server uses the non-clustered index seek but sorts the rows prior to performing the bookmark lookup to reduce the number of random I/Os. You can review last week's post to see examples of each of these plans. I'm going to focus on the third and final plan with the sort.
To demonstrate the benefit of the sort, I need to be able to run the same query with and without the sort. A simple way to make SQL Server remove the sort is to use the following UPDATE STATISTICS statement to trick SQL Server into believing that the table is really small. To ensure that I still get the plan with the non-clustered index seek and the bookmark lookup, I need to add an INDEX hint. I'm also adding a RECOMPILE query hint to ensure that SQL Server generates a new plan after I've altered the statistics.
UPDATE STATISTICS T WITH ROWCOUNT = 1, PAGECOUNT = 1
SELECT SUM(Data)
FROM T WITH (INDEX (IRandKey))
WHERE RandKey < constant
OPTION (RECOMPILE)
I can also reset the statistics using the following statement:
UPDATE STATISTICS T WITH ROWCOUNT = 25600000, PAGECOUNT = 389323
Here is an example of the default plan with the real statistics and with the sort:
|--Compute Scalar(DEFINE:([Expr1003]=CASE WHEN [Expr1010]=(0) THEN NULL ELSE [Expr1011] END))
|--Stream Aggregate(DEFINE:([Expr1010]=COUNT_BIG([T].[Data]), [Expr1011]=SUM([T].[Data])))
|--Nested Loops(Inner Join, OUTER REFERENCES:([T].[PK], [Expr1009]) WITH UNORDERED PREFETCH)
|--Sort(ORDER BY:([T].[PK] ASC))
| |--Index Seek(OBJECT:([T].[IRandKey]), SEEK:([T].[RandKey] < (2000000)) ORDERED FORWARD)
|--Clustered Index Seek(OBJECT:([T].[PK__T__...]), SEEK:([T].[PK]=[T].[PK]) LOOKUP ORDERED FORWARD)
Here is an example of the plan after running UPDATE STATISTICS and without the sort:
|--Compute Scalar(DEFINE:([Expr1003]=CASE WHEN [Expr1009]=(0) THEN NULL ELSE [Expr1010] END))
|--Stream Aggregate(DEFINE:([Expr1009]=COUNT_BIG([T].[Data]), [Expr1010]=SUM([T].[Data])))
|--Nested Loops(Inner Join, OUTER REFERENCES:([T].[PK]))
|--Index Seek(OBJECT:([T].[IRandKey]), SEEK:([T].[RandKey] < (2000000)) ORDERED FORWARD)
|--Clustered Index Seek(OBJECT:([T].[PK__T__...]), SEEK:([T].[PK]=[T].[PK]) LOOKUP ORDERED FORWARD)
Here are my results running this query with two values of the constant both with and without the sort. Keep in mind that these results depend greatly on the specific hardware. If you try this experiment, your results may vary.
|
Execution Time |
% Increase | ||
with Sort |
without Sort | |||
Constant |
2,000,000(0.2% of rows) |
91 seconds |
352 seconds |
286% |
4,000,000(0.4% of rows) |
97 seconds |
654 seconds |
574% | |
% Increase |
100% |
6% |
86% |
|
There are a two points worth noting regarding these results. First, it should be very clear that the plan with the sort is significantly faster (up to 7 times faster) than the plan without the sort. This result clearly shows the benefit of sequential vs. random I/Os. Second, doubling the number of rows touched had hardly any effect on the execution time for the plan with the sort but nearly doubled the execution time for the plan without the sort. Adding additional I/Os to the plan with the sort adds only a small incremental cost since the I/Os are sequential and the disk head will pass over the required data exactly once either way. Adding additional I/Os to the plan without the sort adds additional disk seeks and increases the execution time proportionately to the increase in the number of rows. In fact, if the constant is increased further, the execution time of the plan with the sort will continue to increase only gradually with the execution time of the plan without the sort will continue to increase rapidly.
Comments
Anonymous
March 04, 2009
PingBack from http://www.clickandsolve.com/?p=17808Anonymous
March 18, 2009
In my past two posts, I explained how SQL Server may add a sort to the outer side of a nested loops joinAnonymous
November 01, 2010
Hi, what about the same test on SSD disks? It will be interesting...Anonymous
November 04, 2010
Alas, I do not have access to a machine with SSDs on which to re-run this experiment. However, I agree that sorting is not likely to help with SSDs which have much better random I/O performance compared to compared to rotating disks. (FWIW, if the table is large enough that it does not fit in memory and if a query accesses each page more than once, there is always the chance that sorting will help by ensuring that pages are not discarded from memory before the query is done with them.)Anonymous
September 16, 2012
I do not agree with calling the IO patterns after sorting "sequential". The IO locations are ever-increasing but they are not sequential. Sorting reduces the seek time but a seek still happens. Seeks are not free on SSDs either. SSDs still become 10x slower when going from sequential to random. (Disks get 100x slower.)