SQL Q+APredicates, Stored Procedures, and More
Edited by Nancy Michell
The Order of Your Predicates
Q I'm having a problem with SQL Server 2000. I am seeing the optimizer reorder the predicates in a query in a random way. Can you tell me what's happening?
A Predicates are indeed moved around and reordered by the query optimizer for many reasons. One typical example would be:
SELECT x FROM t1 JOIN t2 ON t1.p = t2.f WHERE t1.v = 5 AND t2.u = 10
Depending on various considerations (such as index usage), it is probable that the predicates on each table will be evaluated while the table is scanned before the join. So the order of evaluation of these predicates is not even defined; they might well be evaluated concurrently in different threads.
Running Two Versions of SQL Server
Q Can I run both SQL Server™ 2005 and SQL Server 2000 on the same server? Will there be any major performance degradation when using both versions?
A SQL Server 2005 and SQL Server 2000 are supported side by side. Be sure that SQL Server 2000 is installed prior to installing SQL Server 2005. This will insure that if you were to uninstall one, the other won't break. As for performance, it is possible that if both versions are being used simultaneously and are competing for resources such as disk and CPU, then you could see some performance degradation. But there are things you can do to minimize contention for memory and disk resources. You can set a fixed upper limit on memory usage for each instance. For disk usage, you can configure the data files on separate spindles, if available. This is assuming, of course, that you are planning to run both instances concurrently on an ongoing basis.
You could also use CPU affinity to assign a given collection of processors to a SQL Server instance, but that's not the best solution. While taking this route allows you to guarantee a certain amount of CPU resource to an instance, it also prevents any single instance from fully utilizing all processors on the box, so you'll never get full use of your hardware.
You should read "SQL Server Consolidation on the 64-Bit Platform". It contains sections on one versus multiple instances, CPU affinity, and Windows® System Resource Manager (WSRM). All of these topics are relevant whenever you consider running more than one SQL Server instance on a box.
Restarting SQL Server
Q I no longer want to restart SQL Server every day. Will I have memory problems if I don't?
A The real question here is why you feel you need to restart every day. The erroneous belief that SQL Server has memory leaks and thus must be restarted frequently is still rampant out in the IT world. But leaks really only happen when you run applications on top of SQL Server that don't follow best practices. Anyway, short of running applications that leak, SQL Server just doesn't need to be restarted on a regular basis.
When you do restart often, you increase your chances of hardware failure (since most problems strike on startup). At the very least, you cause unnecessary downtime and are cycling through error logs more quickly, destroying forensic logging information, and reducing an engineer's ability to troubleshoot a real problem.
Executing Stored Procedures
Q I have noticed that at times when there is a heavy load on my server, some stored procedures' execution plans go awry. One of the first things I check when I notice timeouts is whether statistics are current for tables accessed by the stored procedure that times out.
I have also noticed that although a table may have exceeded the threshold that causes auto statistics to update (20 percent of the rows plus 500 have changed), the update wasn't done and therefore I had to do it manually to resolve these timeouts. Could it be that SQL Server did not update the statistics because the server was too busy at the time?
I'm using SQL Server 2000 and the database options for auto statistics update/create are turned on. The database is around 60GB in size and some of the tables involved have roughly 30 million rows.
If I know which stored procedure inserts data into these tables, would it be acceptable to include "update statistics <table name>" in that stored procedure after I insert the rows? When I run the statistics update manually it takes one to two seconds.
A Auto-updating of statistics is not throttled (auto-create is, however). But there are still plenty of reasons why an auto-update might fail. In a heavy load scenario it is most likely that the auto-update can't acquire the required locks. Then it would fail because auto-update never waits for locks; instead it fails immediately if it doesn't get the locks it needs.
If you know the statistics entries you want to refresh and if the overhead of one to two seconds is acceptable, there is nothing wrong with a manual statistics refresh inside a stored procedure. However, the syntax you mentioned will refresh all statistics defined on this table, which might be prohibitively expensive.
Also note that you cannot execute a manual statistics refresh inside an active user transaction. This will fail with error 226: command not allowed within multi-statement transaction.
The auto stats trace event only logs successful auto-updates, but trace flag 8721 generates a bit more information. You get an entry in the error log for any update stats attempted (not just the successes). But auto-update stats is set oriented. The trace output will tell you how many stats entries on a table were refreshed (x >= 0), but it doesn't tell you how many and which ones failed. If the number is 0, you know something failed, but you don't know how many or which ones.
Why Defragment?
Q I have a production database on disks that are badly fragmented, showing lots of red when graphed. What is the procedure for defragmentation? Right now, the performance counter shows the average disk queue length is 50, with a maximum of 140. The database was configured for autogrow at 10 percent. Is that why this is happening?
Also, how do I determine whether the disks are having performance problems? Right now, I consider the performance counter reading average disk queue length of 50 a hint about what is going on.
Finally, I have heard that fragmentation has less of a performance impact on a Storage Area Network (SAN) than on a physical disk. Is that correct?
A The disk defragmenter shows red for all the space occupied by any file in more than one fragment. A 300GB file in two fragments will create a whole lot of red, but that is completely unrelated to performance.
First, you should dig a little deeper into the amount of fragmentation and its impact on performance. Is this drive used only for this database or is it used for other services as well? If it is just this database, you should be wondering why it's so highly fragmented. If it's used for other things such as files, logs, other databases, and autogrow, it would make sense that it is fragmented at the physical level (see Defrag Best Practices Whitepaper).
If you're using average disk queue as a measure of performance, it could be misleading—it should be a maximum of between 2 and 5 per physical spindle, making up the Logical Unit Number (LUN) and assuming dedicated disks to the LUN. This number might also be tricky as you are not looking at average write queue or average read queue to see where you are experiencing problems.
Autogrow is bad when databases are small and they have a lot of potential growth. This is why you should make your best attempt to size databases properly up front. Then, if you accept the 10 percent default autogrow, the database will take a more appropriate share of disk space.
The fragmentation on a SAN may or may not matter. Some SANs are better at moving sectors around to optimize file access, but fragmentation is fragmentation. If more than one database and their associated files live on this LUN, you really should have thought about things like this prior to implementation. Often disks are configured poorly from the beginning, creating problems likes this.
Separating Your Drives
Q Is it a recommended practice to use a separate disk drive for tempdb for SQL Server? If so, why should it be separate from the data drive, log drive, or SQL drive?
A If you're referring only to logical drive separation, then the benefit really comes from the ability to move these components around easily to different physical locations moving forward.
TempDB is a shared resource on the SQL Server instance; placing it on the same spindles containing data files means database users must endure the I/O resource consumption overhead associated with all SQL Server workspace activity (hash joins, cursors, and so on). Placing them on separate physical spindles provides higher throughput because the media is dedicated to serving I/O requests from the database only instead of dealing with both database activity and all TempDB operations.
This case is even more evident with log files that have a different access pattern (sequential versus random); placing log files with data files means there could be I/O stalls writing to the logs while the actuator positions the arm and head on the spindle over the appropriate section depending whether it is performing database activity or log activity (in other words, it'll be bouncing around a lot). You want to avoid any I/O stalling during writes to logs, since a transaction will not commit until the data is flushed to stable media. (See "I/O Basics" for more information. For the benefits of designating a separate disk drive for tempdb, see "Optimizing tempdb Performance".
Monitoring SQL Server Health
Q I need best practices for SQL Server health monitoring, such as the recommended performance counters, the best way to run trace/profile with minimum impact on SQL Server performance, and so forth. Can you point me in the right direction?
A The whitepaper "Troubleshooting Performance Problems in SQL Server 2005" references many of the performance counters you'll want to monitor and, in some cases, provides threshold information for the counter value that indicates a problem (see "Troubleshooting Performance Problems in SQL Server 2005" and "Tips for Using SQL Server Performance Monitor Counters"). The key to running traces in the most efficient manner is to run the traces locally on the machine running SQL Server and pipe the output to a local file. See SQL Server Books Online for details that describe exactly how to do this.
In addition to capturing and analyzing performance counters and profiler information you may want to look into identifying where the engine is spending its time waiting, which tells you if you are I/O constrained, CPU constrained, memory constrained, and so on. For more on this see "How do you measure CPU pressure?".
You should also consider implementing Microsoft® Operations Manager (MOM) 2005 as a part of your SQL Server 2005 monitoring plans. The SQL Server 2005 Management Pack is quite useful. Even without it you can get a good understanding of what's going on with your servers just by using the MOM operator console and warehouse and associated reports that expose counters like CPU, Memory, Disk Queuing, and so on.
Stored Procedure Performance
Q I have a server that is running SQL Server 2005. A stored procedure in one of the databases on this server runs very slowly when executed through managed code (20+ second execution times). When the stored procedure is run in Query Analyzer, however, it's very fast (one second or less). What's even stranger is that if I open the stored procedure and make a small modification to it, the stored procedure will run fast through my managed code again. Eventually however, it reverts back to being slow again. I've determined that it appears SQL Server is choosing a different plan for the sproc in Query Analyzer because SET ARITHABORT is on by default and in my managed code it's set to off.
What I'm trying to determine is why SQL Server is getting confused about the stored procedure's execution plan. That's why changing it (even by adding a dummy return value) will suddenly cause it to speed up again. What I can do to ensure that it maintains a fast execution plan? How does SQL Server choose an execution plan?
A From what you're describing, the procedure has most likely hit upon a poor execution plan and retained it going forward. When SQL Server has an execution plan for a query, it will retain that plan until it's dropped from the cache. The procedure can be dropped from the cache for many reasons including:
- Aging
- Altering or Drop/Creating the stored procedure
- Issuing a sp_recompile command on the stored procedure
When you change a single line of code in the stored procedure you're causing the procedure to recompile. The next time you execute the procedure, it has a better execution plan than it had before you caused the recompile because the new plan is based on the parameters you just provided.
If you were to go into Query Analyzer and choose "Include actual execution plan" when it was performing slowly, and then you opened another window, recompiled the procedure, and executed it again, you would most likely see two different execution plans. The new execution plan is optimal for what you're doing now, but the original execution plan was probably optimal for previous executions where the parameters were significantly different.
To learn more about optimized execution plans take a look at this SQL Server team blog entry.
Updating Your Database Servers
Q I have more than 20 database servers. Most update operations occur on an online transaction processing (OLTP) database, then propagate to other databases using transactional replication. I am thinking about replacing transactional replication with a distributed transaction and doing the update all at once. How will this affect performance?
A This would be a very bad idea from a performance perspective. You need to ask yourself whether you have a real, legitimate business need to have the updates performed in an all-or-nothing fashion on all of your databases at the same time. In other words, can you justify the use of a distributed transaction?
You should really consider using SQL Service Broker. If you need transaction-based, ordered delivery, that might be a much better approach.
Edited by Nancy Michell
© 2008 Microsoft Corporation and CMP Media, LLC. All rights reserved; reproduction in part or in whole without permission is prohibited.