Resolving Performance Issues
In an ideal world, you would account for SharePoint performance optimization in the planning and design stages with adequately sized and architected servers, support teams, and underlying infrastructure. But in the real world, you'll have trouble predicting user adoption rates. Your budget may be cut or staff downsized. You may inherit a poorly performing SharePoint environment. Even if your infrastructure at first meets performance expectations, growing numbers of documents, groups, lists, and sites may increase page load times and decrease satisfaction.
One of the biggest challenges you'll face in your efforts to optimize SharePoint performance will be navigating through the many configuration options that the underlying IIS, .NET, and SQL Server technologies provide during the planning and design stages, as well as in hands-on operation. The sheer number of options is daunting, not to mention trying to figure out which option is best suited to your needs. For example, SQL Server houses the vast majority of SharePoint configuration data and content, yet the search, content, configuration, and temp databases have very different read/write patterns that require appropriate disk throughput and RAM. To complicate the picture, you also can use caching in IIS or offload indexing to a front-end server to help increase disk throughput.
A second challenge revolves around determining the root causes of performance issues. SharePoint relies not only on the core SQL, IIS, and .NET components, but also on interdependencies such as Active Directory, the network, SharePoint architecture, and physical server hardware. This means a performance issue may have more than one root cause, and similarly require making multiple changes for problem resolution. Operational jobs, backup routines, and third-party tools add more possible root causes to performance issues.
In this column, I present an overview of key SharePoint architecture components, describe how they can lead to common performance issues, and discuss how to resolve and troubleshoot problems.
Before I go into the relationships between IIS and SQL Server design, configuration options, and impact on performance, let's establish the target of performance optimization. Put simply, it is improved user and administrator experience in terms of key indicators such as page load times, search, and crawling. If pages don't load fast for your users, then your effort to optimize performance by eliminating 10 round trips to the SQL Server databases doesn't matter.
When considering how quickly a page appears for a user, be sure to think about initial and subsequent load times. You might have instances in which users load a single page once but, generally, SharePoint use involves people accessing many sites and document libraries repeatedly. That's why focusing on opportunities that yield decreased load times for all page requests is so important. Keep in mind that because of browser caching, the first time a page loads the render time will be different than it is for subsequent page loads.
In my May 2008 column, "Building Your SharePoint Infrastructure", I covered the SharePoint architecture and explained at a basic level how IIS, SQL Server, and .NET work together to render requested pages. Now let's consider how to configure the core technologies to meet your performance needs. Figure 1 shows the key components that relate to optimization options.
Figure 1 SharePoint Architecture Components Affecting Performance
|SharePoint Products and Technologies Web site
|Windows SharePoint Services TechCenter
|Windows SharePoint Services Developer Center
|Microsoft SharePoint Products and Technologies Team Blog
In the basic request scenario, the ASP.NET page parser services the incoming request, once it's authenticated, and renders the result to the browser. The underlying content includes data from the file system and SQL Server content databases, such as list items, binary large objects (BLOBs), graphics, and text. Even submitting content to a blog or news page with a few Web parts requires verification of appropriate permissions, the page parser to compile the ASP.NET page, and multiple trips to the SQL Server for reading and writing to and from the temp, transaction logs, and content databases.
Performance issues can happen at any point in this process. For example, if the page contains many small artifacts, such as images, and your environment uses Windows NT LAN Manager and a remote domain controller (DC), then the full page will load slowly because HTTP GET requests require a round trip to the DC. The Windows NT LAN Manager authentication architecture is the constraint, not IIS or SQL Server operations. Similarly, a user might request a page with thousands of list items or import many items from a spreadsheet to a list, affecting the load times for all other users accessing sites housed on the same SQL Server.
Operational tasks and background processes, such as resource-intensive nightly backups, also can influence performance. Nightly backups can cause problems for business users in global environments that operate 24 hours every day. Background tasks impact performance because they strain system resources. For example, scheduled timer jobs, database cleanup tasks and, especially, indexing and crawling processes use large amounts of disk I/O, CPU, and RAM on front- and back-end servers.
Resolving Performance Issues
Regardless of the framework and methodology you use in trying to understand your SharePoint performance issues, isolate their root causes, and resolve the problems, you need baseline data that reflects acceptable performance against which you compare data that reflects poor performance. You can obtain a baseline set of performance data either right after your initial deployment, during periods of acceptable performance, or from best-practice recommendations. If you have no basis for comparison, then you can use Microsoft published performance recommendations for various measurements, available at technet.micro-soft.com/en-us/library/cc262787.aspx.
As I already mentioned, the most common performance issues happen as a result of interrelated components. This can be troublesome because the underlying indicators are often the same. For example, one common indicator of a performance issue is a spike in resource utilization, such as RAM, CPU, or disk I/O. To resolve the performance issue, you need to look at all available data, understand the sequence of events leading to the issue, and correlate events to determine the underlying root cause. Figure 2 shows common performance issues, possible root causes, and resolution options.
|Figure 2 Issue, Cause and Resolution Summary|
|Issue||Possible Root Causes||Possible Resolutions||Additional Resources|
|High disk I/O activity on SQL Server||Large list operations, timer jobs, SQL maintenance tasks, backup, indexing, inadequate RAM, high I/O databases such as temp tlog, search, and content, placed on same disk or slow disks.||Separate temp and search databases in multiple files across high I/O disk volumes, increase RAM, use dedicated disks for transaction logs, defragment, and re-index databases weekly.||technet.microsoft.com/en-us/library/cc678870.aspx
|SQL blocking/locking||NIC configuration, large list operations, indexing/crawling jobs.||Do not use SharePoint Team Services Administration (STSADM). Use SQL backup, DPM, Litespeed, or SQL 2008 with compression, ensure fill factor is set to 70% on content databases, enforce 100GB growth limit.||technet.microsoft.com/en-us/library/cc901593.aspx
|Overall slow page loads||Compression not enabled. Caching not enabled or not configured. Large pages. Redundant SQL trips, underlying network issues.||Enable caching and compression, check page load times, and examine SQL queries and round trips, check NIC for Broadcom 5708 Chimney issues.||technet.microsoft.com/en-us/library/cc298550.aspx
|Long time to load full page||Improper SharePoint object handling in custom code, slow links, SQL blocking, timer jobs, Web part caching not enabled.||Resolve back-end bandwidth and response issues, dispose of objects properly, use 64-bit hardware or configure memory pool limits, delay downloading core.js.||code.msdn.microsoft.com/SPDisposeCheck
|Poor list performance||Large lists >2,000–3,000 items in a level. No indexing on lists. Underlying SQL Server issues. Too many columns.||Index on one or more columns, ensure SQL Server performance, keep fewer than 2,000–3,000 items in a level.||go.microsoft.com/fwlink/?LinkID=105580&clcid=0x409|
|Long crawl and index times or indexing causing sluggishness||Large data volumes require long index times, no dedicated index target.||Block with robots.txt, offload crawling/indexing to dedicated front-end server.||technet.microsoft.com/en-us/library/cc261810.aspx|
|LDAP operations (such as authentication and user operations) causing usage spikes||Low bandwidth, remote domain controller, large profile imports.||Increase bandwidth, use Kerberos, optimize profile importing.||support.microsoft.com/kb/827754|
|Backup taking too long||Using STSADM, other SQL conditions such as blocking.||Use Microsoft Data Protection Manager (DPM) or SQL 2008 with compression.||technet.microsoft.com/en-us/library/cc901593.aspx|
|IIS out of memory conditions||Application pool and worker process recycling, improper object handling, not enough RAM, poor load balancing architecture.||Use IIS overlapped recycling, use 64-bit hardware.||msdn.microsoft.com/en-us/library/aa720391(VS.71).aspx
As you narrow down possible causes for performance issues, keep in mind general operations best practices, such as the ones documented in the IT Showcase white paper, "SharePoint Performance Optimization." Applying the latest patches, service packs, and updates for SQL Server, IIS, SharePoint, and Windows Server is especially important. Microsoft has fixed many previous performance issues, such as tempdb allocation contention (see Concurrency enhancements for the tempdb database) and TokenAndPermUserStore cache (see Queries take a longer time to finish running when the size of the TokenAndPermUserStore cache grows in SQL Server 2005).
You can rely on a diverse range of tools for digging down to the specifics of a SharePoint performance issue and gathering evidence that would help you make a diagnosis, determine the root causes, and resolve the problem.
The following tools are especially helpful in pinpointing causes of performance issues:
- Fiddler PowerToy and neXpert add-on Used together, these tools provide a solid starting point for page load analysis. They allow you to review caching, compression, and overall HTTP performance. You can get more information about these tools at Fiddler PowerToy - Part 2: HTTP Performance and Microsoft neXpert Performance Analysis Plugin.
- WireShark When you need to look into network issues, use WireShark. It works with many media, and you can capture packets from recreated TCP/IP conversations when you recreate issues. For more information, see wireshark.org.
- Visual RoundTrip Analyzer (VRTA) You can use VRTA to examine the round-trip performance from request to response. VRTA examines the communications protocol, identifying the causes of excessive round trips, and recommending solutions. You can download it from Microsoft downloads, Visual Round Trip Analyzer.
- SQL Profiler You can use this useful tool, installed with SQL Server, for monitoring an instance of SQL Server Database Engine or SQL Server Analysis Services. It enables you to discover issues with queries, deadlocks, timeouts, recompilations, and general errors and exceptions.
- SQLDiag This tool, also installed with SQL Server, collects valuable information about the configuration of the computer running SQL Server, the operating system, and the information that is reported to the SQL Server error logs.
- SQL Query Analyzer This is a low-level debug tool for analyzing query performance issues. It also is part of the SQL Server toolset.
- SPtraceview This is one of my favorite tools because it provides a view of performance issues in real time. It's useful for monitoring diagnostic tracing when working with custom Web parts. For more information, see SPTraceView – Lightweight Tool for Monitoring the SharePoint Diagnostic Logging in Real-Time.
- WSSDW.exe This is a performance-testing tool that populates data for testing deployments of Office SharePoint Server 2007. See Tools for performance and capacity planning (Office SharePoint Server) and SharePoint 2007 Test Data Population Tool for more information.
- Custom tool for client-based URL ping This is one of the most useful tools because it enables the comparison of statistics before and after implementing configuration changes to the environment. See the appendix on the SharePoint Performance Optimization page.
SharePoint performance tuning, like most things SharePoint, is complex. You need to understand the object model, details about the SharePoint architecture, and the interactions between IIS/.NET and SQL Server. You also need to know database administration and troubleshooting best practices. The good news is that if you lack this understanding, you can still do well at optimizing your SharePoint infrastructure by following established best practices, recommendations, and knowledge, and addressing the common issues pointed out here.
Pav Cherny is an IT expert and author specializing in Microsoft technologies for collaboration and unified communication. His publications include white papers, product manuals, and books with a focus on IT operations and system administration. Pav is president of Biblioso Corporation, a company that specializes in managed documentation and localization services.