Share via


Chapter 3: Monitoring SharePoint 2010 (Real World SharePoint 2010)

Summary: Learn how to monitor your Microsoft SharePoint 2010 installation by using tools that help you prevent, diagnose, and assess problematic issues.

Applies to: Business Connectivity Services | SharePoint Foundation 2010 | SharePoint Server 2010 | Visual Studio

This article is an excerpt from Real World SharePoint 2010: Indispensable Experiences from 22 MVPs, edited by Scot Hillier, from Wrox Press (ISBN 978-0-470-59713-2, copyright © 2010 by Wrox, all rights reserved). No part of these chapters may be reproduced, stored in a retrieval system, or transmitted in any form or by any means—electronic, electrostatic, mechanical, photocopying, recording, or otherwise—without the prior written permission of the publisher, except in the case of brief quotations embodied in critical articles or reviews.

Contents

  • Introduction

  • Unified Logging System (ULS)

  • Trace Logs

  • Windows Event Logs

  • Logging Database

  • Health Analyzer

  • Timer Jobs

  • Summary

  • Additional Resources

  • About the Author

 

Introduction

Getting Microsoft SharePoint 2010 up and running is only half the battle. Keeping it up and running is another thing entirely. Once you have SharePoint 2010 installed and configured, and you have end users telling you how great it is (and you are), it's easy to get comfortable and just admire your handiwork. Don't be lulled into a false sense of security. Forces are working against you and your poor, innocent SharePoint farm.

It's your job to keep these problems at bay and keep SharePoint spinning like a top. Using the tools that you read about in this chapter, you can see what SharePoint is doing behind the scenes, and see ways to predict trouble before it happens. After you're finished with this chapter, you'll almost look forward to experiencing problems with SharePoint so that you can put these tools to good use and get to the bottom of the issues.

Unified Logging System (ULS)

The Unified Logging Service (ULS) is the service that is responsible for keeping an eye on SharePoint and reporting what it finds. It can report events to three different locations:

  • SharePoint trace logs

  • Windows Event Log

  • SharePoint logging database

Where the event is logged (and if it's logged at all) depends on the type of event, as well as how SharePoint is configured. The ULS is a passive service, which means that it only watches SharePoint and reports on it; it never acts on what it sees.

Let's take a look at each of the three locations and see how they differ.

Trace Logs

The trace logs are the logs you think of first when discussing SharePoint logs. They are plain old text files that are tab delimited and open with any tool that can open text files. You learn about methods for consuming them later in this chapter.

By default, trace logs are located in the LOGS directory of the SharePoint root (also called the 14 Hive) at C:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\14.

Figure 1 shows how the directory looks. Later in this chapter, you learn how to move these files to a better location.

Figure 1. SharePoint trace logs

SharePoint trace logs

 

The naming format for the trace log files is machinename-YYYYMMDD-HHMM.log, in 24-hour time. By default, a new log file is created every 30 minutes. You can change the interval by using Windows PowerShell with the Set-SPDiagosticConfig command. The following code snippet configures SharePoint to create a new trace log every 60 minutes.

Set-SPDiagnosticConfig Set-SPDiagnosticConfig -LogCutInterval 60

Note

For more information about using PowerShell with SharePoint 2010, read Chapter 5.

Trace logs existed in previous versions of SharePoint, but they have undergone some improvements in SharePoint 2010. For starters, they take up less space, but still provide better information. It's a classic "eat all you want and still lose weight" situation.

The trace logs are smaller than their SharePoint 2007 counterparts for a couple of reasons. First, a lot of thought has gone into what gets written to the trace logs by default. Anyone who has perused through a SharePoint 2007 ULS log or two has seen a lot of repetitive and mostly unhelpful messages. These messages add to the bloat of the file, but do not provide much in return. In SharePoint 2010, many of these messages have been removed from the default settings, which makes the log files smaller.

Also, SharePoint now leverages Windows NT File System (NTFS) file compression for the LOGS folder, which also decreases the amount of space the logs occupy on disk.

Figure 2 shows the compression on a trace log file.

Figure 2. Trace log compression

Trace log compression

 

The log file shown in Figure 2 is 9.62 MB, but it is only taking 3.30 MB on disk, thanks to NTFS compression. This allows you to keep more logs, or logs with more information, without as much impact on the drive space of your SharePoint servers.

Finally, in SharePoint 2010 you have much better control over which events are written to the trace logs, and better control over getting things put back after you have customized them. In SharePoint 2007, you had some control over which events were written to the trace logs, but there were two significant drawbacks:

  • You didn't know which events in the interface to more heavily monitor.

  • Once you cranked up one area, there was no way to set it back to its original setting after you had successfully solved a problem.

With SharePoint 2010, there's now good news, because both of those issues have been addressed. You have a very robust event throttling section in Central Administration that enables you to customize your logs to whatever area your issue is in, and then dial it back easily once the problem is solved.

In Central Administration, click Monitoring on the left, and then select Configure Diagnostic Logging under the Reporting section to see a window similar to Figure 3.

Figure 3. Event Throttling

Event Throttling

 

Two things should jump out at you. The first is the sheer number of options you have to choose from. The Enterprise SKU of SharePoint Server has 20 different categories, each with subcategories. This means that if you are troubleshooting an error that only has to do with accessing External Data with Excel Services, you can crank up the reporting only on that without adding a lot of other unhelpful events to the logs. The checkbox interface also means that you can change the settings of multiple categories or subcategories at one time. So, the interface makes it easy to change a single area, or a large area.

The second thing that should jump out at you in Figure 3-3 is that one of those options, Secure Store Service, is bolded. In SharePoint 2007, after you had customized an event's logging level, there was no way to go back to see which levels you had changed. And, if, by some strange twist of fate, you were able to remember which events you had changed, there was no way to know what level to change them back to.

In most cases, one of two things happened. You either left the events alone (in their chatty state), or you found another SharePoint 2007 installation and went through the settings, one by one, to compare them. Neither solution was great.

Fortunately, SharePoint 2010 addresses both of those issues swimmingly. As you can see in Figure 3, the Secure Store Service is bolded in the list of categories. That's SharePoint 2010's way of saying, "Hey! Look over here!" Any category that is not at its default logging level will appear in bold, making it very easy to discover which ones you need to change back. That's only half the battle though.

How do you know what to set it back to? SharePoint 2010 covers that, too. As shown in Figure 4, in the list of Least critical events to report to the trace log drop-down list box, there is a shining star at the top, Reset to default. This little number sets whichever categories are selected back to their default logging settings. This means that you can crank up the logging as much as you want, knowing that you can easily put it back once you are finished. Microsoft has even provided an All Categories checkbox at the top of the category list (see Figure 3) to make it even easier to fix in one fell swoop.

Figure 4. Reset to default setting

Reset to default setting

 

Setting the event levels that are logged to the trace logs is just one of the settings that you can customize. Probably the most important change you can make to your trace logs is their location. As previously mentioned, the default location for these files is C:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\14\LOGS. That location is fine, because the logs get to hang out with the rest of their SharePoint 2010 friends. But space on the C:\ drive is valuable, and if you can safely move something to another drive, you should.

Fortunately, it is very easy to move these files to a new location, where they cannot do any harm to your C:\ drive. To do this, open Central Administration click Monitoring, and then select Configure Diagnostic Logging. Figure 5 shows the bottom of this page, where you can enter a new location to store your trace logs.

Figure 5. Moving trace logs

Moving trace logs

 

Change the default location to a location on another drive. An excellent choice is something like D:\Logs\SharePoint or E:\Logs\SharePoint, depending on where your server's hard drives are located.

Warning

Keep in mind that this is a farm-level setting, not server level, so every server in your farm must have that location available. If you try to set a location that is not available on all of the servers in the farm, you'll get an error message. You'll also need to keep this in mind when adding new servers to your farm. Your new server must have this location as well.

Figure 5 shows a couple of other settings you can use to control the footprint your trace logs have on your drives.

The first option allows you to configure the maximum number of days the trace logs are kept. The default is 14 days. This is a good middle-of-the road setting. Resist any temptation to shorten this time period unless your servers are really starving for drive space. If you ever need to troubleshoot a problem, the more historical data you have, the better off you are. The only downside to keeping lots of information is the amount of time and effort it takes to go through it. You learn more about this later in this chapter.

You can also assign a finite size to the amount of drive space your trace logs can consume, whether they have reached 14-day expiration or not. The default value is 1 TB, so be sure to change that value if you want to restrict the size in a more meaningful way.

Configuring Log Settings with PowerShell

Every SharePoint 2010 administrator should get cozy with Windows PowerShell. It is the best way to do repetitive tasks, and manipulating the trace logs is no exception. So far in this chapter, you have learned about using Central Administration to interact with trace logs. In this section, you learn how to use PowerShell to make those same configuration changes.

SPDiagnosticConfig

The first tool in the PowerShell arsenal is the Get-SPDiagnosticConfig cmdlet and its twin brother, Set-SPDiagnosticConfig. The former is used to retrieve the diagnostic settings in your farm; the latter is used to change them. Figure 6 shows the output of Get-SPDiagnosticConfig, and it reflects changing the log cut interval to 60 minutes, as you did previously.

Figure 6. Get-SPDiagnosticConfig output

Get-SPDiagnosticConfig output

 

Seeing the settings that Get-SPDiagnosticConfig displays gives an idea of what values its brother Set-SPDiagnosticConfig can set. Using PowerShell's built-in Get-Help cmdlet is also a good way to get ideas on how best to leverage it, especially with the -Examples switch, similar to the following.

Get-Help set-SPDiagnosticConfig -Examples

This shows a couple of different methods of using Set-SPDiagnosticConfig to change the diagnostic values of your farm. The first method uses the command directly to alter values. The second method assigns a variable to an object that contains each property and its value. You can alter the value of one or more properties, then write the variable back with Set-SPDiagnosticConfig. Either way works fine; it's a matter of personal opinion as to which way you go.

Earlier, you learned that it's a good idea to move the location of the ULS trace logs. By default, they are located in the Logs directory of the SharePoint root. While they are fine there, space on the C:\ drive is almost holy ground. If the C:\ drive gets full, then Windows gets unhappy, and everyone (including IIS and SharePoint) is unhappy. To help prevent that, you can move your trace logs to another location, freeing up that precious space. The following PowerShell command moves the log location to E:\logs.

Set-SPDiagnosticConfig -LogLocation e:\Logs

It is important to note that this only changes the location that new log files are written to. It will not move existing log files. You must move them yourself.

SPLogLevel

You learned previously that you have some flexibility in configuring how different aspects of SharePoint log events. You saw how to look at these settings and change them in Central Administration. You can also use PowerShell to get and set that same information with the SPLogLevel cmdlets.

You can get a list of the cmdlets that deal with the log levels by running the command Get-Command -Noun SPLogLevel in a SharePoint Management Shell. The results should look similar to Figure 7.

Figure 7. SPLogLevel cmdlets

SPLogLevel cmdlets

 

Let's start an examination of the available options by taking a look at Get-SPLogLevel, which reports the current logging level in the farm. With no parameters, it will list through every category and subcategory, and report their trace log and event log levels. Using Get-Member, you can see that Get-SPLogLevel will report information that is not available in Central Administration, like what the default trace and event log logging levels are.

The SPLogLevel objects have a property named Area that corresponds to the top-level categories in Central Administration. Running the PowerShell command Get-SPLogLevel | select -Unique area displays those categories. To get all of the settings from a particular area takes a little work.

The -Identity parameter of Get-SPLogLevel corresponds to the second column (or Name column) of the log levels, which maps to the subcategories in Central Administration. This means that you cannot use Access Services for the Identity parameter, but you could use Administration, which is the first subcategory under Access Services. To get all of the logging levels for Access Services, use a command similar to the following.

Get-SPLogLevel | Where-Object {$_.area.tostring() -eq "Access Services"}

This uses the Area property of the log level, converts it to a string, and then displays the log level objects that match Access Services, as shown in Figure 8.

Figure 8. Access Services event categories

Access Services event categories

 

Now that you've mastered Get-SPLogLevel, let's look at Set-SPLogLevel, its complementary cmdlet. You can use this one to set a specific log level to the trace or event logs for a category or group of categories.

Suppose that you are having trouble with the Office Web Applications and you want as much debugging information as you can get. Of course, you could go into Central Administration and check the box next to Office Web Apps, but that's no fun. Let's use PowerShell instead.

The following command uses PowerShell to get all of the SPLogLevel objects that are in the Office Web Applications category, then pipes them through Set-SPLogLevel to set their trace logging level to verbose.

Get-SPLogLevel | Where-Object {$_.area.tostring() -eq "Office Web Apps"} | Set-SPLogLevel -TraceSeverity verbose

In one fell swoop, you have set all of the logging levels you need. Now you can reproduce the error, then go through your trace logs, and discover what the problem is. Once you have conquered that Office Web Applications problem, you must return the logging levels back to normal. That's where the third SPLogLevel cmdlet, Clear-SPLogLevel, comes into play.

Much like in Central Administration, there is an easy way to reset all of your logging levels back to the default. The Clear-SPLogLevel cmdlet clears out any changes you have made, and sets the logging levels to their default values for both trace and event logging. If you run it with no parameters, it resets all of the logging levels to their defaults. Like Get-SPLogLevel and Set-SPLogLevel, you can also pass it optional parameters to reset specific categories or areas.

Using Logs to Troubleshoot

Having lots and lots of beautiful logs files does you no good unless you can crack them open and use them to hunt down problems. In this section, you learn about some things to look for in the trace logs, and a variety of ways to look at them.

Introducing the Correlation ID

The first time that anyone opens up a SharePoint trace log, he or she feels the same sense of helplessness and being overwhelmed. It's like being dropped into an unfamiliar city in the middle of rush hour; so many things are going on at once all around you, and none of it looks familiar.

Fortunately, SharePoint has provided a bit of a road map for you, the correlation ID. The correlation ID is a globally unique identifier GUID that is assigned to each conversation a user or process has with SharePoint. When an error occurs, an administrator can use the correlation ID to track down all of the events in the trace logs that pertain to the conversation where the problem occurred.

This is very helpful in those very, very rare occasions when end users contact the help desk because something is broken and, when asked what they were doing, they reply, "Nothing." Now, with correlation IDs, you can track those conversations with SharePoint and see exactly what was happening when the error occurred. Figure 9 shows a window that an end user might get with the correlation ID in it.

Figure 9. Correlation ID in action

Correlation ID in action

 

In this example, you know that the user was trying to view a Word document with the Office Web Applications and it failed. Once you get the correlation ID, b7162a24-1fa2-4567-80a5-74feda9a768b, 20100801171552, you can figure out why the document would not open.

Because each entry in the trace logs has a correlation ID, you can just open up the trace log in Notepad and look for the lines that reference this conversation. Figure 10 shows what you would find in this example.

Figure 10. Determine the problem by using the correlation ID

Determine the problem by using the correlation ID

 

By following the correlation ID through the trace log, you might stumble across a pretty telling error: "There are no instances of the Word Viewing Service started on any server in this farm. Ensure that at least one instance is started on an application server in the farm using the Services on Server page in Central Administration."

That does sound like a problem. Sure enough, by checking in Central Administration, you see that no servers in the farm are running the Word Viewing service instance. By following that correlation ID through the logs, you can learn all kinds of fun stuff about how SharePoint works. For example, SharePoint looks to see if there is a cached copy of that document before it tries to render it.

As you have seen, the correlation ID is exposed when an error page is displayed, and throughout the trace logs. It is also referenced in events in the Windows Event Log, when it is appropriate. You can also use a correlation ID when doing a SQL trace on SharePoint's SQL traffic. The correlation ID is considered by many administrators to be one of the best new features in SharePoint 2010.

The Developer Dashboard

You aren't always handed the correlation ID all tied up with a bow as you were in Figure 9. Sometimes the page renders, but there are problems with individual web parts. In those cases, you can use the Developer Dashboard to get the correlation ID and track the problem down.

Despite what the name suggests, this dashboard is not just for developers. The Developer Dashboard is a dashboard that shows how long it took for a page to load, and which components loaded with it, as shown in Figure 11.

Figure 11. Developer Dashboard

Developer Dashboard

 

This dashboard is loaded at the bottom of your requested web page. As you can see, the dashboard is chock full of information about the page load. You can see how long the page took to load (790.11 ms), as well as who requested it, its correlation ID, and so on. This can be used when the help desk gets those ever popular "SharePoint is slow" calls from users. Now you can quantify exactly what "slow" means, as well as see what led up to the page load being slow.

If web parts were poorly designed and did a lot of database queries, you'd see it here. If they fetched large amounts of SharePoint content, you'd see it here. If you're really curious, you can click the link on the bottom left, Show or hide additional tracing information, to get several pages worth of information about every step that was taken to render that page.

Now that you're sold on the Developer Dashboard, how do you actually use it? As previously mentioned, it is exposed as a dashboard at the bottom of the page when it renders. The user browsing the page must have the AddAndCustomizePages permission (site collection admins and users in the Owner group, by default) to see the Developer Dashboard, and it must be enabled in your farm.

By default, it is shut off, which is one of the three possible states. It can also be on, which means the dashboard is displayed on every page load. Not only is that tedious when you're using SharePoint, but it also has a performance penalty.

Figure 12. Enabling the Developer Dashboard when it's on demand

Enabling the on-demand Developer Dashboard

 

The third option, on demand, is a more reasonable approach. In on demand mode, the Developer Dashboard is not on, but it's warming up in the on-deck circle, waiting for you to put it into action. When the need arises, a user with the correct permission can turn it on by clicking the icon shown in Figure 12. When you are finished with it, you can put it back on the bench by clicking the same icon.

How do you go about enabling the Developer Dashboard to make this possible? You can use Windows PowerShell, as shown in the following example.

$dash = [Microsoft.SharePoint.Administration.SPWebService]

::ContentService.DeveloperDashboardSettings;

$dash.DisplayLevel = 'OnDemand';

# $dash.DisplayLevel = 'Off';

# $dash.DisplayLevel = 'On';

$dash.TraceEnabled = $true;

$dash.Update()

The DisplayLevel can be one of three values: Off, On, or OnDemand. The default value is Off.

Notice that at no point do you specify a URL when you're setting this. It is a farm-wide setting. Never fear, though, users with AddAndCustomizePages permission will see it, so hopefully it won't scare too many users if you must enable it for troubleshooting. If you have enabled MySites, each user is the site collection owner of their site collection, so they will see it.

Methods for Consuming the Trace Logs

So far, you've learned what the trace logs are, and a little bit about how to configure them and their contents. In this section, you learn about some ways to mine them and get out the information that you need.

The Basics

These wonderful trace files, which will help you do all this troubleshooting, are just text files. This might be very confusing, information-packed text files, but text files nonetheless. The trace logs have the following columns:

Table 1. Trace log columns

Value

Description

Timestamp

The date and time of the event in local time

Process

The name of the process that generated the log entry

TID

The thread ID

Area

The area the event is from

Category

The category of the area that the event is from

EventIDLevel

An undocumented ID

Message

The text of the message

Correlation

The correlation ID

As you saw in Figure 10, these trace logs can be opened with any program that can open text files, including humble old Notepad.exe. Although Notepad is good (because it exists on every SharePoint server), it can be very unwieldy to work with it. If you have a correlation ID, you can just use Ctrl+F to run a search, and then jump right to its location in the log. You will have better luck if you only search on the first group of characters in the GUID (b7162a24 in the previous example), instead of the whole GUID. Not only is it easier to manage, but some screens that display the correlation ID place a space at the end, or do not have hyphens in them — both of which can prevent you from finding the correlation ID in your log files.

Although you can load the logs into Notepad, there are better ways to handle them. Let's look at a couple.

Using Excel 2010

Not only are those beloved trace logs text files, they are tab-delimited text files. This means that Excel 2010 can import them easily and put each column of information into its own column. Once trace logs are in an Excel 2010 spreadsheet, you can use Excel's sort and filtering to locate the events of interest. You can also resize the columns or hide them completely for readability. You can even paste several log files into one spreadsheet to look for trends of errors.

Whereas Notepad gets sluggish with large files, Excel handles them with ease.

Using MSDN ULS Viewer

Though it's frustrating that SharePoint does not come with a log viewer, Microsoft has redeemed itself a bit. It released a free, dedicated (though unsupported) ULS Viewer. Because this utility was built from the ground up to read SharePoint's ULS, it does it quite well.

It allows real-time monitoring of the ULS logs, and will do smart highlighting (where it highlights all events that have the same value of the field you are hovering over). For example, if you hover over the category Taxonomy, it will automatically highlight all categories that match.

It also offers extensive filtering that includes filtering out event levels like Verbose or Medium. You can also filter by any value in any column. Right-clicking any correlation ID allows you to set a highlight for any matching row, or simply only show the rows that match. Figure 13 shows how to filter the logs based on a single correlation ID.

Figure 13: MSDN ULS Viewer

MSDN ULS Viewer

 

The interface has a lot of options, and is laid out very well. Because it's a free tool, it's worth every penny. If you're not comfortable installing it on your production servers, you can install it on your workstation and copy the ULS files over when trouble occurs.

Using SPLogEvent and SPLogFile

The last method to discuss is using the PowerShell cmdlets that deal with consuming the trace logs. The first, Get-SPLogEvent, retrieves log events from the trace logs. As with the other cmdlets you have learned about, using Get-Help with the -Examples parameter provides a good foundation to learn the different ways you can use this cmdlet. Let's take a look at a few examples.

If you just run Get-SPLogEvent with no parameters, it will spit back every record from every trace log it can find on your local machine. Hopefully, you are sitting in a comfortable chair if you do that, because it's going to take a while. Fortunately, you have many ways to limit the number of results you get, making it easier for you to separate the wheat from the chaff.

First, you can use the PowerShell cmdlet Select to pare down the results. The following examples demonstrate getting the first and last events in the logs.

Get-SPLogEvent | Select -First 5

Get-SPLogEvent | Select -Last 5

Get-SPLogEvent | Select -First 20 -Last 10

Depending on how many trace logs you have, it could take a while for the last results to show up. It's still walking through the whole list of events; it's just not displaying them all.

A better way is to use Get-SPLogEvent's -StartTime parameter to limit the events it reads. The following command returns the last ten events in the last five minutes.

Get-SPLogEvent -StartTime (get-date).addminutes(-5) | Select -Last 10

This will return results much more quickly, and likely will give you better results. You can also specify an end time, if you want to narrow down your search. You can also specify which levels to return. The following line returns all of the high-level events in the last minute.

Get-SPLogEvent -MinimumLevel "high" -StartTime (get-date).addminutes(-1)

In most cases, when you use Get-SPLogEvent, it is to get all of the events for a particular correlation ID. This is as easy as piping Get-SPLogEvent through a Where-Object clause and filtering for a specific correlation ID. The following command returns all of the events in the last ten minutes with a blank correlation ID.

Get-SPLogEvent -StartTime (Get-Date).addminutes(-10) |

Where-Object {$_.correlation

-eq "00000000-0000-0000-0000-000000000000"}

If you want a real correlation ID to work with, you can get one quickly with the following command.

Get-SPLogEvent -StartTime (Get-Date).addminutes(-1) | select correlation -First 1

You might have to run it a couple times to get a correlation ID that is not all zeros.

Figure 14 shows how this looks with its output. Once you have it, you can paste it into the previous statement to get all of the events that pertain to that correlation ID.

Figure 14. Getting a random correlation ID

Getting a random correlation ID

 

You have some other cmdlets at your disposal for pruning through those trace logs. A good one to use when troubleshooting is New-SPLogFile.

This tells SharePoint to close out the current log file and create a new one. You saw earlier that, by default, SharePoint rolls its logs over every 30 minutes. If you've ever loaded up those logs and tried to look for a specific event, you know it can be quite daunting.

With New-SPLogFile, you can run it before and after an event you are troubleshooting. For example, if the User Profile service instance won't start on a particular server, you could use New-SPLogFile to create a new log file right before reproducing the problem. Then, after you've tried to start the service, you can create another new log file. This will isolate into one file all the events created during your attempt, making it easier to follow.

If you have multiple servers in your farm, browsing through trace logs can be daunting, because you must constantly collect them from all of your servers. If only there was a way to merge the logs from all of your servers into one file…

Well, there is! SharePoint 2010 comes with a cmdlet, Merge-SPLogFile, that does exactly that. Merge-SPLogFile merges the trace logs from all of the SharePoint servers in your farm into one, easy-to-consume (or, at least, easier-to-consume) log file. All the tools that you used previously to work with trace files work with the merged log file as well.

By default, Merge-SPLogFile only merges events from the last hour from each machine. Using the same -StartTime and -EndTime parameters that you can use with Get-SPLogEvent, you can customize that window. If the error you are chasing happened in the last ten minutes, you can make it shorter. If you want to archive all the events from your servers from the last three hours, you can make it longer. Figure 15 shows Merge-SPLogFile in action.

Figure 15. Merging log files

Merging log files

 

You can see from Figure 15 that all Merge-SPLogFile needs to run is a path to write the newly created log file to. When you run it, it creates a timer job on all of the machines in the farm. This timer job collects the logs requested, and then copies them over to the server where Merge-SPLogFile is running. That's why you are warned that it may take a long time.

Although Merge-SPLogFile is happy to run with no parameters, you do have the option of trimming down the results, should you choose to. Get-Help Merge-SPLogFiles provides a list of parameters that you can use, including (but not limited to) Area, EventID, Level, Message, and Correlation ID. Figure 16 shows how you can use the last one, the correlation ID, to get a single log file that tracks one correlation ID from across your farm. This can be very handy when chasing down a problem.

Figure 16. Merging events with a common correlation ID

Merging events with a common correlation ID

 

Because Get-SPLogEvent supports a -directory switch, you can point it at a location other than your standard LOGS directory when searching for events. This can be used to speed up your searches if you copy the applicable logs to a different directory and point Get-SPLogEvent there. You can also point it to the directory where you save a merged log file created by Merge-SPLogFile and use it to filter those results as well.

As previously mentioned, Merge-SPLogFile is good for troubleshooting, but it is also very handy for archiving log events.

Windows Event Logs

In addition to the trace logs, another part of the ULS is Windows Events. These are the events that you are used to viewing in the Windows Event Viewer. While SharePoint 2010 writes to its own trace logs, it writes events here as well.

You can configure how much information SharePoint writes to the Windows Event Logs in the same way that you can control how much it writes to the trace logs. In Central Administration, click Monitoring and then select Configure diagnostic logging to set the threshold of events that are written to the Windows Event Logs, just like you can with the trace log.

You have several levels of events to choose from, including Reset to default, which resets the logging level back to its default. For Windows Events, you have an additional setting, event log throttling. If you enable event log throttling, SharePoint does repeatedly write the same event to the Windows logs if there is a problem. Instead, it only writes events periodically, telling you that the event is still being throttled. This keeps your Windows Event Logs from being overrun by the same message.

In Central Administration, you can only enable or disable this feature. In Windows PowerShell, using Set-SPDiagnosticConfig, you can enable or disable throttling, as well as change some of the settings. Table 3-2 shows a list of these settings, a description of what they do, and their default values.

Table 2. Settings for Set-DiagnosticConfig

Setting

Description

Units

Default Value

Threshold

Number of events allowed in a given time period (TriggerPeriod) before flood protection is enabled for this event

Integer; value must be between 0 (disabled) and 100 (maximum)

5

TriggerPeriod

The timeframe in which the threshold must be exceeded in order to trigger flood protection

Minutes

2

QuietPeriod

The amount of time that must pass without an event before flood protection is disabled for an event

Minutes

2

NotifyPeriod

The interval in which SharePoint will write an event notifying you that flood protection is still enabled for a particular event

Minutes

5

Earlier in this chapter, Figure 3-6 showed the Event Log Flood Protection settings as they are displayed with Get-SPDiagnosticConfig. These settings can be changed with the complementary cmdlet, Set-SPDiagnosticConfig.

Logging Database

Microsoft has always made it pretty clear how it feels about anyone touching the SharePoint databases. The answer is always a very clear and concise, "Stop it!" Microsoft didn't support reading from, writing to, or even looking crossly at SharePoint databases. Period. End of story. That became a problem, however, because not all of the information that administrators wanted about their farm or servers was discoverable in the interface, or with the SharePoint object model. This resulted in rogue administrators, with the curtains pulled, quietly querying their databases, hoping to never get caught.

SharePoint 2010 addresses this by introducing a logging database. This database is a farm-wide repository of SharePoint events from every machine in your farm. It aggregates information from many different locations, and writes them all to a single database. This database contains just about everything you could ever want to know about your farm, and that's not even the best part. The best part is that it is completely supported for you to read from and write to this database, if you would like, because the schema is public.

The following list includes some of the information that is logged by default:

  • Search Queries

  • Timer Jobs

  • Feature Usage

  • Content Import Usage

  • Server Farm Health Data

  • SQL blocked queries

  • Site Inventory

  • Search Query statistics

  • Page Requests

  • Site Inventory Usage

  • Rating Usage

  • Content Export Usage

  • NT Events

  • SQL high CPU/IO queries

  • Search Crawl

  • Query click-through

Microsoft had well-intentioned reasons for forbidding access to databases before. Obviously, writing to a SharePoint database potentially puts it in a state where SharePoint can no longer read it and render the content in it. Everyone agrees that this is bad.

What is less obvious, though, is that reading from a database can have the same impact. A seemingly innocent, but poorly written SQL query that only reads values could put a lock on a table or the whole database. This lock would also mean that SharePoint could not render out the content of that database for the duration of the lock. That's also a bad thing.

However, because this logging database is simply just a copy of information gathered from other places, and it is not used to satisfy end-user requests, it's safe for you to read from it or write to it. If you destroy the database completely, you can just delete it and let SharePoint re-create it. The freedom is invigorating.

Let's take a look at some details behind this change of heart.

Configuring the Logging Database

How do you use this database and leverage all this information? By default, health data collection is enabled. This builds the logging database. To view the settings, open SharePoint Central Administration and go into the now-familiar Monitoring section. Under the Reporting heading, click Configure usage and health data collection to display the page shown in Figure 17.

Let's start by looking at the settings at the top. The first checkbox on the page determines whether the usage data is collected and stored in the logging database. This is turned on by default, and here is where you would disable it, should you choose to.

The next section enables you to determine which events you want reported in the log. By default, all eight events are logged. If you want to reduce the impact that logging has on your servers, you can disable events for which you don't think you'll want reports. You always have the option to enable events later. You may want to do this if you find yourself wanting to investigate a specific issue. You can turn the logging on during your investigation, and then shut it off after the investigation is finished.

The next section determines where the usage logs will be stored. By default, they are stored in the LOGS directory of the SharePoint root, along with the trace logs. The usage logs follow the same naming convention as the trace logs, but have the suffix .usage. As with the trace logs, it's a good idea to move these logs off of the C:\ drive if possible. You also have the capability to limit the amount of space occupied by the usage logs, with 5 GB being the default.

The next section, Health Data Collection, seems simple enough — just a checkbox and a link. The checkbox determines whether SharePoint will periodically collect health information about the members of the farm. The link takes you to a list of timer jobs that collect that information. When you click the Health Logging Schedule link, you're taken to a page that lists all of the timer jobs that collect this information. You can use this page to disable the timer jobs for any information you don't want to collect. Again, the more logging you do, the greater the impact on performance.

The amount of information SharePoint collects about itself is quite vast. Not only does it monitor SharePoint-related performance (such as the User Profile Service Application Synchronization Job), it also keeps track of the health of non-SharePoint processes (such as SQL Server). It reports SQL blocking queries and Dynamic Management Views (DMV) data. Not only can you disable the timer jobs for information that you don't want to collect, but you can also decrease how frequently they run, to reduce the impact on your servers.

Figure 17. Configuring the logging database

Configuring the logging database

 

The next section of the Configure web analytics and health data collection page is the Log Collection Schedule. Here you can configure how frequently the logs are collected from the servers in the farm, and how frequently they are processed and written to the logging database. This lets you control the impact the log collection has on your servers. The default setting collects the logs every 30 minutes, but you can increase that to reduce the load placed on the servers.

The final section of the page displays the SQL instance and database name of the reporting database itself. The default settings use the same SQL instance as the default Content Database SQL instance, and use the database name WSS_Logging. The page says that it is recommended that you use the default settings. However, there are some pretty good reasons to change its location and settings.

Considering the amount of information that can be written to this database, and how frequently that data can be written, it might make sense to move this database to its own SQL Server instance. Though reading from and writing to the database won't directly impact end-user performance, the amount of usage this database could see might overwhelm SQL Server, or fill up the drives that also contain your other SharePoint databases. If your organization chooses to use the logging database, keep an eye on the disk space that it uses, and the amount of I/O activity it generates. On a test environment with about one month's worth of use by one user, the logging database grew to more than 1 GB. This database can get huge.

If you must alter those settings, you can do so in Windows PowerShell with the Set-SPUsageApplication cmdlet. The following PowerShell code demonstrates how to change the location of the logging database.

Set-SPUsageApplication -DatabaseServer <Database server name>

-DatabaseName <Database name> [-DatabaseUsername <User name>]

[-DatabasePassword <Password>] [-Verbose]

Specify the name of the SQL Server instance where you would like to host the logging database. You must also specify the database name, even if you want to use the default name, WSS_Logging. If the user running the Set-SPUsageApplication cmdlet is not the owner of the database, provide the username and password of an account that has sufficient permissions. Because this database consists of data aggregated from other locations, you can move it without losing any data. It will simply be repopulated as the collection jobs run.

To get the full list of PowerShell cmdlets that deal with the Usage service, use the following command.

get-command -noun spusage*

Consuming the Logging Database

Thus far, you've read a lot about this logging database, what's in it, and how to configure it. But you haven't learned how you can enjoy its handiwork. There are many places to consume the information in the logging database.

The first place to look is Central Administration. Click Monitoring and then select Reporting; there are three reports that use information in the logging database. The first is a link that says View administrative reports. Clicking that link takes you to a document library in Central Administration that contains a few canned administrative reports. Out-of-the-box, there are only search reports, but any type of reports could be put here. Microsoft could provide these reports, or they can be created by SharePoint administrators.

The documents in this library are simply web pages, so you can click any of them to see the information reported in them. These particular reports are very handy for determining the source of search bottlenecks. This enables you to be proactive in scaling out your search infrastructure. You can see how long discrete parts of the search take, and then scale out your infrastructure before end users are affected.

The next reports in Central Administration are the health reports. These reports enable you to isolate the slowest pages in your web application, and the most active users per web application. Like the search reports, these reports enable you to be proactive and diagnose issues in your farm. Running these reports enable you to see details about the pages that take the longest time to render, and then take steps to improve their performance. Figure 18 shows part of the report. To view a report, click the Go button on the top of the page.

Figure 18. Slow Page report

Slow Page report

 

The report shows how long each page takes to load, including minimums, maximums, and averages. This provides a very convenient way to find trouble pages. You can also see how many database queries the page makes. This is helpful, because database queries are expensive operations that can slow down a page render. You can drill down to a specific server or web application with this report as well, because the logging database aggregates information from all the servers in the farm.

You can also pick the scope of the report you want, and click the Go button. The reports are generated at run-time, so it might take a few seconds for them to appear. After the results appear, you can click a column heading to sort by those values.

Web Analytics reports in Central Administration are also fed from the logging database. These reports provide usage information about each of the farm's web applications, excluding Central Administration. Click the View Web Analytics reports link to view a summary page that lists the web applications in the farm, along with some high-level metrics like total number of page views and total number of daily unique visitors.

When you click a web application on the Summary page, you see a Summary page that provides more detailed usage information about that web application. This includes additional metrics for the web application, such as referrers, total number of page views, and the trends for each, as shown in Figure 19.

Figure 19. Web Analytics report

Web Analytics report

 

The web application Summary report also adds new links on the left. These links enable you to drill further down into each category. Each new report has a graph at the top, with more detailed information at the bottom of the page.

To change the scope of a report, click Analyze in the ribbon. This then shows the options that you have for the report, including the date ranges included. You can choose one of the date ranges provided, or choose custom dates. This provides the flexibility to drill down to the exact date you want. You can also export the report out to a comma-separated value (CSV) file by clicking the Export to Spreadsheet button. Because this is a CSV file, the graph is not included, only the dates and their values. These options are available for any of the reports after you choose a web application.

As previously mentioned, the Web Analytics reports do not include Central Administration. Although it is unlikely that you will need such a report, it is available to you. The Central Administration site is simply a highly specialized site collection in its own web application. Because it is a site collection, usage reports are also available for it. To view them, click Site Actions, and then select Site Settings. Under Site Actions, click Site Web Analytics reports. This brings up the same usage reports that you just saw at the web application level. You also have the same options from the ribbon, with the exception of being able to export to a CSV file.

Because these reports are site-collection Web Analytics reports, they are available in all site collections, and not in Central Administration. This is another way to consume the information in the logging database. To view the usage information for any site collection or web application, open Site Actions and select Site Settings to get the Web Analytics links. You have two similar links: Site Web Analytics reports and Site Collection Web Analytics reports. These are the same sets of reports, but at different scopes. The site-collection level reports are for the entire site collection. The site-level reports provide the same information, but at the site (also called web) level. You have the option to scope the reports at that particular site, or that site and its subsites.

Another option that was not available in the Central Administration Web Analytics reports is the capability to use workflows to schedule alerts or reports. You can use this functionality to have specific reports sent to people at specific intervals, or when specific values are met. This is another way that you can use the logging database and the information it collects to be proactive with a SharePoint farm.

There is one final way to consume the information stored in the logging database, directly from SQL. Although it might feel like you're doing something wrong, you're not. Microsoft said that it is okay. You have several ways to access data in SQL Server databases, but let's take a look at how to do it in SQL Server Management Studio with regular SQL queries.

SQL Server Management Studio enables you to run queries against databases. Normally, it is a very bad thing to touch any of the SharePoint databases, but the logging database is the only exception to that rule. To run queries against the logging database, you open Management Studio and locate the WSS_Logging database.

The database has a large number of tables. Each category of information has 32 tables to partition the data. It is obvious this database was designed to accommodate a lot of growth. Because of the database partitions, it is tough to do SELECT statements against them. Fortunately, the database also includes views that you can use to view the data.

Expand the Views node of the database to see which views are defined for you. In Figure 20, you can see how to get the information from the Usage tables. Right-click the view and click Select Top 1000 Rows.

This figure shows both the query that is used, and the results of that query. You can use this view and the resulting query as a template for any queries you want to design. If you do happen to damage the logging database, you can simply delete it, and SharePoint will re-create it.

Figure 20. Usage request query from logging database

Usage request query from logging database

 

Health Analyzer

By now, you've seen that you have a lot of ways to keep an eye on SharePoint. What if there was some way for SharePoint to watch over itself? What if it could use all that fancy monitoring to see when something bad was going to happen to it, and just fix it itself?

Welcome to the future. SharePoint 2010 introduces a feature called the Health Analyzer that does just that. The Health Analyzer utilizes timer jobs to run rules periodically, and to check on system metrics that are based on SharePoint best practices. When a rule fails, SharePoint can alert an administrator in Central Administration, or, in some cases, just fix the problem itself. To access all this in Central Administration, you click Monitoring and then select Health Analyzer.

Reviewing Problems

How do you know when the Health Analyzer has detected a problem? When you open up Central Administration and there's a red or yellow bar running across the top, as shown in Figure 21, that's the Health Analyzer alerting you that there's a problem in the farm. To review the problem, click View these issues on the right side of the notification bar.

Figure 21. Health Analyzer warning

Health Analyzer warning

When you click the link, SharePoint 2010 displays the Review problems and solutions page. (If there are no problems, you can also click Monitoring and then select Review problems and solutions in Central Administration to access the page.) This page shows you all the problems that the Health Analyzer found in the farm. Figure 22 shows some problems common with a single-server farm after installation.

Figure 22. Problems with a SharePoint farm

Problems with a SharePoint farm

Clicking any of the issues displays the definition of the violated rule and possible remedies for it. Figure 23 shows details about one of the problems.

Figure 23. Problem details

Problem details

As you can see toward the top of Figure 23, SharePoint provides a summary of the rule. This particular error indicates that one of the application pool accounts is also a local administrator. In most situations, this is a security issue, so SharePoint discourages it. SharePoint categorizes this as having a severity level of 2, being a Warning. It also tells you that this problem is in the Security category.

The next section, Explanation, describes what the problem is and to which application pools and services it pertains. The following section, Remedy, points you to the Central Administration page where you can fix the problem, and provides an external link to a page with more information about this rule. This is a great addition, and gives SharePoint the capability to update the information dynamically.

The next two sections indicate which server is affected by the issue, and which service logged the failure. The final section provides a link to view the settings for this rule. You learn more about the rule definitions later in this chapter.

That's a rather in-depth property page, and it's packed with even more features. Across the top is a small ribbon that gives you some management options.

Starting on the left, the first button is Edit Item. This lets you alter the values shown on the property page. You could use this to change the error level or category of the rule. It isn't recommended that you alter these values, but if you do, you can keep track of the versions with the next button to the right, Version History. The next button, Alert Me, enables you to set an alert if the item changes. You have these options because these rules are simply items in a list, so you have many of the same options you have with regular list items.

There is another button that deserves mention. For each rule, you have the option to Reanalyze Now. This lets you fire off any rule without waiting for its scheduled appearance, which is great for ensuring that a problem is fixed once you have addressed it. You won't have to wait for the next time the rule runs to verify that it has been taken care of.

Some problems are not only reported, but can be fixed in the property page as well. Figure 22 shows another problem that appears under the Configuration category. It notes that one or more of the trace log categories were configured with Verbose trace logging. This configuration issue can contribute to unnecessary disk I/O and drive space usage. The Health Analyzer alerts you when this value is set. This problem is fairly easy to fix. Simply set the trace logging level back to its default. For problems like this, SharePoint offers another option, Repair Automatically, shown at the top of Figure 24.

Figure 24. Repair Automatically button

Repair Automatically button

Click the Repair Automatically button if you want SharePoint 2010 to fix the problem. Then, click the Reanalyze Now button, click Close on the property page, and then reload the problem report page. The trace logging problem should no longer be listed. This is almost bliss for the lazy SharePoint administrator.

Rule Definitions

The real power of the Health Analyzer lies in its impressive set of rules. SharePoint 2010 includes 60 rules. To see the entire list and details about each rule, click Monitoring, select Health Analyzer, and then choose Review rule definitions under Health Analyzer.

The rules are broken down by category: Security, Performance, Configuration, and Availability. The default view shows several pieces of information about each rule, including the Title, the Schedule of how often it runs, whether it's Enabled to run, and whether it will Repair Automatically. Wait, did you just read "Repair Automatically"? You read that right. Some rules can be configured to automatically repair the problems they find.

One example of a rule that automatically fixes itself is Databases used by SharePoint have fragmented indices. Once a day, SharePoint checks the indices of its databases, and if their fragmentation exceeds a hard-coded threshold, SharePoint automatically defrags the indices. If the indices are not heavily fragmented, it does nothing. This is a great use of Repair Automatically. It's an easy task to automate, and there's no reason it should need to be done manually by an administrator.

Some rules, like Drives are running out of free space, don't seem like quite as good a candidate for SharePoint to fix by itself. You don't want it deleting all those copies of your resume or your Grandma's secret chocolate-chip cookie recipe.

If you want to change the settings of any of the rules (including whether or not it repairs automatically), you simply click the rule title, or click the rule's line and select Edit Item in the ribbon. Here, you can enable or disable whether or not a rule runs. In a single-server environment, it might make sense to disable the rule that reports databases on the SharePoint server. It's nothing that can be fixed, so getting alerts about it does you no good. You could also choose to change how often the rule is run, but it is not a best practice to change the details of a rule except to enable the rule and whether or not you want the rule to automatically correct any problems.

Finally, the rules are simply items in a list. This illustrates how the rules list is extensible. More rules can be added later by Microsoft, or by third parties.

Timer Jobs

Timer jobs are one of the great unsung heroes of SharePoint. They have been around for several versions of SharePoint, and they get better with age.

Timer jobs are the workhorses of SharePoint. At the most basic level, timer jobs are tasks defined in XML files in the configuration database. Those XML files are pushed out to the members of the farm, and are executed by the Windows service, SharePoint 2010 Timer. Most configuration changes are pushed out to the farm members with timer jobs. Recurring tasks like Incoming E-Mail also leverage timer jobs.

In SharePoint 2010, timer jobs get another round of improvements. A lot of the functionality covered in this chapter relies on timer jobs, so you have seen some of those improvements already. This section drills down a little deeper into how timer jobs have improved.

Timer Job Management

When you enter Central Administration, it is not immediately obvious that timer jobs have received such a shiny new coat of paint. They have links to essentially the same two items in SharePoint 2010 that they do in SharePoint 2007, job status and job definition. In SharePoint 2010, the timer job links are under the Monitoring section, because there is no longer an Operations tab. Figure 25 shows their new home.

Figure 25. Timer Job Definitions

Timer Job Definitions

The timer job definition page is largely unchanged from its SharePoint 2007 counterpart. You get a list of the timer jobs, the web application they will run on, and their schedule. You can also change the jobs that are shown by filtering the list with the View drop-down in the upper right-hand corner.

To really see what's new, click one of the timer job definitions. Hopefully you're sitting down, because otherwise the new timer definition page shown in Figure 26 might knock you over. It includes all of the same information provided in SharePoint 2007, including the general information on the job definitions screen, and the buttons to disable the timer job. However, there are two new, very exciting features.

First, you can change the timer job schedule in this window. In SharePoint 2007, you needed to use code to do this. This provides a lot of flexibility to move timer jobs around if your farm load requires it. That's a great feature, but it's not the best addition.

The best addition to this page (and arguably to timer jobs in SharePoint 2010) is the button on the bottom of the page, Run Now. You now have the capability to run almost any timer job at will. This means no more waiting for the timer job's scheduled interval to elapse before knowing if something you fixed is working. It is also how Health Monitoring (discussed earlier in this chapter) can fix issues and re-analyze problems. You are no longer bound by the chains of timer job schedules. You are free to run timer jobs whenever you want. That alone is worth the cost of admission.

Figure 26. Edit Timer Job page

Edit Timer Job page

Timer Job Status

The other link related to timer jobs that you have in Central Administration is Check job status. This serves the same purpose as its SharePoint 2007 counterpart. However, like the timer job definitions, it has received a new coat of paint. Figure 27 shows the new Timer Job Status page. Like the SharePoint 2007 version, it shows you the timer jobs that have completed, when they ran, and whether they were successful.

SharePoint 2010 takes it a step further. The Succeeded status is now a hyperlink. If a timer job fails or succeeds, you can click this link to the status and get more information. You also have the capability to filter and view only the failed jobs. That helps with troubleshooting, because you can see all the failures on one page, without all those pesky successes getting in the way. To take it a step further, you can click on a failure and get information about why that particular timer job failed.

The Timer Job Status page serves as a dashboard. You've already seen how it shows the timer job history, but it also shows the timer jobs that are scheduled to run, as well as the timer jobs that are currently running. If you want more complete information on any of these sections, you can click the appropriate link on the left under Timer Links. This provides a page dedicated to each section.

Figure 27. Timer Job Status page

Timer Job Status page

Along with showing the timer jobs that are running, you can also see the progress of how far along each job is, complete with a progress bar. If you have many jobs running at once, you can click Running Jobs in the left Navigation pane to access a page dedicated to reporting the timer jobs that are currently running.

Here's one final timer job improvement: SharePoint 2010 introduces the capability to assign a preferred server for the timer jobs running against a specific Content Database. Figure 28 shows how it is configured in Central Administration.

Figure 28. Configuring Preferred Timer Job Server

Configuring Preferred Timer Job Server

This setting is set per Content Database, so it is set on the Manage Content Database Settings page (that is, in Central Administration, click Application Management and then select Manage Content Databases). Being able to set a particular server to run the database's timer jobs serves two purposes.

From a troubleshooting standpoint, you can use this to isolate failures to a single box, if you're having trouble with a specific timer job or Content Database. You can also use this to move the burden of timer jobs to a specific server. This server could be one that is not used to service end-user requests, so having it be responsible for timer jobs will allow another scaling option.

Although you can do a lot to manage timer jobs in Central Administration, you can't forget about Windows PowerShell. SharePoint includes five cmdlets that deal with timer jobs. To discover them, use the following Get-Command cmdlet.

PS C:\> Get-Command -noun SPTimerJob

You can use PowerShell to list all of your timer jobs using Get-SPTimerJob, and then choose to run one with Start-SPTimerJob.

Summary

This chapter picked up where your installation experience left off. You have a SharePoint farm that is installed and running perfectly. With the tools in this chapter, you have learned how to keep an eye on that farm to ensure that it continues to run well. If there is trouble in your farm, you are now armed with the tools to hunt it down and fix it. You will know to check the Health Analyzer and see if it has found any problems with your farm. If there's nothing there, you will also know how to use the ULS logs to track down that error. After finishing this chapter you are a lean, mean, SharePoint monitoring machine.

Additional Resources

For more information, see the following resources:

About the Author

Todd Klindt has been a professional computer nerd for 15 years, and an amateur computer nerd before that. After finding out in college that his desires for food and shelter and his abilities at programming were not compatible, he decided to try being an administrator instead. He received his MCSE in 1997 and spent a lot of time taming Windows Server, Exchange Server, and the unlucky Microsoft SQL Server here and there. In 2002, he was asked to set up a web page for his IT department. He couldn't program, and he couldn't design HTML to save his life. He found SharePoint Team Services on an Office XP CD and decided to give it a shot. It turned out that SharePoint was just what the doctor ordered. As each version of SharePoint was released, Mr. Klindt became more and more enamored with it. In 2005, Mr. Klindt was awarded Microsoft's MVP award for Windows SharePoint Services. Since then, he has contributed to a couple of SharePoint books, written a couple of magazine articles, and had the pleasure of speaking in more places than he can believe. To pay the bills, he is a SharePoint consultant and trainer with SharePoint911. You can find him on Twitter dispensing invaluable SharePoint and relationship advice as @ToddKlindt. He lives in Ames, Iowa, with his lovely wife, Jill, their two daughters, Lily and Penny, and their two feline masters, Louise and Spike.