User Profile Analysis for Exchange Server 2010 (Who needs the Exchange Profile Analyzer?)

NOTE: Thanks to Josh Bryant, this script has now been updated to collect data from Exchange 2007, 2010 and 2013 servers. I have updated the link in this article to point to the new script - thanks Josh!

I recently wrote about gathering user profile data for Exchange Server 2003 and 2007 by using the Exchange Server Profile Analyzer tool.  As a re-cap the EPA tool uses WebDAV to interrogate the mailboxes and generates user profile data, including..

  • messages sent per mailbox per day
  • messages received per mailbox per day
  • average message size

Note: This information is vital for performing good quality Exchange server scaling.

The problem of course is that Exchange Server 2010 does not include WebDAV and so the EPA tool will not work.  This poses an interesting problem, however I am happy to report that we have a solution…

One of the nice things about Exchange 2007 and Exchange 2010 is that we can interrogate the message tracking logs via PowerShell.  This provides us with a nice way to query what the Exchange Server is doing.  Usefully the message tracking logs include sufficient information for us to approximate our user profile data, without needing the EPA.

Gathering the Data

After asking around internally within Microsoft about how to gather EPA data for Exchange Server 2010, it became apparent that PowerShell would be the best way to interrogate the message tracking logs.  I mentioned to a few people that I was going to write something up over the next few weeks, however before I had a chance to even put any significant thought into the task, someone sent me a copy of the following script which I have uploaded here.

Now, I must confess that despite my best efforts I have been unable to track down the original author! The author of this script is Rob Campbell, more about the scripts development can be found here…

The script basically works by parsing the messaging tracking logs of your Exchange Servers and then tabulates the information into a CSV file for analysis in Excel.  To provide some data to parse I configured a loadgen test against 10 mailboxes with a heavy profile, this should approximate to around 80 messages received and 20 sent per user.

The MessageStats script has a single command line parameter which controls how many days back it will look in the tracking logs.  The script only parses a single days worth of data, the value you provide helps you tell the script which day to process, so 1 will process yesterdays logs.

Analysing the Data File

So, now we have our CSV file that we can open in Microsoft Excel, however the data required some work before we can get our EPA values.  The following screenshot shows the raw data open in Excel.

The best way to process the data is to convert it into a table..

  1. Highlight cell A1
  3. Click on the INSERT Menu
  4. Click on the TABLE button
  5. Click on OK
  6. Open the DESIGN Menu
  7. Check the "Total Row" checkbox
  8. Hide columns C,D,E,H,I,J,K,L,M,N,O,R,S,T,U

You should now have a table with the following columns…

  • Date
  • User
  • Received Total
  • Received MB Total
  • Sent Unique Total
  • Sent Unique MB Total

Note: Due to my test lab being very small I have added a filter to remove any non-loadgen accounts from the data analysis.

In the Total row at the bottom of your table add "AVERAGE" subtotals for "Received Total" and "Sent Unique Total".

In the "Received MB Total" column total cell, add in an "AVERAGE" subtotal, then edit the formula in the cell and divide that value by the Total Row average for "Received Total", then multiply the result by 1024 – this will report the average message size in KB.

In the "Sent Unique MB Total" column total cell, add in an "AVERAGE" subtotal, then edit the formula in the cell and divide that value by the Total Row average for "Sent Unique Total", then multiply the result by 1024 – this will report the average message size in KB.

We now have all of the information that we require…

  • Messages Received per Mailbox Per Day = Received Total  = 68)
  • Messages Sent per Mailbox Per Day = Sent Unique Total  = 17)
  • Average Message Size = Average of Received MB Total & Sent Unique MB Total (27.37+28.5)/2 = 27.94KB

So, using this technique we have managed to approximate our user profile to a fair degree of accuracy without needing to logon to any mailboxes!.  I suspect that this method is accurate to around +/- 10% which is totally acceptable in this context.

Obviously there is a caveat here that I have only performed some rudimentary testing in a fairly small lab environment, so if you do run this in production and find that it generates weird results, or that it validates your already proven EPA data, then feel free to drop me a note to let me know