OUTBOX: Understanding and Fixing Slow Exchange Web Services Code (Part 1)
Recently I was working with a customer who was concerned about Exchange Web Services performance. He was testing some EWS code whose purpose was to retrieve all properties of every item in a mailbox. The code was structured like this...
- GetItem | AllProperties - Get the root folder of the mailbox (one call)
- FindFolder | AllProperties - Get the folder and the FolderIds of its sub folders (recursive, one call for each folder)
- FindItem | AllProperties - Find each item in the folder (one call for each folder)
- GetItem | AllProperties - Get the full set of properties on each item (one call for each item)
- FindItem | AllProperties - Find each item in the folder (one call for each folder)
- FindFolder | AllProperties - Get the folder and the FolderIds of its sub folders (recursive, one call for each folder)
This first thing to take note of is that there is too much information being requested in the base shapes of these requests. Each of these requests is done requesting an AllProperties base shape. A very simple change to improve speed would be us to use IdOnly base shapes on the requests until making the GetItem calls on the items themselves at which point all properties are desired...
- GetItem | IdOnly- Get the root folder of the mailbox (one call)
- FindFolder | IdOnly - Get the folder and the FolderIds of its sub folders (recursive, one call for each folder)
- FindItem | IdOnly - Find each item in the folder (one call for each folder)
- GetItem | AllProperties - Get the full set of properties on each item (one call for each item)
- FindItem | IdOnly - Find each item in the folder (one call for each folder)
- FindFolder | IdOnly - Get the folder and the FolderIds of its sub folders (recursive, one call for each folder)
This code is still very chatty, it currently makes one FindFolder and one FindItem for each folder plus one GetItem for each item in the folders. That is a lot of requests and responses going back and forth. Fortunately, these calls can be restructured and combined to reduce chattiness. Notice that FindFolder has the option for a deep traversal in its shape, this makes it possible to retrieve all the FolderIds for all folders underneath the mailbox root in a single request and response. There is no deep traversal option in the base shape of a FindItem request and there is no ParentFolderIds property on the GetItem request but GetItem requests can request multiple ItemIds. So while there still needs to be one FindItem request per folder, the GetItem calls can be batched instead of doing one per item...
- FindFolder | IdOnly, Deep - Get the FolderId for every folder under the mailbox root (one call)
- FindItem | IdOnly - Get the ItemIds for every item in the folder (one call per folder)
- GetItem | AllProperties - Get the full set of properties on each item (approximately one call per folder)
- FindItem | IdOnly - Get the ItemIds for every item in the folder (one call per folder)
This greatly reduces the number of requests made and in turn increases the performance of the code. Notice that for the GetItem call it says, "approximately one call per folder". Just because an ItemId array of 1500 items can be sent with a GetItem request doesn't mean it is the best idea. Certainly batching the GetItem calls help performance but this should be broken up into a "reasonable" amount of items per request - large requests can take minutes to process and return. The ideal threshold can be determined by taking into account the deployment environment, application requirements, and testing results for each application.
...Just to illustrate the point, here is some data from tests I did. In this test I'm simply calling GetItem|AllProperties on every item in a test account's inbox. The folder has about 1500 items in it of vary sizes, some of the items are pretty large. There is just one Exchange server with all roles installed on it. The time recorded here is the time it takes to get through all the GetItem requests and responses for the items in the inbox. "Items per Request" is the number of ItemIds included in each GetItem request. Remember, these results are specific to my environment and what I'm doing in my code, your mileage may vary...
Items per Request | Total # of Requests | Time (seconds) Test 1 | Time (seconds) Test 2 | Average Time | Time per request |
1 | 1531 | 250 | 262 | 256 | 0.17 seconds |
2 | 766 | 193 | 188 | 190.5 | 0.25 seconds |
5 | 307 | 149 | 151 | 150 | 0.49 seconds |
10 | 154 | 137 | 135 | 136 | 0.88 seconds |
25 | 62 | 129 | 129 | 129 | 2.08 seconds |
50 | 31 | 135 | 130 | 132.5 | 4.27 seconds |
100 | 16 | 134 | 131 | 132.5 | 8.28 seconds |
250 | 7 | 132 | 131 | 131.5 | 18.79 seconds |
500 | 4 | 131 | 134 | 132.5 | 33.13 seconds |
1000 | 2 | 133 | 134 | 133.5 | 66.75 seconds |
2000 | 1 | 135 | 131 | 133 | 133 seconds |
... This is part one of a two part series, click here to read part two.
Comments
Anonymous
July 24, 2008
Speeding Up PowerShell Startup MS08-039: Which users are vulnerable to the OWA XSS vulnerability? ReducingAnonymous
October 02, 2008
I've put together a list of articles which cover common questions on Exchange Web Services (EWS). These