Things you need to know about “The connection to the Microsoft Exchange server in unavailable. Outlook must be online or connected to complete this action” prompts in an Outlook 2003–Exchange 2010 world

If you are opening this blog, then you most likely are troubleshooting the following prompt in your Outlook 2003 - Exchange 2010 environment while opening additional calendars.

  The connection to the Microsoft Exchange server in unavailable. Outlook must be online or connected to complete this action”

There might also be prompts as below in Outlook 2003 for the same issue 

“Unable to display the folder. The information store could not be opened”

“The set of folders could not be opened”

“The information store could not be opened”

The subject of discussion in this blog is very specific to the above prompts occurring on Outlook 2003 clients while accessing additional mailboxes that reside on Exchange 2010 servers. This is a bit lengthy post. Be prepared to spend sometime to read through it.

 

 As many of us are aware by this time, we’ve KB 2299468 covering this issue for Exchange 2010; I wrote the draft for this KB article and thanks to the Exchange KBs review team for they made some great enhancements to it before it went on to get published. Some background on KB 2299468 -

In my first case that I worked on these issue symptoms, customer was looking to migrate to Outlook 2010 sooner (and the issue didn’t occur for Outlook 2007 and higher clients) and that alleviated any real troubleshooting on the issue and the case was closed. And then I hit my second case with these symptoms. In this case, the primary and additional mailboxes being accessed were all on Exchange 2010. This situation made it different than the then existing public KBs on this topic written for Exchange 2007 and Exchange 2003 respectively ,these KBs covered the scenario wherein the primary mailboxes existing on Exchange 2007 and Exchange 2003 respectively but in my case the primary mailbox was on Exchange 2010. I was aware that Outlook 2003 has to establish more connections to Exchange 2010 than its successors because the LegacyDNs it has to resolve to the target Exchange 2010 server names for opening these additional mailboxes are constructed LegacyDNs which per outlook 2003 client’s point of view is a different server and therefore a different connection every time , this will eventually lead to more RPC connections per client. In my case I had to resort to data collection from server and client side to link the dots to this theory. To start with, we were not even sure if our symptoms were related to this theory. Fortunately for me, the customer was very much willing to cooperate with the data collection.

I reviewed the Address book, RPC Client logs, and Outlook ETL logs. Surely, I didn’t see any failures at address book logging level, at the same time I didn’t either see that a successful logon happened to additional mailboxes, from RPC Client access logs perspective. From the Outlook ETL trace, the error was confirmed to be returned from the server. I then captured Extrace on RPCClientAccess at the CAS server based on recommendation from an Outlook EE from where we found a real breakthrough. We saw that these additional mailbox logins failed for the primary mailbox with error “MaxConnectionsExceeded” and that meant that the maximum number of allowable RPC connections for that given primary mailbox has exceeded. So now, I could link the dots and KB 2299468 was born. That’s its background.

However, in my case the issue was still not completely resolved. It just got a little more interesting. My customer’s feedback was that for some mailboxes it (custom throttling policy) resolved and for some it didn’t. On further tests the customer found that for those mailboxes for which the issue didn’t resolve with custom throttling policy, he found that removing the entire additional mailboxes list and re-adding them resolved the problem. It seemed at some point, his exchange organization had Exchange 2000 in it. His observation was that for those mailboxes that were moved from Exchange 2000 the issue still happened and not completely resolved i.e., the user may be able to open all 29 additional calendars now and within next few minutes he may be able to open only a few. For those mailboxes which were newly created in Exchange 2010, the issue got resolved with custom throttling policy. With more tests, we were able to confirm that this was the behavioral pattern. On further internal research, I found this behavior stems from the way Outlook 2003 stores information about the shared calendars. 

  • It maintains a sort of cache of the shared calendars that a primary mailbox opens. This "cache" takes the form of hidden persistence messages stored in the Common Views of the primary mailbox  
  • The hidden persistence messages contain information about the shared mailbox such as the user name and the server where the mailbox resides. If the mailbox is moved to another server, Outlook 2003 will not update this information by default, and it will try to make referrals to the old server (which is probably decommissioned already), thus the error message that the server is unavailable which makes the error prompt genuine  

 The best method to force outlook 2003 to update itself about the current status of mailboxes in the shared calendars is to start outlook with /resetnavpane switch, which will force these messages to get updated information .And that also meant all the existing additional mailboxes will be gone for those mailboxes for which we launched outlook with resetnavpane and we had to re-add them manually 

               Go to Start- Run and type “outlook.exe /resetnavpane”

We did that and the issue got completely resolved. KB 2299468 focused on server side RPC throttling values and we now hit another cause from Outlook side for the error prompts. My customer eventually added the resetnavpane information to his own blog and I’m sure it’s searchable on the internet.

So, why am writing this blog now? Well, the issue is most common and should have been long posted on a MS blog already. I would have missed the current opportunity to write one too if I wasn’t tempted by another interesting case. Thanks to my colleague who suggested that I should write a post on this subject.

In this case again, the Outlook 2003 error prompt was The connection to the Microsoft Exchange server in unavailable. Outlook must be online or connected to complete this action” while accessing additional calendars. Customer had Exchange 2010 – Exchange 2003 mixed environment, as I started off to work with him I realized he already had a custom throttling policy on Exchange 2010 side and had the relevant Exchange 2003 DS Proxy hotfix already applied on his Exchange 2003 servers. Customer informed me that he has a two node CAS array in Windows NLB and that both the CAS servers have static port mapping defined for RPC Client Access and Address Book services respectively. As a quick test, I convinced customer that we test putting up a host file on one of the outlook 2003 client machines resolving CAS array name to one of the CAS server’s IP, thus bypassing the NLB. We did that and found that the issue didn’t occur at all. To be sure whether it’s a problem with the other CAS server then, we shut down outlook, changed the hosts file such that the CAS array’s name resolved to the IP of the other CAS server, started outlook and we found the issue didn’t occur there either. We now removed the hosts file and the issue was back. So, that tells us the issue has got to do with the load balancing. I was wondering if the NLB resulted in high RPC connections somehow, to check that we removed previous CAS entries from the hosts file thus letting Outlook reach the NLB again and I collected the extrace from both the CAS servers (perhaps there are other ways to see if RPC connections exceeded but I fancied sticking to my method) and found that the connections have NOT exceeded for the users facing the problem.

That was a nice scenario, something which I didn’t anticipate. What next? I captured a simultaneous network trace on the client and the CAS servers; that threw up more light on the actual cause 

Reading the trace, I observed that the port 8500 was returned to client from server as being the Referral service port 

9 15:29:45.052621 192.168.1.5 192.168.1.135 EPM Map request --> Client requests End Point Mapper for RFR's UUID ( 1544f5e0-613c-11d1-93df-00c04fd7bd09 )

0000 13 00 0d e0 f5 44 15 3c 61 d1 11 93 df 00 c0 4f

0010 d7 bd 09 01 00 02 00 00 00

 10 15:29:45.052807 192.168.1.135 192.168.1.5 EPM Map response --> Response from End Point Mapper that RFR port is 8500

Floor 4 TCP Port:8500

And then immediately, the following resets caught my attention. You would see that resets happened for the Referral service’s port

 11 15:29:45.053235 192.168.1.5 192.168.1.135 TCP 3300 > 8500 [SYN] Seq=0 Win=65535 Len=0 MSS=1260 --> Client initiates TCP 3 way handshake to port 8500 on the CAS array’s NLB IP

12 15:29:45.053536 192.168.1.135 192.168.1.5 TCP 8500 > 3300 [RST, ACK] Seq=1 Ack=1 Win=0 Len=0 --> Gets a RST on port 8500

Going back to verifying static port mapping on customer’s CAS servers, we found the following configuration 

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\MSExchangeRpc\ParametersSystem-TCP/IP Port (Reg_Dword) ,value set to 59533

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\services\MSExchangeAB\Parameters - RpcTcpPort (REG_Dword) ,value set to 59534

From the wireshark trace, I found that client was able to establish store connections correctly on port 59533 as defined and desired but for Referral service the server returned the listening port as 8500 as opposed to defined port of 59534. On a closer look, I found the registry key type for Address Book service port definition is incorrect. The RpcTcpPort should be a REG_SZ value instead of a REG_DWORD. Reference - https://msexchangeteam.com/archive/2010/09/01/456094.aspx . We corrected this registry configuration as follows on both the CAS servers and rebooted the servers. 

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\services\MSExchangeAB\Parameters - RpcTcpPort (REG_SZ) ,value set to 59534

After the restart, the clients never hit the error prompts while accessing additional calendars. If you had given a careful attention, you might be wondering even if the server returned Address book service’s port as 8500 instead of 59534 why weren’t the clients able to connect to port 8500 on NLB IP. That’s a valid question and I’ve an answer for that. In customer’s NLB setup, they had defined port rules that only allowed the following ports  

25 

80 

135 

443 

59533-59534 

As you can see, port 8500 is not defined there and therefore NLB didn’t allow clients to connect to that port. That clarifies why the RESETS happened while clients tried connecting to port 8500. By the way, we don’t recommend WNLB for Exchange 2010 CAS load balancing for the reasons mentioned here. I won’t get into those details here as that’s beyond the scope of this article. 

As I finish this article, I got reminded that I worked another case earlier for the same error prompts and there it happened due to 0x1c00001a networking issue caused by their hardware load balancer which closed off idle TCP connections in 6 minutes , we resolved that case by lowering TCP KeepAliveTime on their exchange servers to 5 minutes. So, be wary of networking problems causing these error symptoms.  

To sum it up, we covered the following possible causes for the Outlook 2003 prompts while accessing Exchange 2010 shared calendars   

    •  
    • RPC throttling
    • Outlook 2003 shared calendars cache holding stale information
    • Incorrect static port mapping in combination with load balancer port rules
    • TCP session tear down - 0x1c00001a 

 

Thank You

Kovai J