WAS hosted WCF service stops processing MSMQ messages

IIS 7.0 introduces WAS (Windows Process Activation Service) hosting architecture to support non-HTTP protocols such as named pipes, TCP and MSMQ. A separate Window service is provided for each non-HTTP protocol as the protocol listener and adapter. For MSMQ, the Windows service is Net.Msmq Listener Adapter. After setting up necessary configurations as described in MSMQ, WCF and IIS: Getting them to play nice (Part 1), (Part 2), you should see your WAS hosted WCF service successfully reads and processes the message using the NetMsmqBinding transport after the message arrives in the queue. The queue can be local to the WCF service or installed on a remote standalone machine or even clustered for high availability on a Windows cluster.

The scenario works most of time. It stops working only after a remote MSMQ service restarts or a clustered MSMQ resource fails over. In either case, the WCF service is no longer able to read messages off the queue and the incoming messages will sit in the queue forever. You must perform one of these things to restore normal service: restart WAS in Services MMC, reset IIS or recycle the app pool that hosts the WCF service or browse to the WCF .svc if the WCF service has HTTP enabled, which will recreate the service host. If you have WCF tracing enabled, you may see these exception messages in the service’s WCF trace file:

<TraceIdentifier>https://msdn.microsoft.com/en-US/library/System.ServiceModel.CommunicationObjectOpenFailed.aspx</TraceIdentifier>
<Description>Failed to open System.ServiceModel.ServiceHost</Description>

<TraceIdentifier>https://msdn.microsoft.com/en-US/library/System.ServiceModel.CommunicationObjectFaulted.aspx</TraceIdentifier>
<Description>Faulted System.ServiceModel.ServiceHost</Description>

<TraceIdentifier>https://msdn.microsoft.com/en-US/library/System.ServiceModel.ServiceHostFaulted.aspx</TraceIdentifier>
<Description>ServiceHost faulted.</Description>

The root cause is a WAS service fault after the MSMQ exception (restart). When MSMQ restarts, it faults the listener channel and the service host that was activated by WAS. The faulted service host is then aborted. Microsoft has confirmed this is a bug in the current releases of .NET Framework (4.0 and below) and fixed the issue in the next release version of .NET 4.5. You cannot reproduce the issue when the queue is local because the dependency of net.msmq listener adapter on MSMQ exists on the same box.

To assist our customers, we have written an external program. The console app continuously calls MessageQueue.Peek on the queue-of-interest and handles the MessageQueueException (when the MSMQ service for the queue stops) by recycling the specified app pool or sending a HTTP request to the service’s .svc file. If you encounter this problem on .NET 4.0 and below, we suggest you first try these two workarounds manually and then follow the samples to automate the workaround that you choose to use.

You can run the workaround program on a machine other than the server where the WCF service is hosted. The requirements are the program has the access to the queue and can recycle the app pool or send a HTTP request to the service’s .svc file. In the real world, you should build the program as a Window service.

Two Visual Studio 2010 sample solutions are attached to the article. No executable is included.

A command example in AppPoolMonitor-recyclePool:

AppPoolMonitor.exe /queuePath="DIRECT=OS:clus1msmq\private$\servicemodelsamples/service.svc"
/appPool="DefaultAppPool"

This will recycle “DefaultAppPool” after the MessageQueueException (when MSMQ restarts or fails over) is detected. Note that it may take up to 10 minutes after this app pool recycle for messages to be picked up again. It is by design because polling of queues by the net.msmq adapter happens every 10 minutes.

A command example in AppPoolMonitor-sendRequest:

AppPoolMonitor.exe /queuePath="DIRECT=OS:clus1msmq\private$\servicemodelsamples/service.svc"
/wcfSvcUrl=https://WCFHostServer/servicemodelsamples/service.svc

This will refresh the help page “<https://WCFHostServer/servicemodelsamples/service.svc”> after the MessageQueueException (when MSMQ restarts or fails over) is detected. For this to work, the HTTP protocol needs to be enabled for the WCF service.

Disclaimer
The workaround code is provided as a sample. The sample code is not supported under any Microsoft standard support program or service. It is provided AS IS without warranty of any kind.

Sample 1: AppPoolMonitor-recyclePool.zip

Sample 2: AppPoolMonitor-sendRequest.zip