APIM: Intermittent ClientConnectionFailure at forward-request

Eric D 11 Reputation points
2022-02-23T17:57:26.957+00:00

Hello all! Something changed behind the scenes of our Azure infrastructure on 2022-02-10 to start causing hundreds of intermittent "ClientConnectionFailure at forward-request" errors per day and we could use some help tracking it down.

Architecturally we have an ASP.NET web app calling an APIM instance that uses Function Apps to provide a microservice-ish environment. We use the APIM Consumption pricing tier so we don't have access to use virtual networks (Vnet) therefore all the network traffic uses DNS and public FQDNs for all our resources.

Here are some facts:

  1. We hadn't deployed or changed anything for days,
  2. The volume of API calls hasn't changed significantly.
  3. This solution has been solid for months before 2022-02-10
  4. The vast majority of failures point to a particular function inside a particular Function App
  5. The Function App replies in < 250 ms on average for the function in question
  6. The Function App failure rate is tiny compared to the APIM failure rate. In other words, while APIM shows 140 failures per day, the Function App shows 12, and those 12 failures are only 3% of the overall requests to the particular function.
  7. The Function App errors are network-related: An operation was attempted on a nonexistent network connection. (0x800704CD).
  8. The function's payload is tiny (just a phone number in JSON); not large upload scenario like the majority of the community questions on the "nonexistent network" errors.
  9. Started happening on 2022-02-10
  10. Hadn't deployed or changed anything for days
  11. No useful logs in Kudu's EventLog.xml
  12. APIM Pricing tier: Consumption, so very little visibility for troubleshooting

Here is what we tried:

  1. Redeployed the Web App that is the consumer of the APIM API that fails since the ClientConnectionFailure is a client connection failure after all.
  2. Scaled up the Web app to change hosts. This is a known practice for hosted serverless systems that misbehave.
  3. Scaled up the Function App to change hosts.
  4. Validated our Function App is using HTTP 1.1. Reference: https://windows-hexerror.linestarve.com/q/so64045286-uploading-to-azure-webapp-throws-connectionresetexception-the-client-has-disconnected
  5. Validated our Function App ignores Client Certificates. Reference: https://hahndorf.eu/blog/dupaspnetrequest.html

Help!!

Azure API Management
Azure API Management
An Azure service that provides a hybrid, multi-cloud management platform for APIs.
1,751 questions
Azure Functions
Azure Functions
An Azure service that provides an event-driven serverless compute platform.
4,263 questions
Azure App Service
Azure App Service
Azure App Service is a service used to create and deploy scalable, mission-critical web apps.
6,874 questions
{count} vote