Intermittent SSL failures in Azure Front Door

Joe 0 Reputation points
2024-03-11T15:27:51.8366667+00:00

Hello,

We are seeing intermittent connection problems from requests coming from East Asia, East US & West US regions.

Error: read ECONNRESET Stack: Error: read ECONNRESET at TLSWrap.onStreamRead (node:internal/stream_base_commons:217:20) at TLSWrap.callbackTrampoline (node:internal/async_hooks:128:17)

It seems to be caused by sporadic SSL cert unavailability. SSL certs are managed by Azure Frontdoor, and made available at the edge nodes of the FD network.

Failures can be seen on other common/popular domains so it is localised to ours

CONNECTED(00000003)
write:errno=104
---
no peer certificate available
---
No client certificate CA names sent
---
SSL handshake has read 0 bytes and written 321 bytes
Verification: OK
---
New, (NONE), Cipher is (NONE)
Secure Renegotiation IS NOT supported
Compression: NONE
Expansion: NONE
No ALPN negotiated
Early data was not sent
Verify return code: 0 (ok)
—

If you retry a couple of times you will get a response which sounds similar to an issue on this thread. We see no evidence of the request reaching Azure FrontDoor when looking at standard FrontDoor metrics. We typically see these errors from 3rd party requests outside of Azure but we can replicate the failure from Azure nodes in the affected regions.

We typically see 2 types of responses:

  • Command runs, the certificates are displayed but then it is stuck for 30-60s and we get read:errno=104. If we rerun the same command again multiple times it may complete successfully. This is the most common error. 
  • Command runs but no certificates are found we get error write:errno=104.

When this happens, we see traffic dropping to almost 0 on Application Gateway in the affected region for 10-15 minutes. 

Testing

When the error occurs, we cannot collect HAR traces because the connection is not established. To get HAR we need to reach the application layer and that does not happen.

We do use curl, wget & openssl to test. We run these on Azure AKS nodes in the region so there should be very little or no external traffic involved.

Curl response

curl: (35) OpenSSL SSL_connect: Connection reset by peer in connection

We have some theories as to the cause but no definite idea or possible solutions.

Any assistance of feedback would be greatly appreciated.

Azure Front Door
Azure Front Door
An Azure service that provides a cloud content delivery network with threat protection.
576 questions
{count} votes