Hello @Don Wise ,
I understand that you are receiving intermittent 503 error on Azure Front Door with code "OriginInvalidResponse".
Intermittent 503 errors with "ErrorInfo: OriginInvalidResponse" are mostly caused because the backend has a KeepAlive timeout less than 90 seconds. When the origin has a lower idle timeout that AFD's, the 503 errors are random and low volume.
It can happen if your backend closes a kept alive HTTP connection, right at the moment when AFD reuses the same connection for a new request.
Let's say your backend has an HTTP keepalive timeout that is less than 90 seconds. Azure Frontdoor reuses connections to improve performance, so when a connection is created for handling one request, that TCP connection is kept open for reuse (HTTP keepalive). AFD has an idle keepalive timeout of 90 seconds. But if your origin times out and disconnect sooner than this 90 second, then there can be a race condition that may result in this error. Specifically, AFD may reuse a connection, sending a new HTTP request, right at the moment when the origin times out and sends a TCP FIN. AFD interprets that receiving TCP FIN after sending a new request as an invalid response, and hence the error.
Unfortunately, this 90 second idle timeout is not configurable at AFD side.
To fix this issue, your backend should hold already used connections open for at least 91 sec, so AFD can reuse them for subsequent requests (if any). You need to make sure that the keepalive timeout on backend is more than 90 sec.
I asked you to validate your backend configuration and enable/change keepalives on your backend to more than 90 seconds.
You have a custom Linux VM in your backend, and it is configurable. So, you set the "KeepAliveTimeout" to 92 seconds in the httpd.conf configuration file.
Refer: https://httpd.apache.org/docs/2.4/mod/core.html#keepalive
The httpd.conf file of your Apache server is now configured as below:
You waited for 2 days to observe the behavior and have confirmed that the issue is now fixed for you.
Kindly let us know if the above helps or you need further assistance on this issue.
Please "Accept the answer" if the information helped you. This will help us and others in the community as well.