Hi,
We have an Application Gateway (WAF v2) in front of two web servers in a backend pool, with a basic health probe. Our traffic is not high load, so the main reason for multiple web servers is to have redundancy when something happens.
But we are not happy with the fact that this can still result in error pages for the end user if something happens with one of the servers. I just now tested this in our staging environment, by simply deleting the VM of one of the servers. I expected the website to still be up, since the Gateway has the other web server to send the traffic to. But what I found was a page that was stuck in loading for a long time, then an ugly gateway error.
The way I see it, the only time an error in the 500-range, or a timeout, should ever be visible to the external user, is if none of the backend targets can serve a valid response. How can we achieve that? Is that even possible with an Application Gateway?
If this is not possible, then how can one ever handle non-planned web server outages with an Application Gateway? Are we just supposed to accept that some people will see an error page until the gateway has gotten enough health probe failures to mark it as unhealthy? That would be insane.