@Anonymous Glad to know the issue was resolved and thank you for letting me know. I have summarized the troubleshooting steps followed above in the answer below, it will be helpful if you can mark it as accepted so that any one community facing similar issue can find it useful.
Issue Summary: You have two app services with the same custom domain behind Azure Front Door. when you stopped the main app service, Front Door shows a 403 error instead of redirecting to the second app service.
Troubleshooting steps tried:
- Check the health probe set-up and determine if there are any issues.
- Enabled diagnostic logging for your Front Door service and checked if there are any issues with the healthprobes by running the query below in the log analytics workspace.
AzureDiagnostics | where Category == "FrontDoorHealthProbeLog"
By running this query you confirmed that the issue health probes were getting 302 responses. For Azure Front Door Health probe a 200 OK status code indicates the backend is healthy. Everything else is considered a failure. If for any reason (including network failure) a valid HTTP response isn't received for a probe, the probe is counted as a failure. If health probes fail for every backend in a backend pool, then Front Door considers all backends unhealthy and routes traffic in a round robin distribution across all of them. This information is currently documented here. This might explain the observation you made in your question above. Can you please check why the backend web apps are giving out a 302 response for your health probe path? It might help if you can modify the health probe path set above.
As per the best practice documentation It's usually a good idea to monitor a webpage or location that you specifically design for health monitoring (example /healthstatus). Your application logic can consider the status of all of the critical components required to serve production traffic including application servers, databases, and caches. That way, if any component fails, Front Door can route your traffic to another instance of your service. Setting this such a path like /healthstatus gives you more granular control over health probes response as you can apply check for your internal components and manipulate the health probe response as required. Just adding a crude sample /healthstatus method below for reference.
Reference: https://learn.microsoft.com/en-us/azure/architecture/patterns/health-endpoint-monitoring
Solution: After modifying the health probe path, the issue was resolved. Thank you! It will be helpful if you can mark it as accepted so that any one community facing similar issue can find it useful.