@Jason Hargrave Since you have health checks enabled, it appears the system is detecting issues with some of your instances. To manage and route traffic more effectively during these failures, consider using Azure Traffic Manager or Azure Front Door.
You can deploy Azure Front Door or Azure Traffic Manager to intercept traffic before they hit your site. They help in routing & distributing traffic between your instances/regions. In the event that a catastrophic incident happens in one of the Azure Datacenters, you can still guarantee that your app will run and serve requests by investing in one of them. There are additional benefits to using Front Door or Traffic Manager, such as routing incoming requests based the customers’ geography to provide the shortest respond time to customers and distribute the load among your instances in order not to overload one of them with requests.
Learn More
- Controlling Azure App Service traffic with Azure Traffic Manager
- Quickstart: Create a Front Door for a highly available global web application
Also, you can use Application Initialization which ensures that your app instances have fully started before they are added to they start serving requests. Application Initialization is used during site restarts, auto scaling, and manual scaling. This is a critical feature where hitting the site’s root path is not sufficient to start the application. For this purpose a warm-up path must be created on the app which should be unauthenticated and App Init should be configured to use this url path.
Try to make sure that the method implemented by the warm-up url takes care of touching the functions of all important routes and it returns a response only when warm-up is complete. The site will be put into production only when it returns a response (success or failure) and app initialization will assume “everything is fine with the app”.
App Initialization can be configured for your app within web.config file.
Learn More
For more information, see: Robust Apps for the Cloud.
In response to the question, “Is it the process of one of the service instances becoming bad, and then switching over?” –
I suggest reviewing this article on using Health Check in the Azure portal to monitor App Service instances. Health checks improve your application's availability by rerouting requests away from unhealthy instances and replacing them if they remain unresponsive. The service pings your web application every minute on a path you define.
Also, It seems your application failed to respond to multiple HTTP health checks, which could mean it crashed, finished unexpectedly, or failed to expose the correct TCP port.
Azure App Service pings your container every second to ensure the HTTP server is running. If there’s no response within the default 230-second timeout, the system times out.
To resolve this, you can increase the timeout using the WEBSITES_CONTAINER_START_TIME_LIMIT app setting. Additionally, enabling application logs will help you identify any exceptions or errors that may provide clues to the underlying issue.
For more information, refer to these articles:
- What's the difference between PORT and WEBSITES_PORT?
- Troubleshooting intermittent outbound connection errors in Azure App Service
Please let us know if further query.