Hi Tricky,
Welcome to the Q&A,Let me try to addressing your questions about Azure Front Door's failover and failback behavior in scenarios where two origins are set up with different priorities, one at a time.:
Considering the following configuration:
- Origin A has a priority of 1 and a weight of 1000
- Origin B has a priority of 2 and a weight of 1000.
- T = 0s, T = 30s, T = 60s for default settings per probe.
How long does Frontdoor take to failover to origin B if origin A for some reason goes down? : Azure Front Door monitors the health of origins through health probes. The time to failover to Origin B depends on the probe frequency and timeout configuration. By default, health probes run every 30 seconds, with a timeout of 5 seconds. If three consecutive health probes fail, an origin is marked unhealthy. Based on the default settings, failover to Origin B could take up to 90 seconds.
How long would Frontdoor take to fail back to origin A once it is in healthy status?: When Origin A becomes healthy again, Front Door detects this through the same health probe mechanism. It requires three consecutive successful probes to mark Origin A as healthy and start routing traffic back. Under default settings, this process might also take approximately 90 seconds.
If the duration required to fail back to origin A is a lot, then what would happen if origin B goes down in between? Will there be an outage in the application or will origin A pick it up?: If Origin B becomes unhealthy before Origin A is marked healthy, there may be a temporary outage if neither origin is healthy. However, traffic will seamlessly fail back to Origin A if it recovers during this time.
Here is a bonus from Copilot about time out:
- The system does not perform additional checks within the same 30-second interval. Once the third probe fails (at T = 65s, considering a 5-second timeout), the system immediately marks the origin as unhealthy. The health probe mechanism strictly adheres to the configured interval, and additional probes are not conducted inside that interval window. In other words: Health probes are initiated strictly at regular intervals (e.g., T = 0s, T = 30s, T = 60s for default settings). After the third consecutive failed probe, the system marks the origin as unhealthy immediately at the end of the timeout for the third probe (in this case, T = 65s).
Conclusion: It’s a good idea to run some tests to see how this really works and nail down the actual timings since things like network latency can change the results.
References:
- https://learn.microsoft.com/en-us/azure/frontdoor/health-probes
- https://learn.microsoft.com/en-us/azure/frontdoor/front-door-overview
- https://learn.microsoft.com/en-us/azure/frontdoor/troubleshoot-issues
- Used copilot from the timeout doubts .
If the information helped address your question, please Accept the answer.
Luis