How to resolve intermittent 503 errors with high CPU utilization in Azure App Service during peak traffic spikes while maintaining SLA?

Joseph Kuria Njiraini 0 Reputation points
2025-05-30T20:34:11.4666667+00:00

We have a .NET Core 6 API hosted on Azure App Service (Premium P3v2 tier) that experiences intermittent HTTP 503 errors during unpredictable traffic spikes (10x baseline). Application Insights shows CPU saturation (~95%) coinciding with the errors, but auto-scaling (configured on CPU > 70%) often lags behind demand.

Current Configuration:

  • Instances: 3 (min), 10 (max)
  • Scale-out: CPU > 70% for 5 minutes, +1 instance
  • ARR Affinity: Disabled
  • Health Check: /status (200 OK endpoint)
  • Database: Azure SQL (DTU 100, no throttling observed)

Attempted Fixes (No Success):

  1. Pre-warmed instances via Always On + startup tasks.
  2. Adjusted scale-out rules to trigger at 60% CPU with 3-minute cooldown (resulted in over-provisioning without eliminating 503s).
  3. Optimized code (reduced EF Core queries, added caching via Redis).

Hard Requirements:

  • Must maintain 99.95% SLA.
  • Cannot use ASE (App Service Environment) due to cost constraints.

Question:

What’s a deterministic strategy to eliminate 503s under these conditions? Are there hidden Azure quotas (e.g., SNAT, VMSS burst limits) or advanced scaling patterns (predictive, queue-based) that could resolve this? Provide low-latency solutions, not theoretical guidance.

View Markdown


# Intermittent 503s in Azure App Service During Traffic Spikes  

Problem:

- 503 errors during sudden 10x traffic spikes.  

- CPU hits ~95% before scaling kicks in (P3v2 tier).  

- Auto-scaling delay causes SLA risk.  

Constraints:

- No ASE, must stay cost-efficient.  

- 99.95% SLA non-negotiable.  

Need:

- Actionable fixes (e.g., ARM template tweaks, scale rule hacks).  

- Deep Azure infra insights (throttling, SNAT, VMSS quirks).  

Azure App Service
Azure App Service
Azure App Service is a service used to create and deploy scalable, mission-critical web apps.
8,941 questions
{count} votes

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.