Azure App service get 503 The service is unavailable.

Kusakabe Si 21 Reputation points
2021-11-06T20:16:05.51+00:00

This is my web app:

  1. https://dn42usw.azurewebsites.net
  2. https://dn42br.azurewebsites.net
  3. https://dn42uk.azurewebsites.net/

I get this 3 errors randomly:

  1. 503 The service is unavailable.
  2. 503 Service Unavailable.
  3. 502 Web server received an invalid response while acting as a gateway or proxy server.

Everything seems to be just fine in the portal. The app status is Running, no obvious reason to 503s to be seen. I tried restart the app, with no luck resolving the issue.

503 on SCM

I also tried to use webssh to check there is any problem, I get 503, too I guess my app are not running in the background at all https://dn42usw.scm.azurewebsites.net/webssh/host --> 503 The service is unavailable.

The service just stop working suddenly:

147043-image.png

Some logs

Deployment center: 146960-image.png Log stream 147024-image.png


This case indicates the reason may be the application resources exhaustion

but I checked several times, I'm pretty sure there are no resource exceed the limit.

Quotas are not exceeded 146944-image.png

All traffic during last month, 60MB. Shell not exceed the 5GB/month limit. 147051-image.png

My question is: Why I get 503 error on my app?

If the reason is resource exhaustion, where I can know which resource exhausted?


====Update====

The web app works now. But... may I know the reason why it stop working during the past 8 hours?

====Update 2====

The service goes down again.

====Update 3====

New Discover

This is 2 identical containers in different location 147135-image.png

The load average in Brazil node is very high, and it's very unstable. I got error 503 and 502 very often from it.
I'm guessing the reason for this problem is the loading of the host machine is different.
The loading of the physical machine in Brazil is very high, that's why I'm getting very high load average and it's crash so often.

btw: this is the log of us west: 147120-image.png

====Update 4====

Suddenly all unstable nodes(usw/uk/br) running again 147564-image.png

I'm still don't know the true reason for this problem.

====Update 5==== 2021/11/17

Thank you for all your assistance, I really appreciate your help in resolving the problem.

Now I'm fully understand the LinuxFree SKU are not stable enough to running a production application. During our email conversation, you gives me two reasons to my problem

  1. It occurred due to a normal movement of your site during routine service maintenance from an existing worker.
  2. This indicates that your application crashed, unexpectedly finished or didn't expose/listen to the correct TCP port.

But I have a doubt about the two reasons ...

  1. About this node: https://dn42sg.azurewebsites.net/ The error continues 10~30 hours, not 10~30 minutes. Does the movement takes so long? 150292-image.png
  2. Reason 2 indicates it's scenario1, but all the metrics and logs in azure portal indicates the error is scenario 2, not scenario1: 150301-image.png
metrics:
  1. 150273-image.png
  2. 150254-image.png The data in, data out, cpu time drops to 0, maybe it's my app crashed. But the requests drop to 0, which means my app didn't receive any requests, which indicates the scenario 2 is only possible reason.
logs:

150313-image.png 150314-image.png 150219-image.png 150255-image.png

====Update 6==== 2021/11/20

Thanks your help, now I know the HttpStatus:503 and HttpSubStatus:65 means We ran out of workers.
There are too many people uses that node so that there are no space leave for free SKU. But paid SKUs have higher priority so that they will not affact by this issue.

Azure App Service
Azure App Service
Azure App Service is a service used to create and deploy scalable, mission-critical web apps.
5,620 questions
{count} votes

Accepted answer
  1. SnehaAgrawal-MSFT 14,966 Reputation points Microsoft Employee
    2021-11-16T08:14:18.22+00:00

    Update: All web app is up and running fine.

    Web Apps Current SKU is LinuxFree and Instance Allocations Event was detected.

    Consider running a production application on a Standard, Premium, or Isolated App Service Plan for better performance and isolation. These SKUs are best for production workloads.

    Details on App Service plan

    Azure App Service Plan Pricing Information

    It is recommended to follow all practices as described in below article and ensure you are scaled to multiple workers to avoid potential latency for these occurrences.

    https://azure.github.io/AppService/2020/05/15/Robust-Apps-for-the-cloud.html

    More details: Things You Should Know: Web Apps and Linux

    Hope this helps.


0 additional answers

Sort by: Most helpful