Turns out this is due to a known issue, which reduces security posture and subsequently generates an Azure Advisor security warning of medium severity:
Swap Warmup Takes Same Time as Straight Deployment
This is a continuation of this conversation here:
I seem to be encountering warmup times after a swap that is the same as when I deploy directly to the environment. This is about 6-7 seconds. The expectation is that if the swap is doing a warmup (directly from the
/ root path), then the first request should be instant.
Note that no configuration is done with the swap warmup, so the ping path should be the root. This should be sufficient to load/encounter the 6-7 before the swap, but it doesn't appear to be doing so.
Tagging @Ryan Hill here... thank you very much for the conversation and investigation so far. It is greatly appreciated!
Hey @Mike-E-angelo , allow me to take a closer look. Please respond to the private comment down below.
Thank you for the continued investigation, @Ryan Hill . I very much appreciate you taking your time in assisting and helping me think about this and get acclimated with this particular problem. I have a million billion of these problems right now so it's been tough to attend to them all.
So thinking about this a little further (thanks to your good line of questioning), I think I have figured out what is going on... the primary (gateway) application is warmed up, but it's also dependent on another application which is a simple aspnet.net core application which functions as a SignalR server for all my applications in my system. I call this the "events" application. The events application also gets swapped but does not have a root site/page/location, so the swap does not really impact it at all.
What I think is occurring is that the gateway is actually warmed up and ready to go, but it's establishing a connection to the events server which was sent a warmup ping but doesn't do anything because nothing is at this location.
The next step here is to see if I can get the warmup process to properly initialize the SignalR processes so that when the gateway application makes it's call it's warmed up and ready to go.
Wow, this might be due to the warmup ping path never actually getting called because warmup only supports HTTP and not HTTPS? :S
Alright @Ryan Hill I have spent a week wrestling with Azure and deployments to FINALLY get to try this out. So the good news, is that I am able to confirm 200 status codes with the warm up pings, but only if I do them from an MVC (non-Blazor) call. For some reason, when I ping a Blazor route it results in a 503 and an obscure error being thrown regarding FileLoadException on an assembly. That's one of my many problems at the moment. :P
The other note here is that the SignalR connection from my entry/gateway server only occurs when the user is authenticated. If I sign out and do a refresh after swap and/or I load the MVC warmup path, it is about a 2-second delay now. That seems a bit more aligned with expectations, but it still seems if it was warmed up the reply would be instant. I am still not convinced a full warmup is occurring here (but could be wrong).
The other issue is that you were indeed correct about application saturation. My testing environment (where I built from scratch) has 6 App Services now with 2 slots each, for a total of 12. On an S1 it takes about 20 minutes to deploy/swap them all. The CPU is chugging the entire time during this process. Upgrading to an S3 seems much nicer but is $300/mo :) I may settle for an S2 or go back up to P1V2.
Finally, the biggest issue right now is that when I swap my content server with an expected status reply of 200, it fails in the staging slot for
HostnameSyncPingereach with 503. Unfortunately, no errors are thrown and I am trying to track this down. The logging story for Azure is beyond overwhelming and I simply want to see where things are written out. My Activity Log is so slow and clunky that I despise using it. It takes forever for messages to load and when they do they do not ever have any useful information. I was able to find the Monitoring -> App Service Logs and that has been most valuable, but still really klunky experience as I have to dive through Kudu to find these elusive pieces of information that should be first thing seen when viewing the App Service.
Anyways, ranting aside, I am getting closer but there is still a ways to go here. I feel like there's another half week left of work to round out something that should be very simple to implement but is not.
Figured this one out. As a first step, what I am doing is deploying an app_offline.htm to all my applications. I should also clarify this is for the most disruptive scenario I can think of: one where data changes must occur to the system. In this case, all applications are stopped with an app_offline.htm applied, the database is updated, and then all the applications are started up again.
This results in a 503 when the pingers do their pinging with the app_offline.htm in place. ~~Although not ideal, I have added 503 as a valid
WEBSITE_SWAP_WARMUP_PING_STATUSES. I would definitely appreciate any solutions that are better than this.~~ Nevermind, this actually only occurs when the app_offline.htm is in the source slot. This never occurs in my deployments and I was running into this only because I was testing out the swap functionality. Testing with prior deployments instead. :P
Alright, apologies for the chatter here (but not really :)) as I mentioned this is about a week's worth of work sort of culminating in one day. The problem I am facing now is the Blazor-routed responses. I am seeing both 302s and 503s. Unfortunately, the FileLoadException is no longer occurring so that does not seem to be related here. I have been following the diagnostics prescribed here:
I have been able to get 503s from there, but nothing says exactly what is causing them. I am also seeing those 302s for
HostnameSyncPingerin HTTP logs. There is no location provided so not sure where they are redirecting. I will get those logs and attach them to your private message @Ryan Hill .
The good news is that I further explored the MVC routes and they are WICKED fast now and work exactly like I would expected a warmed-up request to behave. It was so fast I thought the swap didn't happen :D But the staging took 16 seconds (!) to load so things work there, but not in Blazor.
Finally, I got my SignalR warmed up as well. So, kicking some doors down, and the last one seems to be this Blazor routing issue.
Hi @Ryan Hill I never did get a reply back from you regarding this. Note that I still see swaps that result in over 10 seconds of startup time. Sometimes it's 4-5 seconds. But I just ran into one that is 11 seconds, and it's beyond impossible to know exactly what is going on with Azure's incredibly limited capabilities in this area. Any assistance in tracking this down would be greatly appreciated.
I'll follow up with the product group on this @Mike-E-angelo and will update you.
Awesome. Thank you for your time and assistance, @Ryan Hill !
Hi @Ryan Hill any update?
Sign in to comment
Sort by: Most helpful
Well, I say "known" but no one was really doing anything about it until I created the issue to track. From over a year ago:
That's some nice sleuth detective work @Mike-E-angelo . I'll pass this feedback along to the product group.
Thanks @Ryan Hill funny enough I found this because I was trying to figure out a way to warmup a SignalR server. My SignalR server is just that, it does not have any special configurations for MVC or anything to host content as a webserver. However, I did attach it to the debugger process and while the "/" path returns 400 I do see that the process warms up and loads. So, that got me searching for "400 warmup" which resulted in that post as one of the results.
I've also been considering your feedback regarding # of applications per server and will be spending some time getting some things separated here into their own environments a little bit more, while attending to the warmup scenario as discussed in the article. Busy week ahead. :P
So dug a little deeper and determined that health check warmup over HTTPS is supported on Windows hosted machine but not Linux at the moment, see https://learn.microsoft.com/en-us/azure/app-service/monitor-instances-health-check#are-the-health-check-requests-sent-over-http-or-https. This may or may not be a deal breaker for you, but I am trying to find out when Linux will support this feature.
Thank you for your investigation @Ryan Hill . While the health-check infrastructure may support HTTPS, that does not seem to be actually utilized in the warmup ping as part of the Azure Swap task. This seems confirmed by the GitHub issue above.
Sign in to comment
The mystery continues here @Ryan Hill . I just did a deployment w/ a swap and upon checking it with a request it was as fast as I had ever seen it! It had to have been under a second. Very interesting. Maybe a transient bug/issue that has been addressed? Anyways I hope this continues and will keep you updated.
14 freaking seconds, @Ryan Hill :P
Another 5-second "warm" swap @Ryan Hill ... sometimes it's 2 seconds, sometimes it's 4 seconds. Most of the time it's 5-6 seconds, but sometimes 14 seconds like 8 days ago.
Another 14 second deployment @Ryan Hill