Automatic scaling in Azure App Service

Note

Automatic scaling is in preview. It's available for Premium V2 (P1V2, P2V2, P3V2) and Premium V3 (P1V3, P2V3, P3V3) pricing tiers, and supported for all app types: Windows, Linux, and Windows container. Automatic scaling is not supported for deployment slot traffic.

Automatic scaling is a new scale out option that automatically handles scaling decisions for your web apps and App Service Plans. It's different from the pre-existing Azure autoscale, which lets you define scaling rules based on schedules and resources. With automatic scaling, you can adjust scaling settings to improve your app's performance and avoid cold start issues. The platform prewarms instances to act as a buffer when scaling out, ensuring smooth performance transitions. You can use Application Insights Live Metrics to check your current instance count, and performanceCounters to see the instance count history. You're charged per second for every instance, including prewarmed instances.

A comparison of scale out and scale in options available on App Service:

  Manual Autoscale Automatic scaling
Available pricing tiers Basic and Up Standard and Up Premium V2 (P1V2, P2V2, P3V2) and Premium V3 (P1V3, P2V3, P3V3)
Rule-based scaling No Yes No, the platform manages the scale out and in based on HTTP traffic.
Schedule-based scaling No Yes No
Always ready instances No, your web app runs on the number of manually scaled instances. No, your web app runs on other instances available during the scale out operation, based on threshold defined for autoscale rules. Yes (minimum 1)
Prewarmed instances No No Yes (default 1)
Per-app maximum No No Yes

How automatic scaling works

You enable automatic scaling for an App Service Plan and configure a range of instances for each of the web apps. As your web app starts receiving HTTP traffic, App Service monitors the load and adds instances. Resources may be shared when multiple web apps within an App Service Plan are required to scale out simultaneously.

Here are a few scenarios where you should scale out automatically:

  • You don't want to set up autoscale rules based on resource metrics.
  • You want your web apps within the same App Service Plan to scale differently and independently of each other.
  • Your web app is connected to a databases or legacy system, which may not scale as fast as the web app. Scaling automatically allows you to set the maximum number of instances your App Service Plan can scale to. This setting helps the web app to not overwhelm the backend.

Enable automatic scaling

Maximum burst is the highest number of instances that your App Service Plan can increase to based on incoming HTTP requests. For Premium v2 & v3 plans, you can set a maximum burst of up to 30 instances. The maximum burst must be equal to or greater than the number of workers specified for the App Service Plan.

Important

Always ON needs to be disabled to use automatic scaling.

To enable automatic scaling, navigate to the web app's left menu and select Scale out (App Service Plan). Select Automatic (preview), update the Maximum burst value, and select the Save button.

Automatic scaling in Azure portal

Set minimum number of web app instances

Always ready instances is an app-level setting to specify the minimum number of instances. If load exceeds what the always ready instances can handle, additional instances are added (up to the specified maximum burst for the App Service Plan).

To set the minimum number of web app instances, navigate to the web app's left menu and select Scale out (App Service Plan). Update the Always ready instances value, and select the Save button.

Screenshot of always ready instances

Set maximum number of web app instances

The maximum scale limit sets the maximum number of instances a web app can scale to. The maximum scale limit helps when a downstream component like a database has limited throughput. The per-app maximum can be between 1 and the maximum burst.

To set the maximum number of web app instances, navigate to the web app's left menu and select Scale out (App Service Plan). Select Enforce scale out limit, update the Maximum scale limit, and select the Save button.

Screenshot of maximum scale limit

Update prewarmed instances

The prewarmed instance setting provides warmed instances as a buffer during HTTP scale and activation events. Prewarmed instances continue to buffer until the maximum scale-out limit is reached. The default prewarmed instance count is 1 and, for most scenarios, this value should remain as 1.

You can't change the prewarmed instance setting in the portal, you must instead use the Azure CLI.

Disable automatic scaling

To disable automatic scaling, navigate to the web app's left menu and select Scale out (App Service Plan). Select Manual, and select the Save button.

Screenshot of manual scaling

Does automatic scaling support Azure Function apps?

No, you can only have Azure App Service web apps in the App Service Plan where you wish to enable automatic scaling. If you have existing Azure Functions apps in the same App Service Plan, or if you create new Azure Functions apps, then automatic scaling is disabled. For Functions, it's recommended to use the Azure Functions Premium plan instead.

More resources