Azure Functions Premium apps not scaling out (multiple apps in the app service plan)

Hallgeir Østerbø 1 Reputation point

We are setting up multiple Azure Function apps. We've set them up on an Elastic Premium app service plan. Each of the apps "pre-warmed instances" settings are set to 1, and the app service plan maximum burst is set to 80. We trigger the function apps with HTTP triggers mostly.

But we're struggling with getting the apps to actually scale up them to actually scale up. To test, in one of the function apps I've created a function that simply does a bunch of (useless) work, maxing out the CPU, for 10 seconds. Then I invoke it multiple times in several threads. It continues to spin on a single instance (using application insights live metrics to monitor it). Here is the host.json file. I have tried tweaking the maxConcurrentRequests and maxOutstandingRequests, and it certainly modifies behavior in the way I expect them to according to the docs, but nothing seems to help our app actually scaling out.

  "version": "2.0",  
  "logging": {  
    "applicationInsights": {  
      "samplingSettings": {  
        "isEnabled": true,  
        "excludedTypes": "Request"  
  "extensions": {  
    "http": {  
      "maxConcurrentRequests": 1,  
      "maxOutstandingRequests": -1,  
      "dynamicThrottlesEnabled":  false  

I have tried increasing the # of pre-warmed instances - then it scales out to that many instances just fine. But this is not a very economic way of running Function apps, as it would essentially set the number of minimum instances we always pay for.

So what is the best way to get this to work? Will it ever work, or will I need to have a premium app service plan per function app?

Azure Functions
Azure Functions
An Azure service that provides an event-driven serverless compute platform.
2,606 questions
{count} votes

1 answer

Sort by: Most helpful
  1. MughundhanRaveendran-MSFT 11,621 Reputation points Microsoft Employee

    @Hallgeir Østerbø

    The scaling is controlled by the scale controller. There are various factors and conditions considered for scaling out, it also varies depending on the type of trigger. For http triggered function, the incoming requests, CPU and memory utilization of the existing instances that are handling the requests. Now for your scenario, if the incoming requests are high and the existing instance is able to handle the load even if the CPU utilization is high, then the scale controller will not scale out. Only when the requests are high and the existing instances are not able to handle the load resulting in the increase of execution time of the function, the scale controller would vote to scale out and add additional instances. Increasing the pre-warmed instances will increase the instances as it is in your control.

    No comments