Questions on Flex Consumption Always-on Scaling Algorithm

Roy Zeng 60 Reputation points
2024-10-02T06:37:53.2166667+00:00

Q1: If always-on set to 1 for Http Trigger, does it mean one specific instance will be used all the time? Or another instance could be waken up to replace the current one?
Q2: I have noticed that my function app scaled out several more instances when request frequency were relatively low instead of scaling out at peak time. Are there any configurations controlling such a behavior?

Azure Functions
Azure Functions
An Azure service that provides an event-driven serverless compute platform.
4,973 questions
0 comments No comments
{count} votes

Accepted answer
  1. Vinodh247 20,396 Reputation points
    2024-10-02T12:44:43.4433333+00:00

    Hi Roy Zeng,

    Thanks for reaching out to Microsoft Q&A.

    If always-on is set to 1 for HTTP Trigger, does it mean one specific instance will be used all the time, or could another instance be woken up to replace the current one?

    When Always On is set to 1, it ensures that at least one instance of your Function App is always running, ready to handle HTTP requests. However, Always On does not guarantee that the same instance will persist indefinitely. Instances could be replaced or recycled due to various factors like platform maintenance, scaling decisions, or the need to optimize resource utilization. So, while one instance is always available, it may not be the same specific instance.

    Why did my Function App scale out more instances during low request frequency instead of at peak times, and are there any configurations controlling such behavior?

    Several factors can influence this:

    Cold Start Behavior:

    If the Function App was inactive and experienced a cold start, Azure might preemptively allocate more instances to handle requests, even if the request frequency is low.

    Dynamic Scaling Algorithm:

    Azure Functions uses a dynamic scaling algorithm based on various metrics such as queue length, memory usage, CPU load, and response times. If the system detects potential load or a backlog, it may scale out instances even if the request rate isn't peaking.

    Burst Traffic Handling:

    If Azure anticipates a burst in traffic or sees delayed responses due to backend latency, it could preemptively scale out instances to improve performance.

    To control or influence this behavior:

    Function Timeout Settings:

    • Adjusting timeout durations can prevent the platform from assuming a backlog or resource constraints.
    • instance Count Limits: You can set scale-out limits on the number of instances to control excessive scaling.
    • Plan Choice: The pricing plan (Consumption vs. Premium) affects scaling behavior, with Premium offering more predictable scaling and control.

    Understanding the underlying resource patterns and adjusting configurations like maximum instance count and timeout durations can help you optimize the scaling behavior.

    Please 'Upvote'(Thumbs-up) and 'Accept' as an answer if the reply was helpful. This will benefit other community members who face the same issue.

    1 person found this answer helpful.

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.