Hi Roy Zeng,
Thanks for reaching out to Microsoft Q&A.
If always-on is set to 1 for HTTP Trigger, does it mean one specific instance will be used all the time, or could another instance be woken up to replace the current one?
When Always On is set to 1, it ensures that at least one instance of your Function App is always running, ready to handle HTTP requests. However, Always On does not guarantee that the same instance will persist indefinitely. Instances could be replaced or recycled due to various factors like platform maintenance, scaling decisions, or the need to optimize resource utilization. So, while one instance is always available, it may not be the same specific instance.
Why did my Function App scale out more instances during low request frequency instead of at peak times, and are there any configurations controlling such behavior?
Several factors can influence this:
Cold Start Behavior:
If the Function App was inactive and experienced a cold start, Azure might preemptively allocate more instances to handle requests, even if the request frequency is low.
Dynamic Scaling Algorithm:
Azure Functions uses a dynamic scaling algorithm based on various metrics such as queue length, memory usage, CPU load, and response times. If the system detects potential load or a backlog, it may scale out instances even if the request rate isn't peaking.
Burst Traffic Handling:
If Azure anticipates a burst in traffic or sees delayed responses due to backend latency, it could preemptively scale out instances to improve performance.
To control or influence this behavior:
Function Timeout Settings:
- Adjusting timeout durations can prevent the platform from assuming a backlog or resource constraints.
- instance Count Limits: You can set scale-out limits on the number of instances to control excessive scaling.
- Plan Choice: The pricing plan (Consumption vs. Premium) affects scaling behavior, with Premium offering more predictable scaling and control.
Understanding the underlying resource patterns and adjusting configurations like maximum instance count and timeout durations can help you optimize the scaling behavior.
Please 'Upvote'(Thumbs-up) and 'Accept' as an answer if the reply was helpful. This will benefit other community members who face the same issue.