How do I get my containerised azure function to scale down when not in use?

Samuel Bradshaw 30 Reputation points
2024-07-25T11:35:47.48+00:00

Hi,

I have a containerised function app that is failing to scale down when not in use, resulting in much higher than expected costs.

The scale rule settings for the container app are set to have min replicas as 0, and max replicas as 2: Screenshot 2024-07-25 at 12.01.18

However, it seems that the number of replicas is stuck at my maximum replica count of 2: Screenshot 2024-07-25 at 12.02.55

The app is triggered by a queue and I have only 1 scale rule which is the following:
Screenshot 2024-07-25 at 12.03.25

My understanding is that it should scale up to 2 replicas when the queue gets to 5 messages, but it should by default scale back down to 0 replicas when it has finished processing the messages. See the scaling behavior section here https://learn.microsoft.com/en-us/azure/container-apps/scale-app?pivots=azure-resource-manager#scale-behavior.

What do I need to do to make sure my container app scales down to 0 replicas when it is not being used? It is not clear from the docs how I can configure this.

My app is deployed using an ARM template that defines a managed container environment, a storage account containing a queue, and a function app (the container app resource in Azure is created automatically by this deployment). If It is helpful I can include the relevant parts of the ARM template below.

Azure Functions
Azure Functions
An Azure service that provides an event-driven serverless compute platform.
4,674 questions
Azure Container Apps
Azure Container Apps
An Azure service that provides a general-purpose, serverless container platform.
346 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Samuel Bradshaw 30 Reputation points
    2024-07-26T11:27:52.56+00:00

    I think I have resolved the issue. The issue was related to the way that poison messages were handled.

    My app defined 2 functions that are triggered by messages arriving on queues, one for processing messages on the original queue, and another for processing the messages on a '-poison' queue. When a message fails to be processed within the max dequeue count, it is automatically sent to the poison queue (see poison messages section here). If the poison queue does not exist when a message is first sent to it, then it is automatically created. This meant that I had not explicitly defined the poison queue in my ARM template. However, the scaling rules that are generated automatically for my app include a scaling rule for the '-poison' queue. If the '-poison' queue did not exist (which was the case when the app is first deployed), then this seems to break the auto scaling meaning the number of replicas never scaled down.

    The solution was to explicitly define the '-poison' queue in my ARM template as well.

    I think this should still be addressed as a bug in the KEDA scaling - if a scaling rule cannot connect to a queue (in this case because the queue did not exist), surely it should scale down by default, rather that retaining the maximum number of replicas running (and therefore causing additional costs to the customer).

    0 comments No comments