Azure Function triggered by Service Bus Queue event times out under load

Robert Galante 61 Reputation points
2022-02-01T16:55:38.303+00:00

I have an Azure Function which is triggered by a service bus queue trigger. I am conducting load tests.

Yesterday, I pushed 42 messages on the queue in 5 minutes. These messages were scheduled to run over a 1 hour period. The function ran and all 42 messages were processed (14 every half hour).

Today, I pushed 84 messages on the queue in 5 minutes. These messages were scheduled to run over a 1 hour period also (28 every half hour). Every message failed with "Microsoft.Azure.WebJobs.Host.FunctionTimeoutException". The message is "Timeout value of 00:05:00 was exceeded by the function." I am using the default timeout time of 5 minutes (00:05:00). So that's how long every function invocation ran before it was stopped.

Normally, the function takes between 10 and 40 seconds to execute. I do not need to increase the default timeout time. If it's not done in 40 seconds, something is wrong.

I have Application Insight configured. I viewed the log. I see my start message. I do not see any messages indicating that an error or an exception has occurred. I used the New Support Request to discover more details. It only shows the same exception.

Microsoft.Azure.WebJobs.Host.FunctionTimeoutException:
at Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor+<TryHandleTimeoutAsync>d__30.MoveNext (Microsoft.Azure.WebJobs.Host, Version=3.0.31.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35: C:\projects\azure-webjobs-sdk-rqm4t\src\Microsoft.Azure.WebJobs.Host\Executors\FunctionExecutor.cs:646)
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw (System.Private.CoreLib, Version=6.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e)

Almost all the messages remained on the queue. Two were transferred to the dead letter queue. It's almost like my service bus was down this morning. Is there some limit to the number of requests the service bus can receive each second? I am not sure what else to try.

Azure Service Bus
Azure Service Bus
An Azure service that provides cloud messaging as a service and hybrid integration.
700 questions
Azure Functions
Azure Functions
An Azure service that provides an event-driven serverless compute platform.
5,911 questions
0 comments No comments
{count} votes

Accepted answer
  1. MayankBargali-MSFT 70,936 Reputation points Moderator
    2022-02-02T06:17:05.04+00:00

    @Robert Galante Thanks for reaching out. As you have mentioned that you are able to see two messages transferred in the dead letter queue and the rest of the messages were still in the queue. The messages are moved to the dead letter only in these scenarios. So you can use service bus explorer to verify the reason why the messages were moved to the dead letter queue. It looks like the function app was not able to consume the message within the lock duration and after multiple retries, the message has moved to the dead letter queue with the reason MaxDeliveryCountExceededExceptionMessage.

    This will help you to verify why the message was moved to the dead letter queue. I will also suggest you to Peek at some of the messages and verify the property DeliveryCount of the individual message in service bus explorer to know where those messages were consumed by your client application at least once.

    As per the exception, it is the timeout exception. I will suggest you to review your code to verify if there are any other calls apart from service bus that would be cause of the timeout at the function app end.
    To troubleshoot the issue from the client side (i.e. function app) I would suggest you to look into the "Diagnose and solve problems" blade in the Function app --> navigate to "Availability and Performance" to see if you can find any issue or full stack error message to troubleshoot it further. In case if you have enabled the application insights logs then you can review the same to verify the full stack error message and troubleshoot it further.

    If case if the above doesn't help then please refer to my private comment and share the details so I can review the logs at my end to assist you further.


1 additional answer

Sort by: Most helpful
  1. Robert Galante 61 Reputation points
    2022-02-07T10:56:56.43+00:00

    I analyzed the failures to ascertain what is causing the function to experience the function timeout. I reviewed the first five minutes of log messages.

    I see that the function is started. But the first line of the function is supposed to log a message, "FunctionName started - {state info here}". I don't see this message. I don't see any of my function's messages until after the timeout message occurs. This function must be cold-starting for two minutes.

    The app service tried to start 28 instances of my function. It didn't even run the first line to log the start message. And it left my function in the running state for 2 minutes. That expires. Then I see a function timeout message. That is the first message in the log.

    It does a retry. It fails again. And the sequence repeats until things settle. In 1.5 hours the service bus has 956 successful requests and 70 user errors. These 70 user errors are the timeouts.

    The service bus input binding gets unstable under load and takes quite a while to settle. No error message exists in the log which indicates what caused the timeout to occur. It launches my function, and doesn't run for two minutes. It does that 28 times. That's 3360 seconds of wasted, billable time.

    The design I implemented where a service bus input binding triggers an Azure function which outputs to another service bus queue output binding is not scalable. I'm using puppeteer to create an image just like the example in this url.

    https://anthonychu.ca/post/azure-functions-puppeteer-pdf-razor-template/

    It can run 14 instances. It cannot run 28 instances. This is much lower than the 200 instances that the documentation says are possible. I need a better solution than this to handle bursts of activity such as these.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.