Intermittent function timeout without running code

WJ 11 Reputation points
2022-08-11T16:22:56.9+00:00

I have some functions written in Python that have had intermittent timeouts. The timeouts have been more frequent over the past month or so. It feels to me like it is a cold start issue as they do work most of the time. When they don't work, the Python code isn't even running (I have logging that would show up in the invocation details). Most of the fails happen for webhook functions (included an example below) but there is also a schedule function where it has been happening.

2022-08-11 12:01:49.192
Executing 'Functions.PapertrailWebhook' (Reason='This function was programmatically called via the host APIs.', Id=47258b8b-b8e9-4417-aaac-6f69652c53a3)
Information
2022-08-11 12:11:49.192
Timeout value of 00:10:00 exceeded by function 'Functions.PapertrailWebhook' (Id: '47258b8b-b8e9-4417-aaac-6f69652c53a3'). Initiating cancellation.
Error
2022-08-11 12:11:49.193
Executed '{functionName}' ({status}, Id={invocationId}, Duration={executionDuration}ms)
Error
2022-08-11 12:11:49.193
Executed 'Functions.PapertrailWebhook' (Failed, Id=47258b8b-b8e9-4417-aaac-6f69652c53a3, Duration=600007ms)
Error
2022-08-11 12:11:49.194
Error

When this function is run successfully it takes between 1 and 20 seconds. We are on the consumption plan and it really feels to me like a cold start issue.

Azure Functions
Azure Functions
An Azure service that provides an event-driven serverless compute platform.
5,929 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. MughundhanRaveendran-MSFT 12,506 Reputation points
    2022-08-16T07:31:42.78+00:00

    Hi @WJ ,

    Thanks for reaching out to Q&A forum.

    Since you had provided the function name and invocation id, I was able to look into the platform and function logs at my end. Usually cold starts occur in Linux consumption plan whenever the function app is restarted or when there are container recycle events. Looking at the logs, I dont see any restarts or container recycle events. Also the workers running the function were healthy during this time and was able to successfully respond to the health check pings. So I can assure you that this is not a cold start problem and at the same time, this timeout issue isnt caused by the platform.

    I would suggest you to add more granular logging and have a retry mechanism in place to avoid this kind of issue. Although the execution timeout of function running in consumption plan is 10 minutes, the http trigger has a load balancer timeout of 230 seconds.

    I hope this helps! Feel free to reach out to me if you have any questions or concerns.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.