[HostMonitor] Host CPU threshold exceeded

Thomas Ligon (Office 365) 26 Reputation points
2022-09-29T15:01:03+00:00

I am using Azure Functions to run a Python app that reached the limits of my old desktop PC (16GB RAM). The app does some complex symbolic math and has reached the point where Resource Monitor showed memory saturation. Due to request timeouts, I moved from a Consumption Plan to Premium. Now, the troubleshooting feature Diagnose and Solve Problems / Availability and Performance shows
[HostMonitor] Host CPU threshold exceeded (94 >= 80)

Azure Functions
Azure Functions
An Azure service that provides an event-driven serverless compute platform.
4,321 questions
0 comments No comments
{count} votes

7 answers

Sort by: Most helpful
  1. MughundhanRaveendran-MSFT 12,431 Reputation points
    2022-10-03T11:57:12.337+00:00

    Hi @Thomas Ligon (Office 365) ,

    For a function app that processes a large number of I/O events or is being I/O bound, you can significantly improve performance by running functions asynchronously. For more information, see Improve throughout performance of Python apps in Azure Functions.

    Also try to add the app setting "FUNCTIONS_WORKER_PROCESS_COUNT" and set the value in between 2 and 10. This setting specifies the maximum number of language worker processes, with a default value of 1.

    https://learn.microsoft.com/en-us/azure/azure-functions/functions-app-settings#functions_worker_process_count


  2. Thomas Ligon (Office 365) 26 Reputation points
    2022-10-04T14:59:56.16+00:00

    I changed the value to 5, tested, changed it to 10, and tested again. A number of things that I don't understand

    1. Why is Availabilty and Performance only showing 1 execution, even though I started it twice?
    2. Why is Availabilty and Performance showing an execution time of 16:05 (Central European Summer Time), even though I started it around 16:40?
    3. Why is Availabilty and Performance showing Timeout value of 02:00:00 was exceeded even though it only ran for a few seconds?

    Here is the Full Exeption:
    Microsoft.Azure.WebJobs.Host.FunctionTimeoutException : Timeout value of 02:00:00 was exceeded by function: Functions.HillHttpTrigger1 at async Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor.TryHandleTimeoutAsync(Task invokeTask,CancellationToken shutdownToken,Boolean throwOnTimeout,CancellationToken timeoutToken,TimeSpan timeoutInterval,IFunctionInstance instance,Action onTimeout) at C:\projects\azure-webjobs-sdk-rqm4t\src\Microsoft.Azure.WebJobs.Host\Executors\FunctionExecutor.cs : 663 at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at async Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor.InvokeWithTimeoutAsync(IFunctionInvoker invoker,ParameterHelper parameterHelper,CancellationTokenSource timeoutTokenSource,CancellationTokenSource functionCancellationTokenSource,Boolean throwOnTimeout,TimeSpan timerInterval,IFunctionInstance instance) at C:\projects\azure-webjobs-sdk-rqm4t\src\Microsoft.Azure.WebJobs.Host\Executors\FunctionExecutor.cs : 571 at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at async Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor.ExecuteWithWatchersAsync(IFunctionInstanceEx instance,ParameterHelper parameterHelper,ILogger logger,CancellationTokenSource functionCancellationTokenSource) at C:\projects\azure-webjobs-sdk-rqm4t\src\Microsoft.Azure.WebJobs.Host\Executors\FunctionExecutor.cs : 527 at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at async Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor.ExecuteWithLoggingAsync(IFunctionInstanceEx instance,FunctionStartedMessage message,FunctionInstanceLogEntry instanceLogEntry,ParameterHelper parameterHelper,ILogger logger,CancellationToken cancellationToken) at C:\projects\azure-webjobs-sdk-rqm4t\src\Microsoft.Azure.WebJobs.Host\Executors\FunctionExecutor.cs : 306 at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at async Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor.ExecuteWithLoggingAsync(IFunctionInstanceEx instance,FunctionStartedMessage message,FunctionInstanceLogEntry instanceLogEntry,ParameterHelper parameterHelper,ILogger logger,CancellationToken cancellationToken) at C:\projects\azure-webjobs-sdk-rqm4t\src\Microsoft.Azure.WebJobs.Host\Executors\FunctionExecutor.cs : 352 at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at async Microsoft.Azure.WebJobs.Host.Executors.FunctionExecutor.TryExecuteAsync(IFunctionInstance functionInstance,CancellationToken cancellationToken) at C:\projects\azure-webjobs-sdk-rqm4t\src\Microsoft.Azure.WebJobs.Host\Executors\FunctionExecutor.cs : 108

    0 comments No comments

  3. Thomas Ligon (Office 365) 26 Reputation points
    2022-10-04T15:46:26.777+00:00

    Here is a correction (answer) to my questions 1) and 2): the time of 16:05 is for 10-03, which was yesterday. Today is 10-04.

    In the meantime, I read https://learn.microsoft.com/en-us/azure/azure-functions/python-scale-performance-reference
    and decided to implement PYTHON_THREADPOOL_THREAD_COUNT, which I set to 10. I am still getting
    Request timed out. Modify setting "azureFunctions.requestTimeout" if you want to extend the timeout.
    How do I set that timeout? Is it something like REQUEST_TIMEOUT?

    0 comments No comments

  4. Thomas Ligon (Office 365) 26 Reputation points
    2022-10-04T17:00:17.457+00:00

    Looking at Availability and Performance / Functions that are not triggering, I can see
    [HostMonitor] Host CPU threshold exceeded (94 >= 80) 18 [HostMonitor] Host CPU threshold exceeded (94 >= 80) 10/4/2022 3:48:50 PM

    0 comments No comments

  5. Thomas Ligon (Office 365) 26 Reputation points
    2022-10-05T20:24:37.597+00:00

    Today, I did some more reading and testing. I found this
    https://techcommunity.microsoft.com/t5/apps-on-azure-blog/azure-app-service-automatic-scaling/ba-p/2983300
    According to that, I need scale up, not scale out, and the limits are defined by the app service plan (I am using an E1 plan). But I don't see any documentation of the limit for CPU threshold.

    When I look at portal > App Services > Overview > Metrics, I can see that the memory working set is oscillating between 600MB and 1TB. My expectation is that the app will require a bit more than that, so it might be growing until it reaches 1TB and the CPU threshold, where it fails, and falls back to 600MB and restarts.

    0 comments No comments