Azure Functions Garbage Collection Mode

상준 이 251 Reputation points
2022-02-09T03:22:00.04+00:00

Hello.

I am developing a Service Bus Queue Trigger with .NET 5 on Azure Functions.

When a specific URI is sent to Service Bus, it is taken from the Service Bus Queue Trigger, collected Html, parsed and stored in Cosmos DB.

Collecting the html is doing in the SDK I made.

This Crawling SDK creates a Value Object in the form of a Cosmos DB Entity that I want by loading the Html collected through HttpClient to HttpAgilityPack and parsing it.

In this process, I initially did this with one Azure Functions App, and when I collected 20-30 Html at the same time, I noticed that the Memory jumped up to 4GB in Visual Studio and the CPU was stuck at 100%.

When I waited and waited, it wasn't stopped, it was going very slowly.

Guessing the cause, the SDK I made generated a lot of garbage, and it seemed to proceed as several threads stopped while GC was in progress.

But I don't know how to prove my guess is correct.

So, first of all, 1 Azure Functions was divided into 3 projects of .NET Azure Functions Project: Crawling Part, Parsing Part, and Cosmos DB Insert Part.

If the project is divided, resources are also created separately in Azure and memory is managed by three processes, so the burden on GC seems to be less.

But it doesn't seem to work very well.

Is there a way to do Server GC instead of workstation GC in Isolated Azure Functions?

Or do you have the know-how to control the load without touching the SDK in a situation where there is a lot of garbage like this?

PrefetchCount is not used because of fear of memory load, and MaxConcurrentCalls is used with 15.

Please Help.

Azure Service Bus
Azure Service Bus
An Azure service that provides cloud messaging as a service and hybrid integration.
544 questions
Azure Functions
Azure Functions
An Azure service that provides an event-driven serverless compute platform.
4,249 questions
C#
C#
An object-oriented and type-safe programming language that has its roots in the C family of languages and includes support for component-oriented programming.
10,233 questions
0 comments No comments
{count} votes

Accepted answer
  1. AnuragSingh-MSFT 19,691 Reputation points
    2022-02-10T13:15:35.36+00:00

    Hi @상준 이

    Welcome to Microsoft Q&A! Thanks for posting the question.

    I understand that you are developing Azure Function targeting .NET 5 (isolated mode). Based on the details posted, please find my response below:

    1. Is there a way to do Server GC instead of workstation GC in Isolated Azure Functions?
    No, the available configuration options for Azure Function is configured through host.json file and its reference is available here.. It contains configuration that applies "after" the runtime has initialized. The GC mode settings must be configured "before" the dotnet runtime initializes. Furthermore, just changing the GC mode to server will not lower the size of memory. In-fact, in most scenarios the memory requirement increases because in server mode you have 1 dotnet heap per processor core. Setting the mode to server will benefit the applications which is intended to run as servers and need high throughput and scalability (not lower memory).
    In the current scenario, I suspect that the web crawling is creating large objects (probably strings/XML docs) which are rooted in your application.

    2. Or do you have the know-how to control the load without touching the SDK in a situation where there is a lot of garbage like this?
    There is no direct solution to achieve this. However, the best approach here would be to investigate the high memory usage and take appropriate actions. I have added some of the helpful resources and tips below to help you track it down and investigate it.


    Is there a memory leak and where?
    It may be possible that the application does need a lot of memory to perform. In such cases, the best approach is to modify the code/logic, such that the memory requirement is lowered. You can use tools below to investigate the memory leak:
    > Visual Studio Diagnostic tool, OR
    > Command line tools

    The tools above would help you understand details of Objects in dotnet heap and their details like count/size.

    How does garbage collector work? **
    Understanding the **Fundamentals of Garbage collection
    would help you understand if the objects identified above can be reclaimed when GC runs; especially the concept of rooted objects

    Breaking the entire FunctionApp into smaller functionApp could be a good option here, as mentioned above.
    Also, you may investigate Static classes being used which are alive till the application closes. Try to minimize the static classes in the FunctionApp. You may also convert the created functionApp classes from template to non-static and they would still work:
    173191-image.png

    Finally, for test purpose only, you can call GC.Collect() to understand the amount of memory that will be claimed when GC runs next. For more details, please refer to GC.Collect Method This should be done for test purpose only as it will disturb the GC's internal heuristics.

    There are also some articles on the web on similar topics to help reduce memory consumption like:
    > Optimising C# for a serverless environment
    > Increasing performance via low memory allocation in C#

    Please let me know if you have any questions.


    Please 'Accept as answer' and ‘Upvote’ if it helped so that it can help others in the community looking for help on similar topics.


0 additional answers

Sort by: Most helpful