Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
This article discusses how to resolve an increased reported memory usage issue in Microsoft Azure Kubernetes 1.25 or a later version.
Symptoms
You experience one or more of the following symptoms:
Pods report increased memory usage after you upgrade a Microsoft Azure Kubernetes Service (AKS) cluster to Kubernetes 1.25 or a later version.
A node reports memory usage that's greater than in earlier versions of Kubernetes when you run the kubectl top node command.
Increased pod evictions and memory pressure occur within a node.
Cause
This increase is caused by a change in memory accounting within version 2 of the Linux control group (cgroup) API. Cgroup v2 is now the default cgroup version for Kubernetes 1.25 on AKS.
Note
This issue is distinct from the memory saturation in nodes that's caused by applications or frameworks that aren't aware of cgroup v2. For more information, see Memory saturation occurs in pods after cluster upgrade to Kubernetes 1.25.
Solution
If you observe frequent memory pressure on the nodes, upgrade your subscription to increase the amount of memory that's available to your virtual machines (VMs).
If you see a higher eviction rate on the pods, use higher limits and requests for pods.
cgroupv2 uses a different API thancgroupv1. If there are any applications that directly access thecgroupfile system, update them to later versions that supportcgroupv2. For example:Third-party monitoring and security agents:
Some monitoring and security agents depend on the
cgroupfile system. Update these agents to versions that supportcgroupv2.Java applications:
Use versions that fully support
cgroupv2:- OpenJDK/HotSpot:
jdk8u372,11.0.16,15, and later versions. - IBM Semeru Runtimes:
8.0.382.0,11.0.20.0,17.0.8.0, and later versions. - IBM Java:
8.0.8.6and later versions.
- OpenJDK/HotSpot:
uber-go/automaxprocs:
If you're using theuber-go/automaxprocspackage, ensure the version isv1.5.1or later.
An alternative temporary solution is to revert the
cgroupversion on your nodes by using the DaemonSet. For more information, see Revert to cgroup v1 DaemonSet.
Important
- Use the DaemonSet cautiously. Test it in a lower environment before applying to production to ensure compatibility and prevent disruptions.
- By default, the DaemonSet applies to all nodes in the cluster and reboots them to implement the
cgroupchange. - To control how the DaemonSet is applied, configure a
nodeSelectorto target specific nodes.
Note
If you experience only an increase in memory use without any of the other symptoms that are mentioned in the "Symptoms" section, you don't have to take any action.
Status
We're actively working with the Kubernetes community to resolve the underlying issue. Progress on this effort can be tracked at Azure/AKS Issue #3443.
As part of the resolution, we plan to adjust the eviction thresholds or update resource reservations, depending on the outcome of the fix.
Reference
- Node memory usage on cgroupv2 reported higher than cgroupv1 (GitHub issue)
Third-party information disclaimer
The third-party products that this article discusses are manufactured by companies that are independent of Microsoft. Microsoft makes no warranty, implied or otherwise, about the performance or reliability of these products.
Third-party contact disclaimer
Microsoft provides third-party contact information to help you find additional information about this topic. This contact information may change without notice. Microsoft does not guarantee the accuracy of third-party contact information.
Contact us for help
If you have questions, you can ask Azure community support. You can also submit product feedback to Azure feedback community.