Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
This article discusses how to resolve an increased reported memory usage issue in Microsoft Azure Kubernetes 1.25 or a later version.
Symptoms
You experience one or more of the following symptoms:
Pods report increased memory usage after you upgrade a Microsoft Azure Kubernetes Service (AKS) cluster to Kubernetes 1.25 or a later version.
A node reports memory usage that's greater than in earlier versions of Kubernetes when you run the kubectl top node command.
Increased pod evictions and memory pressure occur within a node.
Cause
This increase is caused by a change in memory accounting within version 2 of the Linux control group (cgroup
) API. Cgroup v2 is now the default cgroup
version for Kubernetes 1.25 on AKS.
Note
This issue is distinct from the memory saturation in nodes that's caused by applications or frameworks that aren't aware of cgroup
v2. For more information, see Memory saturation occurs in pods after cluster upgrade to Kubernetes 1.25.
Solution
If you observe frequent memory pressure on the nodes, upgrade your subscription to increase the amount of memory that's available to your virtual machines (VMs).
If you see a higher eviction rate on the pods, use higher limits and requests for pods.
cgroup
v2 uses a different API thancgroup
v1. If there are any applications that directly access thecgroup
file system, update them to later versions that supportcgroup
v2. For example:Third-party monitoring and security agents:
Some monitoring and security agents depend on the
cgroup
file system. Update these agents to versions that supportcgroup
v2.Java applications:
Use versions that fully support
cgroup
v2:- OpenJDK/HotSpot:
jdk8u372
,11.0.16
,15
, and later versions. - IBM Semeru Runtimes:
8.0.382.0
,11.0.20.0
,17.0.8.0
, and later versions. - IBM Java:
8.0.8.6
and later versions.
- OpenJDK/HotSpot:
uber-go/automaxprocs:
If you're using theuber-go/automaxprocs
package, ensure the version isv1.5.1
or later.
An alternative temporary solution is to revert the
cgroup
version on your nodes by using the DaemonSet. For more information, see Revert to cgroup v1 DaemonSet.
Important
- Use the DaemonSet cautiously. Test it in a lower environment before applying to production to ensure compatibility and prevent disruptions.
- By default, the DaemonSet applies to all nodes in the cluster and reboots them to implement the
cgroup
change. - To control how the DaemonSet is applied, configure a
nodeSelector
to target specific nodes.
Note
If you experience only an increase in memory use without any of the other symptoms that are mentioned in the "Symptoms" section, you don't have to take any action.
Status
We're actively working with the Kubernetes community to resolve the underlying issue. Progress on this effort can be tracked at Azure/AKS Issue #3443.
As part of the resolution, we plan to adjust the eviction thresholds or update resource reservations, depending on the outcome of the fix.
Reference
- Node memory usage on cgroupv2 reported higher than cgroupv1 (GitHub issue)
Third-party information disclaimer
The third-party products that this article discusses are manufactured by companies that are independent of Microsoft. Microsoft makes no warranty, implied or otherwise, about the performance or reliability of these products.
Third-party contact disclaimer
Microsoft provides third-party contact information to help you find additional information about this topic. This contact information may change without notice. Microsoft does not guarantee the accuracy of third-party contact information.
Contact us for help
If you have questions or need help, create a support request, or ask Azure community support. You can also submit product feedback to Azure feedback community.