Memory saturation occurs in pods after cluster upgrade to Kubernetes 1.25

This article discusses how to fix pods that stop working because of memory saturation or out-of-memory (OOM) errors that occur after you upgrade a Microsoft Azure Kubernetes Service (AKS) cluster to Kubernetes 1.25.x.

Symptoms

One or more of the following issues occur:

  • Memory pressure on nodes

  • Increased memory usage for apps when compared to their memory usage before the upgrade

  • CPU throttling on nodes

  • Pod failure because of OOM errors

Performance degradation can occur in apps that run in the following environments:

Note

This list of environments in which performance degradation can occur isn't a comprehensive list. There might be other environments that experience memory saturation or OOM issues.

Solution

Beginning in the release of Kubernetes 1.25, the cgroup version 2 API has reached general availability (GA). AKS now uses Ubuntu Linux version 22.04. By default, version 22.04 uses cgroup version 2 API. To make sure the cgroup version 2 API is available for use in other environments to prevent the memory saturation issue, follow this guidance:

In addition, to enable pods to use more resources, increase their memory requests and limits.

Third-party information disclaimer

The third-party products that this article discusses are manufactured by companies that are independent of Microsoft. Microsoft makes no warranty, implied or otherwise, about the performance or reliability of these products.

Third-party contact disclaimer

Microsoft provides third-party contact information to help you find additional information about this topic. This contact information may change without notice. Microsoft does not guarantee the accuracy of third-party contact information.

Contact us for help

If you have questions or need help, create a support request, or ask Azure community support. You can also submit product feedback to Azure feedback community.