Hello Abdul Aziz,
Welcome to the Microsoft Q&A and thank you for posting your questions here.
Problem
I understand that you are having discrepancies between the memory usage metrics reported by kubectl top nodes
and the utilization metrics used by the Descheduler in their AKS (Azure Kubernetes Service) deployment.
Solution
To solve the issues after analyzing the provided output and information. There are two causes to these:
- The discrepancy is likely due to differences in how the metrics are collected and reported, or because the Descheduler uses resource requests and limits, while
kubectl top nodes
reports actual usage. - The pods may not have resource requests and limits set, causing the Descheduler to underestimate their actual resource usage
For the common issue listed above, you can do the following:
STAGE 1
- Ensure that the metrics server is correctly reporting resource usage.
kubectl get apiservice v1beta1.metrics.k8s.io -o yaml
kubectl logs <metrics-server-pod>
- Compare the metrics reported by
kubectl top nodes
,kubectl top pods
, and Descheduler to identify any inconsistencies.
kubectl top nodes
kubectl top pods
kubectl describe nodes
STAGE 2
- Ensure that all pods have appropriate resource requests and limits set. This will help the Descheduler to make accurate decisions.
apiVersion: v1
kind: Pod
metadata:
name: example-pod
spec:
containers:
- name: example-container
image: nginx
resources:
requests:
memory: "128Mi"
cpu: "500m"
limits:
memory: "256Mi"
cpu: "1000m"
Then, you will apply the changes to the cluster:
kubectl apply -f example-pod.yaml
Remember: example-pod.yaml was the YAML file above.
Finally
After setting the resource requests and limits, you will then need to monitor the metrics to ensure they align.
kubectl top nodes
kubectl top pods --all-namespaces
kubectl describe nodes
So therefore, if you can ensure that the metrics server is functioning correctly, setting appropriate resource requests and limits, and consistently monitoring metrics, you will be able to align the reported resource usage across kubectl top nodes
, Descheduler, and kubectl describe node
. This alignment is crucial.
References
For more detail instruction and source for the above solutions, kindly use the following links:
Source: Resource Requests and Limits. Accessed, 6/13/2024.
Source: Kubernetes Metrics Server. Accessed, 6/13/2024.
Source: Azure Kubernetes Service (AKS) Documentation. Accessed, 6/13/2024.
Source: Kubernetes Descheduler. Accessed, 6/13/2024.
Source: Setting Resource Requests and Limits. Accessed, 6/13/2024.
Source: Monitoring and Troubleshooting Metrics Server. Accessed, 6/13/2024.
Accept Answer
I hope this is helpful! Do not hesitate to let me know if you have any other questions.
** Please don't forget to close up the thread here by upvoting and accept it as an answer if it is helpful ** so that others in the community facing similar issues can easily find the solution.
Best Regards,
Sina Salam