How to check if Azure Kubernetes Service cluster is performing good or not?

Syed Mohammed Nusrath 40 Reputation points
2024-09-30T08:20:06.3066667+00:00

How to check if Azure Kubernetes Service cluster is performing good or not?

Azure Kubernetes Service (AKS)
Azure Kubernetes Service (AKS)
An Azure service that provides serverless Kubernetes, an integrated continuous integration and continuous delivery experience, and enterprise-grade security and governance.
2,101 questions
{count} votes

Accepted answer
  1. Vinodh247 20,476 Reputation points
    2024-09-30T13:59:43.31+00:00

    Hi Syed Mohammed Nusrath,

    Thanks for reaching out to Microsoft Q&A.

    To assess the performance of an Azure Kubernetes Service (AKS) cluster, several monitoring tools and techniques can be utilized. Here’s a structured approach to check if your AKS cluster is performing well:

    Monitoring Tools and Techniques

    Azure Monitor and Container Insights:

    • Enable Container Insights: This feature provides detailed monitoring of container workloads. You can enable it during cluster creation or afterward through the Azure portal.
    • Access Metrics: Navigate to your AKS cluster in the Azure portal, go to Monitoring, and select Insights. Here, you can view CPU and memory usage metrics for nodes and containers. Set appropriate time ranges to analyze trends over time.

    Use kubectl Commands:

    Check Node and Pod Performance: Use the following commands to get real-time metrics:

    kubectl top nodes

    kubectl top pods --all-namespaces

    These commands will display CPU and memory usage for each node and pod, helping identify resource-intensive components.

    Analyze Logs with Log Analytics

    Log Analytics Workspace: Connect your AKS logs to a Log Analytics workspace. This allows you to run queries on logs for deeper insights into performance issues.

    • Predefined Queries: Use built-in queries to assess node readiness, pod status, and other critical metrics.

    Resource Health Monitoring

    • Check Resource Health: In the Azure portal, use the Resource Health feature to monitor the overall health of your AKS resources. This tool provides status reports indicating whether your resources are available, degraded, or unavailable.

    Alerts and Notifications

    • Set Up Alerts: Configure alerts based on specific metrics (e.g., CPU usage thresholds). This proactive approach helps in identifying performance degradation before it impacts applications.

    Key Performance Indicators (KPIs) to Monitor

    CPU Usage: Regularly check for high CPU usage which can indicate saturation.

    Memory Usage: Monitor memory utilization to avoid out-of-memory errors.

    Pod Status: Ensure that pods are in a running state; investigate any that are pending or failed.

    Node Availability: Check if nodes are in a NotReady state, which could affect application performance.

    Best Practices:

    Resource Requests and Limits: Set appropriate resource requests and limits for your containers to optimize performance and prevent resource contention.

    Horizontal Pod Autoscaler (HPA): Implement HPA to automatically adjust the number of pods in response to demand, ensuring better resource utilization.

    For further reading:

    https://learn.microsoft.com/en-us/troubleshoot/azure/azure-kubernetes/availability-performance/identify-high-cpu-consuming-containers-aks

    https://stackoverflow.com/questions/57096956/how-to-monitor-the-azure-kubernetes-cluster-resource-status-in-azure-portal

    https://learn.microsoft.com/vi-vn/azure/aks/monitor-aks

    https://learn.microsoft.com/en-us/azure/architecture/operator-guides/aks/aks-triage-cluster-health

    https://learn.microsoft.com/en-us/azure/aks/aks-diagnostics

    Please 'Upvote'(Thumbs-up) and 'Accept' as an answer if the reply was helpful. This will benefit other community members who face the same issue.

    0 comments No comments

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.