Is receiving prometheus KubeAggregatedAPIErrors alerts a sign of an unhealthy AKS cluster?

Question

Is receiving prometheus KubeAggregatedAPIErrors alerts a sign of an unhealthy AKS cluster?

AlexandraGroschner-3808 40

Following setup:

AKS cluster, version 1.30.10
Deployment of kube-prometheus-stack (https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack), version 57.2.0
- with enabled Alertmanager deployment and default rule set (from which some I already disabled)

Since around 2 weeks I frequently get alerts of type "KubeAggregatedAPIErrors" with a description that goes like "Kubernetes aggregated API v1beta1.metrics.k8s.io/default has reported errors. It has appeared unavailable 125.3 times averaged over the past 10m."

The error then resolves itself but triggers again.

I found https://github.com/prometheus-community/helm-charts/issues/3539 which among other things suggests restarts of the metrics server (due to a lack of resources) could be the problem, but this is not the case.

Some comments in the issue also mention people contacted the Azure support and it's due to their "normal" control plane activities.

But this happens now to 2 of my 4 clusters and I wanted to check if this indicates a problem or not.

Thanks in advance for any tip!

Siva Pavuluri 570 Reputation points Microsoft External Staff Moderator

2025-06-13T01:29:37.1833333+00:00

Hi Alexandra Groschner,

Just checking in to see if you have got a chance to see my response to your question in resolving the issue.

If the answer is helpful, please click "Accept Answer" and "Upvote it" as it can be helpful to others in the community.

Thank You.
Siva Pavuluri 570 Reputation points Microsoft External Staff Moderator

2025-06-16T01:35:28.3333333+00:00

Hi Alexandra Groschner,

Just checking in to see if you have got a chance to see my response to your question in resolving the issue.

If the answer is helpful, please click "Accept Answer" and "Upvote it" as it can be helpful to others in the community.

Thank You.

Accepted answer

0 additional answers

Your answer

Siva Pavuluri 570 Reputation points Microsoft External Staff Moderator

2025-06-13T01:29:37.1833333+00:00

Hi Alexandra Groschner,

Just checking in to see if you have got a chance to see my response to your question in resolving the issue.

If the answer is helpful, please click "Accept Answer" and "Upvote it" as it can be helpful to others in the community.

Thank You.
Siva Pavuluri 570 Reputation points Microsoft External Staff Moderator

2025-06-16T01:35:28.3333333+00:00

Hi Alexandra Groschner,

Just checking in to see if you have got a chance to see my response to your question in resolving the issue.

If the answer is helpful, please click "Accept Answer" and "Upvote it" as it can be helpful to others in the community.

Thank You.

Answer 1

Hi Alexandra Groschner,

The alert is triggered when Prometheus detects errors or unavailability in aggregated API endpoints, averaged over a specific period (e.g., 10 minutes).

In your case, the API v1beta1.metrics.k8s.io is provided by the Kubernetes metrics-server, which is crucial for collecting resource metrics (CPU/memory) across the cluster. If the metrics-server pod is restarting or under-resourced, it can temporarily become unavailable, triggering this alert. However, you've noted that this is not happening in your clusters.

If the alert resolves on its own and there are no visible impacts on workload performance or metrics collection, it is generally considered a transient or benign issue. This is especially true in managed environments like AKS, where the control plane components are not under direct user control apiserver-aggregation

If you found information is helpful, please click "Upvote" on the post to let us know.

If you have any further queries feel free to ask us we are happy to assist you.

Thank You.

Share via

Is receiving prometheus KubeAggregatedAPIErrors alerts a sign of an unhealthy AKS cluster?

0 additional answers

Your answer