KubeEvents Table Inconsistent Results

devopsfj 246 Reputation points
2024-01-18T16:31:48.9466667+00:00

Hello,
I am trying to setup Azure Alerts when any of our Pods fail or crash. I am using the below query:

KubeEvents  
| where ObjectKind == 'Pod' 
| where Reason in ('BackOff', 'Unhealthy', 'CrashLoopBackOff')
| where Name contains "reporting" 
| project TimeGenerated, Name, ObjectKind, Reason, Message, Namespace, Count 
| order by TimeGenerated desc

This is kind of working however I have found one issue, after the first initial log for a pod, a new log is not created, the count on the existing log is just increased, take the below example: User's image

So these events above, they actually all happened 30 minutes between each other, the first time the liveness probe failed was indeed 15:53, however, the subsequent failures were 30 minutes after, but the log still gets incremented at the first time of the failure which is not accurate, is this a bug? So for example my pod become unhealthy at 15.53, I then rectified the issue, my Pod was healthy for 30 minutes, I then forced an issue which would cause my liveness probe to fail again, Kubernetes then logged another issue at the same time as the first liveness probe failure (15.53). This is an issue because I am trying to setup an alert to notify us when a pod crashes, my query is set to take a look at the last 10 minutes, but because KubeEvents always adds another log to the first failure time, this is not going to work. Is there any way around this?

Azure Kubernetes Service (AKS)
Azure Kubernetes Service (AKS)
An Azure service that provides serverless Kubernetes, an integrated continuous integration and continuous delivery experience, and enterprise-grade security and governance.
2,254 questions
0 comments No comments
{count} votes

2 answers

Sort by: Most helpful
  1. devopsfj 246 Reputation points
    2024-01-18T16:48:54.6833333+00:00

    I was thinking of using LastSeen, however, this will not work as if you set the results to last 10 minutes, it goes off TimeGenerated.

    0 comments No comments

  2. Anveshreddy Nimmala 3,550 Reputation points Microsoft Vendor
    2024-01-22T06:11:18.4833333+00:00

    Hello devopsfj, Welcome to microsoft Q&A,Thankyou for posting your quey here. You are correct that using the LastSeen property will not work in this case, as it is based on the TimeGenerated property and will not account for the behavior you are experiencing with the KubeEvents table. To work around this issue, you can try using the ContainerLog table instead of the KubeEvents table. The ContainerLog table contains log lines collected from stdout and stderr streams for containers, which includes logs for liveness and readiness probe. ContainerLog | where LogEntrySource == "stdout" and LogEntry contains "Liveness probe failed" and ContainerName contains "reporting" | project Time Generated, ContainerID, ContainerName, LogEntry | order by TimeGenerated desc This query filters the ContainerLog table to only show logs for liveness probe failures for containers with "reporting" in their name. Hope this answer helps you , Please accept the answer if it is helpful for the sake of community , please post any other query/error here.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.