App Insights: Metrics Outage (no-report) alerting rule

Question

I use App Insights metrics to report the delay between an application event and it's processing. The processing time is, per definition, >1s since it uses a cron based scheduler. I've written a bash script to report the time difference to Azure App Insights. This is working fine so far. Now I've configured two alerts:

avg(time difference) of the last 5 minutes > 120
avg(time difference) of the last 5 minutes <= 1

The first alert is pretty obvious: catch a instance where my application is not processing the event correctly at all.
The second alert might need some more explanation: I want to catch the case in which my bash script is not reporting any data at all (i.e. system downtime, complete application crash…). In theory, the average value would drag down to 0 within 5 minutes after the application crash, thus triggering the alert and sending me an email.

This is not working at all: I can kill my bash script that transfers the data to the custom metrics API and not receive an alert at all (yes, I've waited the 5 minutes). If I manually (/from my bash script that is) report values of 0 for the time difference, the alert fires correctly. If I then change the script to report a value > 0, the alert is deactived properly as well. I have also tested this with avg(td) < 0 (which is my preferred way of doing it), but that doesn't work either. Is this expected / documented behavior? It really doesn't make a whole lot of sense to me. Is there a better way to alert on this"non-reporting" of certain metrics?

Answer

There are a couple features in Application Insights to be aware of if you are looking for a 1:1 mapping of activity and result instead of statistically relevant overviews. Sampling in Application Insights is one of the first things I would look at if you are not seeing specific events that you are expecting. You would also want to be aware of a 5-10 minute delay in the availability of data although that may not be important in your scenario. I would also take a quick look at other similar services, like Stream Analytics to see if they are more in line with your goals for this project.

App Insights: Metrics Outage (no-report) alerting rule

1 answer