How to configure alerts for long running automation runbooks

Sai Praneeth Eranti 170 Reputation points
2024-05-28T06:42:04.1233333+00:00

Hello ,

How can I set up an alert to notify me if an Automation runbook takes more than 30 minutes to complete? I have set up an alert for statuses "Completed," "Failed," and "Suspended," and receiving notifications but I am not receiving notifications for the "Running" status. Is there anything I am missing? Here is a screenshot of my current configuration:

User's image

Azure Automation
Azure Automation
An Azure service that is used to automate, configure, and install updates across hybrid environments.
1,173 questions
{count} votes

Accepted answer
  1. AnuragSingh-MSFT 21,236 Reputation points
    2024-06-03T09:36:44.5166667+00:00

    @Sai Praneeth Eranti, Based on the requirement in the question, a metric based alert rule will not help you achieve that because you have to consider 2 things here - when did the job start and when did it end (complete, fail, suspend, stopped).

    This can be achieved using Log based alert rule instead, based on the logs collected via Diagnostic settings of Azure Automation Account. The following is the high-level steps:

    1. Enable Diagnostic settings for Azure Automation account, It only needs to be enabled for "logs" category with destination as "Log Analytics Workspace (LA workspace)"
    2. Once the logs start flowing to LA workspace (which might take a few minutes), use the query below to get the jobs which has started but has not ended -
         let threshold_minute = 30 ; //runtime threashold after which the job is categorized as long running
         AzureDiagnostics
         | where ResourceProvider == "MICROSOFT.AUTOMATION" and Category == "JobLogs"
         | where ResultType in ("Started", "Completed", "Failed", "Stopped", "Suspended") //for job ending due to any of these condition
         | project TimeGenerated, ResultType, JobId_g, RunbookName_s
         | summarize startTime = min(TimeGenerated), entryCount = count() by JobId_g, RunbookName_s
         | where entryCount == 1 //--> for jobs which has neither completed, stopped, failed or suspended will have 2 entries
                                 //--> hence filter out the jobs where only start even is avaialble
         | extend ageOfJob_minute = datetime_diff('Minute', now(), startTime)
         | where ageOfJob_minute > threshold_minute
      
    3. Create a log based alert rule based on the query above.

    For additional information, see the following articles:

    Hope this helps.

    If the answer did not help, please add more context/follow-up question for it. Else, if the answer helped, please click Accept answer so that it can help others in the community looking for help on similar topics.


0 additional answers

Sort by: Most helpful