Process 3: Continuous Monitoring
Figure 5. Continuous monitoring
Activities: Continuous Monitoring
The third process in SMC occurs after any monitoring tool being used is in place. When an event occurs, a notification is received, either by a dedicated SMC group or by a related group that has SMC responsibilities. After analysis, the event is either solved or escalated to a higher level for eventual solution.
This process involves the following activities:
- Receive notification.
- Analyze the event.
- Solve or escalate the event.
The following table describes these activities in greater detail.
Table 6. Activities and Considerations for Continuous Monitoring
Activities
Considerations
Receive notification
Key questions:
- Who should receive alerts?
- Do incoming alerts require 24/7 support and, if so, who should handle them?
- Is there a dedicated SMC group, or is monitoring handled by other departments, such as the Service Desk or Operations?
- Is there a need for correlating events? Correlating events allows for an end-to-end look at related events and makes troubleshooting easier.
- Have events historically been regarded as incidents, and has the incident management process handled the incident to analyze and resolve events/incidents?
- Is there a connector between the monitoring system and the Service Desk tools or will alerts be transferred manually?
- Do other departments or resources work on a given problem?
- Are automated solutions applied?
- Can alerts automatically be solved and closed?
- How are alerts communicated to groups (via pager, text message, monitoring console, e-mail)?
Inputs:
- IT services configured in the monitoring tool
- Role descriptions
- SMC policies and procedures
- Notifications
Outputs:
- Incident information
- Event information
- Alert information
Best practice:
- If something needs immediate attention, ensure that there is a way to prioritize it.
Analyze event
Key questions:
- Who is primarily responsible for event analysis?
- Who is responsible for handling “noise” reduction—for clearing out events that aren’t real and that should be removed from view?
- Is a known problem causing the event?
- Is there clear, easily accessible information available about possible solutions?
- Is the event description understandable?
- Have there been other alerts about the same problem?
- Can certain manual tasks help solve the problem?
- Does any tool used by the Service Desk contain procedures for covering this incident?
- Are there any changes planned for the IT service or for CIs of the IT service?
- Is the event actionable? Is it valid?
- Can the alert be tuned? Alert tuning is the adjustment of a service monitoring tool for a lower level of alert noise to reduce the number of false alerts.
- Is the impact to the IT service clearly understood and communicated in the SMC tool?
Inputs:
- Information about event resolution
- Description of the event
- Open problems
- Open incidents
- Open changes
- Information from other teams
Outputs:
- Event is solved
- Event escalated as an incident and its severity raised, with possible transfer to another team
Best practice:
- Ensure that all alerts are understandable, relevant, and up to date.
Resolve or escalate event
Key questions:
- Who has authority to escalate events?
- Who receives the escalated event?
- How can we ensure that the receiver takes ownership of the event? If the receiver can’t, is there an alternate individual or team to call upon?
- Which events should be subject to 24/7 escalation?
- Was the event resolved through the use of a knowledge base? Product knowledge? Other approaches?
- Should the alert threshold be tuned or updated?
Inputs:
- Updated knowledge about alerts
- Input for tuning the alerts
- Additional error description of the alert for further troubleshooting
- Description of previous activities (if problem is not solved)
Outputs:
- Escalated alerts
- Solved alerts
Best practice:
- Encourage each individual on the alert escalation chain to provide input and knowledge.
This accelerator is part of a larger series of tools and guidance from Solution Accelerators. |
Download |
Solution Accelerators Notifications |
Feedback |