Develop an incident response plan

4 minutes

The IT industry considers Microsoft 365 as a Managed Evergreen environment. This title means that Microsoft 365 is always changing and updating to give its customers the best service it can. At times, this type of dynamic environment can cause incidents in the service. For example, organizations can experience a short outage or a degradation in a valued service like SharePoint or Skype for Business. When these incidents occur, it's the duty of the Microsoft 365 Administrator to monitor these events and develop a mitigation plan.

The following graphic displays the fact that single services can be degraded dependent on the tenant location. In this example, the health of Exchange Online is normal for a company's US tenant, but degraded for its European tenant.

Microsoft 365 Administrators should complete the following steps to develop an incident response plan:

Validate the incident and confirming that your environment is affected. This step is necessary because some service incidents don't affect your environment. Since Microsoft 365 is a global service and spans across multiple data centers, the notifications go out in mass when they hit a certain threshold of tenant saturation. The Microsoft 365 Administrator can confirm the issue is present by running self-assessments, and stop false positive notifications from occurring.
Determine whether the incident is relevant to your company. There are times that the specific incident is related to a service that doesn’t interfere with daily business operations. By collaborating with your specialized administrators to tap into their product knowledge, you can determine if the incident is relevant to your company.
Review for timeline checks once relevancy and degradation have been established. The purpose of this step is to determine whether the service group set a specific timeframe on when they expect the service to be in a nondegraded state. If the service group didn't set a timeline for the incident, you can submit a service request to see if there's an opening on the timeline. You should have the incident number ready and added to the service request to speed up the resolution time.
Develop a backup solution in case the service is degraded for longer than an acceptable timeframe. While this step can be an inconvenience, it’s necessary to have a remedy in place when unexpected incidents occur. For example, you need to work from the cloud until Microsoft resolves the incident. Or, you need to work locally and on a reliable system during the down time.

Tip

Microsoft 365 Administrators should always check the Service Health dashboard for updates and use the service request as needed. The service is free and provided with your subscription.

Continue

Feedback