How to investigate sign-ins requiring multifactor authentication

Microsoft Entra Health monitoring provides a set of tenant-level health metrics you can monitor and alerts when a potential issue or failure condition is detected. There are multiple health scenarios that can be monitored, including multifactor authentication (MFA).

This scenario:

  • Aggregates the number of users who successfully completed an MFA sign-in using a Microsoft Entra cloud MFA service.
  • Captures interactive sign-ins with MFA, aggregating both successes and failures.
  • Excludes when a user refreshes the session without completing the interactive MFA or using passwordless sign-in methods.

This article describes these health metrics and how to troubleshoot a potential issue when you receive an alert.

Prerequisites

There are different roles, permissions, and license requirements to view health monitoring signals and configure and receive alerts. Apart from Microsoft Entra admin roles, Microsoft Graph permissions are required to access health monitoring signals and alerts via the Microsoft Graph APIs. We recommend using a role with least privilege access to align with the Zero Trust guidance.

Required roles and permissions

Activity Roles
View scenario monitoring signals and alerts and alert configurations Reports Reader
Security Reader
Security Operator
Security Administrator
Helpdesk Administrator
Global Reader
Update alerts Security Operator
Security Administrator
Helpdesk Administrator
Update alert notification configurations Security Administrator
Helpdesk Administrator
View and modify Conditional Access policies Conditional Access Administrator
View the alerts using the Microsoft Graph API HealthMonitoringAlert.Read.All permission
View and modify the alerts using the Microsoft Graph API HealthMonitoringAlert.ReadWrite.All permission

Gather data

Investigating an alert starts with gathering data.

  1. Gather the signal details and impact summary.
  2. Review the sign-in logs.
  3. Check the audit logs for recent policy changes.

Mitigate common issues

The following common issues could cause a spike in MFA sign-ins. This list isn't exhaustive, but provides a starting point for your investigation.

Application configuration issues

An increase in sign-ins requiring MFA could indicate a policy change or new feature rollout potentially triggered a large number of users to sign in around the same time.

To investigate:

  • In the impact summary, if resourceType is "application" and there's only one or two applications listed, check the audit logs for changes to the listed applications.
  • In the audit logs, use the Target column to filter for the application or open the audit logs from Enterprise Applications, so the filter is already set.
  • Determine if the application was recently added or reconfigured.
  • In the sign-in logs, use the Application column to filter for the same application or date range to look for any other patterns.

User authentication issues

An increase in sign-ins requiring MFA could indicate a brute force attack, where multiple unauthorized sign-in attempts are made to a user's account.

To investigate:

  • In the impact summary, if resourceType is "user" and the impactedCount value shows a small subset of users, the issue might be user-specific.
  • Use the following filters in the sign-in logs:
    • Status: Failure
    • Authentication requirement: Multifactor authentication
    • Adjust the date to match the timeframe indicated in the impact summary.
  • Are the failed sign-in attempts coming from the same IP address?
  • Are the failed sign-in attempts from the same user?
  • Run the sign-in diagnostic to rule out standard user error issues or initial MFA setup issues.

Network issues

There could be a regional system outage that required a large number of users to sign in at the same time.

To investigate:

  • In the impact summary, if resourceType is "user" and the impactedCount value shows a large percentage of your organization's users, you might be looking at a wide spread issue.
  • Check your system and network health to see if an outage or update matches the same timeframe as the anomaly.

Next steps