An Azure service that provides protection for web apps.
Hello **Darrin Werre - Azure Admin
**I understand that your Google crawlers are being denied access, even though you aren’t seeing any entries marked as “blocked” by the WAF.
Please let me know which WAF are you using AFD or Application Gateway?
- Azure Front Door (Standard/Premium), Application Gateway, and CDN each have their own Bot Manager integration and logging methods. Ensure that your front door, app-gw, or CDN endpoint is the one actually handling the traffic from Google.
- In the Managed Rules section of your WAF policy, make sure the Microsoft_BotManagerRuleSet (or BotManagerRuleSet_1.1 for Front Door Premium) is assigned. Verify that “Good bots” are allowed and not blocked by any custom rule. Also, confirm the policy is set to Prevention mode, rather than Detection.
- Activate WAF diagnostics to Log Analytics, Storage, or Event Hub. Apply filters for bot rule IDs, such as 300600 on Front Door, or search for “BotManager” in the logs. Check for entries where the user-agent includes “Googlebot” or where the rule action is “Block” or “Log”.
- Google provides its crawler IP ranges here: https://developers.google.com/crawling/docs/crawlers-fetchers/verify-google-requests
- Perform a reverse DNS lookup on incoming IPs to confirm they are from Google. If any are incorrectly identified (such as appearing as “unknown bot”), you might need to use an allow-list. You can set up a custom WAF exclusion rule to permit traffic from these IP ranges or the UA string while troubleshooting.
- If you have custom OWASP rules in place, such as for SQLi, XSS, or size constraints, please double-check that none of them are unintentionally triggering on the Googlebot user agent or its query patterns. Since custom rules are processed before managed rule sets, a general rule like “block on unknown UA” could be causing the issue.
- Microsoft’s “test” crawler may only simulate a user agent check, while the production throttling and algorithms can be stricter. Try using curl or a basic HTTP client from a Googlebot IP to your public endpoint, and be sure to capture the complete WAF logs.
Check the below reference documents for more understanding:
bot protection for Web Application Firewall
Custom rules for Azure Web Application Firewall on Azure Front Door
Azure Web Application Firewall on Azure Application Gateway bot protection overview
I hope the above answer helps you! Please let us know if you have any further questions.
Please don't forget to "upvote" where the information provided will help you, this can be beneficial to other members of the community.
and “up-vote” wherever the information provided helps you, this can be beneficial to other community members.