The SharePoint Server crawler ignores directives in Robots.txt
Original KB number: 3019711
Consider the following scenario:
You use the Microsoft SharePoint Server 2013 or SharePoint Server 2010 search engine to crawl various sites.
For those sites, you want to use directives in the Robots.txt file to define the paths that the search engine can crawl.
You set the following directive for the default user-agent of the crawler:
User-Agent: Mozilla/4.0 (compatible; MSIE 4.01; Windows NT; MS Search 6.0 Robot)
In this scenario, the SharePoint Server crawler doesn't apply the directive.
This issue occurs because the SharePoint Server crawl engine doesn't recognize its default user-agent in the directive.
To resolve this issue, use the following directive in the Robots.txt file:
User-Agent: MS Search 6.0 Robot