Use crawl rules to determine what content gets crawled (Office SharePoint Server 2007)
Applies To: Office SharePoint Server 2007
This Office product will reach end of support on October 10, 2017. To stay supported, you will need to upgrade. For more information, see , Resources to help you upgrade your Office 2007 servers and clients.
Topic Last Modified: 2016-11-14
In this article:
Create a crawl rule
Edit a crawl rule
Delete a crawl rule
Reorder crawl rules
Before you perform these procedures, confirm that:
- You have read the topic Limit or increase the quantity of content that is crawled (Office SharePoint Server).
Important
You must be a shared services administrator to perform the procedures in this article.
You can create new crawl rules or edit existing crawl rules to determine what content gets crawled. You can also reorder crawl rules to specify the order in which these rules are applied.
Create a crawl rule
Use the following procedure to create a crawl rule.
Create a crawl rule
Complete one of the following steps depending on the status of your installation.
If the Infrastructure Update for Microsoft Office Servers is installed, in Central Administration, on the Quick Launch, in the Shared Services Administration group, click a shared service.
On the Shared Services Administration page, in the Search section, click Search administration.
On the Search Administration page, on the Quick Launch, in the Crawling section, click Crawl Rules.
Note
For more information, see Description of the Microsoft Office Servers Infrastructure Update (https://go.microsoft.com/fwlink/?LinkID=121886).
If the Infrastructure Update for Microsoft Office Servers is not installed, in Central Administration, on the Quick Launch, in the Shared Services Administration group, click a shared service.
On the Shared Services Administration page, in the Search section, click Search settings.
On the Configure Search Settings page, in the Crawl Settings section, click Crawl rules.
On the Manage Crawl Rules page, click New Crawl Rule.
On the Add Crawl Rule page, in the Path section, in the Path box, type the path affected by this rule. You can use standard wildcard characters in the path. For example:
http://server1/folder* contains all Web resources with a URL that starts with http://server1/folder.
*://*.txt includes every document with the txt file extension.
In the Crawl Configuration section, select one of the following:
Exclude all items in this path. Select this option if you want all items in the specified path to be excluded from the crawl.
Include all items in this path. Select this option if you want all items in the path to be crawled.
If you chose to exclude all items in this path, skip to step 7. Otherwise, you can further refine the inclusion by selecting any combination of the following:
Follow links on the URL without crawling the URL itself. Select this option if you want to crawl links contained within the URL, but not the URL itself.
Crawl complex URLs (URLs that contain a question mark (?)). Select this option if you want to crawl URLs that contain parameters that use the question mark (?) notation.
Crawl SharePoint content as HTTP pages. Normally, SharePoint content is crawled by using a special protocol. Select this option if you want SharePoint content to be crawled as HTTP pages instead. When the content is crawled by using the HTTP protocol, item permissions are not stored. This means that all items that match a particular search query appear on search results pages, regardless of whether the user that initiated the query has access to those items.
The purpose of this setting is to enable search administrators to crawl remote SharePoint sites that they do not have explicit control over and therefore cannot enforce that the domain account used to crawl those remote sites has been granted full-read permissions on those sites.
Note
For information about the settings in the Specify Authentication section, see Use crawl rules to specify a different content access account or authentication method (Office SharePoint Server 2007).
Click OK.
Repeat steps 4 through 7 for each new crawl rule you want to create.
Edit a crawl rule
You can edit an existing crawl rule at any time by clicking it and then making the necessary changes to the path and configuration, as described in the previous procedure.
Note
This will require a full crawl of the content impacted by the altered crawl rule.
Delete a crawl rule
Use the following procedure to delete a crawl rule that is no longer needed.
Delete a crawl rule
Complete one of the following steps depending on the status of your installation.
If the Infrastructure Update for Microsoft Office Servers is installed, in Central Administration, on the Quick Launch, in the Shared Services Administration group, click a shared service.
On the Shared Services Administration page, in the Search section, click Search administration.
On the Search Administration page, on the Quick Launch, in the Crawling section, click Crawl Rules.
Note
For more information, see Description of the Microsoft Office Servers Infrastructure Update (https://go.microsoft.com/fwlink/?LinkID=121886).
If the Infrastructure Update for Microsoft Office Servers is not installed, in Central Administration, on the Quick Launch, in the Shared Services Administration group, click a shared service.
On the Shared Services Administration page, in the Search section, click Search settings.
On the Configure Search Settings page, in the Crawl Settings section, click Crawl rules.
On the Manage Crawl Rules page, point to the crawl rule that you want to delete, click the arrow that appears, and then click Delete on the menu that appears.
Click OK to confirm the deletion.
Note
This will require a full crawl of the content impacted by the deleted crawl rule.
Reorder crawl rules
After you create new crawl rules, we recommend that you specify the order in which you want the rules to be applied while content is being crawled. Crawl rules are applied in the order in which they are listed. Therefore, if two rules cover the same or overlapping content, the first rule that is listed is applied. Use the following procedure to specify the order of your crawl rules.
Reorder crawl rules
Complete one of the following steps depending on the status of your installation.
If the Infrastructure Update for Microsoft Office Servers is installed, in Central Administration, on the Quick Launch, in the Shared Services Administration group, click a shared service.
On the Shared Services Administration page, in the Search section, click Search administration.
On the Search Administration page, on the Quick Launch, in the Crawling section, click Crawl Rules.
Note
For more information, see Description of the Microsoft Office Servers Infrastructure Update (https://go.microsoft.com/fwlink/?LinkID=121886).
If the Infrastructure Update for Microsoft Office Servers is not installed, in Central Administration, on the Quick Launch, in the Shared Services Administration group, click a shared service.
On the Shared Services Administration page, in the Search section, click Search settings.
On the Configure Search Settings page, in the Crawl Settings section, click Crawl rules.
On the Manage Crawl Rules page, in the Order column in the list of crawl rules, select a value in the list that specifies the position you want the rule to occupy. Other values are shifted accordingly.
You can also use a global exclusion rule, which applies regardless of the order in which it is listed. For more information about administering crawl rules, see the Administrating Crawl Rules section in the following resource: Book Excerpt - Chapter 16 Enterprise search and indexing architecture and administration.
Note
This will require a full crawl of the content that is affected by the repositioned crawl rule.