Share via


Use crawl rules to determine what content gets crawled (Office SharePoint Server 2007)

Applies To: Office SharePoint Server 2007

This Office product will reach end of support on October 10, 2017. To stay supported, you will need to upgrade. For more information, see , Resources to help you upgrade your Office 2007 servers and clients.

 

Topic Last Modified: 2016-11-14

In this article:

  • Create a crawl rule

  • Edit a crawl rule

  • Delete a crawl rule

  • Reorder crawl rules

Before you perform these procedures, confirm that:

Important

You must be a shared services administrator to perform the procedures in this article.

You can create new crawl rules or edit existing crawl rules to determine what content gets crawled. You can also reorder crawl rules to specify the order in which these rules are applied.

Create a crawl rule

Use the following procedure to create a crawl rule.

Create a crawl rule

  1. Complete one of the following steps depending on the status of your installation.

    • If the Infrastructure Update for Microsoft Office Servers is installed, in Central Administration, on the Quick Launch, in the Shared Services Administration group, click a shared service.

      On the Shared Services Administration page, in the Search section, click Search administration.

      On the Search Administration page, on the Quick Launch, in the Crawling section, click Crawl Rules.

    • If the Infrastructure Update for Microsoft Office Servers is not installed, in Central Administration, on the Quick Launch, in the Shared Services Administration group, click a shared service.

      On the Shared Services Administration page, in the Search section, click Search settings.

      On the Configure Search Settings page, in the Crawl Settings section, click Crawl rules.

  2. On the Manage Crawl Rules page, click New Crawl Rule.

  3. On the Add Crawl Rule page, in the Path section, in the Path box, type the path affected by this rule. You can use standard wildcard characters in the path. For example:

    • http://server1/folder* contains all Web resources with a URL that starts with http://server1/folder.

    • *://*.txt includes every document with the txt file extension.

  4. In the Crawl Configuration section, select one of the following:

    • Exclude all items in this path. Select this option if you want all items in the specified path to be excluded from the crawl.

    • Include all items in this path. Select this option if you want all items in the path to be crawled.

  5. If you chose to exclude all items in this path, skip to step 7. Otherwise, you can further refine the inclusion by selecting any combination of the following:

    • Follow links on the URL without crawling the URL itself. Select this option if you want to crawl links contained within the URL, but not the URL itself.

    • Crawl complex URLs (URLs that contain a question mark (?)). Select this option if you want to crawl URLs that contain parameters that use the question mark (?) notation.

    • Crawl SharePoint content as HTTP pages. Normally, SharePoint content is crawled by using a special protocol. Select this option if you want SharePoint content to be crawled as HTTP pages instead. When the content is crawled by using the HTTP protocol, item permissions are not stored. This means that all items that match a particular search query appear on search results pages, regardless of whether the user that initiated the query has access to those items.

      The purpose of this setting is to enable search administrators to crawl remote SharePoint sites that they do not have explicit control over and therefore cannot enforce that the domain account used to crawl those remote sites has been granted full-read permissions on those sites.

    Note

    For information about the settings in the Specify Authentication section, see Use crawl rules to specify a different content access account or authentication method (Office SharePoint Server 2007).

  6. Click OK.

  7. Repeat steps 4 through 7 for each new crawl rule you want to create.

Edit a crawl rule

You can edit an existing crawl rule at any time by clicking it and then making the necessary changes to the path and configuration, as described in the previous procedure.

Note

This will require a full crawl of the content impacted by the altered crawl rule.

Delete a crawl rule

Use the following procedure to delete a crawl rule that is no longer needed.

Delete a crawl rule

  1. Complete one of the following steps depending on the status of your installation.

    • If the Infrastructure Update for Microsoft Office Servers is installed, in Central Administration, on the Quick Launch, in the Shared Services Administration group, click a shared service.

      On the Shared Services Administration page, in the Search section, click Search administration.

      On the Search Administration page, on the Quick Launch, in the Crawling section, click Crawl Rules.

    • If the Infrastructure Update for Microsoft Office Servers is not installed, in Central Administration, on the Quick Launch, in the Shared Services Administration group, click a shared service.

      On the Shared Services Administration page, in the Search section, click Search settings.

      On the Configure Search Settings page, in the Crawl Settings section, click Crawl rules.

  2. On the Manage Crawl Rules page, point to the crawl rule that you want to delete, click the arrow that appears, and then click Delete on the menu that appears.

  3. Click OK to confirm the deletion.

Note

This will require a full crawl of the content impacted by the deleted crawl rule.

Reorder crawl rules

After you create new crawl rules, we recommend that you specify the order in which you want the rules to be applied while content is being crawled. Crawl rules are applied in the order in which they are listed. Therefore, if two rules cover the same or overlapping content, the first rule that is listed is applied. Use the following procedure to specify the order of your crawl rules.

Reorder crawl rules

  1. Complete one of the following steps depending on the status of your installation.

    • If the Infrastructure Update for Microsoft Office Servers is installed, in Central Administration, on the Quick Launch, in the Shared Services Administration group, click a shared service.

      On the Shared Services Administration page, in the Search section, click Search administration.

      On the Search Administration page, on the Quick Launch, in the Crawling section, click Crawl Rules.

    • If the Infrastructure Update for Microsoft Office Servers is not installed, in Central Administration, on the Quick Launch, in the Shared Services Administration group, click a shared service.

      On the Shared Services Administration page, in the Search section, click Search settings.

      On the Configure Search Settings page, in the Crawl Settings section, click Crawl rules.

  2. On the Manage Crawl Rules page, in the Order column in the list of crawl rules, select a value in the list that specifies the position you want the rule to occupy. Other values are shifted accordingly.

    You can also use a global exclusion rule, which applies regardless of the order in which it is listed. For more information about administering crawl rules, see the Administrating Crawl Rules section in the following resource: Book Excerpt - Chapter 16 Enterprise search and indexing architecture and administration.

Note

This will require a full crawl of the content that is affected by the repositioned crawl rule.