Freigeben über


Use crawl rules to determine what content gets crawled (Search Server 2008)

Tipp

Falls nicht anders angegeben, beziehen sich die Informationen in diesem Artikel sowohl auf Microsoft Search Server 2008 als auch auf Microsoft Search Server 2008 Express.

In this article:

  • Create a crawl rule

  • Edit a crawl rule

  • Delete a crawl rule

  • Reorder crawl rules

Before you perform these procedures, confirm that:

Wichtig

You must be a search services administrator to perform the procedures in this article. For more information, see Hinzufügen oder Entfernen eines Suchdienstadministrators (Search Server 2008).

You can create new crawl rules or edit existing crawl rules to determine what content gets crawled. You can also reorder crawl rules to specify the order in which these rules are applied.

Create a crawl rule

Use the following procedure to create a crawl rule.

Create a crawl rule

  1. On the Search Administration page, in the Crawling section, click Crawl rules.

  2. On the Manage Crawl Rules page, click New Crawl Rule.

  3. On the Add Crawl Rule page, in the Path section, in the Path box, type the path affected by this rule. You can use standard wildcard characters in the path. For example:

    • http://server1/folder* contains all Web resources with a URL that starts with http://server1/folder.

    • *://*.txt includes every document with the txt file extension.

  4. In the Crawl Configuration section, select one of the following:

    • Exclude all items in this path. Select this option if you want all items in the specified path to be excluded from the crawl.

    • Include all items in this path. Select this option if you want all items in the path to be crawled.

  5. If you chose to exclude all items in this path, skip to step 7. Otherwise, you can further refine the inclusion by selecting any combination of the following:

    • Follow links on the URL without crawling the URL itself. Select this option if you want to crawl links contained within the URL, but not the URL itself.

    • Crawl complex URLs (URLs that contain a question mark (?)). Select this option if you want to crawl URLs that contain parameters that use the question mark (?) notation.

    • Crawl SharePoint content as HTTP pages. Normally, SharePoint content is crawled by using a special protocol. Select this option if you want SharePoint content to be crawled as HTTP pages instead. When the content is crawled by using the HTTP protocol, item permissions are not stored. This means that all items that match a particular search query appear on search results pages, regardless of whether the user that initiated the query has access to those items.

      The purpose of this setting is to enable search administrators to crawl remote SharePoint sites that they do not have explicit control over and therefore cannot enforce that the domain account used to crawl those remote sites has been granted full-read permissions on those sites.

  6. Click OK.

  7. Repeat steps 2 through 6 for each new crawl rule you want to create.

Edit a crawl rule

You can edit an existing crawl rule at any time by clicking it, and then making the necessary changes to the path and configuration, as described in the previous procedure.

Tipp

This will require a full crawl of the content impacted by the altered crawl rule.

Delete a crawl rule

Use the following procedure to delete a crawl rule that is no longer needed.

Delete a crawl rule

  1. On the Manage Crawl Rules page, point to the crawl rule that you want to delete, click the arrow that appears, and then click Delete on the menu that appears.

  2. Click OK to confirm the deletion.

Tipp

This will require a full crawl of the content impacted by the deleted crawl rule.

Reorder crawl rules

After you create new crawl rules, we recommend that you specify the order in which you want the rules to be applied while content is being crawled. Crawl rules are applied in the order in which they are listed. Therefore, if two rules cover the same or overlapping content, the first rule that is listed is applied. Use the following procedure to specify the order of your crawl rules.

Reorder crawl rules

  • On the Manage Crawl Rules page, in the Order column in the list of crawl rules, select a value in the list that specifies the position you want the rule to occupy. Other values are shifted accordingly.

Tipp

This will require a full crawl of the content impacted by the repositioned crawl rule.