Best Practices: Writing SQL Syntax Queries for Relevant Results in Enterprise Search
Summary: Learn how to write SQL syntax queries to return the most relevant results for Enterprise Search in Microsoft Office SharePoint Server 2007. (4 printed pages)
Dmitriy Meyerzon, Microsoft Corporation
Jo-Anne West, Microsoft Corporation
January 2007
Applies to: Microsoft Office SharePoint Server 2007, Microsoft Office SharePoint Server 2007 for Search
Contents:
Introduction to Constructing the SQL Search Query for Relevance
Sort Query by Rank
Use Only One FREETEXT Clause
Unless Specifying a Particular Column, Specify DefaultProperties as Column for FREETEXT Predicate
Use AND Operator to Combine FREETEXT Predicate and Boolean Restriction
Use AND Boolean Operator to Combine CONTAINS Predicate with FREETEXT Predicate
Conclusion
Additional Resources
Introduction to Constructing the SQL Search Query for Relevance
By following certain basic guidelines for writing good SQL syntax queries, you can construct a SQL search query to achieve improved result relevance in custom search solutions you create for Enterprise Search in Microsoft Office SharePoint Server 2007. This article provides information about these guidelines.
To demonstrate these guidelines, we use the following example of a query, written with the focus on returning relevant results for a search of all content items about "sharepoint", authored by "John Smith".
SELECT Path, Title, Description FROM SCOPE() WHERE
FREETEXT(DefaultProperties, 'sharepoint') AND Author='John Smith'
ORDER BY Rank DESC
This query is composed of the following elements:
ORDER BY clause, specifying rank
Single FREETEXT clause
DefaultProperties column in FREETEXT clause
FREETEXT clause and Boolean restriction combined with AND operator
In the following sections, we examine these parts of the query, along with additional elements to demonstrate the guidelines and how they can help you.
Sort Query by Rank
In Enterprise Search SQL syntax, the ORDER BY clause sorts the query results based on the value of the column you specify. For more information, see ORDER BY Clause in Enterprise Search SQL Syntax. The relevance of each result that is returned for a particular search query is represented by an integer value returned in a column named "Rank". You can specify Rank as the column by which to sort the results in the ORDER BY clause, in descending order, as follows:
...ORDER BY Rank DESC
This ensures that results with a higher relevance value calculated for the query are returned at the start of the result set, while results with a lower relevance value calculated are returned at the end of the result set.
For more information about Rank, see the Understanding Relevance Values section of the Enterprise Search SQL Query Language Information in the Microsoft Office SharePoint Server 2007 Software Development Kit (SDK).
Use Only One FREETEXT Clause
When you construct your SQL syntax query statements, be aware that each FREETEXT clause represents a separate query, and rank is calculated for each separately; FREETEXT clauses are not combined. To return the most relevant results, do not use more than one FREETEXT clause in a query.
The rank value for an individual result returned for a query that contains multiple FREETEXT clauses combined by the AND operator is the smallest value between the separate rank values calculated for each FREETEXT clause in the query. This is shown in the following example.
...WHERE FREETEXT(DefaultProperties, 'sharepoint') AND FREETEXT(DefaultProperties, 'excel')...
To specify multiple query terms in a single FREETEXT clause for a query, append all the query terms to a single string, and separate each term by using a space. This is shown in the following example.
...WHERE FREETEXT(DefaultProperties, 'sharepoint excel')...
For more information, see FREETEXT Predicate in Enterprise Search SQL Syntax in the Microsoft Office SharePoint Server 2007 SDK.
Unless Specifying a Particular Column, Specify DefaultProperties as Column for FREETEXT Predicate
You can specify a single column or a column group against which to test the FREETEXT predicate. For example, the following syntax in our statement specifies to search for content items where the Title property contains the query terms "sharepoint" and "search".
...WHERE FREETEXT(Title, 'sharepoint search')
The column reference in the preceding example is optional. For relevance purposes, however, if you do not need to specify a particular column, we recommend that you specify DefaultProperties as the column. This is shown in the following example.
...WHERE FREETEXT(DefaultProperties, 'sharepoint search')
When you specify DefaultProperties, all indexed text properties that have non-zero weight are searched.
For more information see FREETEXT Predicate in Enterprise Search SQL Syntax in the Microsoft Office SharePoint Server 2007 SDK.
Note
If you do not specify a column reference, only the Contents column, which contains the body of the item, is searched.
Use AND Operator to Combine FREETEXT Predicate and Boolean Restriction
By specifying a Boolean restriction in a query, you reduce the number of possible matches to the query. For example, the following query syntax specifies that only documents authored by "John Smith" be returned.
...Author='John Smith'
In addition, if you combine a Boolean restriction clause with a FREETEXT clause by using the AND operator in a query, you reduce the number of possible matches to a query without affecting the rank values that are calculated based on the FREETEXT clause of the query. Because the rank value based on the Boolean restriction portion of the query is always 1000—the maximum possible rank value—the rank calculated for the result is the value of the FREETEXT clause, because this value is lower than 1000.
For example, when searching for documents about "SharePoint", authored by "John Smith", queries that use the following syntax…
...WHERE FREETEXT(DefaultProperties, 'sharepoint') AND Author='John Smith'...
…should return results that are more relevant than queries that use any of the following syntax examples.
...WHERE FREETEXT(DefaultProperties, 'sharepoint')
...WHERE FREETEXT(DefaultProperties, 'sharepoint "John Smith"')
...WHERE FREETEXT(DefaultProperties, 'sharepoint') OR Author='John Smith'...
For more information, see FREETEXT Predicate in Enterprise Search SQL Syntax in the Microsoft Office SharePoint Server 2007 SDK.
Use AND Boolean Operator to Combine CONTAINS Predicate with FREETEXT Predicate
Queries that contain only the CONTAINS predicate return results with unexpected rank ordering. For improved ranking of search results, you can combine a CONTAINS predicate condition with a FREETEXT predicate condition in the query. This is shown in the following example.
...WHERE FREETEXT(DefaultProperties, 'sharepoint search') AND CONTAINS(Title,'overview')...
For more information, see FREETEXT Predicate in Enterprise Search SQL Syntax and CONTAINS Predicate in Enterprise Search SQL Syntax in the Microsoft Office SharePoint Server 2007 SDK.
Conclusion
Returning the most relevant results for search queries is very important in any search solution. By following the guidelines for constructing SQL syntax queries described in this article, you can help improve the relevance of query results in custom search solutions you create for Enterprise Search.
Additional Resources
For more information, see the following resources: