Performance Tuning and Optimization (Full-Text Search)
Performance for full-text indexing and full-text queries is influenced by hardware resources, such as memory, disk speed, and CPU speed.
Full-Text Indexing Performance
The main cause for reduced full-text indexing performance is hardware-resource limits:
- If CPU usage by the MSFTESQL service and SQL Server are close to 100 percent, the CPU is the bottleneck.
- If the average disk-waiting queue length is more than two times the number of disk heads, there is a bottleneck on the disk. The primary workaround is to create full-text catalogs that are separate from the SQL Server database files and logs. Put the logs, database files, and full-text catalogs on separate disks. Buying faster disks and using RAID can also help improve indexing performance.
- If there is a shortage of physical memory (3-GB limit) and the service-pause message appeared in the crawl log, memory is the bottleneck. For information about how to resolve this problem see Paused Microsoft Full-Text Engine for SQL Server (MSFTESQL) Service. The MSFTESQL service does not support AWE. Adding more than 4 GB of memory only helps SQL Server.
If hardware bottlenecks are not observed on the system, you could fine-tune the system to maximize hardware capacity. When no hardware bottlenecks are present, the indexing performance of Full-Text Search in Microsoft SQL Server 2005 mostly depends on the following:
- How long it takes SQL Server to create full-text batches.
- How quickly the MSFTESQL service can consume those batches.
The interaction between SQL Server and the MSFTESQL service must be tuned to reach optimal performance. If SQL Server produces more batches than the MSFTESQL service can consume, the service pauses and generates a crawl-log message to indicate this paused state. For information about how to resolve this problem, see Paused Microsoft Full-Text Engine for SQL Server (MSFTESQL) Service.
If, on the other hand, SQL Server does not produce enough full-text batches to keep the MSFTESQL service busy, the service is idle. This state is not optimal. In fact, this is the most common reason for indexing that runs slowly. To make sure that MSFTESQL service is optimally occupied, you must track and tune the following counters:
- Batches in progress counter: Microsoft Full-Text Engine Filter Daemon (MSFTELFD)
This counter should be either equal to or double the number of CPUs in the system. Values of 0, 1 or 2 with low CPU usage indicate that SQL Server is not performing well. For example, if you have a four-way computer, this number should be between 4 and 8.
- Batches in ready queue: MSFTESQL service
The value should be close to 10 times the crawl range count. To determine how many ranges are used to index the table, run a query from sys.dm_fts_population_ranges.
If the counter is low, you can improve it in the following ways:
Make sure that the table has a multicrawl range. To verify, query sys.dm_fts_population_ranges. Ideally, the crawl-range count should be two times the number of CPUs. The crawl range is limited by the number of rows in the table, the number of CPUs, and the max full-text crawl range configuration option. You must restart the crawl operation for the option to take effect.
This only applies to a full crawl.
Make sure that the base table has a clustered index. Use an integer data type for the first column of the clustered index. Avoid using GUIDs in the first column of the clustered index. A multirange crawl on a clustered index can produce the highest crawl speed.
Update the statistics of the base table by using the UPDATE STATISTICS statement. More important, update the statistics on the clustered index or the full-text key for a full crawl. This helps a multirange crawl to generate good partitions on the table.
Build a secondary index on a timestamp column if you want to improve the performance of incremental population.
Unlike full crawl, incremental, manual, and auto change tracking population are not designed to maximize hardware resources to achieve faster speed. Therefore, these tuning suggestions may not enhance performance for full-text indexing.
Recommendations to Increase Full-Text Query Performance
The following is a list of recommendations that will help increase full-text query performance:
- Defragment the index of the base table by using ALTER INDEX REORGANIZE.
- Reorganize the full-text catalog by using ALTER FULLTEXT CATALOG REORGANIZE. Make sure that you do this before performance testing because running this statement causes a master merge of the full-text indexes in that catalog.
- Restrict your choice of full-text key columns to a small column. Although a 900-byte column is supported, we do not recommend building a full-text index that uses a key column of that size.
- Combine multiple CONTAINS predicates into one CONTAINS predicate. In SQL Server you can specify a list of columns in the CONTAINS query.
- If you only require full-text key or rank information, use CONTAINSTABLE or FREETEXTTABLE instead of CONTAINS or FREETEXT, respectively.
- To limit results and increase performance, use the TOP_N_BY_RANK option of the FREETEXTTABLE and CONTAINSTABLE syntax. Use this option you are not interested in all possible hits.
- Check the full-text query plan to make sure that the appropriate join plan is chosen. Use a join hint or query hint if you have to. If a parameter is used in the full-text query, the first-time value of the parameter determines the query plan. You can use the OPTIMIZE FOR query hint to force the query to compile with the value you want. This helps achieve a deterministic query plan and better performance.
Help and Information
14 April 2006