Estimate MOSS Search Disk Space Requirements:

Hi all,

I have been asked many times about storage requirements around MOSS Search. By this I mean, what estimates can be made on the disk space requirements for Index server, Query Server and Database Server.

Below I have pasted in content that is included in the whitepaper on TechNet on this. I pulled this out because many people miss this information:

Index server disk space requirements:

To estimate the index server disk space requirements, we recommend that you use the following calculations:
 

  Size of data crawled = Y 
  Size of index on index server = a range of 5% through 12% * Y = X 
  Initial disk space = 2.5*X.  

A large amount of index server disk capacity is required to accommodate backups, which must reside on the same disk as the index, and to accommodate the merge process when crawled data is merged with the index.

Note: The volume of crawled data can differ based on the content source. A content source is a set of options that you can use to specify what type of content is crawled, what URLs to crawl, and how deep and when to crawl.

For example, if the content source specifies file-share content, the index size can be up to 30 percent of the size of the content.

Content Index Sizing:

You can estimate the size of the content index with the following equation:

Index size = Average size of document * number of documents * 4 x 10-10 GB.

Note that this equation is intended only to establish a starting-point estimate. Real-world results may vary widely based on the size of documents being indexed, and how much metadata is being indexed during a crawl operation.

Query server disk space requirements:

Content indexes are propagated from the index server to every query server in the farm. The full index is propagated to the query servers during the query server initialization phase, and incremental changes in the index are propagated on a continual basis. The merging process requires more disk space than what is required to accommodate the index itself.

Given a content index size of X, we recommend that initial disk space be at least 2.5*X for every content index on each query server in the farm.

Database server disk space requirements:

The search database that stores the metadata for the search system requires more disk space than the index. This is especially the case if you crawl many SharePoint sites, which are very rich in metadata.

To estimate disk space requirements for the search database, use the following guideline: For an index size of X, we recommend initial disk space of 4*X for the hard disk that contains the search database.

Note: When a farm contains only site collections, sites, lists, and document libraries, and no external content such as documents stored on file shares, the typical size of the index is approximately 1-5 percent of the size of the content database. If there are no document libraries in the farm, the typical size of the index size is approximately one percent of the size of the content database. The actual size of the index relative to the content database varies depending on the size and type of the documents stored in the farm.

For more information on this, please see the whitepaper that this content was taken out at: https://technet2.microsoft.com/Office/en-us/library/5465aa2b-aec3-4b87-bce0-8601ff20615e1033.mspx?mfr=true 

Hope that helps,
Mike