Using Files and Filegroups
SQL Server maps a database over a set of operating-system files. Data and log information are never mixed on the same file, and individual files are used only by one database. Filegroups are named collections of files and are used to simplify data placement and administrative tasks such as backup and restore operations. For more information, see Files and Filegroups Architecture.
File and Filegroup Fill Strategy
Filegroups use a proportional fill strategy across all the files within each filegroup. As data is written to the filegroup, the SQL Server Database Engine writes an amount proportional to the free space in the file to each file within the filegroup, instead of writing all the data to the first file until full. It then writes to the next file. For example, if file f1 has 100 MB free and file f2 has 200 MB free, one extent is allocated from file f1, two extents from file f2, and so on. In this way, both files become full at about the same time, and simple striping is achieved.
As soon as all the files in a filegroup are full, the Database Engine automatically expands one file at a time in a round-robin manner to allow for more data, provided that the database is set to grow automatically. For example, a filegroup is made up of three files, all set to automatically grow. When space in all the files in the filegroup is exhausted, only the first file is expanded. When the first file is full and no more data can be written to the filegroup, the second file is expanded. When the second file is full and no more data can be written to the filegroup, the third file is expanded. If the third file becomes full and no more data can be written to the filegroup, the first file is expanded again, and so on.
Improving Database Performance
Using files and filegroups improves database performance, because it lets a database be created across multiple disks, multiple disk controllers, or RAID (redundant array of independent disks) systems. For example, if your computer has four disks, you can create a database that is made up of three data files and one log file, with one file on each disk. As data is accessed, four read/write heads can access the data in parallel at the same time. This speeds up database operations. For more information about hardware solutions, see Database Performance.
Additionally, files and filegroups enable data placement, because a table can be created in a specific filegroup. This improves performance, because all I/O for a specific table can be directed at a specific disk. For example, a heavily used table can be put on one file in one filegroup, located on one disk, and the other less heavily accessed tables in the database can be put on the other files in another filegroup, located on a second disk.
Implementing Backup and Restore Strategies
Databases made up of multiple filegroups can be restored in stages by a process known as piecemeal restore. Piecemeal restore works with all recovery models, but is more flexible for the full and bulk-logged models than for the simple model. The piecemeal restore scenario includes all three phases of restore: data copy, redo or roll forward, and undo or roll back. For more information, see Performing Piecemeal Restores.
When multiple filegroups are used, the files in a database can be backed up and restored individually. Under the simple recovery model, file backups are allowed only for read-only files. Using file backups can increase the speed of recovery by letting you restore only damaged files without restoring the rest of the database. For example, if a database is made up of several files physically located on different disks and one disk fails, only the file on the failed disk has to be restored. For more information, see BACKUP (Transact-SQL).
Rules for Designing Files and Filegroups
The following rules pertain to files and filegroups:
A file or filegroup cannot be used by more than one database. For example, file sales.mdf and sales.ndf, which contain data and objects from the sales database, cannot be used by any other database.
A file can be a member of only one filegroup.
Transaction log files are never part of any filegroups.
Recommendations
Following are some general recommendations when you are working with files and filegroups:
Most databases will work well with a single data file and a single transaction log file.
If you use multiple files, create a second filegroup for the additional file and make that filegroup the default filegroup. In this way, the primary file will contain only system tables and objects.
To maximize performance, create files or filegroups on as many different available local physical disks as possible. Put objects that compete heavily for space in different filegroups.
Use filegroups to enable placement of objects on specific physical disks.
Put different tables used in the same join queries in different filegroups. This will improve performance, because of parallel disk I/O searching for joined data.
Put heavily accessed tables and the nonclustered indexes that belong to those tables on different filegroups. This will improve performance, because of parallel I/O if the files are located on different physical disks.
Do not put the transaction log file or files on the same physical disk that has the other files and filegroups.