Creating New Data Warehouse Filegroups
You can use SQL Server filegroups to improve the performance of the Data Warehouse by distributing data across multiple disks and by using parallel threads for query processing. Additionally, filegroups can make managing the Data Warehouse databases easier.
SQL Server has a primary file group that contains the primary data file with the system tables. You can create user-defined file groups that contain data files that are grouped together for allocation and administrative purposes.
You can create multiple data files on separate disks and create a user-defined filegroup to contain the files. If you create file groups, try to have one file per physical disk. You can use filegroups to organize your data files, for example to separate files that are heavily queried from those that are heavily modified.
SQL Server designates one filegroup as the default filegroup. The default filegroup is set to the primary filegroup at the time of database creation, unless you specify otherwise. The default filegroup contains the pages for all tables and indexes that do not have a filegroup specified when they are created.
If the default filegroup is left as the primary filegroup, you must size the primary filegroup appropriately or set it to automatically grow so that you do not run out of space. The primary filegroup must be large enough to hold all system tables and any tables and indexes not allocated to a user-defined filegroup.
If the primary filegroup runs out of space, you will be unable to add any information to the system tables. However, if a user-defined filegroup runs out of space, only the user files that are specifically allocated to that file group are affected.
User-defined filegroups can improve performance by balancing data load across multiple disks and by using parallel threads to improve data access. When you create a table, you can assign it to a user-defined filegroup. As data is written to the filegroup, each file is filled in parallel.
Each file is physically placed on a disk or a set of disks. SQL Server maintains a file map that associates each database object with its location on the disk. For example, if one file is created on a filegroup that spans four disks, one file map points to the location of data on all four physical disks. If two files are created on a filegroup that spans four disks, two file maps (one map for each file) point to the location of the data on all four physical disks.
Filegroups use parallel threads to improve data access. When a table is accessed sequentially, the system creates a separate thread for each file in parallel. When the system performs a table scan for a table in a filegroup with four files, it uses four separate threads to read the data in parallel. In general, using multiple files on separate disks improves performance. Too many files in a filegroup can cause too many parallel threads and create bottlenecks.
In addition to using filegroups to balance data loads for performance, you can use filegroups to simplify maintenance.
To use filegroups to simplify maintenance:
- Back up or restore individual files or filegroups instead of backing up or restoring an entire database. Backing up files or filegroups might be necessary on large databases to have an effective backup and restore strategy.
- Group tables and indexes with similar maintenance requirements into the same filegroups. For example, you might need to perform maintenance on some objects more frequently than on others. By creating two filegroups and assigning tables to them, you can run daily maintenance tasks against the tables in a daily group, and weekly maintenance tasks against the tables in a weekly group. This limits disk contention between the two filegroups.
- Assign an individual high-maintenance table to its own filegroup. A table that has frequent updates might need to be backed up and restored separately from the database as a whole.
- A maximum of 256 filegroups can be created for each database.
- Filegroups can contain only data files.
- Transaction logs cannot be part of a filegroup.
- Filegroups cannot be created independently of database files. A filegroup is an administrative mechanism for grouping files within a database.
Best practices for creating filegroups
Before you create filegroups, you must understand your database structure, data, transactions, and queries thoroughly to determine the best way to place tables and indexes on specific filegroups.
When creating filegroups, you should:
- Use maintenance requirements rather than performance considerations to determine the number of filegroups.
In many cases, using the striping capabilities of RAID provides much of the same performance gain that you might achieve with filegroups, without the added administrative burden of defining and managing them.
- Change the default filegroup if you use filegroups.
If you create multiple filegroups for your database, you should assign one as the default. This prevents unexpected table growth from constraining the system tables in the primary filegroup.
- Be aware that filegroups do not provide fault tolerance.
To include fault tolerance, you can mirror each disk by using RAID 1. However, this is an expensive option.
- Combine filegroups with hardware-based RAID to optimize your databases.
You can combine filegroups with hardware-based RAID solutions. First, set up hardware striping, and then use filegroups to spread data across multiple hardware stripe sets. A separate thread used for each file provides parallel data access, and the load is spread among multiple disks to reduce contention.
To create a new database in a filegroup
In SQL Server Enterprise Manager, expand a server group, and then expand a server.
Right-click Databases, and then click New Database.
In the Database Properties dialog box, in the Name box, type a name for the new database.
- The primary database and transaction log files are created by using the database name you specified as the prefix, for example newdb_Data.mdf and newdb_Log.ldf. The initial sizes of the database and transaction log files are the same as the default sizes specified for the model database. The primary file contains the system tables for the database.
On the Data Files tab, click the next to the first open position in the File Name list.
In the Locate Database File dialog box, navigate to the location for the new database file, in the File name box, type the name of the new database file, and then click OK.
In the Database Properties dialog box, on the Data Files tab, do the following:
Use this To do this File Name Type the file name for the database file. Location Type the location of the database file, or click the ellipsis to navigate to the location. Initial size (MB) Type the initial size of the database file in megabytes (MB). Filegroup Type the name of the filegroup to which the database file will belong.
The database is created, and the new data file belongs to the specified file group.
Copyright © 2005 Microsoft Corporation.
All rights reserved.