The NTFS File System
Windows 2000 comes with a new version of NTFS. This newest version of NTFS provides performance, reliability, and functionality not found in FAT. Some new features in Windows 2000, such as Active Directory ™ directory service and the storage features based on reparse points are only available on volumes formatted with NTFS
NTFS also includes security features required for file servers and high-end personal computers in a corporate environment, and data access control and ownership privileges important for data integrity.
Multiple Data Streams
NTFS supports multiple data streams, where the stream name identifies a new data attribute on the file. A handle can be opened to each data stream. A data stream, then, is a unique set of file attributes. Streams have separate opportunistic locks, file locks, and sizes, but common permissions.
This feature enables you to manage data as a single unit. The following is an example of an alternate stream:
myfile.dat:stream2
A library of files might exist where the files are defined as alternate streams, as in the following example:
library:file1
:file2
:file3
A file can be associated with more than one application at a time, such as Microsoft® Word and Microsoft® WordPad. For instance, a file structure like the following illustrates file association, but not multiple files:
program:source_file
:doc_file
:object_file
:executable_file
You can use the Win32 advanced programming interface (API) CreateFile to create an alternate data stream. Or, at the command prompt, you can type commands such as:
echo text>program:source_file
more <program:source_file
Caution
Because NTFS is not supported on floppy disks, when you copy an NTFS file to a floppy disk, data streams and other attributes not supported by FAT are lost without warning.
Reparse Points
Reparse points are new file system objects in the version of NTFS included with Windows 2000. Reparse points have a definable attribute containing user-controlled data and are used to extend functionality in the input/output (I/O) subsystem.
For more information about reparse points, see the Platform Software Development Kit (SDK) link on the Web Resources page at https://windows.microsoft.com/windows2000/reskit/webresources .
Change Journal
The change journal is used by NTFS to provide a persistent log of all changes made to files on the volume. For each volume, NTFS uses the change journal to track information about added, deleted, and modified files. The change journal is much more efficient than time stamps or file notifications for determining changes in a given namespace.
The change journal is implemented as a sparse stream in which only a small active range uses any disk allocation. The active range initially begins at offset 0 in the stream and moves monotonically forward. The unique sequence number (USN) of a particular record represents its virtual offset in the stream. As the active range moves forward through the stream, earlier records are deallocated and become unavailable. The size of the active range in a sparse file can be adjusted. For more information about the change journal and sparse files, see the Platform Software Development Kit (SDK) link on the Web Resources page at https://windows.microsoft.com/windows2000/reskit/webresources .
Encryption
File and directory-level encryption is implemented in the version of NTFS included with Windows 2000 for enhanced security in NTFS volumes. Windows 2000 uses Encrypting File System (EFS) to store data in encrypted form, which provides security when the storage media are removed from a system running Windows 2000. For more information about EFS, see the Microsoft ® Windows ® 2000 Server Resource Kit Distributed Systems Guide .
Sparse File Support
Sparse files allow programs to create very large files, but to consume disk space only as needed. A sparse file is a file with an attribute that causes the I/O subsystem to allocate the file's meaningful (nonzero) data. All nonzero data is allocated on disk, whereas all nonmeaningful data (large strings of data composed of zeros) is not. When a sparse file is read, allocated data is returned as it was stored, and nonallocated data is returned, by default, as zeros in accordance with the C2 security requirement specification.
NTFS includes full sparse file support for both compressed and uncompressed files. NTFS handles read operations on sparse files by returning allocated data and sparse data. It is possible to read a sparse file as allocated data and a range of data without having to retrieve the entire data set, although, by default, NTFS returns the entire data set.
You can set a user-controlled file system attribute to take advantage of the sparse file function in NTFS. With the sparse file attribute set, the file system can deallocate data from anywhere in the file and, when an application calls, yield the zero data by range instead of storing and returning the actual data. File system APIs allow for the file to be copied or backed as actual bits and sparse stream ranges. The net result is efficient file system storage and access. Figure 3.4 shows how data is stored with and without the sparse file attribute set.
Figure 3.4 Sparse Data Storage
Disk Quotas
Disk quotas are a new feature in NTFS that provide more precise control of network-based storage. Disk quotas are implemented on a per-volume basis and enable both hard and soft storage limits to be implemented on a per-user basis. For more information about disk quotas, see "Data Storage and Management" in this book.
The introduction of distributed file system (Dfs), NTFS directory junctions, and volume mount points also creates situations where logical directories do not have to correspond to the same physical volume. Available disk space is based on user context, and the space reported for a volume is not necessarily representative of the space available to the user. For this reason, do not rely on space queries to make assumptions about the amount of available disk space in directories other than the current one. For more information about Dfs, see the Distributed Systems Guide . For more information about volume mount points, see "Volume Mount Points" later in this chapter.
Distributed Link-Tracking
Windows 2000 provides a distributed link-tracking service that enables client applications to track link sources that have been moved locally or within a domain. Clients that subscribe to this link-tracking service can maintain the integrity of their references because the objects referenced can be moved transparently. Files managed by NTFS can be referenced by a unique object identifier. Link tracking stores a file's object identifier as part of its tracking information.
The distributed link-tracking service tracks shell shortcuts and OLE links within NTFS volumes on computers running Windows 2000. For example, if a shell shortcut is created to a text document, distributed link-tracking allows the shortcut to remain correct, even if the target file moves to a new drive or computer system. Similarly, in a Microsoft® Word document that contains an OLE link to a Microsoft® Excel spreadsheet, the link remains correct even if the Excel file moves to a new drive or computer system.
If a link is made to a file on a volume formatted with the version of NTFS included with Windows 2000, and the file is moved to any other volume with the same version of NTFS within the same domain, the file is found by the tracking service, subject to time considerations. Additionally, if the file is moved outside the domain or within a workgroup, it is likely to be found.
Converting to Windows 2000 File Systems
The on-disk format for NTFS has been enhanced in Windows 2000 to enable new functionality. The upgrade to the new on-disk format occurs when Windows 2000 mounts an existing NTFS volume. The upgrade is quick and automatic; the conversion time is independent of volume size. Note that FAT volumes can be converted to NTFS format at any time using the Convert.exe utility.
Important
Performance of volumes that have been converted from FAT is not as high as volumes that were originally formatted with NTFS.
Multiple Booting of Windows NT and Windows 2000
Your ability to access your NTFS volumes when you multiple-boot Windows NT and Windows 2000 depends on which version you are using. (Redirected clients using NTFS volumes on file and print servers are not affected.)
Windows NT Compatibility with the Version of NTFS Included with Windows 2000
When a Windows 2000 volume is mounted on a system running Windows NT 4.0 Service Pack 4, most features of the version of NTFS included with Windows 2000 are not available. However, most read and write operations are permitted if they do not make use of any new NTFS features. Features affected by this configuration include the following:
Reparse points. Windows NT cannot use any features based on reparse points, such as Remote Storage and volume mount points.
Disk quotas. When running Windows NT, Windows 2000 disk quotas are ignored. This allows you to allocate more disk space than is allowed by your quota.
Encryption. Windows NT cannot perform any operations on files encrypted by Windows 2000.
Sparse files. Windows NT cannot perform any operations on sparse files.
Change journal. Windows NT ignores the change journal. No entries are logged when files are accessed.
Cleanup Operations on Windows NT Volumes
Because files on volumes formatted with the version of NTFS included with Windows 2000 can be read and written to by Windows NT, Windows 2000 may need to perform cleanup operations to ensure the consistency of the data structures of a volume after it was mounted on a computer that is running Windows NT. Features affected by cleanup operations are explained below.
Disk quotas If disk quotas are turned off, Windows 2000 performs no cleanup operations. If disk quotas are turned on, Windows 2000 cleans up the quota information.
If a user exceeds the disk quota while the NTFS volume is mounted by a Windows NT 4.0 system, all further disk allocations of data by that user will fail. The user can still read and write data to any existing file, but will not be able to increase the size of a file. However, the user can delete and shrink files. When the user gets below the assigned disk quota, he or she can resume disk allocations of data. The same behavior occurs when a system is upgraded from a Windows NT system to a Windows 2000 system with quotas enforced.
Reparse points Because files that have reparse points associated with them cannot be accessed by computers that are running Windows NT 4.0 or earlier, no cleanup operations are necessary in Windows 2000.
Encryption Because encrypted files cannot be accessed by computers that are running Windows NT 4.0 or earlier, no cleanup operations are necessary.
Sparse files Because sparse files cannot be accessed by computers that are running Windows NT 4.0 or earlier, no cleanup operations are necessary.
Object identifiers Windows 2000 maintains two references to the object identifier. One is on the file; the other is in the volume-wide object identifier index. If you delete a file with an object identifier on it, Windows 2000 must scan and clean up the leftover entry in the index.
Change journal Computers that are running Windows NT 4.0 or earlier do not log file changes in the change journal. When Windows 2000 starts, the change journals on volumes accessed by Windows NT are reset to indicate that the journal history is incomplete. Applications that use the change journal must have the ability to accept incomplete journals.
Structure of an NTFS Volume
Like FAT, NTFS uses clusters as the fundamental unit of disk allocation. In the Disk Management snap-in, you can specify a cluster size of up to 4 KB. If you type format at the command prompt to format your NTFS volume, but do not specify an allocation unit size using the /A:<size> switch , the values in Table 3.4 will be used.
Table 3.4 Default Cluster Sizes for NTFS
Volume Size |
Sectors Per Custer |
Default Cluster Size |
---|---|---|
512 MB or less |
1 |
512 bytes |
513 MB–1,024 MB (1 GB) |
2 |
1,024 bytes (1 KB) |
1,025 MB–2,048 MB (2 GB) |
4 |
2,048 bytes (2 KB) |
Greater than 2,049 MB |
8 |
4 KB |
Note
Windows 2000, like Windows NT 3.51 and Windows NT 4.0, supports file compression. Since file compression is not supported on cluster sizes above 4 K, the default NTFS cluster size for Windows 2000 never exceeds 4 K. For more information about NTFS compression, see "File and Folder Compression" later in this chapter.
Boot Sector
The first information found on an NTFS volume is the boot sector. The boot sector starts at sector 0 and can be up to 16 sectors long. It consists of two structures:
The BIOS parameter block, which contains information on the volume layout and file system structures.
Code that describes how to find and load the startup files for the operating system being loaded. For Windows 2000, this code loads the file Ntldr. For more information about the boot sector, see "Disk Concepts and Troubleshooting" in this book.
Master File Table and Metadata
When a volume is formatted with NTFS, a Master File Table (MFT) file and other pieces of metadata are created. Metadata are the files NTFS uses to implement the file system structure. NTFS reserves the first 16 records of the MFT for metadata files.
Note
The data segment locations for both $Mft and $MftMirr are recorded in the boot sector. If the first MFT record is corrupted, NTFS reads the second record to find the MFT mirror file. A duplicate of the boot sector is located at the end of the volume.
Table 3.5 lists and briefly describes the metadata stored in the MFT.
Table 3.5 Metadata Stored in the Master File Table
System File |
File Name |
MFT Record |
Purpose of the File |
---|---|---|---|
Master file table |
$Mft |
0 |
Contains one base file record for each file and directory on an NTFS volume. If the allocation information for a file or directory is too large to fit within a single record, other file records are allocated as well. |
Master file table 2 |
$MftMirr |
1 |
A duplicate image of the first four records of the MFT. This file guarantees access to the MFT in case of a single-sector failure. |
Log file |
$LogFile |
2 |
Contains a list of transaction steps used for NTFS recoverability. Log file size depends upon the volume size. It is used by Windows 2000 to restore consistency to NTFS in the event of a system failure. For more information about the log file, see "NTFS Recoverability" later in this chapter. |
Volume |
$Volume |
3 |
Contains information about the volume, such as the volume label and the volume version. |
Attribute definitions |
$AttrDef |
4 |
A table of attribute names, numbers, and descriptions. |
Root file name index |
$ |
5 |
The root directory. |
Cluster bitmap |
$Bitmap |
6 |
A representation of the volume showing which clusters are in use. |
Boot sector |
$Boot |
7 |
Includes the bootstrap for the volume if it is a bootable volume. |
Bad cluster file |
$BadClus |
8 |
Contains bad clusters for the volume. |
Security file |
$Secure |
9 |
Contains unique security descriptors for all files within a volume. |
Upcase table |
$Upcase |
10 |
Converts lowercase characters to matching Unicode uppercase characters. |
NTFS extension file |
$Extend |
11 |
Used for various optional extensions such as quotas, reparse point data, and object identifiers. |
|
|
12–15 |
Reserved for future use. |
The remaining records of the MFT contain the file and directory records for each file and directory on the volume.
NTFS creates a file record for each file and a directory record for each directory created on an NTFS volume. The MFT includes a separate file record for the MFT itself. These file and directory records are stored on the MFT. The attributes of the file are written to the allocated space in the MFT. Besides file attributes, each file record contains information about the position of the file record in the MFT.
Each file usually uses one file record. However, if a file has a large number of attributes or becomes highly fragmented, it may need more than one file record. If this is the case, the first record for the file, called the base file record, stores the location of the other file records required by the file. Small files and directories (typically 1,500 bytes or smaller) are entirely contained within the file's MFT record.
Directory records contain index information. Small directories might reside entirely within the MFT structure, while large directories are organized into B-tree structures and have records with pointers to external clusters that contain directory entries that could not be contained within the MFT structure.
NTFS File Attributes
Every allocated sector on an NTFS volume belongs to a file. Even the file system metadata is part of a file. NTFS views each file (or folder) as a set of file attributes. Elements such as the file's name, its security information, and even its data, are all file attributes. Each attribute is identified by an attribute type code and, optionally, an attribute name.
When a file's attributes can fit within the MFT file record for that file, they are called resident attributes. Information such as file name and time stamp are always resident attributes. When the information for a file is too large to fit in its MFT file record, some of the file attributes are nonresident. Nonresident attributes are allocated one or more clusters of disk space and stored as an alternate data stream in the volume. NTFS creates the Attribute List attribute to describe the location of both resident and nonresident attribute records.
Table 3.6 lists the file attributes defined by NTFS, although other file attributes might be defined in the future.
Table 3.6 NTFS File Attribute Types
Attribute Type |
Description |
---|---|
Standard Information |
Includes information such as time stamp and link count. |
Attribute List |
Lists the location of all the attribute records that do not fit in the MFT record. |
File Name |
A repeatable attribute for both long and short file names. The long name of the file can be up to 255 Unicode characters. The short name is the MS-DOS-readable, 8.3, case-insensitive name for the file. Additional names, or hard links, required by POSIX can be included as additional file name attributes. |
Security Descriptor |
Shows information about who owns the file and who can access the file. |
Data |
Contains file data. NTFS allows multiple data attributes per file. Each file typically has one unnamed data attribute. A file can also have one or more named data attributes, each using a particular syntax. |
Object ID |
A volume-unique file identifier. Used by the link tracking service. Not all files have object identifiers. |
Logged Tool Stream |
Similar to a data stream, but operations on a logged tool stream are logged to the NTFS log file just like NTFS metadata changes. Used by EFS. |
Reparse Point |
Used for directory junction points and volume mount points. They are also used by file system filter drivers to mark certain files as special to that driver. |
Index Root |
Used to implement folders and other indexes. |
Index Allocation |
Used to implement folders and other indexes. |
Bitmap |
Used to implement folders and other indexes. |
Volume Information |
Used only in the $Volume system file. Contains the volume version. |
Volume Name |
Used only in the $Volume system file. Contains the volume label. |
MS-DOS -Readable File Names on NTFS Volumes
By default, Windows NT and Windows 2000 generate MS-DOS-readable file names on all NTFS volumes. To improve performance on volumes with many long, similar names, you can change the default value of the registry entry NtfsDisable8dot3NameCreation (in HKEY_LOCAL_MACHINE\System \CurrentControlSet\Control\FileSystem) to 1 .
Windows 2000 does not generate short (8.3) file names for files created by POSIX-based applications on an NTFS volume, regardless of the value of the NtfsDisable8dot3NameCreation registry entry. This means that MS-DOS-based and 16-bit Windows-based applications cannot view these file names if they are not valid 8.3 file names. Use standard MS-DOS 8.3 naming conventions if you want to use files that are created by a POSIX application with MS-DOS-based or Windows-based applications.