2 Structures

This document references commonly used data types as defined in [MS-DTYP].

Unless otherwise qualified, instances of GUID in this section refer to [MS-DTYP] section 2.3.4.

Sectors of a compound file with FAT array at sector #0

Figure 6: Sectors of a compound file with FAT array at sector #0

The main structure that is used to manage sector allocation and sector chains is the file allocation table (FAT). The FAT contains an array of 32-bit sector numbers, where the index represents a sector number, and its value represents the next sector in the chain or a special value.

  •  FAT[0] contains sector #0's next sector in the chain.

  •  FAT[1] contains sector #1's next sector in the chain.

  •  ...

  •  FAT[N] contains sector #N's next sector in the chain.

This allows a compound file to contain many sector chains in a single file. Many compound file structures, including user-defined data, are implemented as sector chains that are represented in the FAT.

Even the FAT array itself is represented as a sector chain. The sector chain holds both internal and user-defined data streams. Because the FAT array is stored in a sector chain, the double-indirect file allocation table (DIFAT) array is used to find the FAT sector locations. Each DIFAT array entry contains a 32-bit sector number.

  •  DIFAT[0] contains FAT sector #0's location.

  •  DIFAT[1] contains FAT sector #1's location.

  •  ...

  •  DIFAT[N] contains FAT sector #N's location.

Because space for streams is always allocated in sector-sized blocks, storing objects that are much smaller than the normal sector size (either 512 bytes or 4,096 bytes) can cause considerable waste. As a solution to this problem, the concept of the mini FAT is introduced.

Mini sectors of a mini stream

Figure 7: Mini sectors of a mini stream

The mini FAT is structurally equivalent to the FAT, but it is used in a different way. The sector size for objects that are represented in mini FAT is 64 bytes, instead of the 512 bytes or 4,096 bytes for normal sectors. The space for these objects comes from a special stream that is called the mini stream. The mini stream is an internal stream object that is divided into equal-length mini sectors. Each mini FAT array entry contains a 32-bit sector number for the mini stream, not the file.

  •  MiniFAT[0] contains mini stream sector #0's next sector in the chain.

  •  MiniFAT[1] contains mini stream sector #1's next sector in the chain.

  •  ...

  •  MiniFAT[N] contains mini stream sector #N's next sector in the chain.

Stream objects that have a user-defined data length less than a cutoff (4,096 bytes) are allocated with the mini FAT from the mini stream. Larger stream objects are allocated with the FAT from unallocated free sectors in the file.

The names of all storage objects and stream objects, along with other object metadata like stream size and storage CLSIDs, are found in the directory entry array. The space for the directory entry array is allocated with the FAT like other sector chains.

  • DirectoryEntry[0] contains information about the root storage object.

  • DirectoryEntry[1] contains information about a storage object, stream object, or unallocated object.

  •  ...

  • DirectoryEntry[N] contains information about a storage object, stream object, or unallocated object.

Entries of a directory entry array

Figure 8: Entries of a directory entry array

Summary of compound file internal streams and connections to user-defined data streams

Figure 9: Summary of compound file internal streams and connections to user-defined data streams

This diagram summarizes the compound file main internal streams and how they are linked to user-defined data streams. The DIFAT, FAT, mini FAT, directory entry arrays, and mini stream are internal streams, whereas the user-defined data streams link directly to their stream objects.

In a compound file, all integer fields, including Unicode characters that are encoded in UTF-16, MUST be stored in little-endian byte order. The only exception is in user-defined data streams, where the compound file structure does not impose any restrictions.