NTFS Sparse files range boundaries

Pierre Chatelier 81 Reputation points
2023-12-09T16:49:04.04+00:00

NTFS Sparse file https://learn.microsoft.com/en-us/windows/win32/fileio/sparse-files are not really well documented.

For instance you can read The default data value of a sparse file is zero; however, it can be set to other values., but I found no further information.

More important : what are the requirements on a FSCTL_SET_ZERO_DATA to be efficient ?

I expected that it has an effect only on boundaries and size of clusters (default 4KB on my drives), but a few tests revealed that I had to write at least 64KB of 0 to see the size of the file reflect a "sparse".

Where does this value come from ? Is there something to know here ?

Windows development | Windows API - Win32
Windows for business | Windows Client for IT Pros | User experience | Other
{count} votes

Answer accepted by question author
  1. Gary Nebbett 6,411 Reputation points
    2023-12-12T10:00:48.82+00:00

    Hello Pierre,

    Using Windows 7 almost certainly explains the trace behaviour and probably explains the unexpected filling of sparse regions that you seem to be encountering.

    The posting at https://bugs.launchpad.net/ubuntu/+source/ntfs-3g/+bug/1958180 might help inform one's understanding of NTFS sparse files. A non-Windows implementation of NTFS sparse files can create sparse regions at the smallest possible size (a cluster); a Windows autochk of the volume will report this but can still work with the file.

    Using sparse files can add to the fragmentation of a volume (a newly created large file might initially occupy a single range of clusters; making some parts of the file sparse might lead to these freed clusters being used by other files; when the large file is deleted, the initially contiguous free space is now fragmented).

    One idea would be to use a maximally sized FSCTL_SET_ZERO_DATA whenever possible (ignoring alignment and size of the region); if adjacent regions are known to be zero (because the space in the file is being managed by software - perhaps a database application) then include them in the region to be zeroed. This would allow the NTFS implementation to make the best choice about how to perform FSCTL_SET_ZERO_DATA on the region (make sparse or write zeros).

    Gary

    2 people found this answer helpful.

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.