FSRM Windows Server 2022 File Classification hash value issue

William Jackson 20 Reputation points
2025-12-02T16:37:45.1266667+00:00

Hello there,

I am trying to find an answer to a question regarding Microsoft Office file formats and FSRM File Classification. I work in a legal system as an IT Admin. I was tasked with creating a retention policy for a file server share used for evidence processing. As it is very important to preserve the forensic integrity of the data I couldn't base my retention on the usual stuff like created, modified, accessed dates and time.

I installed FSRM and setup a simple classification property for "Date-time". I then created a classification rule to classify files in the evidence share based on Windows Powershell Classifier ( simply adding the PS command for Get-Date ) and re-evaluation switched off. When I started the rule on the share it seemed to be perfect, it scans the folders every day and applies the Get-Date for the server date/time in the files classification tab. If newer evidence arrives in the folder structure it recieves the date of arrival in the next scheduled scan. I could then base my file management task (retention script) off the classification tab without touching the file integrity ( or so I thought! ).

I was contacted recently after enabling the rule by a colleague who demostrated to me that when a standard format file (txt, rtf, bmp, jpg,pdf etc.) is dropped into the share being scanned by the rule the hash value is true, no change, all good. BUT when a microsoft office document ( docx, xlsx etc. ) is dropped into the share the hash value is changed rendering the evidence useless. I cannot find the reason for this and can find no direct reference to this problem on the internet.

If anybody has some insight into this I would be very interested to hear. Up until a few days ago I thought I had cracked this retention problem and now its back to the drawing board.

Best Regards

Neil

Windows for business | Windows Server | Directory services | Other
{count} votes

Answer accepted by question author
  1. VPHAN 9,760 Reputation points Independent Advisor
    2025-12-02T17:12:35.0033333+00:00

    Good morning William Jackson,

    The problem stems from how FSRM interacts with the compound document structure of Office Open XML files (.docx, .xlsx, .pptx). These are essentially ZIP archives containing XML parts and binary components. When FSRM, or the PowerShell classifier, accesses the file to read its properties, the Windows Shell or the Office subsystem may automatically update internal metadata timestamps, such as the lastPrinted attribute or the ZIP container's directory modification timestamp, even when the file is merely read. This alters the file's binary composition, hence changing its cryptographic hash.

    To achieve your goal of timestamping arrival without altering file integrity, you must abandon any method that requires opening the file itself. Instead, leverage the NTFS USN Change Journal, which logs file system events without modifying the files. You can create a scheduled PowerShell script that queries the USN Journal for USN_REASON_FILE_CREATE events on your evidence share. This script extracts the precise date/time a file was first written to the volume and writes this timestamp to a separate, secured database (e.g., SQLite or a dedicated SQL server). The file itself is never opened; only its MFT record is read.

    Here is a simplified PowerShell example to capture file creation events:

    powershell

    $volume
    

    You would then base your retention policy on this external log. For absolute integrity, combine this with a write-blocker at the hardware or driver level for the evidence volume, ensuring no process can modify the files. Furthermore, immediately upon file arrival, calculate and store the original hash in your external database. Your retention script can then compare current hashes against this record to detect any unauthorized changes.

    I hope you've found something useful here. If it helps you get more insight into the issue, it's appreciated to ACCEPT ANSWER then. Should you have more questions, feel free to leave a message. Have a nice day!

    VPHAN

    1 person found this answer helpful.

1 additional answer

Sort by: Most helpful
  1. Q&A Assist
    2025-12-02T16:37:59.19+00:00
    AI generated content. This question contains an answer created with AI from Q&A Assist. Learn more

    It appears that the issue you're experiencing with Microsoft Office documents (like docx and xlsx) changing their hash values upon being processed by the File Server Resource Manager (FSRM) classification rules may be related to how these file formats handle metadata and internal structures.

    When a Microsoft Office document is created or modified, it often includes various metadata and properties that can change even if the visible content does not. This can include changes to the document's internal structure, which can affect the hash value. In contrast, simpler file formats like txt or jpg may not have this complexity, leading to unchanged hash values when processed by FSRM.

    To maintain the forensic integrity of your evidence, you might consider alternative approaches, such as:

    1. Using a different classification method: Instead of relying solely on the FSRM classification that modifies the files, explore options that do not alter the files' metadata.
    2. Implementing a separate logging mechanism: Keep track of file arrivals and their original states without modifying the files themselves. This could involve maintaining a separate database or log file that records the necessary information.
    3. Consulting with Microsoft Support: If this issue significantly impacts your operations, reaching out to Microsoft Support for specific guidance on handling Office document integrity with FSRM might be beneficial.

    These strategies could help you achieve your retention goals while preserving the integrity of the evidence files.

    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.