Running Data Deduplication
Applies to: Windows Server 2022, Windows Server 2019, Windows Server 2016, Azure Stack HCI, versions 21H2 and 20H2
Running Data Deduplication jobs manually
You can run every scheduled Data Deduplication job manually by using the following PowerShell cmdlets:
Start-DedupJob: Starts a new Data Deduplication job
Stop-DedupJob: Stops a Data Deduplication job already in progress (or removes it from the queue)
Get-DedupJob: Shows all the active and queued Data Deduplication jobs
All settings that are available when you schedule a Data Deduplication job are also available when you start a job manually except for the scheduling-specific settings. For example, to start an Optimization job manually with high priority, maximum CPU usage, and maximum memory usage, execute the following PowerShell command with administrator privilege:
Start-DedupJob -Type Optimization -Volume <Your-Volume-Here> -Memory 100 -Cores 100 -Priority High
Monitoring Data Deduplication
Because Data Deduplication uses a post-processing model, it is important that Data Deduplication jobs succeed. An easy way to check the status of the most recent job is to use the
Get-DedupStatus PowerShell cmdlet. Periodically check the following fields:
- For the Optimization job, look at
LastOptimizationResult(0 = Success),
LastOptimizationTime(should be recent).
- For the Garbage Collection job, look at
LastGarbageCollectionResult(0 = Success),
LastGarbageCollectionTime(should be recent).
- For the Integrity Scrubbing job, look at
LastScrubbingResult(0 = Success),
LastScrubbingTime(should be recent).
More detail on job successes and failures can be found in the Windows Event Viewer under
\Applications and Services Logs\Windows\Deduplication\Operational.
One indicator of Optimization job failure is a downward-trending optimization rate which might indicate that the Optimization jobs are not keeping up with the rate of changes, or churn. You can check the optimization rate by using the
Get-DedupStatus PowerShell cmdlet.
Get-DedupStatus has two fields that are relevant to the optimization rate:
SavingsRate. These are both important values to track, but each has a unique meaning.
OptimizedFilesSavingsRateapplies only to the files that are 'in-policy' for optimization (
space used by optimized files after optimization / logical size of optimized files).
SavingsRateapplies to the entire volume (
space used by optimized files after optimization / total logical size of the optimization).
Disabling Data Deduplication
To turn off Data Deduplication, run the Unoptimization job. To undo volume optimization, run the following command:
Start-DedupJob -Type Unoptimization -Volume <Desired-Volume>
The Unoptimization job will fail if the volume does not have sufficient space to hold the unoptimized data.
Frequently Asked Questions
Is there a System Center Operations Manager Management Pack available to monitor Data Deduplication? Yes. Data Deduplication can be monitored through the System Center Management Pack for File Server. For more information, see the Guide for System Center Management Pack for File Server 2012 R2 document.
Submit and view feedback for