ClusterPerformanceHistory volume wants to be repaired - but how?

Stefan Falk 166 Reputation points
2021-05-05T14:46:54.957+00:00

Hello everybody,

A customer has a 4-node Windows Server 2019 Hyper-V cluster. One node had to be put in maintenance because one of its boot drives (RAID 1) was replaced. After returning the node to the cluster, the cluster rans just fine. However, we cannot put pause a node (for maintenance). Failover cluster management says error 0x80071763, some cluster storage would be in a degraded state.

Windows Admin Center shows us that it is the ClusterPerformanceHistory volume. It says, a repair is needed, but it does not do that on its own (we waited several days). The volume is ReFS and has 3-way-mirrorring on the Cluster-S2D storage, but it is not a CSV, says WAC. WAC also says that all data on the volume is safe and available but must be synchronized with other servers in the cluster.

How can we repair this volume please?

Best Regards,
Stefan Falk

Windows Server Clustering
Windows Server Clustering
Windows Server: A family of Microsoft server operating systems that support enterprise-level management, data storage, applications, and communications.Clustering: The grouping of multiple servers in a way that allows them to appear to be a single unit to client computers on a network. Clustering is a means of increasing network capacity, providing live backup in case one of the servers fails, and improving data security.
956 questions
0 comments No comments
{count} votes

7 answers

Sort by: Most helpful
  1. JiayaoZhu 3,911 Reputation points
    2021-05-06T03:11:31.9+00:00

    Hi,

    Thanks for your posting!

    There are many reasons that can cause cluster storage in a degraded state. So I would like you to first try to manually repair your volume via PowerShell, to see if the issue can be resolved or there will be some error messages popping up after running the commands. Here is the article that can guide you to run PowerShell commands:

    https://learn.microsoft.com/en-us/powershell/module/storage/repair-volume?view=windowsserver2019-ps

    Besides, you mentioned that "WAC also says that all data on the volume is safe and available but must be synchronized with other servers in the cluster," so I am wondering if your issue is related with data sync. You can follow the guidance in the article to monitor your data sync:

    https://learn.microsoft.com/en-us/windows-server/storage/storage-spaces/understand-storage-resync

    Thanks for your support!

    Best regards
    Joan

    --------------------------------------------------------------------------------------------------------------------

    If the Answer is helpful, please click "Accept Answer" and upvote it.

    Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.

    0 comments No comments

  2. Stefan Falk 166 Reputation points
    2021-05-06T09:35:30.633+00:00

    Thanks for the hints, Joan! I'll try and report here.

    0 comments No comments

  3. Stefan Falk 166 Reputation points
    2021-05-06T17:33:21.053+00:00

    Hello Joan,

    I checked on the cluster today:

    [cluster2019]: PS C:\> Get-StorageJob
    
    Name                             IsBackgroundTask ElapsedTime JobState  PercentComplete BytesProcessed BytesTotal
    ----                             ---------------- ----------- --------  --------------- -------------- ----------
    ClusterPerformanceHistory-Repair False            00:00:00    Completed 100                        0 B        0 B
    

    Why is this job still there despite its state being "Completed"? So that WAC tells us that a repair of this volume would (still) be necessary?

    Best Regards,
    Stefan

    0 comments No comments

  4. JiayaoZhu 3,911 Reputation points
    2021-05-11T02:09:48.65+00:00

    Hi,

    Thanks for your reply!

    Firstly I would like to check if you have successfully repaired your volume no matter what results appeared after running the command you have mentioned.

    Secondly, if your issue has been resolved then just ignore the "False" output which has few effects on your repair process. “IsBackgroundTask” = “False” means that you did not run your command under a background environment. "-asjob" is a command used to create a background job to run commands in the background, so if you did not add "asjob" in your repair command, then this "False" result could occur.

    Here is an article giving you more details about "asjob" command:

    https://learn.microsoft.com/en-us/powershell/module/microsoft.powershell.core/about/about_jobs?view=powershell-7.1

    Thanks for your patience!

    BR,
    Joan

    --------------------------------------------------------------------------------------------------------------------

    If the Answer is helpful, please click "Accept Answer" and upvote it.

    Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.

    0 comments No comments

  5. Stefan Falk 166 Reputation points
    2021-05-11T08:24:55.747+00:00

    Hello Joan,

    Thanks for your input again.

    > Firstly I would like to check if you have successfully repaired your volume

    [cluster2019]: PS C:\> Get-Volume | Where-Object {$_.FileSystemLabel -eq 'ClusterPerformanceHistory'}
    
    DriveLetter FriendlyName              FileSystemType DriveType HealthStatus OperationalStatus SizeRemaining     Size    ----------- ------------              -------------- --------- ------------ ----------------- -------------     ----                ClusterPerformanceHistory ReFS           Fixed     Healthy      OK                     13.86 GB 15.94 GB                                                                                                                                                                                                                                                    [cluster2019]: PS C:\> Get-Volume | Where-Object {$_.FileSystemLabel -eq 'ClusterPerformanceHistory'} | Repair-Volume   NoErrorsFound                                                                                                           [cluster2019]: PS C:\>                                                                                                  
    

    The volume itself seems fine.

    > Secondly, if your issue has been resolved then just ignore the "False" output which has few effects on your repair process

    I would like to ignore it, but it leads the cluster to not allow me to take a node offline for a reboot. That's the problem.

    Regards,
    Stefan

    0 comments No comments