Share via

Cannot add storage back after S2D node failure in a 2 node cluster.

Jo Ford 20 Reputation points
2026-02-04T19:10:53.2133333+00:00

Hi,

I have a two node storage spaces cluster, with 2 NVMe cache drives, and 8 SSD capacity per node.

Had a failure of a node, and rebuilt the OS, Drivers, etc. Same drives.

Rejoined to cluster, but drives showed as failed.

Reset the drives, no joy. Set drives to retired, and they sit there saying 'removing from pool'

Evicted the node. Reset and cleared each drive in turn back to 'can pool status'

Joined again, and same thing, drives showed "retired, removing from pool".

Tried to remove drives from pool, but wont let me as it says not enough capacity.

Trued to reset again, and now it says the configuration is read-only (which it isn't)

Used the Clear-PhysicalDiskHealthData script, which removed all errors except 'Transient Error'

By physically removing the NVMe drives, and reinstalling them, these have come back and seem to have joined the pool. But the same trick doesn't work with the capacity drives.

If I look at pool usage, it shows the drives all in use, but unblanced, so it seems there is old meta data that is associated with the drives by serial number that is not being cleared.

Thinking I will have to rebuild the storage from scratch but want to avoid that if possible. Any ideas?

Hoping as usual I am missing something simple.

I tried to upload screen shots, but it violated some policy.

image

Windows for business | Windows Server | Storage high availability | Clustering and high availability

2 answers

Sort by: Most helpful
  1. Domic Vo 22,925 Reputation points Independent Advisor
    2026-02-10T06:02:44.6733333+00:00

    Good morning

    I hope you are doing well.

    Have you found the answer useful? If everything is okay, don't forget to share your experience with the issue by accepting the answer. Should you need more information, free free to leave a message. Happy to help! :)

    Domic Vo.

    Was this answer helpful?

    0 comments No comments

  2. Domic Vo 22,925 Reputation points Independent Advisor
    2026-02-04T19:57:26.01+00:00

    Hello Jo Ford,

    What you are running into is a Storage Spaces Direct metadata persistence issue. When a node fails and is rebuilt, the cluster still associates the physical disks by serial number with their previous pool membership. Even after resetting or evicting the node, the metadata on the SSDs remains, which is why they continue to show as “retired, removing from pool” and cannot be reclaimed. The NVMe cache drives responded to a physical reseat because their metadata was cleared at firmware level, but the capacity SSDs are still holding stale pool configuration.

    The correct way to clear this is not just Clear-PhysicalDiskHealthData, but a full wipe of the Storage Spaces metadata on each affected disk. You can do this with PowerShell using Reset-PhysicalDisk -FriendlyName <diskname> or, if that fails, by running Clear-Disk -Number <disknumber> -RemoveData followed by Set-Disk -IsReadOnly $false. In some cases, you may need to use diskpart with the clean command to remove all metadata. Be careful: this will destroy all data on those drives, so only apply it to the failed/retired disks you are trying to reclaim. Once the metadata is fully cleared, the drives should return to “CanPool” status and be available to rejoin the cluster.

    If the pool itself is showing as read-only, check the cluster state with Get-ClusterResource and confirm that the Storage Pool resource is online. A read-only flag can also appear if the cluster lost quorum during the rebuild. Bringing the pool resource offline and back online sometimes clears that state, but if the metadata is inconsistent across nodes, the cluster will continue to block writes.

    If you find that the pool remains unbalanced and the drives cannot be reclaimed even after metadata clearing, then unfortunately a full pool rebuild is the only guaranteed fix. Microsoft’s guidance is that once metadata corruption occurs across multiple nodes, recovery is unreliable.

    I hope you've found something useful here. If it helps you get more insight into the issue, it's appreciated to accept the answer. Should you have more questions, feel free to leave a message. Have a nice day!

    Domic Vo.

    Was this answer helpful?

    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.