Hi,
I have a two node storage spaces cluster, with 2 NVMe cache drives, and 8 SSD capacity per node.
Had a failure of a node, and rebuilt the OS, Drivers, etc. Same drives.
Rejoined to cluster, but drives showed as failed.
Reset the drives, no joy. Set drives to retired, and they sit there saying 'removing from pool'
Evicted the node. Reset and cleared each drive in turn back to 'can pool status'
Joined again, and same thing, drives showed "retired, removing from pool".
Tried to remove drives from pool, but wont let me as it says not enough capacity.
Trued to reset again, and now it says the configuration is read-only (which it isn't)
Used the Clear-PhysicalDiskHealthData script, which removed all errors except 'Transient Error'
By physically removing the NVMe drives, and reinstalling them, these have come back and seem to have joined the pool. But the same trick doesn't work with the capacity drives.
If I look at pool usage, it shows the drives all in use, but unblanced, so it seems there is old meta data that is associated with the drives by serial number that is not being cleared.
Thinking I will have to rebuild the storage from scratch but want to avoid that if possible. Any ideas?
Hoping as usual I am missing something simple.
I tried to upload screen shots, but it violated some policy.
