Hi there, sorry for the late update, but the situation keeps us quite busy…
As mentioned before, we are running support cases with HPE and Microsoft on this one, the solution is yet to come…
HPE supported us by sending an alternative to the HBA H241, HPE Smart Array P441. Unfortunately, this device uses the same driver in the same version as well as the same hardware. The hardware exchange was not successful, we are still losing the virtual disks as soon as there is a bit of disk IO, for instance while creating a .vhdx file on the Cluster Shared Volume. In order to avoid further confusion whether the MPIO Windows feature interferes, we have uninstalled it. There is no need for it right now since we are not using redundant cabling. Uninstalling the MPIO component still did not fix our issue.
Still HPE keeps complaining about the cabling scheme not being a supported scenario. Luckily, with a bit of organizing, we were able to setup the following scenario:
- 1x DL 380 Gen 9 Windows Server 2016 (latest updates)
- 1x H241 HBA (FW 7.00)
- 1x D3700 JBOD (FW 7.00)
We have created the simplest possible scenario:
A single server connected to a single JBOD with a single miniSAS HD cable – no MPIO feature installed. All components are running the latest firmware (including disks) and drivers.
With storage spaces we have created a single virtual disk, made it highly available as a Cluster Shared Volume (to mimic the troubled cluster mentioned above). We are able to reproduce the error on this system as well. These are now 3 systems with similar JBODs and HBAs experiencing the same trouble. HPE is still unsure whether this single miniSAS HD cable-connected JBOD would be a supported scenario.
We have had this problem for the first at 27 JAN 2021, right after the upgrade and contacted HPE right away. We are unauthorized sending logfiles of any kind, unfortunately.
Right now I don’t see where this how this would be anything else but a driver or firmware (or both) issue. We have eliminated as many possible obstacles as possible between the OS and the storage. We are awaiting staff from HPE to analyze the system, but currently they fail to find the right people dealing with storage spaces or scale-out file server.
Does anyone have a good idea how further narrow this down? Just to make sure one more time, this not S2D (storage spaces direct), this is Scale-out file server on Windows Server 2016, usually connected redundant to 3 JBODs with a 3 way mirror configuration. I’ll also provide a scratch of our simplified scenario, maybe this helps understanding…
Best regards,
Christian