hyper-v converged cluster seems to faces low performance on new vms

Kuznos Christos 20 Reputation points
2024-03-14T08:37:04.46+00:00

Hi!

I am facing something strange on my Hyper-V converged cluster and I would like to hear from you If you faced something similar and you deal with.

I have a cluster of 3 physical servers with win2019 datacenter edition. They are all same and contain 2xSSD and 6xHDD each, I've create a converged hyper-v cluster, with s2d storage pool and 3 virtual disks

I applied 3-way mirroring and for at least 2 years all were smoothly. Suddenly, last week I noticed when I am trying to create a new VM the process of installation it takes more than usual, I tried both win installation and linux OS on different new vms. Also the checkpoint function it seems to take more time than usual.

The storage is healthy, all disks are healthy. The only thing that I noticed is that all 6 SSD (2 disks per server) are almost full.

The SSDs are used as journal and not to save the data, so I am wonder if its normal to be full.

Is there a need to free up some space on cache disks ? How can I do that?

Does this derives from the 3-way mirroring? and is it related with such delay that I notice?

I run optimize-storage command and completed successfully..(please let me know if you need any more screenshots). Maybe is not related with this, yet I dont know what else to check..

thank you in advanced.

User's image

User's image

User's image

Hyper-V
Hyper-V
A Windows technology providing a hypervisor-based virtualization solution enabling customers to consolidate workloads onto a single server.
2,545 questions
Windows Server Clustering
Windows Server Clustering
Windows Server: A family of Microsoft server operating systems that support enterprise-level management, data storage, applications, and communications.Clustering: The grouping of multiple servers in a way that allows them to appear to be a single unit to client computers on a network. Clustering is a means of increasing network capacity, providing live backup in case one of the servers fails, and improving data security.
959 questions
Windows Server Storage
Windows Server Storage
Windows Server: A family of Microsoft server operating systems that support enterprise-level management, data storage, applications, and communications.Storage: The hardware and software system used to retain data for subsequent retrieval.
631 questions
0 comments No comments
{count} votes

2 answers

Sort by: Most helpful
  1. Net Runner 505 Reputation points
    2024-03-15T09:31:22.3733333+00:00

    Storage Spaces Direct journal disks are, in fact, working as a read-and-write cache for your entire storage pool. When caching for HDD, both reads and writes are cached to provide SSD-like latency (often ~10x better) for both. The read cache stores recently and frequently read data for fast access and minimizing random HDD traffic. That means those journal disks have to be full when operating normally.

    In your case, I would rather check the total incoming pool writes (using Perfmon) since your journal size seems to be relatively small compared to the entire pool capacity. Your virtualized production may generate more workload (writes) than your cache can handle, resulting in a significant performance drop due to parallel de-staging of cached data down to HDDs. Try shutting down a couple of performance-hungry virtual machines (if possible) to see if that changes anything.

    If the above is the case, your best option to increase storage pool performance would be to replace the SSDs with bigger ones, replace SSDs with NVMe drives of the same size, or add PCIe NVMe disks to the pool for additional caching.

    Alternatively, you may replace Storage Spaces Direct with Virtual SAN software https://www.starwindsoftware.com/storage-spaces-direct that uses different storage pool mechanics and true OpenCAS caching (not journaling), offering a much better performance for virtual machines compared to classic S2D.

    1 person found this answer helpful.

  2. Ian Xue (Shanghai Wicresoft Co., Ltd.) 29,891 Reputation points Microsoft Vendor
    2024-03-18T10:19:47.7333333+00:00

    Hi Kuznos,

    Hope you're doing well.

    The fact that the SSDs used as journals are almost full could indeed be a contributing factor to the slowdown.

    Here are some steps you can take to address the issue:

    1. Since the SSDs are almost full and are used as journals, freeing up space on them could potentially improve performance. You can do this by removing unnecessary files or data from the SSDs. However, be cautious not to delete any critical data required for the operation of your cluster.
    2. Running a disk defragmentation or optimization tool on the SSDs may help improve performance. Below is the relevant document for your reference:

    defrag | Microsoft Learn

    1. Ensure that storage tiering and caching settings are configured optimally for your workload. You may need to adjust the tiering and caching policies to better utilize the SSDs and improve performance.
    2. Use performance monitoring tools to analyze storage performance metrics such as IOPS, throughput, and latency. This can help identify any bottlenecks or issues with storage performance that may be impacting VM creation and checkpoint functions.
    3. If the performance issues persist despite optimizations, consider scaling out your Hyper-V cluster by adding more nodes or additional SSDs to distribute the workload and improve performance.

    Best Regards,

    Ian Xue


    If the Answer is helpful, please click "Accept Answer" and upvote it.