The failover is happening after draining the cluster role

Question

I have 5 nodes in the Hyper-V cluster, let's say node1, node2, node3, node4, and node5. Eearlier they were sharing Netapp storage for VM storage but now I moved VMs to IBM storage.

To move the VMs on IBM storage

Each node had two Fiber cables, I removed one cable from each server and connected on IBM storage so at this point, one storage was coming from first FC from NetApp and IBM storage was connected via second FC
I configured IBM storage in CSV and migrate all the VMs from NetApp to IBM successfully and gradually move three nodes node1, node2, node3 completely on IBM.
I was removing two remaining nodes node4 and node5 today from NetApp, I already removed NetApp storage so I was just moving the cable from NetApp to IBM, But when I paused node4 I got live migration error as below.But if I do manual migration, the liver migration works. My questions why when I put a node4 in pause mode and drain the roles, live migration fails??

Answer

Hi,

Thank you for reaching out. So when you pause node4 and drain the roles, the system tries to move the VMs to another node using live migration. But sometimes there can be communication issues between the nodes during this process, which can cause live migration to fail. There are a bunch of reasons why this can happen, like network problems or storage settings that aren't configured properly.

It's possible that the network configuration on node4 isn't set up to support live migration, which could cause it to fail. But when you do a manual migration, you might be getting around that issue by using a different method to move the VMs.

To figure out what's going on, you could check the network configuration on node4 and make sure it's set up for live migration. You might also want to take a look at the IBM storage settings and check the VMs themselves for any issues that could be causing the migration to fail. And if all else fails, reviewing the event logs on node4 and the destination node could give you some more clues about what's going wrong.

Let me know if you had issues with the steps or need further assistance in case the issue persists. Don't forget to mention the results after the steps given above. Thank you!

The failover is happening after draining the cluster role

1 answer