Hyper-V cluster break during VMWare snapshoot

Dilan Nanayakkara 1,111 Reputation points
2020-11-18T04:45:24.737+00:00

Hi All,

we have a SQL failover cluster with 2 nodes. However, every time we run a snapshot of those nodes as a part of our backup process it will be a getting event ID 1135 which is a cluster node disconnecting. further, we have configured our File Server as a file share witness and it also gets break as a result of that. our two nodes are VMWare hosts.

I have run the validate configuration wizard on fail over a cluster to identify whether there are ongoing network configuration issues from the cluster ends. However, there are no identified network issues from the cluster end by the time I have run the Configuration validation.

appreciate if anyone can let me know the proper solution for this.

40555-image1.jpg

System Center Virtual Machine Manager
SQL Server
SQL Server
A family of Microsoft relational database management and analysis systems for e-commerce, line-of-business, and data warehousing solutions.
13,998 questions
Hyper-V
Hyper-V
A Windows technology providing a hypervisor-based virtualization solution enabling customers to consolidate workloads onto a single server.
2,735 questions
0 comments No comments
{count} votes

Accepted answer
  1. Shashank Singh 6,251 Reputation points
    2020-11-18T05:37:25.547+00:00

    I believe the culprits might be snapshots. Microsoft has detailed article on resolving 1135 event ID troubleshooting-cluster-event-id-1135. Did you got a chance to look through it.

    Some threads with similar issue

    vmware-snapshot-breaks-windows-2012-cluster

    windows-failover-cluster-vmware-snapshot


2 additional answers

Sort by: Most helpful
  1. Xiaowei He 9,906 Reputation points
    2020-11-18T07:00:30.577+00:00

    Hi,

    every time we run a snapshot of those nodes as a part of our backup process it will be a getting event ID 1135

    As far as I'm concerned, it's better to ask the question "VMWare snapshot cause network drops" instead of troubleshooting cluster 1135.

    You may use network monitor tool to capture UDP 3343 packet on the cluster nodes when creating snapshots on VMware hosts, if the traffic always dropped when creating snapshots, then it seems there are some issue with VMWare creating snapshot cause network drop. It's better to turn to VMWare forum for more information.

    After research, I found some examples below:

    https://communities.vmware.com/t5/ESXi-Discussions/Losing-pings-during-creation-deletion-of-snapshots/m-p/2512683

    https://communities.vmware.com/t5/ESXi-Discussions/VM-loses-network-after-snapshot-is-taken/m-p/1341352

    (Please note: Information posted in the given link is hosted by a third party. Microsoft does not guarantee the accuracy and effectiveness of information.)

    Thanks for your time!
    Best Regards,
    Anne

    -----------------------------

    If the Answer is helpful, please click "Accept Answer" and upvote it.

    Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.

    0 comments No comments

  2. Dirk Hondong 871 Reputation points
    2020-11-18T07:29:33.327+00:00

    Hi there,

    possible that it is still network related in the combination with taking a snapshot.
    I think that for a short amount of time neither heartbeat connection nor connection to witness share is available and therefore one node "goes down".

    Also, I'd setup a test environment where you can play around a bit. You don't need to take snapshots then but just quickly disable and enable the virt. NICs (try powershell ) on one Node and see, if you get the same error.

    As a workaround, although it is quite uncomfortable:
    You may need to work with pre and post commands, e.g. pre command against the passive node to stop clustsvc, take snapshot, start clustsvc.
    Node should connect properly again.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.