DAG and cluster have fail

Louis Ruth 1 Reputation point
2021-03-15T02:47:34.207+00:00

in company environment , 3 exchange servers

EX01 , EX02 , DR EX

Exchange 2016 CU13 version

no load balancer , send connector use iron port to send out and receive , no other user have external enviroment to access owa except VIP

  1. witness server have setup but no log file in witness server
  2. cluster failover manager without my cluster record
  3. happen cluster member auto remove , status is EX01 remove cluster membership , then Database have active on DR EX , then the EX01 Cluster services have stopped and manual start also not work , all user have pop up login and not work to auth , how to trace why member ship have remove , and solution is rebuild cluster and DAG?

cluster error (Windows could not start the Cluster Service on Local computer. For more information, review the System Event Log. If this is a non-Microsoft service, contact the service vendor, and refer to service-specific error code 2.)

Event ID 1135
Cluster node ' NODE A ' was removed from the active failover cluster membership. The Cluster service on this node may have stopped.
This could also be due to the node having lost communication with other active nodes in the failover cluster.
Run the Validate a Configuration wizard to check your network configuration.
If the condition persists, check for hardware or software errors related to the network adapters on this node.
Also check for failures in any other network components to which the node is connected such as hubs, switches, or bridges.

have other

Exchange Server Management
Exchange Server Management
Exchange Server: A family of Microsoft client/server messaging and collaboration software.Management: The act or process of organizing, handling, directing or controlling something.
7,499 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Eric Yin-MSFT 4,386 Reputation points
    2021-03-16T02:37:51.133+00:00

    Hi,
    Normally Event ID1135 results from losing heartbeat between the nodes, would you:
    1 install network monitor between the nodes and run cluster validation, then capture network activity and filter UDP 3343 to check the sending and responsing works, also, check if any error exists in cluster validation report.
    2 Please check if any other cluster related errors exists in the Event log, be free to post it back with personal information covered.
    3 Please check the Network drive has been updated to latest version.


    If an Answer is helpful, please click "Accept Answer" and upvote it.
    Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.