Поделиться через


Failover Cluster Step-by-Step Guide: Configuring the Quorum in a Failover Cluster

Applies To: Windows Server 2008, Windows Server 2008 R2

A failover cluster is a group of independent computers that work together to increase the availability of applications and services. The clustered servers (called nodes) are connected by physical cables and by software. If one of the cluster nodes fails, another node begins to provide service (a process known as failover). Users experience a minimum of disruptions in service.

This guide describes the new quorum options in failover clusters in Windows Server® 2008 and provides steps for configuring the quorum in a failover cluster. By following the configuration steps in this guide, you can learn about failover clusters and familiarize yourself with quorum modes in failover clustering.

In Windows Server 2008, the improvements to failover clusters (formerly known as server clusters) are aimed at simplifying clusters, making them more secure, and enhancing cluster stability. Cluster setup and management are easier. Security and networking in clusters have been improved, as has the way a failover cluster communicates with storage. For more information about improvements to failover clusters, see https://go.microsoft.com/fwlink/?LinkId=62368.

In this guide

Overview of quorum in a failover cluster

Requirements and recommendations for quorum configurations

Steps for viewing the quorum configuration of a failover cluster

Steps for changing the quorum configuration in a failover cluster

Troubleshooting: how to force a cluster to start without quorum

Additional references

For additional background information, also see Appendix A: Details of How Quorum Works in a Failover Cluster and Appendix B: Additional Information About Quorum Modes.

Overview of quorum in a failover cluster

In simple terms, the quorum for a cluster is the number of elements that must be online for that cluster to continue running. In effect, each element can cast one “vote” to determine whether the cluster continues running. The voting elements are nodes or, in some cases, a disk witness or file share witness. Each voting element (with the exception of a file share witness) contains a copy of the cluster configuration, and the Cluster service works to keep all copies synchronized at all times.

It is essential that the cluster stops running if too many failures occur or if there is a problem with communication between the cluster nodes. For a more detailed explanation, see the next section, Why quorum is necessary.

Note that the full function of a cluster depends not just on quorum, but on the capacity of each node to support the services and applications that fail over to that node. For example, a cluster that has five nodes could still have quorum after two nodes fail, but each remaining cluster node would continue serving clients only if it had enough capacity to support the services and applications that failed over to it.

Why quorum is necessary

When network problems occur, they can interfere with communication between cluster nodes. A small set of nodes might be able to communicate together across a functioning part of a network, but might not be able to communicate with a different set of nodes in another part of the network. This can cause serious issues. In this “split” situation, at least one of the sets of nodes must stop running as a cluster.

To prevent the issues that are caused by a split in the cluster, the cluster software requires that any set of nodes running as a cluster must use a voting algorithm to determine whether, at a given time, that set has quorum. Because a given cluster has a specific set of nodes and a specific quorum configuration, the cluster will know how many “votes” constitutes a majority (that is, a quorum). If the number drops below the majority, the cluster stops running. Nodes will still listen for the presence of other nodes, in case another node appears again on the network, but the nodes will not begin to function as a cluster until the quorum exists again.

For example, in a five node cluster that is using a node majority, consider what happens if nodes 1, 2, and 3 can communicate with each other but not with nodes 4 and 5. Nodes 1, 2, and 3 constitute a majority, and they continue running as a cluster. Nodes 4 and 5 are a minority and stop running as a cluster, which prevents the problems of a “split” situation. If node 3 loses communication with other nodes, all nodes stop running as a cluster. However, all functioning nodes will continue to listen for communication, so that when the network begins working again, the cluster can form and begin to run.

For more information about how quorum works, see Appendix A: Details of How Quorum Works in a Failover Cluster.

Overview of the quorum modes

There have been significant improvements to the quorum model in Windows Server 2008. In Windows Server 2003, almost all server clusters used a disk in cluster storage (the “quorum resource”) as the quorum. If a node could communicate with the specified disk, the node could function as a part of a cluster, and otherwise it could not. This made the quorum resource a potential single point of failure. In Windows Server 2008, a majority of ‘votes’ is what determines whether a cluster achieves quorum. Nodes can vote, and where appropriate, either a disk in cluster storage (called a “disk witness”) or a file share (called a “file share witness”) can vote. There is also a quorum mode called No Majority: Disk Only which functions like the disk-based quorum in Windows Server 2003. Aside from that mode, there is no single point of failure with the quorum modes, since what matters is the number of votes, not whether a particular element is available to vote.

This new quorum model is flexible and you can choose the mode best suited to your cluster.

Important

In most situations, it is best to use the quorum mode selected by the cluster software. If you run the quorum configuration wizard, the quorum mode that the wizard lists as “recommended” is the quorum mode chosen by the cluster software. We only recommend changing the quorum configuration if you have determined that the change is appropriate for your cluster.

There are four quorum modes:

  • Node Majority: Each node that is available and in communication can vote. The cluster functions only with a majority of the votes, that is, more than half.

  • Node and Disk Majority: Each node plus a designated disk in the cluster storage (the “disk witness”) can vote, whenever they are available and in communication. The cluster functions only with a majority of the votes, that is, more than half.

  • Node and File Share Majority: Each node plus a designated file share created by the administrator (the “file share witness”) can vote, whenever they are available and in communication. The cluster functions only with a majority of the votes, that is, more than half.

  • No Majority: Disk Only: The cluster has quorum if one node is available and in communication with a specific disk in the cluster storage.

Choosing the quorum mode for a particular cluster

The following table describes clusters based on the number of nodes and other cluster characteristics, and lists the quorum mode that is recommended in most cases.

A “multi-site” cluster is a cluster in which an investment has been made to place sets of nodes and storage in physically separate locations, providing a disaster recovery solution.

Description of cluster Quorum recommendation

Odd number of nodes

Node Majority

Even number of nodes (but not a multi-site cluster)

Node and Disk Majority

Even number of nodes, multi-site cluster

Node and File Share Majority

Even number of nodes, no shared storage

Node and File Share Majority

Diagrams of quorum modes

The following diagrams show how each of the quorum modes affects whether a cluster can or cannot achieve quorum.

Node Majority

The following diagram shows Node Majority used (as recommended) for a cluster with an odd number of nodes.

Node Majority quorum configuration, three nodes

In this mode, each node gets one vote. In certain circumstances, you might want to install a hotfix that lets you select which nodes will have votes. This can be useful with certain multi-site clusters, for example, where you want one site to have more votes than other sites in a disaster recovery situation. For more information, see Changing the quorum configuration in a failover cluster for unequal node weight.

Node and Disk Majority

The following diagram shows Node and Disk Majority used (as recommended) for a cluster with an even number of nodes. Each node can vote, as can the disk witness.

Node and Disk Majority quorum configuration, four nodes (plus disk)

The following diagram shows how the disk witness also contains a replica of the cluster configuration database in a cluster that uses Node and Disk Majority.

Replicas of cluster configuration in cluster that uses Node and Disk Majority

Node and File Share Majority

The following diagram shows Node and File Share Majority used (as recommended) for a cluster with an even number of nodes and a situation where having a file share witness works better than having a disk witness. Each node can vote, as can the file share witness.

Node and File Share Majority quorum configuration, four nodes (plus file share)

The following diagram shows how the file share witness can vote, but does not contain a replica of the cluster configuration database. Note that the file share witness does contain information about which version of the cluster configuration database is the most recent.

Replicas of cluster configuration in cluster that uses Node and File Share Majority

No Majority: Disk Only

The following illustration shows how a cluster that uses the disk as the only determiner of quorum can run even if only one node is available and in communication with the quorum disk. It also shows how the cluster cannot run if the quorum disk is not available (single point of failure). For this cluster, which has an odd number of nodes, Node Majority is the recommended quorum mode.

No Majority: Disk Only quorum configuration, three nodes

Additional information about quorum modes

For more information about quorum modes, see Appendix B: Additional Information About Quorum Modes.

Requirements and recommendations for quorum configurations

Before configuring the quorum for a failover cluster you must of course meet the requirements for the cluster itself. For information about cluster requirements, see https://go.microsoft.com/fwlink/?LinkId=114536. For information about cluster validation, see https://go.microsoft.com/fwlink/?LinkId=114537 and https://go.microsoft.com/fwlink/?LinkId=114538.

For a cluster using the Node Majority quorum mode (which includes almost all clusters with an odd number of nodes), there are no additional requirements for the quorum. The following sections provide guidelines for clusters using the Node and Disk Majority quorum mode and the Node and File Share Majority quorum mode. (The requirements and recommendations for the Node and Disk Majority mode also apply to the No Majority: Disk Only mode.)

Requirements and recommendations for clusters using Node and Disk Majority

When using the Node and Disk Majority mode, review the following requirements and recommendations for the disk witness.

Note

These requirements and recommendations also apply to the quorum disk for the No Majority: Disk Only mode.

  • Use a small Logical Unit Number (LUN) that is at least 512 MB in size.

  • Choose a basic disk with a single volume.

  • Make sure that the LUN is dedicated to the disk witness. It must not contain any other user or application data.

  • Choose whether to assign a drive letter to the LUN based on the needs of your cluster. The LUN does not have to have a drive letter (to conserve drive letters for applications).

  • As with other LUNs that are to be used by the cluster, you must add the LUN to the set of disks that the cluster can use. For more information, see https://go.microsoft.com/fwlink/?LinkId=114539.

  • Make sure that the LUN has been verified with the Validate a Configuration Wizard.

  • We recommend that you configure the LUN with hardware RAID for fault tolerance.

  • In most situations, do not back up the disk witness or the data on it. Backing up the disk witness can add to the input/output (I/O) activity on the disk and decrease its performance, which could potentially cause it to fail.

  • We recommend that you avoid all antivirus scanning on the disk witness.

  • Format the LUN with the NTFS file system.

If there is a disk witness configured, but bringing that disk online will not achieve quorum, then it remains offline. If bringing that disk online will achieve quorum, then it is brought online by the cluster software.

In certain circumstances, you might use an asymmetric storage configuration, where only a subset of cluster nodes have access to the storage array that contains the disk witness. For information about how to work within an asymmetric storage configuration and specify either Node and Disk Majority mode or No Majority: Disk Only mode, see Changing the quorum configuration in a failover cluster with asymmetric storage.

Requirements and recommendations for clusters using Node and File Share Majority

When using the Node and File Share Majority mode, review the following recommendations for the file share witness.

  • Use a Server Message Block (SMB) share on a Windows Server 2003 or Windows Server 2008 file server.

  • Make sure that the file share has a minimum of 5 MB of free space.

  • Make sure that the file share is dedicated to the cluster and is not used in other ways (including storage of user or application data).

  • Do not place the share on a node that is a member of this cluster or will become a member of this cluster in the future.

  • You can place the share on a file server that has multiple file shares servicing different purposes. This may include multiple file share witnesses, each one a dedicated share. You can even place the share on a clustered file server (in a different cluster), which would typically be a clustered file server containing multiple file shares servicing different purposes.

  • For a multi-site cluster, you can co-locate the external file share at one of the sites where a node or nodes are located. However, we recommend that you configure the external share in a separate third site.

  • Place the file share on a server that is a member of a domain, in the same forest as the cluster nodes.

  • For the folder that the file share uses, make sure that the administrator has Full Control share and NTFS permissions.

  • Do not use a file share that is part of a Distributed File System (DFS) Namespace.

Note

After the Quorum Configuration Wizard has been run, the computer object for the Cluster Name will automatically be granted read and write permissions to the file share.

If there is a file share witness configured, but bringing that file share online will not achieve quorum, then it remains offline. If bringing that file share online will achieve quorum, then it is brought online by cluster software.

For more information about file share witness recommendations, see:

Steps for viewing the quorum configuration of a failover cluster

When you install a failover cluster, the cluster software automatically chooses an appropriate quorum configuration for that cluster, based mainly on the number of nodes (even or odd). You can easily view the quorum configuration of an existing cluster using either the failover cluster snap-in or the command line.

To view the quorum configuration of an existing cluster using the failover cluster snap-in

  1. To open the failover cluster snap-in, click Start, click Administrative Tools, and then click Failover Cluster Management (in Windows Server 2008) or Failover Cluster Manager (in Windows Server 2008 R2).If the User Account Control dialog box appears, confirm that the action it displays is what you want, and then click Continue.

  2. In the console tree, if the cluster that you want to view is not displayed, right-click Failover Cluster Management or Failover Cluster Manager, click Manage a Cluster, and then select the cluster you want to view.

  3. In the center pane, find Quorum Configuration, and view the description.

    In the following example, the quorum mode is Node and Disk Majority and the disk witness is Cluster Disk 2.

To view the quorum configuration of an existing cluster using the Command Prompt window

  1. To open a Command Prompt window, on a cluster node, click Start, right-click Command Prompt, and then either click Run as administrator or click Open.

  2. If the User Account Control dialog box appears, confirm that the action it displays is what you want, and then click Continue.

  3. Review the configuration of the quorum by typing:

    cluster /quorum

Steps for changing the quorum configuration in a failover cluster

The procedure in this section describes how you can configure quorum configuration in a failover cluster by using the failover cluster snap-in. Additional subsections provide information about quorum configurations for use in certain circumstances where you want to use unequal node weight or asymmetric storage.

Important

Unless you have changed the number of nodes in your cluster, it is usually best to use the quorum configuration recommended by the quorum configuration wizard. We only recommend changing the quorum configuration if you have determined that the change is appropriate for your cluster.

Changing the quorum configuration in a failover cluster

The following procedure describes how you can configure quorum configuration in a failover cluster by using the failover cluster snap-in.

Membership in the local Administrators group on each clustered server, or equivalent, is the minimum permissions required to complete this procedure. Also, the account you use must be a domain user account. Review details about using the appropriate accounts and group memberships at https://go.microsoft.com/fwlink/?LinkId=83477.

To change the quorum configuration in a failover cluster by using the failover cluster snap-in

  1. To open the failover cluster snap-in, click Start, click Administrative Tools, and then click Failover Cluster Management (in Windows Server 2008) or Failover Cluster Manager (in Windows Server 2008 R2). If the User Account Control dialog box appears, confirm that the action it displays is what you want, and then click Continue.

  2. In the console tree, if the cluster you want to configure is not displayed, right-click Failover Cluster Management or Failover Cluster Manager, click Manage a Cluster, and select or specify the cluster you want.

  3. With the cluster selected, under Actions, click More Actions, and then click Configure Cluster Quorum Settings.

  4. Click Next. The following illustration shows the wizard page that displays for a cluster with an even number of nodes. Note that the text on this page varies, depending on whether the cluster has an even number or odd number of nodes. To view more information about the selections on this page, at the bottom of the page, click More about quorum configurations.

  5. Select a quorum mode from the list. For more information, see Choosing the quorum mode for a particular cluster, earlier in this guide.

  6. Click Next and then go to the appropriate step in this procedure:

    • If you chose Node Majority, go to the last step in this procedure.

    • If you chose Node and Disk Majority or No Majority, go to the next step in this procedure.

    • If you chose Node and File Share Majority, skip to step 8 in this procedure.

  7. If you chose Node and Disk Majority or No Majority, a wizard page similar to the following appears. (For No Majority, the title of the page is Select Storage Resource.) Select the storage volume that you want to use for the disk witness (or if you chose No Majority, for the quorum resource), and then skip to step 9. For information about the requirements for the disk witness, see Requirements and recommendations for clusters using Node and Disk Majority.

    If you change disk assignments on this page, the former storage volume is no longer assigned to the core Cluster Group and instead goes back to Available Storage.

  8. If you chose Node and File Share Majority, the following wizard page appears. Specify the file share you want to use, or click the Browse button and use the standard browsing interface to select the file share. For information about the requirements for the file share, see Requirements and recommendations for clusters using Node and File Share Majority.

  9. Click Next. Use the confirmation page to confirm your selections, and then click Next.

  10. After the wizard runs and the Summary page appears, if you want to view a report of the tasks that the wizard performed, click View Report.

Note

The most recent report will remain in the systemroot\Cluster\Reports folder with the name QuorumConfiguration.mht.

Changing the quorum configuration in a failover cluster for unequal node weight

In most failover clusters, each node gets one vote. In certain circumstances, you might want to install a hotfix that lets you select which nodes will have votes. This can be useful with certain multi-site clusters, for example, where you want one site to have more votes than other sites in a disaster recovery situation. Install the hotfix to all nodes (not just the node that will not have a vote). To download and install the hotfix, see https://support.microsoft.com/kb/2494036. To configure a node so that it does not have a vote, at the command prompt, type:

cluster . node <NodeName> /prop NodeWeight=0

This sets the NodeWeight property to 0. Similarly, to return the node to having a vote, set the NodeWeight property to 1.

After you have applied the hotfix described in this section, you might want to start a node but prevent it from achieving quorum and forming the cluster (to prevent a "split" situation with two competing instances of the cluster running). To do this, start the Cluster service with the /preventquorum option, which can be abbreviated as /pq, as shown in the following command:

net start clussvc /pq

Changing the quorum configuration in a failover cluster with asymmetric storage

In most failover clusters, the disk witness or quorum disk can be accessed by all nodes. In certain circumstances, you might want to configure a disk witness or quorum disk that can be accessed by a subset of nodes only (asymmetric storage). Before configuring this, if your cluster is running Windows Server 2008 R2, make sure that it has Service Pack 1. Similarly, if your cluster is running Windows Server 2008, make sure that it has Service Pack 2 and hotfix 976097, which is described at https://support.microsoft.com/kb/976097.

To configure the quorum in a failover cluster with asymmetric storage, first ensure that the disk that will be the disk witness or quorum disk is online and is in the Cluster Group (not in Available Storage). Then, at the command prompt, for a Node and Disk Majority (disk witness) configuration, type:

cluster <ClusterName> /quorum:"<DiskResourceName>"

For a No Majority: Disk Witness configuration, type the same command, but add /diskonly to the end of the command. With the No Majority: Disk Witness configuration, the cluster may be unable to start unless a node that can access the disk witness is available. If none of these nodes is available, it might be necessary to start the cluster service with the net start clussvc /forcequorum command.

Troubleshooting: how to force a cluster to start without quorum

When troubleshooting, you might be in a situation where the cluster is offline because it does not have quorum, but you want to bring it online. The first thing to understand is your quorum mode and why you no longer have quorum. This may provide some insight into how the cluster can achieve quorum and come online automatically. If you need to force the Cluster service to start, you can make all nodes which can communicate with each other begin working together as a cluster by running the net start clussvc command with an option for forcing quorum. The cluster will use the copy of the cluster configuration that is on the node on which you run the command, and replicate it to all other nodes. To force the cluster to start, on a node that contains a copy of the cluster configuration that you want to use, type the following command:

net start clussvc /fq

The command can also be typed as net start clussvc /forcequorum.

Forcing a cluster to start that does not have quorum may be especially useful in an unbalanced multi-site cluster. If you have a five-node multi-site cluster and three nodes at Site A fail, then the two nodes at Site B will go offline since they no longer have quorum. If there is a genuine disaster at Site A, then it may take a significant amount of time for the site to come online, and so you would likely want to force Site B to come online, even though it does not have quorum.

When a cluster is forced to start without quorum it continually looks to add nodes to the cluster and is in a special “forced” state. Once it has majority, the cluster moves out of the forced state and behaves normally, which means it is not necessary to rerun the cluster command without a startup switch. If the cluster then loses a node and drops below quorum, it will go offline again because it is no longer in the forced state. At that point, to bring it online again while it does not have quorum would require running net start clussvc /fq again.

In some situations, you might want to start a node but prevent it from achieving quorum and forming the cluster. For more information, see Changing the quorum configuration in a failover cluster for unequal node weight, earlier in this topic.

Additional references

For more information about disk witness recommendations, see:

https://go.microsoft.com/fwlink/?LinkId=115004 (Note that this is a Windows Server 2003 article, but the disk witness recommendations remain unchanged.)

For more information about file share witness recommendations, see:

For a list of technical documentation for failover clusters on the TechNet Web site, see:

https://go.microsoft.com/fwlink/?LinkId=68633

For information on Cluster Continuous Replication (CCR) in Microsoft Exchange Server 2007:

https://go.microsoft.com/fwlink/?LinkId=114542

For information about choosing and validating hardware for a failover cluster, see the TechNet Web site at:

https://go.microsoft.com/fwlink/?LinkId=115087