Guidelines for Backing Up a Windows HPC Server 2008 R2 Cluster
Updated: July 2011
Applies To: Windows HPC Server 2008 R2
This section provides an overview of the guidelines and methods for backing up a Windows HPC Server 2008 R2 cluster. This section focuses on backing up the data that is stored in the HPC databases (on the head node or on remote servers that are running at least Microsoft® SQL Server® 2008 SP1) and backing up the cluster configuration settings on the head node. If this data is regularly backed up, it is generally possible to restore a cluster to normal operations with minimal disruptions, after a hardware or software failure on the head node or on a remote server for the HPC databases. For more information about restoring a Windows HPC Server 2008 R2 cluster in different situations, see Scenarios for Restoring a Windows HPC Server 2008 R2 Cluster in this Back Up and Restore guide.
Note |
---|
Often the existing cluster nodes other than the head node can continue to operate after a cluster or head node is restored, or they can be easily redeployed by using the data that is stored in the HPC databases and the cluster configuration settings. |
In this section:
Important cluster data and configuration settings
Backup methods
Important cluster data and configuration settings
The critical data for the cluster is stored in the HPC databases and in configuration files and settings on the head node.
HPC databases
The HPC databases in the following table store data that is critical to the operation of Windows HPC Server 2008 R2 and should be backed up regularly. The HPC databases are much more dynamic than other cluster data. The databases are changing continuously while jobs are submitted and run on the cluster.
Database | Default name | Data |
---|---|---|
Cluster management database |
HPCManagement |
Cluster users, network configuration, nodes, node groups, node templates, operations history, performance counter metric history |
Job scheduling database |
HPCScheduler |
Nodes, job templates, job history, scheduler configuration |
Diagnostics database |
HPCDiagnostics |
Results of diagnostic tests |
Reporting database |
HPCReporting |
Raw and aggregated reporting data |
In Windows HPC Server 2008 R2, the cluster databases can be installed on the head node or on remote servers that are running SQL Server. When the databases are installed on the head node, they are installed by default in the COMPUTECLUSTER instance in SQL Server. The database files are located by default in the %PROGRAMFILES%\Microsoft HPC Pack 2008 R2\SQLDB folder.
Cluster configuration settings
In addition to the data that is stored in the HPC databases, the following table lists important cluster configuration settings and data that are stored on the head node.
Important |
---|
This list is not exhaustive, but it indicates settings that are important to restore in many environments. Many settings depend on or affect the data that is stored in the HPC databases. Not all items apply to all clusters. |
Item | Location | Notes |
---|---|---|
Store of files for node setup, including operating system images and drivers |
REMINST file share, including the REMINST\setup\images and REMINST\setup\drivers folders |
|
Configuration files for service-oriented architecture (SOA) services |
HpcServiceRegistration file share |
Present if SOA services are installed. The DLLs for the SOA services that are specified in the service registration files are stored separately according to the preferences of the cluster administrator, and they should also be backed up. |
Output spool share for compute nodes |
CcpSpoolShare file share |
|
Results of diagnostic tests |
Diagnostics file share |
|
Submission and activation filters |
Paths configured either in the job scheduler configuration options, or by job template (in Windows HPC Server 2008 R2 SP2 or later) |
Present if installed by the cluster administrator |
Registry settings |
HKEY_LOCAL_MACHINE\SOFTWARE\MICROSOFT\HPC |
|
Custom diagnostic tests |
%CCP_HOME%bin\DiagTests folder |
Present if installed by the cluster administrator |
Other customizable files |
Folders under %CCP_HOME% |
Includes CcpPower.cmd, startnet.cmd, unattend.xml, HpcSession.exe.config |
Environment variables |
On the head node: CCP_DATA, CCP_HOME, CCP_JOBTEMPLATE, CCP_SCHEDULER |
|
Hosts file |
%SystemRoot%\System32\drivers\etc\hosts) |
Backup methods
To help to recover your cluster if the head node fails or if the HPC databases fail, you should regularly back up the full system of the head node and any remote database servers, the HPC databases (on the head node or on one or more remote servers), and the cluster configuration settings on the head node. The following table provides guidelines for these three backup types.
Backup type | Description | Recommended frequency of backups |
---|---|---|
Full server |
A backup of all volumes so that you can recover the full server, including all the files, data, applications, and the system state. The system state includes the boot file, the COM+ class registration database, and the registry. |
Make a full server backup at regular intervals, and before and after you make major configuration changes. In a stable cluster, you can schedule full server backups once a week or even less frequently. |
Databases |
A full backup of each HPC database. Database backups represent the whole database at the time the backup finished. |
The HPC databases should be backed up much more often than the full system state backup. Back up all the HPC databases at the same time. Depending on the activity level in your cluster, you might want to back up the databases daily, multiple times per day, or even use a continual backup method. To ensure consistency between the databases, back up all of the databases after making configuration changes such as adding or deleting nodes. |
Cluster configuration settings |
A backup of all the file shares on the head node that contain configuration settings that are critical for the operation of your cluster. |
Back up all of the configuration settings at the same time. To ensure consistency between the databases and the configuration settings, back up the configuration settings at the same time that you back up the databases. |
To create a backup of your head node, the HPC databases, and the cluster configuration settings, you can choose among several standard backup solutions, which include Microsoft and non-Microsoft backup and restore solutions. These methods include the following:
Backup solution | More information |
---|---|
Windows Server Backup |
|
SQL Server Management Studio Backup |
|
Microsoft System Center Data Protection Manager (DPM) |
|
Non-Microsoft backup and recovery solutions, such as Symantec NetBackup |
Consult the documentation of the vendor. DISCLAIMER: Reference to any non-Microsoft products is intended solely for informational purposes and does not constitute or imply any endorsement by Microsoft. |
Note |
---|
If your system is running at least Windows HPC Server 2008 R2 Service Pack 2, you can also use the Export-HpcConfiguration.ps1 and Import-HpcConfiguration.ps1 HPC PowerShell® scripts to back up and restore certain cluster configuration settings on the head node. Some of these cluster configuration settings, such as node templates and job templates, are stored in the HPC databases. These scripts can also be used to migrate critical configuration settings from one cluster to another—for example, to Mayntain cluster operations in case of a disaster. For more information, see Export and Import Windows HPC Cluster Configuration Settings in this Back Up and Restoreguide. |
Important |
---|
Regardless of which method you choose for backup and restore, to bring the cluster to a consistent state, you need to perform additional steps when you restore the cluster. For an overview of the restore steps in several scenarios, see Scenarios for Restoring a Windows HPC Server 2008 R2 Cluster. |
Example: Create a protection group for the cluster in DPM
You can use DPM to create a protection group for the cluster that includes the HPC databases (on the head node or on one or more remote computers) and the configuration settings that are stored in shared folders on the head node. For example, the protection group could include the following sources:
The SQL Server instance or instances that host the HPC databases
REMINST file share
HpcServiceRegistration file share
CcpSpoolShare file share
Diagnostics file share
The protection group could be expanded to include other data sources depending on the needs of the cluster administrator.
For information about creating a protection group in DPM, see Creating a Protection Group for File and Application Servers.