Migration to HPC Pack 2016 Update 1

TThis article describes the steps to migrate your HPC Pack cluster from HPC Pack 2016 RTM version to HPC Pack 2016 Update 1.

Important

Only migration from HPC Pack 2016 RTM to HPC Pack 2016 Update 1 is supported. It is not supported to upgrade or migrate your existing HPC Pack 2012 R2 or earlier cluster to HPC Pack 2016 Update 1.

Before migration

Before the migration, you need to do the following:

  1. Stop all running jobs.
  2. Stop all Azure nodes (PaaS), if you have deployed them.
  3. Stop and delete all Azure Batch pools, if you have deployed them. Remove the Azure Batch node templates because there is a breaking change in the Burst to Azure Batch feature in HPC Pack 2016 Update 1.
  4. Back up the HPC databases manually.

Step 1: Download HPC Pack 2016 Update 1 installation package and create a network share

1.1: Decide the head node to create the network share

For a single head node cluster, create the HPC Pack 2016 Update 1 installation network share on the head node.

For a high availability cluster, run the following PowerShell command as administrator to decide in which head node to create the network share.

Add-PSSnapin Microsoft.Hpc
Get-HpcClusterRegistry | ?{$_.Name -eq "RuntimeDataShare"}

Output is similar to the following:

Name                                         Value
----                                         ----
RuntimeDataShare                             \\HPCHN01\Runtime$

1.2: Download the HPC Pack 2016 Update 1 installation package

Download the HPC Pack 2016 Update 1 installation zip package and the migration zip package from the Microsoft Download Center to the head node decided in Step 1.1.

1.3: Unblock the installation package zip file and migration zip file

Right-click the downloaded zip file and select Properties. If there is a security alert on the General page that the file is blocked, click Unblock.

1.4: Create a network share for HPC Pack 2016 installation package

Extract the HPC Pack 2016 Update 1 installation package to a local directory, for example d:\HPCPack2016Update1, and extract the migration package to the same directory, and then create a network share as follows.

  1. Right-click the folder d:\HPCPack2016Update1, choose Properties > Security, and click Edit.

  2. Click Add to grant Read & execute permissions to “Everyone”. Click OK twice.

    Grant permissions to Everyone

  3. On the Sharing tab click Share. Create a network share for the folder and add Read permission for “Everyone”.

    Create network share

  4. If your cluster is not in an Active Directory domain, you need to modify the local group policy. Do the following:

    a. Run gpedit.msc.

    b. Click Computer Configuration > Windows Settings > Security Settings > Local Policies> Security Options.

    c. Enable Network access: Let everyone permissions apply to anonymous users, and add “HPCPack2016Update1” to Network access: Shares that can be accessed anonymously.

Step 2: Upgrade Windows compute, broker, and workstation nodes

Open HPC Cluster Manager on the head node, and click Resource Management > Nodes. Select all the compute, broker, and workstation nodes, and click Run Command. In the Command line field, enter the following command line, and click Run.

powershell.exe -ExecutionPolicy ByPass -Command "\\yourheadnode\HPCPack2016Update1\Migration\MigrateHpcNode.ps1 -Update1PackagePath \\yourheadnode\HPCPack2016Update1 -RunAsScheduledTask"

Note

All the nodes will be shown in “Error” state after the command because of a breaking change in HPC Pack 2016 Update 1. The compute, broker, and workstation nodes upgraded with Update 1 cannot connect to the head node with RTM version. After you upgrade the head node(s), the connection will be recovered.

Step 3: Upgrade Linux compute nodes

Scenario 1: Upgrade on-premises Linux compute nodes

If your Linux compute nodes were installed manually as per Add Linux nodes to the cluster, use the following steps to migrate.

Open HPC Cluster Manager on the head node, and click Resource Management > Nodes. Select all the Linux nodes, click Run Command, and run the following commands in sequence.

First, create a temp directory on all Linux nodes.

mkdir /tmp/hpc2016u1

Second, mount the HPC Pack 2016 Update 1 installation share.

mount -t cifs //yourheadnode/HPCPack2016Update1 /tmp/hpc2016u1 -o vers=2.1, domain=<domainname>,username=<username>,password='<password>',dir_mode=0777,file_mode=0777

Third, schedule a job on all Linux nodes to migrate one minute later

cd /tmp/hpc2016u1; echo "python /tmp/hpc2016u1/Migration/migratelinuxnode.py -update1package:/tmp/hpc2016u1/LinuxNodeAgent/hpcnodeagent.tar.gz" | at now + 1 minute

Scenario 2: Upgrade Azure Linux compute nodes

If you had deployed the HPC Pack 2016 cluster with Linux workloads with our Azure Resource Manager template, you can open a PowerShell console as administrator on one head node, and run the following commands. You need to specify the resource group and location in which your nodes were deployed.

$resourceGroupName = "yourresourcegroup"
$location = "yourlocation"
Add-PSSnapin Microsoft.Hpc
$vmNames = get-hpcnode -GroupName LinuxNodes | %{$_.NetbiosName}
if((Get-WmiObject Win32_ComputerSystem).DomainRole -lt 3)
{
    $settings = @{"ClusterConnectionString" = "$env:CCP_SCHEDULER"}
}
else
{
    $domainName = (Get-WmiObject Win32_ComputerSystem).Domain
    $connStr = @($env:CCP_SCHEDULER.split(',') | %{$_ + '.' + $domainName}) -join ','
    $settings = @{"ClusterConnectionString" = "$connStr"}
}

foreach($vmName in $vmNames)
{
    Remove-AzureRmVMExtension -Name installHPCNodeAgent -ResourceGroupName $resourceGroupName -VMName $vmName
    Set-AzureRmVMExtension -Publisher Microsoft.HpcPack -ExtensionType LinuxNodeAgent2016U1 -Version 2.3 -Name installHPCNodeAgent -Settings $settings -ResourceGroupName $resourceGroupName -VMName $vmName -Location $location
}

Note

All the Linux nodes will be shown in “Error” state after the command because there is a breaking change in HPC Pack 2016 Update 1. The Linux nodes upgraded to Update 1 cannot connect to the head node with RTM version. After you upgrade the head node(s), the connection will be recovered.

Step 4: Upgrade the head node(s)

Scenario 1: Upgrade the head node for a single head node cluster

1: Back up cluster configuration settings

Open a PowerShell Console as administrator on the head node, and run the following commands.

Add-PSSnapin Microsoft.Hpc
mkdir d:\HPCBackup
Get-HpcClusterRegistry | Export-Clixml d:\HPCBackup\clusterregistry.xml

2: Run script to upgrade head node

Open a new PowerShell console as administrator, and run the following command:

d:\HPCPack2016Update1\Migration\MigrateSingleHN.ps1 –BackupDir d:\HPCBackup -Update1PackagePath d:\HPCPack2016Update1

Scenario 2: Upgrade the high availability head nodes

For an HPC Pack 2016 cluster with head node high availability, there are three head nodes. In one head node, you have downloaded the HPC Pack 2016 Update 1 installation package and created a network share (here called head node A). The other two head nodes are head node B and head node C.

1: Back up cluster configuration settings

Open a PowerShell console as administrator on head node A, and run the following PowerShell commands.

Add-PSSnapin Microsoft.Hpc
mkdir d:\HPCBackup
Get-HpcClusterRegistry | Export-Clixml d:\HPCBackup\clusterregistry.xml

2: Upgrade the Service Fabric cluster

Run the following PowerShell commands on head node A to upgrade the Service Fabric cluster to the latest version.

Connect to the cluster and get the list of available versions that you can upgrade to.

Connect-ServiceFabricCluster
Get-ServiceFabricRegisteredClusterCodeVersion

Start a cluster upgrade to the latest version from the list (for example 6.0.232.9494).

Start-ServiceFabricClusterUpgrade -Code -CodePackageVersion 6.0.232.9494 -Monitored -FailureAction Rollback

During the upgrading the original PowerShell console will close. Open a new one as administrator, connect to the Service Fabric cluster again with the Connect-ServiceFabricCluster command, and run the following command to monitor the upgrading progress.

Get-ServiceFabricClusterUpgrade

The upgrading completes when the UpgradeState becomes RollingForwardCompleted.

Upgrade domain status

3: Remove the HPC application from the Service Fabric cluster

Run the following PowerShell commands on head node A to remove the old version of HPC application from Service Fabric cluster.

Remove the HPC application.

Remove-ServiceFabricApplication -ApplicationName fabric:/HpcApplication -Force

Remove the HPC application type and application package.

Get-ServiceFabricApplicationType -ApplicationTypeName HpcApplicationType | Unregister-ServiceFabricApplicationType -Force
Remove-ServiceFabricApplicationPackage -ApplicationPackagePathInImageStore HpcApplicationType -ImageStoreConnectionString fabric:ImageStore

4: Upgrade the first two head nodes.

Run the following PowerShell command as administrator separately on head node B and head node C.

Copy-Item \\HPCHN01\HPCPack2016Update1\Migration\MigrateHaHN.ps1 -Destination c:\Windows\Temp\MigrateHaHN.ps1 -Force
c:\Windows\Temp\MigrateHaHN.ps1 -Update1PackagePath \\HPCHN01\HPCPack2016Update1 -Prerequisites

5: Upgrade the last head node.

Run the following PowerShell command as administrator on head node A.

D:\HPCPack2016Update1\Migration\MigrateHaHN.ps1 -Update1PackagePath D:\HPCPack2016Update1 –BackupDir d:\HPCBackup