Run Batch Workload on a Mixed Infrastructure (Windows Azure Worker Nodes & On-Premise HPC Server 2008 R2 Compute Nodes)
With the introduction of SP1 of HPC Server 2008 R2 it is possible to run workload on Windows Azure. If you want to be able to off-burst your batch application (= extending the infrastructure from classic on-premise servers to cloud based Azure worker nodes) you need to set specific environment variables within the on-premise server infrastructure.
Here is a sample of how to upload a batch based application (azuretestbatch.exe) to Azure workder nodes and run a parametric sweep workload across the full cluster (on-premise compute nodes + Azure worker nodes):
Create the Windows Azure Package & Upload It
Use the hpcpack create command to create a *.zip file. Once the package got created, upload it to the Windows Azure storage by using the hpcpack upload command. Specify the node template of your Windows Azure worker nodes. Additionally specify a relative path – in our sample its named test. This forces the hpc sync command which we call later to unzip the package content to the folder named test:
C:\Users\tlangner>hpcpack create AzureTestPackage.zip
C:\temp\folder1
C:\Users\tlangner>hpcpack upload AzureTestPackage.zip
/nodetemplate:AzureNodeTemplate /relativepath:test
You can test your upload by calling the hpcpack view command. As you can see in the lines below the targeted physical directory on the Windows Azure worker node disks is %CCP_PACKAGE_ROOT%\test. The environment variable %CCP_PACKAGE_ROOT% only exists on Windows Azure worker nodes.
C:\Users\tlangner>hpcpack view azuretestpackage.zip
/nodetemplate:AzureNodeTemplate
Connecting to head node: head
Querying for Windows Azure Worker Role node template "AzureNodeTemplate"
Windows Azure Worker Role node template found.
Retrieving Azure account name and key.
Found account: hpc*** and key: ********
Package Name: azuretestpackage.zip
Uploaded: 06.01.2011 10:43:44
Description: Node
Template: AzureNodeTemplate
Target Azure Dir: %CCP_PACKAGE_ROOT%\test
After the hpcpack upload command has finished, it is time to deploy the package content to the Windows Azure worker nodes (ie. local disks of the cloud machines). This is done by calling the hpcsync command:
C:\Users\tlangner>hpcsync /nodetemplate:AzureNodeTemplate
Deploy the Package Content Locally
In order to run the batch workload on the “classic” compute nodes, too, it is helpful to deploy the package using the same environment variables as they are are available in Windows Azure. In this sample we want the CCP_PACKAGE_ROOT variable link to the local C:\TEMP folder of every machine in the cluster:
C:\Users\tlangner>cluscfg setenvs "CCP_PACKAGE_ROOT=C:\TEMP"
Then, let’s copy the zip package content to the target directory of every “classic” compute node in the cluster. The content will be deployed to the C:\TEMP\test directory:
C:\Users\tlangner>clusrun /nodegroup:computenodes xcopy
\\head\temp\AzureTestBatch\AzureTestBatch\bin\Debug\*.*
^%CCP_PACKAGE_ROOT%^\test
Run the Mixed Workload
After deploying the package content to the whole cluster it’s time to run the batch job:
C:\Users\tlangner>job submit /nodegroup:X
/parametric:1-1000 /workdir:^%CCP_PACKAGE_ROOT%^\test
azuretestbatch.exe 5000
Comments
Anonymous
February 02, 2011
Thanks, good one. From operations point of view, may need a native GUI tool within 2008 HPC to manage burst limit transfer, parameters for this transfer,logging and errors along with support for Azure transfer back to cluster env . May require USA on Azure, Sandbox with development and testing for HPC compilers!! .. Well guess worth loggin in our connect tooAnonymous
March 16, 2011
Regarding hpcsync, the command should be written as: clusrun /nodegroup:AzureWorkerNodesGroupName hpcsync (AzureWorkerNodesGroupName should be replaced with the name of the azure nodes group)Anonymous
May 16, 2011
In SP2 beta, you can enable Azure Connect with your Azure Nodes. With Azure Connect, you can enable connectivity between Azure Nodes and on-premises endpoints that have the Azure Connect agent installed. This can help provide access from Azure Nodes to UNC file shares and license servers on-premises. You can use the Remote Desktop functionality to install the Azure Connect agent on your Azure nodes, and associate the Azure nodes with an on-premise group via the Azure portal.