Install and Use the Network Troubleshooting Report Diagnostic Test
Applies To: Microsoft HPC Pack 2008 R2, Microsoft HPC Pack 2012, Microsoft HPC Pack 2012 R2, Windows HPC Server 2008 R2
This topic contains information about how to install and use the Network Troubleshooting Report diagnostic test for Microsoft® HPC Pack. The Network Troubleshooting Report diagnostic test collects and analyzes network information to help you troubleshoot networking issues on your Windows HPC cluster.
In this topic:
About the Network Troubleshooting Report diagnostic test
Install the diagnostic test
Install the diagnostic test files on workstation nodes
Install vstat.exe (optional)
Run the Network Troubleshooting Report diagnostic test
Copy test data to Microsoft Excel for custom analysis
Redistribute the diagnostic test files to cluster nodes
Edit a node template to automatically deploy test files to new nodes
Uninstall the diagnostic test
About the Network Troubleshooting Report diagnostic test
The Network Troubleshooting Report diagnostic test is installed using an installation (MSI) file, which automates the tasks of adding the test to HPC Cluster Manager and copying the necessary binary files and scripts to the nodes. This installation method for a new or custom diagnostic test is not necessary, but it automates tasks that would otherwise be manual. (For more information about adding custom diagnostic tests to a cluster, see Add New and Custom Diagnostic Tests.)
During the installation, the following binary files are copied to each node:
Node
Files
Path
Function
Head node
NetTroubleshoot.exe
,NetWrench.exe
,NetTSAdapter.dll
%REMINST%\NetTroubleshoot
Analyze the data relayed by the
NetWrench.exe
andNetTSAdapter.dll
files that run on the compute nodes, and generate the HTML report for the testCompute nodes
NetWrench.exe
,NetTSAdapter.dll
Important
If a node does not have these files (for example, they failed to be distributed to that node during the installation of the test or the node was redeployed recently), the test will fail to run on that node. No results for that node will appear in the report for the test.
%CCP_HOME%bin
Run when the test runs, to collect network information and relay it to the head node for analysis
If you have an InfiniBand network, you can choose to install
vstat.exe
on the nodes in that network to collect additional information for the report. TheNetWrench.exe
andNetTSAdapter.dll
files on the compute node usevstat.exe
to collect information about the status and the capabilities of the host channel adapter (HCA) cards for that network. This information is then analyzed on the head node and is included in the HTML report. For more information, see Install vstat.exe (optional), later in this topic.A script (
RedistributeCN.cmd
) is provided to redistribute the binary files to the nodes. You can run this script at any time. The script should be run after a new node is added to the cluster, or if an existing node is redeployed. For more information, see Redistribute the diagnostic test files to cluster nodes, later in this topic.If you redeploy nodes often, so that you do not need to constantly redistribute the binary files to the nodes, you can add a task to your node templates to run a script (
CopyCN.cmd
). This script will copy the binary files for the test during node deployment. For more information, see Edit a node template to automatically deploy test files to new nodes, later in this topic.Unlike the basic HTML reports generated by the default tests installed with HPC Pack, the Network Troubleshooting Report test report includes custom tables that contain all of the test data. These tables can be copied to Microsoft ® Excel ® or a similar application for analysis. For more information, see Copy test data to Microsoft Excel for custom analysis , later in this topic.
^ Top of page
Install the diagnostic test
Important
- To install the Network Troubleshooting Report diagnostic test, you must use the domain credentials of an HPC cluster administrator. If you use local administrator credentials, the installation will fail. It is recommended that you are logged on to the head node as an HPC cluster administrator.
- To copy the necessary binary files for the test to the compute nodes, the installation wizard starts a script that runs a
clusrun
command. If a compute node cannot be reached or the files are not copied successfully, the installation of the test can still complete. However, if a compute node does not have the files to collect and relay information to the head node, information about that node will not appear in the test report.
- If the binary files that are required for the test are not successfully copied to all compute nodes during the installation, you can manually redistribute the files to the nodes at a later date by running the
RedistributeCN.cmd
script. For more information, see Redistribute the diagnostic test files to cluster nodes, later in this topic.
- If you have workstation nodes in your cluster, you or an administrator of workstation computers may need to perform additional, manual steps to deploy the necessary binary files to the workstation nodes. For more information, see Install the diagnostic test files on workstation nodes.
- If you have an InfiniBand network, you can choose to install
vstat.exe
on the nodes to provide additional network information, orvstat.exe
may already be installed. If you need to installvstat.exe
on the nodes, see Install vstat.exe (optional), later in this topic.
To install the diagnostic test
Download the Network Troubleshooting Report diagnostic test installation program from the Microsoft Download Center to the head node of your Windows HPC cluster or to a network location. The Network Troubleshooting Report diagnostic test is included in the HPC Pack Tool Pack. For more information about the Tool Pack, see Microsoft HPC Pack: Tools and Extensibility.
Make sure that as many nodes as possible in your cluster are started and can be reached from the head node. For example, to check this, view the Node Health monitoring chart. To do this:
If HPC Cluster Manager is not already open on the head node, open it.
In Charts and Reports, view the Node Health monitoring chart.
On the head node computer, run
NetTroubleshoot.msi
. Follow the steps in the installation wizard.To copy the
NetWrench.exe
andNetTSAdapter.dll
files to the compute nodes in the cluster, the wizard automatically opens a Command Prompt window and runs a script that copies the files. When the script completes, review the Summary and then press a key to continue.Important
- If an error occurs during the copying of the files to the compute nodes or there is a problem with your credentials, an error message appears in the command output that guides you to correct the problem.
- If you are prompted, type the password for your account.
- If an error occurs during the copying of the files to the compute nodes or there is a problem with your credentials, an error message appears in the command output that guides you to correct the problem.
On the final page of the installation wizard, click Finish.
Additional considerations
The Network Troubleshooting Report diagnostic test is installed in HPC Cluster Manager using the following metadata:
Item Description Suite
Network Troubleshooting
Name
Network Troubleshooting Report
Alias
netTroubleshoot
^ Top of page
Install the diagnostic test files on workstation nodes
If the HPC cluster administrator credentials that you use to install the test also provide administrative credentials on workstation nodes (and unmanaged server nodes, if supported in your version of HPC Pack), the installation program automatically copies the necessary test files to the workstation computers.
In many organizations, however, HPC cluster administrators do not have administrative credentials on the workstation nodes. If you do not have administrative credentials on the workstations, the installation program cannot copy the binary files for the test to those nodes. It is also not possible to redistribute the test files to the workstation nodes by running the RedistributeCN.cmd
script.
If you are not a workstation administrator for your organization, you will need to discuss and coordinate the installation of the diagnostic test files with the workstation administrator. The administrator will need to follow the deployment practices in the organization to copy the following files from the head node computer (or a network location) to the %CCP_HOME%bin
folder in the workstation node computers: NetWrench.exe
and NetTSAdapter.dll
. For more information about these files and their locations, see About the Network Troubleshooting Report diagnostic test, in this topic.
Install vstat.exe (optional)
If you have an InfiniBand network and you want the Network Troubleshooting Report to display the status and the capabilities of the host channel adapter (HCA) cards in that network, the vstat.exe
tool must be installed on each node that has an HCA card. The Network Troubleshooting Report diagnostic test does not install vstat.exe
.
Important
Vstat.exe
may be installed automatically as part of the driver package from your HCA card vendor or system builder, or you may need to download and installvstat.exe
separately. Consult the documentation from your vendor or system builder.
- If you installed the driver package or
vstat.exe
only on the head node computer, you will need to deployvstat.exe
to the other cluster nodes. The method you use depends on howvstat.exe
is installed on the head node.
- If
vstat.exe
is not available from your vendor or system builder, you can also download the latest device drivers that are published by the OpenFabrics Alliance (OFA), which work with most commercially available HCA cards. Thevstat.exe
tool is installed automatically with the device drivers that are published by the OFA. For more information and to download the latest device drivers that are published by the OFA, see The OpenFabrics Alliance (https://go.microsoft.com/fwlink/?LinkID=137347).
To install vstat.exe on the nodes in your cluster
Download the InfiniBand driver or tools installation program from the appropriate hardware vendor or system builder to the head node of your Windows HPC cluster or to a network location.
On the head node computer, run the installation program.
(Important) When prompted, choose to install the program files in a folder in the
C:\Program Files
folder on the head node computer. In most cases, this is a default installation option.Depending on how
vstat.exe
is installed on the head node, do one of the following:If
vstat.exe
is installed as a stand-alone application, you can run aclusrun
command on the head node to copyvstat.exe
to the nodes.Important
vstat.exe
must be copied to a folder in theC:\Program Files
folder on the nodes. This ensures that the diagnostic test can usevstat.exe
to collect information on the nodes.If
vstat.exe
is installed with the driver package, you can add the drivers to the operating system images that are deployed to the nodes. In HPC Cluster Manager, in Configuration, in the Deployment To-do List, click Manage drivers.
Additional resources
^ Top of page
Run the Network Troubleshooting Report diagnostic test
You can use the following procedure to run the Network Troubleshooting Report diagnostic test on all nodes in the cluster and to view the report.
Important
- A different suite of tests, also named “Network Troubleshooting”, is also installed by default in HPC Cluster Manager. This suite includes the DNS Test, the Domain Connectivity Test, and the Ping Test. To run the Network Troubleshooting Report diagnostic test that you installed, ensure that you select the test with the alias netTroubleshoot.
- Running the Network Troubleshooting Report diagnostic test on a large cluster can take a long time.
To run the Network Troubleshooting Report diagnostic test
If HPC Cluster Manager is not already open on the head node, open it.
In Diagnostics, in the Navigation Pane, expand Tests, expand Network, and then click Network Troubleshooting.
In the view pane, right-click Network Troubleshooting Report, and then click Run.
In the Run Diagnostic Tests dialog box, in Nodes to test, select All nodes, and then click Run.
To view the report
In Diagnostics, in the Navigation Pane, click Test Results.
In the view pane, verify that the status of the Network Troubleshooting Report test is not Running.
To view the report, in the view pane, double-click Network Troubleshooting Report. The report will open in your default web browser.
To export the report, right-click Network Troubleshooting Report and then click Export Results. You can then open the report using a browser or Microsoft Excel.
Note
If you see an error message in the Nodes Excluded from this Report section of the report similar to “Netwrench.exe is not recognized as an internal or external command, operable program or batch file”, the binary files for the diagnostic test are not found on the indicated nodes. This can occur if there is a problem distributing the files to the nodes, or if a node was redeployed. To redistribute the binary files for the test to the nodes, see Redistribute the diagnostic test files to cluster nodes, later in this topic.
Additional resources
^ Top of page
Copy test data to Microsoft Excel for custom analysis
You can copy the data in any report table to Microsoft Excel, and perform your custom analysis of the data. The All Data by Network tab in the report is specifically created for that purpose. It contains summary tables of the data in the different categories in the Analysis by Category tab.
To copy the data to Excel
Open a new Excel workbook.
On the test report, click All Data by Network.
Select or highlight the table that you want to copy.
In Excel, click a cell and paste the data.
You can use the tools in Excel to sort, filter, and analyze the data.
^ Top of page
Redistribute the diagnostic test files to cluster nodes
At any time, you can redistribute the files necessary to collect information for the diagnostic test to all nodes that can be reached in the cluster.
To redistribute the diagnostic test files
Make sure that as many nodes as possible in your cluster are started and can be reached from the head node.
Open an elevated Command Prompt window. Click Start, point to All Programs, click Accessories, right-click Command Prompt, and then click Run as administrator.
At the elevated Command Prompt window, type the following command:
%ccp_home%bin\RedistributeCN.cmd
^ Top of page
Edit a node template to automatically deploy test files to new nodes
If you are deploying nodes from bare metal, you can edit an existing node template to automatically deploy the Network Troubleshooting diagnostic test files to new nodes.
To edit the node template
In HPC Cluster Manager, in Configuration, in the Navigation Pane, click Node Templates.
In the views pane, select a node template.
In the Actions pane, click Edit. The Node Template Editor dialog box appears.
To add a task that will copy the files for the Network Troubleshooting Report diagnostic test to each node, click Add Task, point to Deployment, and then click Run OS command.
Ensure that the new task that you created is selected in the Node template tasks list, and then click Move Down until that task is listed as the last task under Deployment. This will make the new task run after all the other deployment tasks have finished running.
Specify the following properties for the new task:
Set the ContinueOnFailure property to True.
Optionally, in the text box for the Description property, type a description for the task. For example: Copy test report files command.
In the text box for the Command property, type the following command:
\\%ccp_scheduler%\REMINST\NetTroubleshootSetup\CopyCN.cmd
To save the node template with the new task, click Save.
^ Top of page
Uninstall the diagnostic test
To uninstall the Network Troubleshooting Report diagnostic test, do the following:
Uninstall the diagnostic test on the head node
Delete the diagnostic test files from the nodes (optional)
Delete vstat.exe from the nodes (optional)
To uninstall the diagnostic test on the head node
On the head node, close HPC Cluster Manager, if it is currently open.
Open Control Panel. Click Start, and then click Control Panel.
In Control Panel, under Programs, click Uninstall a program.
On the list of installed programs, right-click Microsoft HPC Pack 2008 R2 Network Troubleshooting Report, and then click Uninstall. Follow the steps of the wizard.
Important
To uninstall the Network Troubleshooting Report diagnostic test, you must use the domain credentials of an HPC cluster administrator. If you use local administrator credentials, the uninstallation will fail.
To delete the diagnostic test files from the nodes (optional)
Open an elevated Command Prompt window. Click Start, point to All Programs, click Accessories, right-click Command Prompt, and then click Run as administrator.
At the elevated Command Prompt window, type the following two commands:
clusrun /all del “%CCP_HOME%bin\NetWrench.exe” clusrun /all del “%CCP_HOME%bin\NetTSAdapter.dll”
To delete vstat.exe from the nodes (optional)
- Follow the instructions of your vendor of HCA cards or system builder. If an unattended uninstallation program is provided, you can run a
clusrun
command on the head node to uninstallvstat.exe
on the nodes.
^ Top of page