Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
This document lists the new features and changes that are available in Microsoft HPC Pack 2019 Update 3.
Operating system and software requirements
HPC Pack 2019 Update 3 has an updated set of requirements for the operating system and other prerequisite software.
Note
HPC Pack 2019 has supported Windows Server 2025 since HPC Pack 2019 Update 3. HPC Pack 2019 has supported Windows Server 2022 since HPC Pack 2019 Update 1. Windows Server 2022 is supported for head node role only with .Net Framework 4.8 cumulative update August 9, 2022-KB5015733 or later applied to all cluster nodes and clients on Windows Server 2022.
Important
Starting from HPC Pack 2019 Update 3, 32-bit installations are no longer available. 32-bit SOA DLLs are still supported.
Enhancements to Job Scheduler
Initial support for Kubernetes workloads within HPC Pack
Supported head node FQDN from clients on which the head node's host name cannot be resolved - To enable client connection with head node FQDN when the clients cannot resolve the host name of the head node in another domain, please add registry value named EnableClientFQDN with DWORD value 1 under registry key
HKLM\SOFTWARE\Microsoft\HPC
on the clients.Configurable job history auto-cleanup options - Support the following configurations for job cleanup.
- These configurations can be viewed or set by
Get-HpcClusterProperty
orSet-HpcClusterProperty
. Please use the default values unless there is any specific issue or requirement for the job history auto-cleanup.
SchedulerDeleteOldJobsTotalTimeout // default 14400 seconds SchedulerDeleteOldJobsDefaultCommandTimeout // default 60 seconds SchedulerDeleteOldJobRetryInterval // default 15000 milliseconds SchedulerDeleteOldJobsMaxBatchSize // default 2048 jobs SchedulerDeleteOldJobsMaxTimeout // default 480 seconds
- These configurations can be viewed or set by
Supported Windows environment configurations for Windows registry settings - To use this feature, just set the environment variable with
CCP_CONFIG_
prefix, e.g.,CCP_CONFIG_CertificateValidationType
.- The following set environment command would override the cluster registry
CertificateValidationType
and bypass the certificate validation.
set CCP_CONFIG_CertificateValidationType=0
- The following set environment command would override the cluster registry
Supported jobs packing and tasks spreading on nodes - By default, jobs are spreading on nodes and tasks are packing on nodes.
- To enable jobs packing on nodes, run the following PowerShell cmdlet and then restart HpcScheduler service on all head nodes.
Set-HpcClusterProperty -SchedulerEnvFeatureFlags 'JOB_PACKING_ON_NODE'
- To enable tasks spreading on nodes, run the following PowerShell cmdlet and then restart HpcScheduler service on all head nodes.
Set-HpcClusterProperty -SchedulerEnvFeatureFlags 'TASK_SPREADING_ON_NODE'
Fixed job failure when the cluster property DisableResourceValidation is set to True and the nodes are removed from job's node group - The job would be requeued instead.
Fixed runaway tasks under stress
Fixed clusrun job stuck when running on Linux node with a leftover named pipe from a failed task
Fixed cluster event dispatching issue which caused a scheduler memory leak, job slowness, broker timeouts, and client event loss
Fixed task stuck in queued state due to incorrect required core computation when adding tasks after a job is submitted with task dependencies
Fixed node allocation order for tasks in a job as default packing by node names
Fixed divided by zero exception when viewing job cost due to zero core nodes
Fixed the issue that GPU job finished immediately with all tasks in queued state
Fixed job failure when all nodes are removed from their node groups when
DisableResourceValidation
is set to TrueFixed a job project name cleanup bug where the
SP_DeleteOldJobs
stored procedure was not handling null entries in theProjectId
Column properlyReplaced an index in the
AllocationHistory
table to increase deletion performanceLinux node support updated for latest Linux distro versions
Fixed job stuck in cancelling state due to race condition
Fixed node reservation in queue mode when MIN_MAX_ON_NODE feature is enabled
Improvements to Setup and Management
- Fixes for bursting to Azure IaaS VMs
- Fixes for bursting to Azure Batch pools
- Fixed Entra ID service principal creation error
- Fixed an authentication issue when bursting to IaaS VMs in regional Azure Cloud
- Updated API versions in Azure node template
- Supported Node Cool Down Time for auto grow and shrink on Azure - A new auto grow shrink parameter
NodeCoolDownTime
was added for Azure IaaS VM nodes that failed to grow.- You may set it to 100 minutes using the following PowerShell cmdlet. By default it is set to 10 minutes.
Set-HpcClusterProperty -NodeCoolDownTime 100
- Support for new Azure IaaS VM SKUs
- Improved logging integration with Azure Monitor
- Enhanced Azure deployment using Bicep
- Inclusion of a Log Viewer GUI tool for easier log analysis
- Improved logic for handling Service Fabric certificate keys during installation
- Fixed an issue where service versions in
ServiceManifest.xml
were not set properly, causing Service Fabric cluster installation failure - Security updates for dependent libraries and applications
- Fixed node stuck in draining state due to divide by zero error when removing the node
- Fixed Azure shared image version validation
SOA Runtime and Excel
- .NET 8 SOA service hosts available on Windows compute nodes - To enable .Net 8 SOA service hosts follow the steps below.
- Download and install the latest .Net 8 Runtime and Asp.Net Core 8 Runtime from here
- Copy the installed bits to the head node file share, e.g,
\\<HeadNode>\reminst
, and then run the followingclusrun
commands on the compute nodes.
clusrun /nodegroup:ComputeNodes \\<HeadNode>\reminst\dotnet-runtime-8.0.8-win-x64.exe /install /passive /quiet clusrun /nodegroup:ComputeNodes \\<HeadNode>\reminst\aspnetcore-runtime-8.0.10-win-x64.exe /install /passive /quiet
- Add or update architecture="NET64" under the service section in the service registraion files to switch from .Net Framework service hosts to .Net service hosts.
- To change the built-in Echo service for .Net 8 service hosts, just make the following changes in
CcpEchoSvc.config
file and runEchoClient.exe
to try it out.
<service assembly="%CCP_HOME%Net\NetEchoSvcLib.dll" architecture="NET64" ... >
- Fixed SOA session stuck with slow progress for short echo requests
- Fixed OnExit handler exception caused by race conditions under stress
- Fixed the issue where the create session async call won't be called
- Fixed the exception thrown when Excel.exe couldn't be found
- Fixed the registration of the ExcelDriver Type Library (TLB)
- Support for Excel 2021 in Excel VBA offloading
UI & CMD & SDK
Added SDK support for .NET Standard 2.0 - Check the NuGet package here.
Added SDK support for Linux. - See here for more information.
Fixed the job modify API exception
Fixed the connection leak in Store API
Fixed the SOA client random crash due to
System.InvalidOperationException
using .Net SDKFixed HPC Cluster Manager crashes
Supported fast job commands when the previous job Id macro '!!' is not used - To enable fast job commands, just set user environment variable
CCP_NO_JOB_ID
as True, e.g.,setx CCP_NO_JOB_ID true
Fixed potential deadlocks when
Wait()
onConnectAsync(SchedulerConnectionContext context, CancellationToken token)