Plan for GPU acceleration in Windows Server
Applies to: Windows Server 2022, Windows Server 2016, Microsoft Hyper-V Server 2016, Windows Server 2019, Microsoft Hyper-V Server 2019
This article introduces the graphics virtualization capabilities available in Windows Server.
When to use GPU acceleration
Depending on your workload, you may want to consider GPU acceleration. Here's what you should consider before choosing GPU acceleration:
- App and desktop remoting (VDI/DaaS) workloads: If you're building an app or desktop remoting service with Windows Server, consider the catalogue of apps you expect your users to run. Some types of apps, such as CAD/CAM apps, simulation apps, games, and rendering/visualization apps, rely heavily on 3D rendering to deliver smooth and responsive interactivity. Most customers consider GPUs a necessity for a reasonable user experience with these kinds of apps.
- Remote rendering, encoding, and visualization workloads: These graphics-oriented workloads tend to rely heavily on a GPU's specialized capabilities, such as efficient 3D rendering and frame encoding/decoding, in order to achieve cost-effectiveness and throughput goals. For this kind of workload, a single GPU-enabled VM may be able to match the throughput of many CPU-only VMs.
- HPC and ML workloads: For highly data-parallel computational workloads, such as high-performance compute and machine learning model training or inference, GPUs can dramatically shorten time to result, time to inference, and training time. Alternatively, they may offer better cost-effectiveness than a CPU-only architecture at a comparable performance level. Many HPC and machine learning frameworks have an option to enable GPU acceleration; consider whether this might benefit your specific workload.
GPU virtualization in Windows Server
GPU virtualization technologies enable GPU acceleration in a virtualized environment, typically within virtual machines. If your workload is virtualized with Hyper-V, then you'll need to employ graphics virtualization in order to provide GPU acceleration from the physical GPU to your virtualized apps or services. However, if your workload runs directly on physical Windows Server hosts, then you have no need for graphics virtualization; your apps and services already have access to the GPU capabilities and APIs natively supported in Windows Server.
The following graphics virtualization technologies are available to Hyper-V VMs in Windows Server:
In addition to VM workloads, Windows Server also supports GPU acceleration of containerized workloads within Windows Containers. For more information, see GPU Acceleration in Windows containers.
Discrete Device Assignment (DDA)
Discrete Device Assignment (DDA), also known as GPU pass-through, allows you to dedicate one or more physical GPUs to a virtual machine. In a DDA deployment, virtualized workloads run on the native driver and typically have full access to the GPU's functionality. DDA offers the highest level of app compatibility and potential performance. DDA can also provide GPU acceleration to Linux VMs, subject to support.
A DDA deployment can accelerate only a limited number of virtual machines, since each physical GPU can provide acceleration to at most one VM. If you're developing a service whose architecture supports shared virtual machines, consider hosting multiple accelerated workloads per VM. For example, if you're building a desktop remoting service with RDS, you can improve user scale by leveraging the multi-session capabilities of Windows Server to host multiple user desktops on each VM. These users will share the benefits of GPU acceleration.
For more information, see these topics:
- Plan for deploying Discrete Device Assignment
- Deploy graphics devices using Discrete Device Assignment
Because of security concerns, RemoteFX vGPU is disabled by default on all versions of Windows starting with the July 14, 2020 Security Update and removed starting with the April 13, 2021 Security Update. To learn more, see KB 4570006.
RemoteFX vGPU is a graphics virtualization technology that allows a single physical GPU to be shared among multiple virtual machines. In a RemoteFX vGPU deployment, virtualized workloads run on Microsoft's RemoteFX 3D adapter, which coordinates GPU processing requests between the host and guests. RemoteFX vGPU is most suitable for knowledge worker and high-burst workloads where dedicated GPU resources are not required. RemoteFX vGPU can only provide GPU acceleration to Windows VMs.
For more information, see these topics:
Comparing DDA and RemoteFX vGPU
Consider the following functionality and support differences between graphics virtualization technologies when planning your deployment:
|Description||RemoteFX vGPU||Discrete Device Assignment|
|GPU resource model||Dedicated or shared||Dedicated only|
|VM density||High (one or more GPUs to many VMs)||Low (one or more GPUs to one VM)|
|App compatibility||DX 11.1, OpenGL 4.4, OpenCL 1.1||All GPU capabilities provided by vendor (DX 12, OpenGL, CUDA)|
|AVC444||Enabled by default||Available through Group Policy|
|GPU VRAM||Up to 1 GB dedicated VRAM||Up to VRAM supported by the GPU|
|Frame rate||Up to 30fps||Up to 60fps|
|GPU driver in guest||RemoteFX 3D adapter display driver (Microsoft)||GPU vendor driver (NVIDIA, AMD, Intel)|
|Host OS support||Windows Server 2016||Windows Server 2016; Windows Server 2019|
|Guest OS support||Windows Server 2012 R2; Windows Server 2016; Windows 7 SP1; Windows 8.1; Windows 10||Windows Server 2012 R2; Windows Server 2016; Windows Server 2019; Windows 10; Linux|
|Hypervisor||Microsoft Hyper-V||Microsoft Hyper-V|
|GPU hardware||Enterprise GPUs (such as Nvidia Quadro/GRID or AMD FirePro)||Enterprise GPUs (such as Nvidia Quadro/GRID or AMD FirePro)|
|Server hardware||No special requirements||Modern server, exposes IOMMU to OS (usually SR-IOV compliant hardware)|