GPU virtual address

2024-12-19

This article describes GPU virtual address (GPUVA) concepts and how they're managed starting with WDDM 2.0 (Windows 10).

GPUVAs are managed in logical 4-KB or 64-KB pages at the device driver interface (DDI) level. Using these page sizes allows GPUVAs to reference either:

System memory, which is always allocated at a 4-KB granularity.
Memory segment pages, which can be managed at either 4 KB or 64 KB.

The video memory manager (VidMm) supports a multilevel virtual address translation scheme, where several levels of page tables are used to translate a virtual address:

The levels are numbered from zero. Level zero is assigned to the leaf level.
Translation starts from the root level page table.

When the number of page table levels is two, the root level page table can be resized to accommodate a process with variable GPUVA space size. Every level is described by the DXGK_PAGE_TABLE_LEVEL_DESC structure which the kernel-mode display driver (KMD) fills in during a DxgkDdiQueryAdapterInfo call. The KMD also fills out the DXGK_GPUMMUCAPS caps structure to describe the GPUVA support.

Each process has its own GPUVA space. Before a graphics context of a process can be set for execution, KMD's DxgkDdiSetRootPageTable function is called to set the root page table address.

The virtual address translation for the case of two page table levels is shown in the following diagram.

Diagram that shows virtual address translation for two page table levels.

The GPUVA has DXGK_GPUMMUCAPS::VirtualAddressBitCount bits.
The low bits [0 - 11] represent an offset in bytes in a page.
The next DXGK_PAGE_TABLE_LEVEL_DESC::PageTableIndexBitCount bits represent the index of a page table entry in a leaf level page table.
The number of entries in a page table is 2^{DXGK_PAGE_TABLE_LEVEL_DESC::PageTableIndexBitCount} and the page table size is DXGK_PAGE_TABLE_LEVEL_DESC::PageTableSizeInBytes bytes.
The rest of the bits represent an index to a page table entry in the root page table. The root page table is resizable for the two-level translation scheme. The DxgkDdiGetRootPageTableSize DDI obtains its size.

The DXGK_PTE structure is used through the DDI to represent a page table entry. This structure represents information about each entry, which the DirectX graphics kernel (Dxgkrnl) manages. The driver uses this information to build hardware-specific page table entries.

Creation of page table allocations

Page tables are created as implicit allocations and don't have a user-mode driver (UMD) or a KMD handle.

To allocate a page table, VidMm allocates an allocation of size DXGK_PAGE_TABLE_LEVEL_DESC::PageTableSizeInBytes from the segment, specified in DXGK_PAGE_TABLE_LEVEL_DESC::PageTableSegmentId. After creation, VidMm initializes every entry in the page table to invalid. Page tables never change size, except for the root page table in the two-level translation scheme.

VidMm supports resizing of the root page table in the two-level translation scheme. When a root page table, covering a specified amount of address space, is being created, VidMm calls DxgkDdiGetRootPageTableSize to determine the required allocation size for it. VidMm then allocates an allocation of that size in the segment, specified by DXGK_PAGE_TABLE_LEVEL_DESC::PageTableSegmentId for the root level. After creation, VidMm initializes every entry in the page table to invalid using the new UpdatePageTable paging operation. The root page table can grow or shrink as the amount of video address space that a process needs expands and shrinks. Once the root page table is created, VidMm calls DxgkDdiSetRootPageTable to associate the newly created root page table with the various contexts that will execute within.

In linked display adapter configurations, root page tables are created as LinkMirrored allocations. These allocations have identical content and are located at the same physical address on each GPU in the link. Lower level page tables are allocated as LinkInstanced allocation to reflect the fact that their content can vary between GPUs, typically because of different peer mapping. The content of page tables is updated separately on all GPUs.

Growing and shrinking a root page table

This section is applicable only for systems with two levels of page tables. When the number of page table levels is greater than two, the page table size for each level is defined by the virtual addressing caps and is fixed.

When the UMD requests GPUVAs, VidMm grows the size of the address space of a process to accommodate the request. It does so by growing the size of the current root page table (if necessary) and allocating new page tables for the new range.

To grow a root page table VidMm creates another root page table allocation, makes it resident, initializes its entries, and destroys the old allocation. The DxgkDdiGetRootPageTableSize function is used to get the size of the new page table in bytes.

To shrink a root page table, VidMm creates a new page table allocation, makes it resident, copies a portion of the old page table to the new one, and destroys the old allocation.

After the resize operation completes, VidMm calls DxgkDdiSetRootPageTable to associate the impacted contexts with their new root page table.

Updating page table

As surfaces move around in memory, VidMm updates the content of page tables to reflect the new location of surfaces.

Moving a page table

*VidMm can relocate or evict page tables when a device is idle or suspended. When VidMm moves a page table, it updates the higher levels page table to reference the new location of the page table.

When the root page table itself is relocated, VidMm calls DxgkDdiSetRootPageTable to inform impacted contexts of the new location of their page directory.

Physical page size

As mentioned previously, VidMm supports two page sizes. System memory is always managed in 4-KB pages, while memory segments can be managed at either 4 KB or 64-KB granularity as determined by the KMD.

When opting for virtual memory to be managed in 64-KB pages, all allocations are automatically aligned and sized to be multiple of 64 KB.

Expanding all allocations to 64 KB can have a significant effect on memory. UMD is responsible for packing small allocations into a larger one to avoid wasting memory.

When VidMm maps a GPUVA to a large 64-KB memory segment page, it maps 4-KB page table entries to 16 contiguous 4-KB pages in the memory segment. Both the virtual address and the physical address are guaranteed to share the same 64-KB alignment. That is, the bottom 16 bits of the virtual address and the physical address are guaranteed to match.