Supporting dGPUs on Modern Standby Systems

Overview

Discrete GPUs present a challenging problem for Modern Standby designs, as they put strain on the platform both in terms of power consumption and resume from sleep latency. It’s a fundamental power vs. performance tradeoff—if the dGPU keeps its memory in VRAM (either by staying active in D0 or by enabling VRAM self-refresh capabilities in D3), then it consumes additional power—a problem in the face of upcoming power regulations. However, if the dGPU enters a D3 state and fully offloads its contents to power down VRAM, the system could suffer from extended latencies when resuming from sleep, as it can take several seconds to restore VRAM contents from main memory. This tradeoff must be balanced to ease adoption of Modern Standby for systems with dGPUs and provide the best user experience possible. This document aims to explain the dGPU problem in detail and outline guidance for supporting dGPUs in Modern Standby systems.

Current dGPU Support

The problem of supporting dGPUs can be thought of along two separate categories— (1) add-in dGPU cards and (2) soldered-in dGPUs. The following chart outlines the situation and guidance across these scenarios. The remaining sections expand upon the information and requirements highlighted in this table.

As the ecosystem works towards better dGPU power & performance: :

Plug-in dGPUs: dGPUs that are plugged into an open PCIe slot, either in a shipping configuration or added independently by an end user. This applies to any system with an open PCIe slot capable of supporting an add-in dGPU card.

  • Systems must implement the PCI ECN to allow the dGPU to enter its D3cold state
  • The dGPU must be capable of entering D3cold
  • Self-refresh support in the dGPU is optional—if supported, the system provides a better user experience by providing the “Instant On” capability, power regulations should be considered.
  • Threshold for the OS using the dGPU’s self-refresh capability is 300 MB in VRAM
  • If the dGPU is not using self-refresh (lacks support or feature disabled in the graphics driver), Microsoft’s Directed PoFx (DFx) framework will force the dGPU down from D0 after 2 minutes in idle—VRAM will be offloaded
  • Systems with dGPUs are exempted from the requirement in HLK testing for a 1 second resume latency from Modern Standby since offloading VRAM causes a latency hit

Soldered-down designs: This covers both high end notebooks or all-in-ones shipping with a dGPU and hybrid systems that use both an integrated and discrete GPU.

  • The dGPU must be capable of entering D3cold.
  • Self-refresh support in the dGPU is recommended—if supported, the system provides a better user experience by delivering the “Instant On” capability, but the system must be carefully engineered to meet power regulations.
  • Threshold for the OS using the dGPU’s self-refresh capability is 300 MB in VRAM
  • If the dGPU is not using self-refresh (lacks support or feature disabled in the kernel mode graphics driver), Microsoft’s Directed PoFx (DFx) framework will force the dGPU down from D0 after 2 minutes in idle—VRAM will be offloaded
  • Systems with dGPUs are exempted from the requirement in HLK testing for a 1 second resume latency from Modern Standby since offloading VRAM causes a latency hit.

Essentially, there are a couple of main points regarding dGPU support in Modern Standby designs:

  • System manufacturers should first ensure their systems meet power regulations, and then optimize for the best possible user experience with respect to resume time from modern standby.
  • The ecosystem is moving towards self-refresh VRAM being a requirement for maintaining a great user experience with lower dGPU power consumption.
    • It is in the ecosystem’s best interest to invest in improving GDDR power consumption.

dGPU VRAM Self-Refresh Behavior

This section discusses the current heuristics around dGPU self-refresh behavior—system designers should take this into account when evaluating their system behavior and performance, as these will be dependent on the scenario—specifically, it will depend on the dGPU’s self-refresh capability and how much content is currently held in VRAM.

Beginning with Windows 10, the operating system is smart about making the decision for when to use self-refresh and when to not use self-refresh. If the VRAM is relatively empty when entering sleep, it will be powered off without making use of self-refresh. Otherwise, self-refresh VRAM will be used. The threshold for this behavior is currently defined as 300 MB of contents in VRAM and may be further optimized in the future. The following table describes the current self-refresh heuristics:

Entering Modern Standby with <= 300 MB of VRAM in use Entering Modern Standby with > 300 MB of VRAM in use

dGPU with self-refresh VRAM support

  • VRAM is evicted
  • D3cold with VRAM off
  • VRAM is preserved
  • D3cold with VRAM in self-refresh

dGPU without self-refresh VRAM

  • VRAM is evicted
  • D3cold with VRAM off
  • VRAM is preserved
  • dGPU stays in D0
  • Microsoft’s Directed PoFx (DFx) framework will force the dGPU into D3cold and evict VRAM after 2 minutes

Add-in dGPU card solution

This section outlines the requirements of supporting an add-in dGPU card in a Modern Standby desktop system design on a high level. For implementation specifics, refer to hardware vendor guidance.

When building a Modern Standby desktop system with support for an add-in dGPU card, there are a few key requirements that must be included for a complete solution. These requirements span across dGPU design, dGPU drivers, motherboard design, and firmware implementation.

Requirement Description Resources

The BIOS implements the _DSM specified in the attached PCI ECN

These functions allow for the PCIe device driver to negotiate with the platform for the aux power necessary for the dGPU card to support self-refresh in D3cold.

_DSM Additions for Runtime Device Power Management

The SoC can provide up to 1A of auxiliary power to the PCIe slot

This is specified in the ECN to the PCI spec. dGPU power consumption varies by card, so the SoC must be capable of providing the necessary power to support self-refresh on cards, which varies up to 1A.

Refer to hardware vendor's implementation guidance

The dGPU card supports self-refresh VRAM and D3cold

Self-refresh VRAM allows the dGPU card to enter D3Cold while still maintaining its memory contents for a short resume latency. This is essential to the Modern Standby promise of an “Instant On” experience.

Refer to dGPU designer’s implementation guidance