Direct3D 11 provides the ability to use compute shaders that operate on most Direct3D 10.x hardware, with some limitations to operation. The compute shader technology is also known as the DirectCompute technology. This topic discusses how to make use of compute shaders in a Direct3D 11 app on Direct3D 10 hardware.
Support for compute shaders on downlevel hardware is only for devices compatible with Direct3D 10.x. Compute shaders cannot be used on Direct3D 9.x hardware.
To check if Direct3D 10.x hardware supports compute shaders, call ID3D11Device::CheckFeatureSupport. In the CheckFeatureSupport call, pass the D3D11_FEATURE_D3D10_X_HARDWARE_OPTIONS value to the Feature parameter, pass a pointer to the D3D11_FEATURE_DATA_D3D10_X_HARDWARE_OPTIONS structure to the pFeatureSupportData parameter, and pass the size of the D3D11_FEATURE_DATA_D3D10_X_HARDWARE_OPTIONS structure to the FeatureSupportDataSize parameter. CheckFeatureSupport returns TRUE in the ComputeShaders_Plus_RawAndStructuredBuffers_Via_Shader_4_x member of D3D11_FEATURE_DATA_D3D10_X_HARDWARE_OPTIONS if the Direct3D 10.x hardware supports compute shaders.
Raw (RWByteAddressBuffer) and Structured (RWStructuredBuffer) Unordered Access Views are supported on downlevel hardware, with the following limitations:
Pixel Shaders on downlevel hardware do not support unordered access.
Shader Resource Views (SRVs)
Raw and Structured Buffers as Shader Resource Views are supported on downlevel hardware for read-only access, as they are on Direct3D 11 hardware. These resource types are supported for Vertex Shaders, Geometry Shaders, Pixel Shaders as well as Compute Shaders.
Thread Groups
A compute shader can execute on many threads in parallel, within a thread group.
Thread groups are supported on downlevel hardware, with the following limitations:
Thread Group Dimensions
Thread groups defined for downlevel hardware are limited to X and Y dimensions of 768. This is less than the maximum values of 1024 for Direct3D 11 hardware. The maximum Z dimension of 64 is unchanged.
The total number of threads in the group (X × Y × Z) is limited to 768. This is less than the limit of 1024 for Direct3D 11 hardware.
If these numbers are exceeded, shader compilation will fail.
Two-Dimensional Thread Indices
A particular thread within a thread group is indexed using a 3D vector given by (x,y,z).
For compute shaders operating on downlevel hardware, thread groups only support two dimensions. This means that the Z value in the 3D vector must always be 1.
This limitation specifically applies to the following:
Thread Group Shared Memory is limited to 16Kb on downlevel hardware. This is less than the 32Kb that is available to Direct3D 11 hardware.
A Compute Shader thread may only write to its own region of TGSM. This write-only region has a maximum size of 256 bytes or less, with the maximum decreasing as the number of threads declared for the group increases.
The following table defines the per-thread maximum size of a TGSM region for the number of threads in the group:
Number of Threads in Group
Maximum TGSM Size Per Thread
0-64
256
65-68
240
69-72
224
73-76
208
77-84
192
85-92
176
93-100
160
101-112
144
113-128
128
129-144
112
145-168
96
169-204
80
205-256
64
257-340
48
341-512
32
513-768
16
A Compute Shader thread may read the TGSM from any location.
D3DCompile with D3DCOMPILE_SKIP_OPTIMIZATION
D3DCompile returns E_NOTIMPL when you pass cs_4_0 as the shader target along with the D3DCOMPILE_SKIP_OPTIMIZATION compile option. The cs_5_0 shader target works with D3DCOMPILE_SKIP_OPTIMIZATION.
Az Azure HPC a HPC & AI számítási feladatok célhoz kötött felhőalapú képessége, amely élvonalbeli processzorokat és HPC-osztályú InfiniBand-összekapcsolásokat használ, így a legjobb alkalmazásteljesítményt, méretezhetőséget és értéket nyújtja. Az Azure HPC lehetővé teszi a felhasználók számára az innovációt, a termelékenységet és az üzleti rugalmasságot a magas rendelkezésre állású HPC & AI-technológiák révén, amelyek az üzleti és technikai igények változásával dinamikusan lefoglalhatók. Ez a képzési terv o