WGF11 Interfaces
This automated test verifies different parts of the hardware and the driver for valid shader execution when interfaces are used.
The test focuses on covering hidden buffer information that is provided to the driver through the DDI, which includes interface types and resource locations. Some information that is used in interfaces is embedded in the shader itself, such as the function tables and class type tables. The hardware is only required to correctly call and index these tables, because the runtime does all the management required to keep the information organized and mapped correctly.
The test only executes valid scenarios and verifies results are successful. No failures are expected to occur on certifiable D3D11 hardware.
The test logs to WTT as previous conformance tests have, and contain useful information about the failure to help IHVs to debug their failures.
This topic applies to the following test jobs:
WGF11 Interfaces
WGF11 Interfaces (WoW64)
Test details
Associated requirements |
Device.Graphics.AdapterRender.D3D111Core.D3D111CorePrimary Device.Graphics.AdapterRender.D3D11Core.D3D11CorePrimary |
Platforms |
Windows 7 (x64) Windows 7 (x86) Windows RT (ARM-based) Windows 8 (x64) Windows 8 (x86) Windows Server 2012 (x64) Windows Server 2008 R2 (x64) Windows RT 8.1 Windows 8.1 x64 Windows 8.1 x86 Windows Server 2012 R2 |
Expected run time |
~2 minutes |
Categories |
Certification Functional |
Type |
Automated |
Running the test
Before you run the test, complete the test setup as described in the test requirements: Graphic Adapter or Chipset Testing Prerequisites.
Troubleshooting
For troubleshooting information, see Troubleshooting Device.Graphics Testing.
The test returns SKIP when run on feature level < 11. The test returns SKIP when run on feature level < 11.
More information
WGF11Interfaces - Interface Shader Execution
The WGF11Interfaces cover all aspects of data transfer to the driver, along with correct execution of shader IL.
A description of each test group and the necessary command-line parameter is listed later in this section. Whole shaders are not provided in this documentation. However, a description of the shader's intended goal and the types of inputs are described to provide information about how to test in Windows Hardware Quality Labs (WHQL). In addition, each test is run on all shader stages to verify consistent behavior for a feature that adheres to the unified architecture goals.
The tests use ints and uints as inputs and during calculation wherever possible, because precision and verification of floating point math is covered in a different conformance test.
Tests that use samplers perform point sampling and use border color to verify whether the correct sampler is used. Filtering and other aspects of sampler coverage are covered in a different conformance test. Interface testing is only concerned with the correct indexing of samplers that are used by class instances during execution.
Tests that use resources focus on formats with 8-bit channels and no MIP levels. Other tests verify resource correctness. Interface testing is only concerned with the correct indexing of textures that are used by class instances during execution. Only resource loads are used. Because they cannot be indexed, UAVs are not important for interfaces.
Tests are run against every shader stage, because the feature is supposed to fit within the unified architecture of Shader Model 5.0.
Each test has a Pri-1 task and a Pri-2 task, which, when combined, complete coverage of the feature. Pri-1 tasks require that a test to run only on a specific shader stage. Pri-2 tasks test the remaining shader stages.
All instances are created by the runtime using the following API calls:
HRESULT CreateClassInstance(LPCWSTR pszClassTypeName,UINT ConstantBufferOffset,UINT ConstantVectorOffset,UINT TextureOffset,UINT SamplerOffset,ID3D11ClassInstance **ppInstance);
Instances are set when the shader is set by using XXSetShader() calls.
WGF11Interfaces.exe Interfaces\FunctionTables and fcall\[ PS]
Pri-1 16 hours
The large function table test verifies whether hardware is able to manage the shader programs output by the compiler. This verification is specific to shaders that have many cascading interface calls that resulted in large generations of code that have many function bodies. This test does not test the performance of such shaders, but tests whether the execution is correct when compared against the reference rasterizer.
Several shaders are written that will sequentially double the number of function bodies that are created by the compiler. These shaders are then run with several variations on the instances that fill the slots, in order to verify correct execution through a subset of code paths. At any time, the test may attempt to validate all code paths, and it is expected that hardware will not fail on any of them. If the hardware returns out-of-memory during shader creation time, the test returns RESULT_SKIP, when reasonable. The resource requirements of these shaders should not push the limits of hardware. As such, the shader should bind and execute just fine. If hardware fails on even small function tables, a warning occurs. The hardware should support a minimum of 4 KB function bodies.
The cascading functions are designed by using several interface types, each which relies on an object of another instance for its parameter. These calls are designed to be several layers deep and may easily result in more than 1k function bodies to be created.
The test also provides coverage of the fcall instruction by extensively calling other functions to get appropriate coverage of the 4 KB function bodies in the shader. This can be done by varying the bound instances by using XXSetShader by the runtime. The framework provides a simple way to get adequate coverage through permutations of bound types.
To ensure that the test does not need maintenance due to changes in the compiler, each function body must be unique from all other function bodies. This is because the compiler may at some point collapse function bodies with identical code to reduce code size and provide a more optimized function table. By ensuring that all function bodies are different, you prevent the test from being changed or becoming obsolete when this optimization is added to the compiler. It is important to review the IL to make sure that the expected code is produced.
Example:
interface Type0{ float2 func( float x );};interface Type1{ float4 func( const Type0 param, inout float2 y );};interface Type2{ float func( Type0 param0, Type1 param1 );};interface Type3{ uint func( out float z, Type0 param0, Type1 param1, Type2 param2 );};class AT0 : Type0;class BT0 : Type0;class CT0 : Type0;class DT0 : Type0;class ET0 : Type0;class FT0 : Type0;class AT1 : Type1;class BT1 : Type1;class CT1 : Type1;class DT1 : Type1;class AT2 : Type2;class BT2 : Type2;class CT2 : Type2;class DT2 : Type2;class ET2 : Type2;class AT3 : Type3;class BT3 : Type3;class CT3 : Type3;class DT3 : Type3;class ET3 : Type3;class FT3 : Type3;
If the interface for Type3 is called, and each implementation calls the interface method of an input parameter, 720 function bodies are created by all possible code paths, each one optimized for its particular code path. Because there is no limitation to code size other than memory, even very large shaders should execute without error.
Shader code is optimized across fcall sites, so it is possible for each call to be unique based on information from the caller and the callee. Simple multiplying by a constant is not enough; instead, each function body must be unique by doing different operations and using member variables. The resulting output must be verified, so multiplication by prime numbers or something similar would work. To verify that the shader was executed correctly, the same math must be done on the CPU based on the bound class instances.
This test also covers call tree depth (32). It is important to test that more than 32 fcalls result in no-ops (you can only test 33 at a given time). The compiler does not allow code to be written to simply test this case, so you need a more sophisticated approach that can dynamically change the call depth and verify that each call was made (or not made).
A fibinachi sequence or similar could be useful for this. Recursive calls are not allowed, so there must be 33 interfaces that can be called one after another. Locally, the instances must decide whether or not to continue the fcall depth. The runtime binds the data to drive these tests.
This test is written for the pixel shader as Pri-1.
Completion of this test proves the following:
The fcall works.
The function tables are correct and are being used correctly.
The 4 KB function table limit is supported.
The call depth is 32.
Results for the test are captured in the following render target:
WGF11Interfaces.exe Interfaces\FunctionTables and fcall \[VS,GS, HS,DS,CS]
Pri-2
The test covers all shader stages.
Verification of this feature supports a wide range of valid design patterns and OO programming techniques that Shader Model 5.0 was designed to support.
Results for these tests are captured in a stream out buffer for VS,HS,DS, and GS. Results for PS and CS are captured by render targets and writeable buffers.
WGF11Interfaces.exe Interfaces\GroupExecutionPathDifferences\[ CS]
Pri-1 40 hours
Exploiting parallelism is very important when designing hardware. However, capitalizing on such opportunities should not disrupt the correct execution of a shader when code paths diverge. This test runs on each shader stage and provides coverage that executes different code paths despite the fact that pixels, vertices, and other pipeline stage primitives might be getting executed as lock step groups. This test does not test performance in such cases. Instead, it is only verifies a valid result after execution.
Providing data to each pipeline stage is important and must be done in sizeable chunks to verify coverage of varied execution across groups. The following method provides the data to the tests for each shader stage:
Compute Shader:
Data that is used to select instances that are based on array slots is provided by using resources to the compute shader. The SV_GroupID and SV_GroupThreadID are used to select the correct data from the resource to choose which class instances to use during the invocation of a shader. The results of execution are written to a writable buffer to be verified that it is correct. A large dimension count of threads in each dimension is used. This includes large prime numbers of threads and multiple groups that will be dispatched to get adequate coverage for this test.
Each hardware thread should execute a method call in a different type provided by the runtime. The runtime is able to calculate the result for verification based on the type used, the data provided, and the algorithm of the type. The code for each method should vary in length and complexity to prove the hardware can handle shaders like this. 12-18 different class implementations should be used, and each SV_[Index] value can be pseudo modded to pick which array index and which class instance to execute.
WGF11Interfaces.exe Interfaces\GroupExecutionPathDifferences\[VS,GS,PS,HS,DS]
Pri-2 40 hours
The test is extended to the other shader stages.
Vertex Shader:
A set of interface slot array indexes are added to the vertex data and used during vertex shader execution to determine which interface instance to invoke. The data also covers which instances are used as parameters when methods are invoked. A block of at least 512 points are drawn to verify hardware behavior. The results are caught by a stream out buffer.
Hull Shader:
The interface slot array indexes are added to the control point input structure, allowing each point to determine which class instances are invoked over the course of the shader. That data also covers which instances are used as parameters when methods are invoked. A full 32 point patch is drawn several times to verify hardware behavior. The results will be caught by a stream out buffer.
This test also forwards the data to the Patch Constant function, which also verifies correct behavior.
In addition, the SV_OutputControlPointID and specific forking code are turned on in the compiler. The forking code causes divergent code execution paths by using interfaces in this stage as well. Forking can be accessed by using a loop and calling an interface method from within the loop.
Domain Shader:
Data is passed through the Hull shader on each control point and then recovered by using the SV_PrimitiveID available in the Domain Shader. The tessellator output positions are combined with the SV_PrimitiveID to create the indexes into the available class instance slots during execution. The full 32 point patch is drawn several times to verify hardware behavior. The results are caught by a stream out buffer. The focus is not to cover all domain types.
Geometry Shader:
Interface slot indexes are attached to point primitives that are provided to the geometry shader. The indexes are used to choose class instances to invoke methods on and use as parameters during shader execution. A block of at least 512 points are drawn to verify hardware behavior. The results are captured in a stream out buffer for verification.
Pixel Shader:
For pixel shaders, a texture is used to provide the class instance indexes for each pixel. The texture exactly matches the size of the render target by drawing a large quad. An area of at least 512x512 pixels is drawn, with the supporting resources, to verify hardware behavior. The results are captured in a render target for validation. SV_Position and SV_SampleIndex can also be used to determine how a pixel indexes into the interface slots. Drawing a triangle is more interesting than drawing a quad.
WGF11Interfaces.exe Interfaces\IndexingResources and this[]\[VS]
Pri-1 26 hours
All resources that are used by an interface were made indexable by the IL in order to support dynamic runtime binding. The data is provided through the DDI and includes the following information:
Class type ID
Cbuffer
Cbuffer offset
Texture slot
Sampler slot
This test ensures that all this information is used correctly by the driver and shader execution. Access to this information can only be done through the "this" keyword, which uses a hidden reserved buffer. Class instances that are 256 bytescan be bound to a shader, so this test provides coverage for using all 256 instance slots. Doing so implies that this needs to be used in combination with each of the other areas in this test. However, the other areas do not need to be verified through permutation with each other.
The test cycles location for all the different slots and offsets in the resources and uses those resources when producing results.
In order to get full coverage, each class instances should execute a method that makes use of the resource data to produce a result. By doing this, it ensures that the instance type ID is being used correctly with respect to the function tables.
Each cbuffer must be tested for class data. The data needs to be placed throughout the buffer using the offset parameter. This can be done by binding 256 instances each with a different location set by the runtime. The shader can execute 256 vertices and use the SV_PrimitiveID to determine the instance slot to use.
Each tbuffer slot in the 128 available must be used in the same way as mentioned previously. Only a simple buffer or texture2d need be used, and only the load instruction is tested. The test is only interested in correct indexing of the texture registers.
Each sampler slot in the 16 available to a shader stage must be used in the same way as mentioned previously. Samplers will be sampled outside of texture boundary so that a border color is returned. Each of the 16 samplers should have a unique border color in order for the test to verify that the correct sampler was used.
These can be tested separately--combined coverage is unnecessary.
F11Intefaces.exe Interfaces\IndexingResources and this[]\[GS,PS,HS,DS,CS]
Pri-2 26 hours
The previous test is extended to cover all shader stages.
(Pri 1 18 hours) The deferred context should also be used in these tests.
The test cases outlined will also be run on a deferred context by using command lists to set classes and instances.
Command lists do not inherit state from the immediate context. Therefore, instances set on the immediate context should not be accessible when executing a command list.
State clearing on the deferred context (through the bool parameter in ExecuteCommandList and FinishCommandList) should be tested with interfaces and classes.
Command syntax
Command option | Description |
---|---|
Wgf11interfaces |
Runs the test jobs. Without any options, the test enumerates devices. |
-FeatureLevel:XX.X |
Sets the feature lewlve, where XX.X is the Feature Level the test will run at: 10.0, 10.1, or 11.0. |
Note
For command line help for this test binary, type /?.
File list
File | Location |
---|---|
Configdisplay.exe |
<[testbinroot]>\nttest\windowstest\tools\ |
D3d11_1sdklayers.dll |
<[testbinroot]>\nttest\windowstest\graphics\d3d\support\ |
D3d11ref.dll |
<[testbinroot]>\nttest\windowstest\graphics\d3d\support\ |
D3d11sdklayers.dll |
<[testbinroot]>\nttest\windowstest\graphics\d3d\support\ |
D3dcompiler_test.dll |
<[testbinroot]>\nttest\windowstest\graphics\d3d\support |
D3dx10_test.dll |
<[testbinroot]>\nttest\windowstest\graphics\d3d\support\ |
d3dx11_test.dll |
<[testbinroot]>\nttest\windowstest\graphics\d3d\support\ |
TDRWatch.exe |
<[testbinroot]>\nttest\windowstest\graphics\ |
Wgf11interfaces.exe |
<[testbinroot]>\nttest\windowstest\graphics\d3d\conf |