Disabling TDR on Windows 8 for your C++ AMP algorithms

The Windows Timeout Detection and Recovery(TDR) mechanism prevents processes from hogging the GPU and rendering the system display unresponsive or denying other processes a fair share of the GPU. I encourage you to read our post Handling TDRs in C++ AMP which extensively covers TDR and various ways of handling TDR occurrences in C++ AMP applications. However, in compute scenarios there is often a genuine need for executing commands on the GPU that run for longer than the stipulated TDR timeout period. Windows 8 offers the ability to programmatically disable TDR for specific devices, thus allowing commands on that device to run for longer than the TDR timeout period, if the OS or other processes are not contending for that GPU simultaneously. In this post, I will show how you can use this new Windows 8 feature to create a C++ AMP accelerator_view where long running commands can be executed without causing TDR.

Creating a Direct 3D 11 device using the D3D11CreateDevice API

The D3D11CreateDevice API creates an ID3D11Device interface which represents a logical device on a display adapter. The “Flags” parameter to this API is a combination of device creation settings from the D3D11_CREATE_DEVICE_FLAG enumeration. Windows 8 introduces a new member to this enumeration viz. D3D11_CREATE_DEVICE_DISABLE_GPU_TIMEOUT which can be used to specify that commands on that device are allowed to run for longer than the usual timeout period without causing a TDR, in absence of contention for that GPU.

Creating a C++ AMP accelerator_view from a ID3D11Device interface pointer

An accelerator_view is an isolated resource and execution context/domain on an accelerator and is your gateway to executing commands on a GPU accelerator in C++ AMP as described in my previous post on accelerator_view queuing_mode. The default accelerator_view of an accelerator or one created through the accelerator::create_view API are subject to the TDR timeout limit and if execution of a command on that accelerator_view exceeds the limit, TDR is initiated.

However, if your application needs to execute long running commands on the GPU, on Windows 8 you can create a Direct3D 11 device with GPU timeout disabled using the D3D11CreateDevice method mentioned above, and subsequently create a C++ AMP accelerator_view using the C++ AMP DirectX interoperability API method concurrency::direct3d::create_accelerator_view. On accelerator_views created through this mechanism, commands are allowed to execute beyond the TDR timeout limit as long as the OS or other processes are not simultaneously contending for the same GPU accelerator.

Following is a code snippet illustrating creation of a C++ AMP accelerator_view which is not subject to TDR timeout:

 unsigned int createDeviceFlags = D3D11_CREATE_DEVICE_DISABLE_GPU_TIMEOUT;
ID3D11Device *pDevice;
ID3D11DeviceContext *pContext;
D3D_FEATURE_LEVEL featureLevel;
HRESULT hr = D3D11CreateDevice(pAdapter,
                               D3D_DRIVER_TYPE_UNKNOWN,
                               NULL,
                               createDeviceFlags,
                               NULL,
                               0,
                               D3D11_SDK_VERSION,
                               &pDevice,
                               &featureLevel,
                               &pContext);

if (FAILED(hr) ||
    ((featureLevel != D3D_FEATURE_LEVEL_11_1) &&
     (featureLevel != D3D_FEATURE_LEVEL_11_0))) 
{
    fprintf(stderr, "Failed to create Direct3D 11 device\n");
    return hr;
}

accelerator_view noTimeoutAcclView = 
    concurrency::direct3d::create_accelerator_view(pDevice);

 

Please note that TDR may be caused due to various reasons and its proper handling in your C++ AMP application depends on the underlying cause of the TDR as discussed in detail in our post on Handling TDRs in C++ AMP. The technique of creating an accelerator_view with TDR disabled must be employed if and only if your application has a genuine need for accelerator operations exceeding the TDR timeout limit. Also remember that:

1) This feature is only available on Windows8.

2) Disabling GPU timeout on devices prevents TDR occurrence only if the OS or other processes are not simultaneously contending for that GPU. If Windows detects contention for the GPU from the Desktop Windows manager or other processes, it will initiate TDR to reset the accelerator_view where a long running command is executing, irrespective of the disablement of GPU timeout on that device. Hence for this technique to be effective in preventing your long running C++ AMP computations from causing TDR, you must pick a dedicated GPU accelerator which is not connected to display and is neither concurrently used by other processes thus eliminating any chances of contention.

I hope this post would help you negotiate the Windows TDR timeout limit for your genuine needs for long running C++ AMP computations on GPU accelerators. Please feel free to ask questions below or in our MSDN concurrency forum!

Comments

  • Anonymous
    March 22, 2012
    This is a bit confusing. If there really is no contention on a GPU (for example, a device not connected to a display), will Windows 8 still try to perform TDR if not programmatically disabled?

  • Anonymous
    March 29, 2012
    Thats correct Rahul - TDR will be performed by Windows even in absence of contention, unless it is programatically disabled. The idea is that Windows will continue to notify existing applications of a potential hang on expiration of the timeout period so that they can investigate any bugs in their code or in the driver. Applications that expect their kernels to run for longer than the default timeout period have to opt into this feature.

  • Anonymous
    April 20, 2012
    Ah just saw your reply now :) Thanks for the clarification!

  • Anonymous
    November 12, 2013
    I tried this code, using NULL as pAdaptor (with the intention of choosing the default), but I just get "Failed to create Direct3D 11 devicen". What am I doing wrong? How can I pick the default adaptor?

  • Anonymous
    January 12, 2014
    Thanks for the info. Is this supported on windows 2012 r2 as well?

  • Anonymous
    April 04, 2014
    It fails for me if i use D3D_DRIVER_TYPE_UNKNOWN - have to use D3D_DRIVER_TYPE_HARDWARE

  • Anonymous
    April 05, 2014
    Thank you for your comment Michael. The API of this function is somewhat involved, and the value of the second parameter depends on the first parameter, as well as has changed between Direct3D 10 and Direct3D 11 (note C++ AMP implementation in Visual Studio 2012 and 2013 depends on the latter). For details please refer to the last part of this article: msdn.microsoft.com/.../ff476082(v=vs.85).aspx .