Direct3D video motion estimation

This article contains guidance for motion vector estimation with Direct3D 12 video APIs. This feature was introduced in Windows 10, version 2004 (10.0; Build 19041). Motion estimation is the process of determining motion vectors that describe the transformation from one 2D image to another. Motion estimation is an essential part of video encoding and can be used in frame rate conversion algorithms.

While motion estimation can be implemented with shaders, the purpose of the D3D12 Motion Estimation feature is to expose fixed function acceleration for motion searching to offload this part of the work from 3D. Often this comes in the form of exposing the GPU video encoder motion estimator. The goal of D3D12 Motion estimation is optical flow, but it should be noted that encoder motion estimators may be optimized for improving compression.

Checking for Support

Determine support for all D3D video features by calling ID3D12VideoDevice::CheckFeatureSupport and passing in a value from the D3D12_FEATURE_VIDEO enumeration to specify the feature for which support is being queried. To query the supported block size and resolutions for a given format, use the D3D12_FEATURE_VIDEO_MOTION_ESTIMATOR value and supply a D3D12_FEATURE_DATA_VIDEO_MOTION_ESTIMATOR structure for the pFeatureSupportData as shown in the example below. In the current release, only DXGI_FORMAT_NV12 is supported, so content in other formats may need to be color converted and downsampled to use motion estimation.

D3D12_FEATURE_DATA_VIDEO_MOTION_ESTIMATOR MotionEstimatorSupport = {0u, DXGI_FORMAT_NV12};
VERIFY(spVideoDevice->CheckFeatureSupport(D3D12_FEATURE_VIDEO_MOTION_ESTIMATOR, &MotionEstimatorSupport, sizeof(MotionEstimatorSupport)));

The motion estimator context object

The ID3D12VideoMotionEstimator object maintains context for video motion estimation operations. Create a new instance of this interface by calling ID3D12VideoDevice1::CreateVideoMotionEstimator.

The selected block size, precision, and supported size range would depend on values supported by hardware returned from the D3D12_FEATURE_VIDEO_MOTION_ESTIMATOR feature check. You can select a smaller size range than the driver supports. Size range informs internal allocation sizes.

D3D12_VIDEO_MOTION_ESTIMATOR_DESC motionEstimatorDesc = { 
    0, //NodeIndex
    DXGI_FORMAT_NV12, 
    D3D12_VIDEO_MOTION_ESTIMATOR_SEARCH_BLOCK_SIZE_16X16,
    D3D12_VIDEO_MOTION_ESTIMATOR_VECTOR_PRECISION_QUARTER_PEL, 
    {1920, 1080, 1280, 720} // D3D12_VIDEO_SIZE_RANGE
    }; 

CComPtr<ID3D12VideoMotionEstimator> spVideoMotionEstimator;
VERIFY_SUCCEEDED(spVideoDevice->CreateVideoMotionEstimator(
    &motionEstimatorDesc, 
    nullptr,
    IID_PPV_ARGS(&spVideoMotionEstimator)));

Storage of motion vectors

The ID3D12VideoMotionVectorHeap stores motion vectors. This interface is used by the D3D12_VIDEO_MOTION_ESTIMATOR_OUTPUT structure returned from ID3D12VideoEncodeCommandList::EstimateMotion. The resolved output 2D texture is a DXGI_FORMAT_R16G16_SINT texture where R holds the horizontal component and G holds the vertical component of the motion vector. This texture is sized to hold one pair of components per block. Call ID3D12VideoEncodeCommandList::ResolveMotionVectorHeap to translates the motion vector output of EstimateMotion from hardware-dependent formats into a consistent format defined by the video motion estimation APIs.

D3D12_VIDEO_MOTION_VECTOR_HEAP_DESC MotionVectorHeapDesc = { 
    0, // NodeIndex 
    DXGI_FORMAT_NV12, 
    D3D12_VIDEO_MOTION_ESTIMATOR_SEARCH_BLOCK_SIZE_16X16,
    D3D12_VIDEO_MOTION_ESTIMATOR_VECTOR_PRECISION_QUARTER_PEL, 
    {1920, 1080, 1280, 720} // D3D12_VIDEO_SIZE_RANGE
    }; 

CComPtr<ID3D12VideoMotionVectorHeap> spVideoMotionVectorHeap;
VERIFY_SUCCEEDED(spVideoDevice->CreateVideoMotionVectorHeap(
    &MotionVectorHeapDesc, 
    nullptr, 
    IID_PPV_ARGS(&spVideoMotionVectorHeap)));
    CD3DX12_RESOURCE_DESC::Tex2D(
        DXGI_FORMAT_R16G16_SINT, 
        Align(1920, 16) / 16, // This example uses a 16x16 block size. Pixel width and height
        Align(1080, 16) / 16, // are adjusted to store the vectors for those blocks.
        1, // ArraySize
        1  // MipLevels
        );

    ATL::CComPtr< ID3D12Resource > spResolvedMotionVectors;
    VERIFY_SUCCEEDED(pDevice->CreateCommittedResource(
        &Properties,
        D3D12_HEAP_FLAG_NONE,
        &resolvedMotionVectorDesc,
        D3D12_RESOURCE_STATE_COMMON,
        nullptr,
        IID_PPV_ARGS(&spResolvedMotionVectors)));

ID3D12VideoMotionVectorHeap is also used to supply hint vectors in the D3D12_VIDEO_MOTION_ESTIMATOR_INPUT structure.

Estimate motion in a command list

Call the EstimateMotion from a ID3D12VideoEncodeCommandList to invoke the motion estimation operation.

The example below executes the motion search and resolves the motion vectors to the 2D texture with D3D12_COMMAND_LIST_TYPE_VIDEO_ENCODE. D3D12 Resources used as input to EstimateMotion must be in the D3D12_RESOURCE_STATE_VIDEO_ENCODE_READ state and the resource written to by ResolveMotionVectorHeap must be in the D3D12_RESOURCE_STATE_VIDEO_ENCODE_WRITE state.

const D3D12_VIDEO_MOTION_ESTIMATOR_OUTPUT outputArgs = {spVideoMotionVectorHeap};

const D3D12_VIDEO_MOTION_ESTIMATOR_INPUT inputArgs = {
    spCurrentResource,
    0,
    spReferenceResource,
    0,
    nullptr // pHintMotionVectorHeap
    };

spCommandList->EstimateMotion(spVideoMotionEstimator, &outputArgs, &inputArgs);

const D3D12_RESOLVE_VIDEO_MOTION_VECTOR_HEAP_OUTPUT outputArgs = { 
    spResolvedMotionVectors,
    {}};

const D3D12_RESOLVE_VIDEO_MOTION_VECTOR_HEAP_INPUT inputArgs = {
    spVideoMotionVectorHeap,
    1920,
    1080
    };

spCommandList->ResolveMotionVectorHeap(&outputArgs, &inputArgs);
        
VERIFY(spCommandList->Close());

// Execute Commandlist.
ID3D12CommandList *ppCommandLists[1] = { spCommandList.p };
spCommandQueue->ExecuteCommandLists(1, ppCommandLists);

Protected resources

Direct3D 12 motion vector estimation supports reading from and writing to hardware DRM protected resources when supported by the driver. If the input resources are hardware DRM protected, the output is also a hardware DRM protected resource.The methods used to create the motion estimation context object and the motion vector heap, ID3D12VideoDevice1::CreateVideoMotionEstimator and ID3D12VideoDevice1::CreateVideoMotionVectorHeap, both accept a ID3D12ProtectedResourceSession that is used to manage access to protected resources.

Use ID3D12VideoMotionEstimator::GetProtectedResourceSession and ID3D12VideoMotionVectorHeap::GetProtectedResourceSession to retrieve the ID3D12ProtectedResourceSession objects provided when the objects were created.