Extending Media Foundation Encoder to support 10-bit video encoding

Michael Chourdakis 66 Reputation points
2022-05-05T11:56:19.623+00:00

The HEVC Media Foundation Encoder in Windows will only encode 8-bit video. My GFX NVidia card also supports 10-bit HDR and alpha-mode video encoding, so I decided to create my own IMFTransform to use the NVidia SDK.

I 've registered my DLL using MFTRegister with a non existing input type to fool the Sink Writer to pick my encoder when calling SetInputMediaType instead of the predefined Microsoft's transform. This works OK.

int wi = 1920;  
int he = 1080;  
int fps = 30;  
int br = 4000;  
auto fmt = MFVideoFormat_H264;  
bool Our = 1;  
const wchar_t* fil = L"r:\\1.mp4";  
std::vector<DWORD> frame;  
frame.resize(wi * he);  
  
// Test  
CComPtr<IMFSinkWriter> wr;  
DeleteFile(fil);  
  
CComPtr<IMFAttributes> attrs;  
MFCreateAttributes(&attrs, 0);  
auto hr = MFCreateSinkWriterFromURL(fil, 0, attrs, &wr);  
DWORD str = (DWORD)-1;  
CComPtr<IMFMediaType> mt2;  
MFCreateMediaType(&mt2);  
mt2->SetGUID(MF_MT_MAJOR_TYPE, MFMediaType_Video);  
mt2->SetGUID(MF_MT_SUBTYPE, fmt);  
MFSetAttributeRatio(mt2, MF_MT_FRAME_RATE, fps, 1);  
hr = MFSetAttributeSize(mt2, MF_MT_FRAME_SIZE,wi, he);  
MFSetAttributeRatio(mt2, MF_MT_PIXEL_ASPECT_RATIO, 1, 1);  
mt2->SetUINT32(MF_MT_INTERLACE_MODE, MFVideoInterlace_Progressive);  
mt2->SetUINT32(MF_MT_VIDEO_NOMINAL_RANGE, MFNominalRange_Normal);  
mt2->SetUINT32(MF_MT_AVG_BITRATE, br*1000);  
hr = wr->AddStream(mt2, &str);  
CComPtr<IMFMediaType> mt1;  
MFCreateMediaType(&mt1);  
mt1->SetGUID(MF_MT_MAJOR_TYPE, MFMediaType_Video);  
mt1->SetGUID(MF_MT_SUBTYPE, MFVideoFormat_ARGB32);  
hr = MFSetAttributeSize(mt1, MF_MT_FRAME_SIZE, wi, he);  
// Force our selection  
if (Our)  
{  
    mt1->SetGUID(MF_MT_SUBTYPE, MyFakeFmt);  
    hr = wr->SetInputMediaType(str, mt1, 0);  
}  
mt1->SetGUID(MF_MT_SUBTYPE, MFVideoFormat_ARGB32);  
hr = wr->SetInputMediaType(str, mt1, 0);  
hr = wr->BeginWriting();  
for(int i = 0 ; i < 15 ; i++)  
{  
    auto i2 = i % 5;  
    if (i2 == 0)            Frm(frame, wi, he, 0xFFFFFFFF);  
    if (i2 == 1 || i2 == 4)         Frm(frame, wi, he, 0xFF0000FF); // some colors  
    if (i2 == 2)            Frm(frame, wi, he, 0xFFFF00FF); //   
    if (i2 == 3)            Frm(frame, wi, he, 0xFF00FFFF); //  
  
    CComPtr<IMFSample> s;  
    MFCreateSample(&s);  
    int secs = 1;  
  
    hr = s->SetSampleDuration(10 * 1000 * 1000 * secs);  
    hr = s->SetSampleTime(10 * 1000 * 1000 * i);  
  
    CComPtr<IMFMediaBuffer> b;  
    MFCreateMemoryBuffer((DWORD)(frame.size() * 4), &b);          
    b->SetCurrentLength((DWORD)(frame.size() * 4));  
    BYTE* by = 0;  
    DWORD ml = 0, cl = 0;  
    b->Lock(&by, &ml, &cl);  
    memcpy(by, frame.data(), frame.size() * 4);  
    b->Unlock();  
    hr = s->AddBuffer(b);  
    b = 0;  
    hr = wr->WriteSample(str, s);  
}  
  
hr = wr->Finalize();  
wr = 0;  

The problems start with the call to Finalize to end the writing. At that point, everything seems to work normally. Note that I have tested the NVidia IMFTransform I 've created with input frames and it encodes and outputs them correctly as raw data.

When I call Finalize and the type is MFVideoFormat_H264 , the call succeeds. However the generated mp4 plays weirdly:

199244-image.png

When the output is MFVideoFormat_HEVC, then Finalize fails with `0xc00d4a45 : Sink could not create valid output file because required headers were not provided to the sink.'.

I 've also tried to convert the raw .h264 file I 'm saving with ffmpeg to mp4, and this works. The mp4 generated plays correctly.

Adding a MF_MT_MPEG_SEQUENCE_HEADER didn't help (besides, I think this is only needed for H.264)

const char* bl4 = "\x00\x00\x00\x01\x67\x42\xC0\x28\x95\xB0\x1E\x00\x89\xF9\x70\x16\xC8\x00\x00\x03\x00\x08\x00\x00\x03\x01\xE0\x6D\x04\x42\x37\x00\x00\x00\x01\x68\xCA\x8F\x20";
mt2->SetBlob(MF_MT_MPEG_SEQUENCE_HEADER, (UINT8*)bl4, 39);

What do you make of all that?
:)

Windows API - Win32
Windows API - Win32
A core set of Windows application programming interfaces (APIs) for desktop and server applications. Previously known as Win32 API.
2,426 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Michael Chourdakis 66 Reputation points
    2024-03-20T11:15:11.68+00:00

    Yes, I finally was able to create a Media Foundation compatible HDR-10 encoder.

    0 comments No comments