Hello,
I am developing an command line application which shall be able to fully support Unicode.
To be able to do this, the application
a) runs in modern "Windows Terminal"
b) uses ReadConsoleW / WriteConsoleW instead of the C and C++ standard library functions.
After b was ensured for all code pathes, everything is working fine, except one thing:
I use ReadConsoleW for get an input line from the user. ReadConsoleW has some handy features, which I want to use, like a built in history (arrow up/down) and also implement proper cursor movement/backspace/del key and so on.
Unfortunately it struggles on the automatic echoed user input.
It seems to be that Unicode glyphs which are encoded as an UTF-16 surrogate pair (means 2 UTF-16 words) cannot be displayed via that automatic echo.
More details:
Use ReadConsoleW on the STD_INPUT_HANDLE with (default) enabled ENABLE_LINE_INPUT and ENABLE_ECHO_INPUT for read in user input on the Terminal. It will have garbage in its produced echo for all Unicode chars which are assembled as UTF-16 surrogate pairs (need 2 16 bit words), e.g. with 🚀 🍀 🔥.
It works if ENABLE_LINE_INPUT is removed, but then all handy input features are also gone (e.g. no input delete when backspace, no history when press UP/DOWN ARROW, etc...)
So, the ENABLE_LINE_INPUT is the default mode for STD_INPUT_HANDLE, I expect that it works with the full set of Unicode characters and not only with the first Unicode plane.
Is there a way to make it working?
Or is there an alternative way to get the handy input features when ENABLE_LINE_INPUT is set?
Do you consider it as a bug?
here is a screenshot:
here is a minimal example program
#include <cstdlib> // EXIT_SUCCESS
#include <Windows.h>
int wmain( int argc, wchar_t **argv )
{
::WriteConsoleW( ::GetStdHandle( STD_OUTPUT_HANDLE ), L"Type a string: ", 15, 0x0, 0x0);
HANDLE h = ::GetStdHandle( STD_INPUT_HANDLE );
DWORD mode = 0;
::GetConsoleMode( h, &mode );
// this is set per default already, just for name it here explicitly.
::SetConsoleMode( h, mode | (ENABLE_LINE_INPUT | ENABLE_ECHO_INPUT) );
wchar_t wbuf[128] = {};
::memset( wbuf, 0, sizeof( wbuf ) );
DWORD read = 0;
//the automatic produced echo is garbage for UTF-16 surrogate pairs
if( !::ReadConsoleW( h, wbuf, ARRAYSIZE( wbuf ) - 2, &read, 0x0 ) ) {
return EXIT_FAILURE;
}
if( read > 0 && IS_HIGH_SURROGATE( wbuf[read - 1] ) ) {
// try to read one more character
DWORD extra = 0;
if( ::ReadConsoleW( h, wbuf + read, 1, &extra, NULL ) && extra == 1 ) {
++read;
}
}
wbuf[read] = L'\0'; // ensure zero terminated.
// output it via WriteConsoleW (this works correctly)
::WriteConsoleW( ::GetStdHandle( STD_OUTPUT_HANDLE ), wbuf, read, 0x0, 0x0 );
::SetConsoleMode( h, mode );
return EXIT_SUCCESS;
}
Thank you very much in advance!
Kind Regards,
Florian Thake