Read UNICODE by std::wifstream

Flaviu_ 771 Reputation points
2023-11-12T10:22:51.9866667+00:00

I have a (html) file with following text:

ânii așteaptă cu nerăbdare să această Până

How can I correctly read this with:

	std::wifstream ifsw("C:/Transfer/test9.html", std::ios::app);
	std::wstring htmlw((std::istreambuf_iterator<wchar_t>(ifsw)),
		(std::istreambuf_iterator<wchar_t>()));
	std::wcout << htmlw;

But I see on the screen weird chars:

ânii așteaptă cu nerăbdare această Până

What I did wrong?

C++
C++
A high-level, general-purpose programming language, created as an extension of the C programming language, that has object-oriented, generic, and functional features in addition to facilities for low-level memory manipulation.
3,313 questions
{count} votes

2 additional answers

Sort by: Most helpful
  1. Flaviu_ 771 Reputation points
    2023-11-12T15:48:42.8566667+00:00

    I have also tried:

    	std::wstring s = L"așteaptă cu nerăbdare această";
    	std::string ss(s.begin(), s.end());
    	std::cout << ss.c_str();
    

    Even worst:

    a↓teapt♥ cu ner♥bdare aceast♥

    0 comments No comments

  2. RLWA32 36,711 Reputation points
    2023-11-12T17:05:25.9933333+00:00

    If I'm guessing correctly you are trying to write a UTF-8 string to a conventional console. So you would need to change the code page used by the console to UTF-8.

    Try this -

    #define WIN32_LEAN_AND_MEAN
    #include <Windows.h>
    
    #define _ATL_CSTRING_EXPLICIT_CONSTRUCTORS      // some CString constructors will be explicit
    //#include <atlbase.h>
    #include <atlstr.h>
    
    
    #include <string>
    #include <iostream>
    
    std::string ConvertWideToANSI(const std::wstring& wstr)
    {
        auto count = WideCharToMultiByte(CP_ACP, 0, wstr.c_str(), static_cast<int>(wstr.length()), NULL, 0, NULL, NULL);
        std::string str(count, 0);
        WideCharToMultiByte(CP_ACP, 0, wstr.c_str(), -1, &str[0], count, NULL, NULL);
        return str;
    }
    
    int main()
    {
        CString s(L"așteaptă cu nerăbdare această");
        std::string ss = ConvertWideToANSI(s.GetString());
        SetConsoleOutputCP(CP_UTF8);
        std::cout << ss << std::endl;
        return 0;
    }