Read UNICODE by std::wifstream

Flaviu_ 1,031 Reputation points
2023-11-12T10:22:51.9866667+00:00

I have a (html) file with following text:

ânii așteaptă cu nerăbdare să această Până

How can I correctly read this with:

	std::wifstream ifsw("C:/Transfer/test9.html", std::ios::app);
	std::wstring htmlw((std::istreambuf_iterator<wchar_t>(ifsw)),
		(std::istreambuf_iterator<wchar_t>()));
	std::wcout << htmlw;

But I see on the screen weird chars:

ânii așteaptă cu nerăbdare această Până

What I did wrong?

Developer technologies C++
{count} votes

Accepted answer
  1. David Lowndes 2,640 Reputation points MVP
    2023-11-12T16:12:11.2233333+00:00

    The documentation/code I referenced originally appears fine to me:
    User's image

    0 comments No comments

2 additional answers

Sort by: Most helpful
  1. Flaviu_ 1,031 Reputation points
    2023-11-12T15:48:42.8566667+00:00

    I have also tried:

    	std::wstring s = L"așteaptă cu nerăbdare această";
    	std::string ss(s.begin(), s.end());
    	std::cout << ss.c_str();
    

    Even worst:

    a↓teapt♥ cu ner♥bdare aceast♥

    0 comments No comments

  2. RLWA32 49,536 Reputation points
    2023-11-12T17:05:25.9933333+00:00

    If I'm guessing correctly you are trying to write a UTF-8 string to a conventional console. So you would need to change the code page used by the console to UTF-8.

    Try this -

    #define WIN32_LEAN_AND_MEAN
    #include <Windows.h>
    
    #define _ATL_CSTRING_EXPLICIT_CONSTRUCTORS      // some CString constructors will be explicit
    //#include <atlbase.h>
    #include <atlstr.h>
    
    
    #include <string>
    #include <iostream>
    
    std::string ConvertWideToANSI(const std::wstring& wstr)
    {
        auto count = WideCharToMultiByte(CP_ACP, 0, wstr.c_str(), static_cast<int>(wstr.length()), NULL, 0, NULL, NULL);
        std::string str(count, 0);
        WideCharToMultiByte(CP_ACP, 0, wstr.c_str(), -1, &str[0], count, NULL, NULL);
        return str;
    }
    
    int main()
    {
        CString s(L"așteaptă cu nerăbdare această");
        std::string ss = ConvertWideToANSI(s.GetString());
        SetConsoleOutputCP(CP_UTF8);
        std::cout << ss << std::endl;
        return 0;
    }
    

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.