Getting the wchar_t string length in C++

thebluetropics 1,046 Reputation points
2022-10-05T06:37:42.19+00:00

Win32 uses UTF-16 encoding for wchar_t to store strings.
Each character can eat from 1 wchar_t to 2 wchar_t(s), depending on the character.

However, either wcslenW() and lstrlenW() does not count character with 2 wchar_t's as single character.

This Japanese Kanji uses only 1 code unit (1 wchar_t):

   wchar_t text[] = L"私";  
   int length = lstrlenW(text); // Outputs 1, as expected  

However, when I use Chinese Character U+2070E, that uses 2 code units (2 wchar_t's), it counts as two character instead of 1 character.
I can't put the code here for some reason, here is the link to the code.

So, I assume that lstrlenW() and wcslen() is counting the total of wchar_t, not the total of characters :I

Is there a way to get the correct length of wchar_t in Win32 applications?

Windows development Windows API - Win32
Developer technologies C++
{count} votes

Accepted answer
  1. Xiaopo Yang - MSFT 12,731 Reputation points Microsoft External Staff
    2022-10-05T08:28:26.857+00:00

    As Double-byte Character Sets pointed, use _mbs version function. To get the length based on locale, use _mbstrlen. see the example.

    0 comments No comments

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.