Potential bug in NumberFormatInfo.NativeDigits for cultures "zh-Hans-HK" and "zh-Hans-MO"

Taylour 11 Reputation points
2022-09-07T16:12:59.3+00:00

For the numbering systems of the Chinese, Japanese and Korean cultures they use a numbering system where simple digit substitution from our English numbers is not sufficient for converting numbers between these differing numbering systems. I discovered that for the cultures "zh-Hans-HK" and "zh-Hans-MO" that the NumberFormatInfo.NativeDigits array is filled with Chinese numbers instead of English numbers. This is reproducible in both .NET Framework and in .NET Core

var nd1 = new CultureInfo("zh-Hans-HK").NumberFormat.NativeDigits; //incorrect ["〇", "一", "二", "三", "四", "五", "六", "七", "八", "九"]
var nd2 = new CultureInfo("zh-Hans-MO").NumberFormat.NativeDigits; //incorrect ["〇", "一", "二", "三", "四", "五", "六", "七", "八", "九"]
var nd3 = new CultureInfo("zh").NumberFormat.NativeDigits; //correct ["0", "1", "2", "3", "4", "5", "6", "7", "8", "9"]

C#
C#
An object-oriented and type-safe programming language that has its roots in the C family of languages and includes support for component-oriented programming.
10,206 questions
0 comments No comments
{count} votes

2 answers

Sort by: Most helpful
  1. Jack J Jun 24,281 Reputation points Microsoft Vendor
    2022-09-08T08:19:42.39+00:00

    @Taylour , Welcome to Microsoft Q&A, based on my research, I find that it is not a bug. Please read the Microsoft doc NumberFormatInfo.NativeDigits Property, which has the following sentence:

    A string array that contains the native equivalent of the Western digits 0 through 9. The default is an array having the elements "0", "1", "2", "3", "4", "5", "6", "7", "8", and "9".

    Therefore, the string array will not always return the string array of English Member.

    I also make a test by using other language, it will return their correspond string array.

    238909-image.png

    Hope my explanation could help you.

    Best Regards,
    Jack


    If the answer is the right solution, please click "Accept Answer" and upvote it.If you have extra questions about this answer, please click "Comment".
    Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.

    0 comments No comments

  2. Taylour 11 Reputation points
    2022-09-08T15:37:31.727+00:00

    Hi Jack,

    Thanks for your reply but that is not quite how NumberFormatInfo.NativeDigits works.

    Your example with the locale "gu-IN" shows the Gujarati script equivalent for the Western numbers. In this example simple digit substitution works since the Indian number system is the same as the Western number system, only the characters for the native digits are different. For example the Western number 23 converted to the native Gujarati number format can be done by converting the character for "2" ( "૨" ) followed by the character for "3" ( "૩" ) or ૨૩.

    Chinese, Japanese, Korean and Hebrew use different number systems where simple digit substitution does not work because the numbering system is completely different. For these cultures, the NumberFormatInfo.NativeDigits should be set to the Western equivalent: ["0", "1", "2", "3", "4", "5", "6", "7", "8", "9"]. This is the case for all relevant cultures except for the two cultures which I mentioned ("zh-Hans-HK" and "zh-Hans-MO") where there is a bug and it is incorrectly set to the Chinese numbers from 0 to 9.

    If you look at this Microsoft documentation on Number Formatting you will see the ellipsis character ( … ) following the example numbers for these special cultures. In Chinese, the number 10 uses the character "十" which is different than "1" ("一") + "0" ( "〇" ). Likewise the Chinese numbering system has special characters used for 100, 1000 and other large numbers. The number "11" in Chinese is "十一" which is the character for "10" followed by the character for "1", it is not the character for "1" repeated twice like in our Western system.

    For more information on the Chinese numbering system, you can also look at the Wikipedia Page or this online number converter

    Since converting between numbering systems using simple digit substitution does not work for the Chinese, Japanese, Korean and Hebrew numbering systems the NativeDigits array should be set to the Western equivalent for those cultures. If one wants to correctly convert Western numbers to Chinese, you would either need to roll your own (potentially buggy) solution or use an established library like ICU. ICU4C has a .NET wrapper but it is not complete and currently does not expose the functionality to convert between numbering systems.

    Kind Regards,
    -Taylour