Globalize test data
After the environment has been set for globalized testing, your regular test cases must be run with special attention paid to potential globalization problems. The test data must match the test. Use a mix of scripts such as Latin, Cyrillic, Armenian, Georgian, Thai, and Indic languages.
The rest of this article describes Some particular areas of functionality might require special test data. These examples are just a few of the possibilities. A realistic test suite will cover more.
If the application converts the case of text, you can verify the functionality with scripts that do not have uppercase and lowercase symbols, such as Cyrillic, East Asian scripts, and Hebrew.
You can also use text where there are locale-specific rules for case conversion, such as Turkish. Turkish uses upper and lower case, dotted and dotless letter "I". Unlike other languages written in the Latin script, Turkish "I"s do not gain or lose their dots when changing case.
|Code point||Unicode Character Name||Letter||Opposite case (Turkish)|
|U+0049||LATIN CAPITAL LETTER I||
|U+0069||LATIN SMALL LETTER I||
|U+0130||LATIN CAPITAL LETTER I WITH DOT ABOVE||
|U+0131||LATIN SMALL LETTER DOTLESS I||
Another example of an area requiring functionality-specific test data is sorting, for which there are some specific examples of data you can use. These are only a few of the cases that you should prepare test data for. You should consider the coverage that you require for the target markets you expect to reach.
Under Turkish conventions, dotless "I" (the uppercase "I" or the lowercase "ı") precedes dotted "I" (the uppercase "İ" or lowercase "i").
In Turkish, the letters sort in the order
I ı İ i (U+0049, U+0131, U+0130, U+0069).
Note that this order is not the same as the order of the values of the letter's Unicode code points.
China has two different sort orders, one by pronunciation and one by stroke order. The results will vary based on the sort order the user chooses.
|Pronunciation Order||Unicode Value||PinYin||Stroke Order||Unicode Value||PinYin|
Text processing and storage
When planning your test, put greater importance on test cases that deal with the direct or indirect I/O operations on strings. You should also give special priority to tests involving text and formatted data (such as numbers, monetary values, date, and time). Direct input happens when a user enters text in the user interface. Indirect text input is text fetched from some external source, like a text file, database, or communication channel. Examples of indirect string input can host names, user names and email addresses, folder names, and so on. If your application deals with information of this kind, make sure this data is multilingual when your tests are running.
In a small number of cases the range of characters in the test input might be limited in accordance with certain design limitations. Set these tests in a globalized environment. See "Create the test environment." It might be hard to enter all of these test inputs manually if you do not know the languages in which you are preparing your test data. A simple Unicode text generator can be very helpful at this point.