Share via


Factoids for East Asian Languages

Factoids for East Asian Languages

East Asian languages are defined as Japanese, Chinese (Simplified), Chinese (Traditional), and Korean. The formats within the factoid below are specific to each language's recognizer.

For example, the Telephone factoid is different in each language. Furthermore, each factoid is specific to a particular recognizer. For example, only the Japanese Telephone factoid can be used with the Japanese recognizer. In addition to the factoids below, all languages use the factoids listed in Factoids Common Across Languages.

Note: The factoids for East Asian languages are implemented by specifying a list of acceptable Unicode characters. The factoids for western languages are implemented by using regular expressions that describe the expected input. This is because western languages are composed of letters that are combined to make words, whereas East Asian languages are character-based.

East Asian recognizers support combining up to ten factoids together. These factoid combinations employ a logical OR operator, therefore the input can match any of the factoids in the expression.

Factoid OneChar Percent PostalCode UpperChar
Description One character. Numbers with a percent symbol. Numerical postal codes. Uppercase Latin script characters.
Unicode values U+0020

U+0021

U+0022

U+0023

U+0024

U+0025

U+0026

U+0027

U+0028

U+0029

U+002A

U+002B

U+002C

U+002D

U+002E

U+002F

U+0030

U+0031

U+0032

U+0033

U+0034

U+0035

U+0036

U+0037

U+0038

U+0039

U+003A

U+003B

U+003C

U+003D

U+003E

U+003F

U+0040

U+0041

U+0042

U+0043

U+0044

U+0045

U+0046

U+0047

U+0048

U+0049

U+004A

U+004B

U+004C

U+004D

U+004E

U+004F

U+0050

U+0051

U+0052

U+0053

U+0054

U+0055

U+0056

U+0057

U+0058

U+0059

U+005A

U+005B

U+005C

U+005D

U+005E

U+005F

U+0060

U+0061

U+0062

U+0063

U+0064

U+0065

U+0066

U+0067

U+0068

U+0069

U+006A

U+006B

U+006C

U+006D

U+006E

U+006F

U+0070

U+0071

U+0072

U+0073

U+0074

U+0075

U+0076

U+0077

U+0078

U+0079

U+007A

U+007B

U+007C

U+007D

U+007E

U+0025

U+002E

U+0030

U+0031

U+0032

U+0033

U+0034

U+0035

U+0036

U+0037

U+0038

U+0039

U+002D

U+0030

U+0031

U+0032

U+0033

U+0034

U+0035

U+0036

U+0037

U+0038

U+0039

U+0041

U+0042

U+0043

U+0044

U+0045

U+0046

U+0047

U+0048

U+0049

U+004A

U+004B

U+004C

U+004D

U+004E

U+004F

U+0050

U+0051

U+0052

U+0053

U+0054

U+0055

U+0056

U+0057

U+0058

U+0059

U+005A

The following topics show the formats supported for each factoid in Japanese, Chinese (Simplified), Chinese (Traditional), and Korean.