Not all custom locales are created equal.

If you're going to use the Locale Builder tool to create a custom locale, you have a couple of options: You can create a brand new locale for a language-region combination that is not currently supported by Microsoft, or you can replace a locale that exists on your machine with appropriately customized data. If you choose to add support for a language-region combination that's entirely new, you'll be creating a supplemental locale, designed to supplement the existing Windows list. If you choose to replace data for an existing locale, you'll be creating a replacement locale, designed to replace some locale that is already supported by Windows.

So how do you know which you really want?

A replacement locale is a good choice if you want to continue identifying the locale with its Windows name and LCID; the crucial property of a replacement locale that cannot be changed is its identifier (e.g. en-US, ii-CN, ja-JP, etc). This means that applications will be able to call our APIs with existing Windows identifiers and on your machine they will return the data that you have provided rather than the data that shipped with Windows. There are a few other properties that cannot be changed if you choose to create a replacement locale. One of the most crucial of these is code page assignments. We know that you wouldn't dream of developing using anything but Unicode :), but just in case you're still maintaining or running ANSI applications, we don't want those applications to break because code page assignments have changed.

Internally, we also try not to change sorting behavior for existing locales between major versions -- and if we discover that there is customer need for providing custom sorting behavior in the future, we'll have to think very hard about what to do here for replacement locales. When we change sorting behavior for existing locales -- say, if we discover that our collation isn't linguistically accurate --  applications that rely on consistent sorting results need to make changes to accomodate our updates (e.g. database applications that need to reindex whenever sorting behavior changes). So we want to keep this kind of update to a minimum. (We are introducing APIs that will let applications know which version of our sorting behavior is on a particular machine, which will help mitigate this, but that's fodder for another post.)

Supplemental locales require you to choose identifier names that do not overlap with the names of locales already on your machine. We recommend that you choose identifiers that conform as much as possible to the IETF standard; one reason that our names generally follow this model is that people recognize and use them. If you want your custom locale to be readily accessible to applications, using a standard identifier name is a good start. Once you've changed the name, you can customize code page assignments or anything else, since we don't run the same kind of a risk of breaking applications that are relying on long-standing settings.

In the end, whether you choose to create a supplemental locale or a replacement locale depends on your needs; it is just as easy to create, share, and install either of them.

Comments