How to tell if the collation version changed.

I added this to the msdn wiki for collation, but I'll blog about it here too. 

Occasionally we change the sorting behavior because new code points are added to Unicode, or we find out betterer data or made a mistake (never!) or whatever.  Unfortunately if you built an index (like for a binary search), then when you use CompareStringEx it might not find what you're looking for if the sort (collation) order has changed.

To help work around that we provide the GetNLSVersionEx() function which can tell the version of the sort your using.  Then when you use the data you can check that the sort hasn't changed.  If it has, then you need to reindex your data.

To prepare for this step you need to remember what version your data is indexed with, so when indexing:

  1. Use GetNLSVersionEx() to retrieve an NLSVERSIONINFOEX structure when doing the original indexing of your data.
  2. Store the following properties with your index to identify the version:
  • NLSVERSIONINFOEX.dwNLSVersion - This specifies the version of the sorting table you're using.
  • NLSVERSIONINFOEX.guidCustomVersion - This is a GUID specifying the locale specific behavior for the locale you rewuested
    • Note: Prior to windows 8 this wasn't used.
  • Note: dwEffectiveId is deprecated with Windows 8.
  • Note: dwDefinedVersion is deprecated in Windows 8.

Then when you use your data (probably just the first time when you run your app since people are unlikely to upgrade windows while your app is running :)):

  1. When using the index use GetNlsVersionEx() to discover the version of your data.
  2. If any of the properties have changed, the sorting data you're using could return different results and any indexing you have may fail to find records.

So if the versions are different, then you need to reindex before you try to use that index.

The specific fields of the NLSVERSIONINFOEX structure used for sort versioning are below.  Remember that when you ask for the version you pass in a locale name (ie: en-US or fj-FJ), so the data isn't the same for all locales.

dwNLSVersion

The version number of the collation in the form 0xRRMMMMmm, where R equals Reserved, M equals major, and m equals minor.  The low "minor" byte is ignored when resolving requests to CompareStringEx, and indicates a small change.  Applications might be able to avoid a complete reindex for minor revision changes to the NLS sorting version, however it is provided for notification and also allows applications to verify the index.

dwDefinedVersion

Deprecated in Windows 8

dwEffectiveId

Deprecated in Windows 8

guidCustomVersion

A unique GUID to tailor the behavior of a sort used by the represented dwNLSversion.  Different locales may share behavior in a particular version, so if you requested the NLSVERSIONINFOEX for en-US and en-GB it may return the same guidCustomVersion for both locales.  However de-DE and fr-FR sort a bit differently than en-US, so you will get a different guidCustomVersion for those locales.

Hope that helps, Shawn