Share via


Creating Portable Encoded Custom Dictionaries that Improve Handwriting Recognition Results

 

Ryoma Ito
Microsoft Corporation

January 2007

Applies to:
   Microsoft Tablet PC Platform SDK
   Windows Vista
   Handwriting Recognition
   Custom Dictionaries

Requirements

   Microsoft Visual Studio .NET 2005
   Update for Microsoft Windows XP Tablet PC Edition Development Kit 1.7
   Microsoft Speech SDK version 5.1 or later
   Microsoft Windows XP, Microsoft Windows XP Tablet PC Edition, or Windows Vista

Summary: This white paper demonstrates how a developer can use Windows Vista to compile an encoded custom dictionary, compatible with the Tablet PC Platform. In addition, this paper provides guidelines on implementing this solution using managed code, and discusses associated caveats. Although support for East Asian languages is planned, this solution supports U.S. and U.K. language dictionaries only. (12 printed pages)

Contents

Introduction
Encoded Dictionary Solution
   Compiler
      Compiler Sample Code
   Installer
      Installer Sample Code
      Coding Instructions for the closeIPS() Method
      Coding Instructions for the modRegData() Method
      Coding Instructions for the cleanUp() Method
Conclusion

Introduction

The Tablet PC uses three types of dictionaries to increase recognition accuracy. The System Dictionary, which is included in the Handwriting Recognizer, contains all the commonly used words in a language. The Speech User Dictionary (hereafter referred to as the User Dictionary) is a customizable list accessible by all applications. It is empty by default. The Application Dictionaries, which are not installed by default, provide a list of application-specific words that are only accessible during application run time.

Thus far, Independent Software Vendors (ISVs) or other developers have leveraged User and Application dictionaries to enhance recognition accuracy in environments requiring domain-specific terminology. They have used the Tablet PC Input Panel, Dictionary Tool, and the Speech API (SAPI), to add domain-specific words to the User Dictionary. See Using Speech Dictionaries to Improve Handwriting Recognition Results for details. Furthermore, they have used the WordList property from the RecognizerContext class, and the SetWordlist method from the AnalysisHintNode class, to create custom application dictionaries.

Unfortunately, the aforementioned methods of dictionary manipulation limit the input used to a pure text source. Essentially, developers must read in a pure text file, iterate through all the words, and add those words to the user or application dictionaries. Although the method is effective, it's a major pain point for organizations or individuals making a business of distributing word lists. Word-list distributors consider their lists to be intellectual property and they do not wish to expose them in a non-encoded format. Thus, the remainder of this paper will describe how to develop and support scalable, encoded custom dictionaries.

Note   The encoded format shown in this white paper is simply the usual runtime binary format used by the handwriting recognizer; no encryption or additional encoding occurs.

Encoded Dictionary Solution

This solution comprises a compiler and an installer. The first part of this article describes how to create a portable encoded dictionary compiler. The second part of the article describes how to deploy custom dictionaries on clients' computers.

Compiler

To make any sense of the Compiler, one must understand a new Windows Vista feature for handwriting recognition on the Tablet PC, called the Input Personalization System (IPS). One function of IPS is to collect text and ink data from the user. As an example, IPS is used to collect words from the SAPI accessible User and Application Dictionaries. IPS transfers the collected data to a Text Trainer, which is part of the Recognizer. The Text Trainer tunes the collected data, storing it in a private encoded file format called a blob.

Note   The Text Trainer creates and maintains a number of blobs, including an Application Lexicon blob, which contains all the words from the SAPI accessible Application Dictionaries, and a User Lexicon blob, which contains all the words from the User Dictionary. Both blobs are components of the Handwriting Recognizer and both blobs are updated when new data is received from IPS, thereby improving handwriting recognition accuracy.

The purpose of the Compiler, which runs on the developer's computer, is to add custom words to the User Dictionary using SAPI. When the Compiler modifies the User Dictionary, IPS detects the changes and, through the Text Trainer, makes the appropriate updates to the User Lexicon blob. Thus, User Lexicon blobs can store proprietary dictionary files in an encoded, updateable format which is also appropriate for distribution.

Not only does the User Lexicon blob's format keep prying eyes from obtaining the list of words in the proprietary dictionary, its words can be leveraged by all applications requiring ink recognition. Figure 1 provides a high-level architectural layout for the encoded dictionary.

Figure 1. Relationships between ink applications, SAPI, and IPS

Note   The custom dictionary's data is stored in the User Lexicon blob, instead of the Application Lexicon blob, because the User Lexicon blob's update model is more suitable. Updates to the User Lexicon blob are accomplished by merging new words with the existing set; similarly, the deployment of a custom dictionary often requires an existing dictionary to be merged with a new one. In contrast, the Text Trainer updates the Application Lexicon blob by overwriting it.

Compiler Sample Code

In order to create a compiler:

  1. Create a new C# project in Visual Studio 2005.

  2. Create a reference to the Speech API. This is required to access the User Dictionary.

    • Click the Project menu, and then click Add Reference.
    • In the Add Reference dialog, on the COM tab, click Microsoft Speech Object Library in the Component Name column, and then click OK.
  3. Add the following lines of code at the very top of the class specification, in order to remove the requirement of writing the complete namespace for each class.

    using SpeechLib;
    using Globalization;
    
  4. Create an AddWord method. AddWord() is called after form initialization is complete. The remainder of the Compiler code discussed in this section should be placed in AddWord().

    private void addWord()
    {
    ...
    }
    
  5. Instantiate the CultureInfo class, which contains specific culture information, such as the writing system, calendar used, and formatting for dates and sort strings. This class is used to retrieve the current Locale ID.

    int currentLCID = CultureInfo.CurrentCulture.LCID;
    
  6. Instantiate the SpLexicon class, so that words may be added to the User Dictionary. SpLexicon, included in SAPI, provides standard methods that can be used to create, access, modify, and synchronize lexicons.

    SpLexiconClass customLex = new SpLexiconClass();
    
  7. Use the AddPronounciation method of the SpLexicon class, to add a new word to the User Dictionary.

    customLex.AddPronunciation("newSampleWord",currentLCID,
    SpeechPartOfSpeech.SPSVerb,null);
    

    **Note   **The preceding code adds only a single newSampleWord to the User Dictionary. In practice, this line of code should be placed inside a loop so that multiple words can be retrieved from a text file (or another external source) and added to the User Dictionary. Be aware that if a text file is used as an input source, each entry must be on a separate line, and the file must be saved using ANSI encoding.

The Compiler now contains all the required code. If the Compiler succeeds in adding any words to the User Dictionary, IPS will update the User Lexicon blob to reflect the change.

Note   To extend the capabilities of this application to remove words from the User Dictionary, use SpLexicon's RemovePronunciation method.

Compiler Output

The following registry key (and on the developer's computer) contains a number of subkeys, including one that identifies the location of the custom dictionary.

HKEY_CURRENT_USER\Software\Microsoft\InputPersonalization\TrainedDataStore\

As Figure 2 illustrates, the TrainedDataStore registry key contains a list of all recognizers supported by IPS.

Figure 2. Depiction of the registry-key tree for path-location values

The registry keys for the U.S. and U.K. recognizers are the only ones of interest at this time, because they are the only ones fully functional with IPS.

{6D1087D7-61D2-495F-9293-5B7B1C3FCEAB} (for the US Recognizer)
{6DA087D7-61D2-495F-9293-5B7B1C3FCEAB} (for the UK Recognizer)

Let's focus on the US Recognizer for a moment. The highlighted subkey in Figure 2 is the subkey for the US Recognizer. It contains a string value identifying the US dictionary blob's location on the developer's computer. The complete registry-key name for the highlighted subkey should look like the following.

HKEY_CURRENT_USER\Software\Microsoft\InputPersonalization\
TrainedDataStore\{6D1087D7-61D2-495F-9293-5B7B1C3FCEAB}\4\

Notice that the string value {E03B7BD0-CEAC-43C3-9677-3F51908ECCD5} is located inside the "4" subkey under the US Recognizer. (Unfortunately, Figure 2 truncates it.) This string value's data identifies the location of the newly created English U.S. dictionary blob on the developer's computer. The string value's data should look like the following.

%USERPROFILE%\AppData\Local\microsoft\InputPersonalization\
TrainedDataStore\{6D1087D7-61D2-495F-9293-5B7B1C3FCEAB}\
{E03B7BD0-CEAC-43C3-9677-3F51908ECCD5}_4ac

Installer

Although the following procedure can be completed manually, it should really be automated with a script.

The Installer script is responsible for deploying the User Lexicon blob on the client's computer. It runs from a fresh login in order to achieve a clean shutdown of any instance of IPS running on the client's computer. A fresh login reduces the potential for IPS shutdown conflicts between IPS dependent applications. It also prevents client computers from updating the existing User Lexicon blob while the Installer is executing.

**Note   **This is a per-user-basis installation, not a per-computer installation. Therefore, if a computer has two users, each user will have their own reference to the custom dictionary. Each computer user must run the Installer before they may access the custom dictionary, even if another user of the same computer already ran the Installer. Note also that User Lexicon blobs can only be deployed on client computers running Windows Vista.

The Installer performs the following tasks:

  1. Creates a mutex lock on the IPS process. This lock restricts IPS use during Installer execution and decreases the potential for data loss/correction.
  2. Copies the developer-created User Lexicon blob onto the client computer. The specific whereabouts can be chosen freely by the developer.
  3. Modifies the path location value under the IPS registry key, configuring IPS to point to the developer-created User Lexicon blob copied to the user's system by the Installer.
  4. Sets the Lexicon Generation registry key data back to 0. When IPS restarts, this value will trigger the merge of existing words in the User Dictionary with the words contained in the developer-created blob.
  5. Removes the mutex lock on the IPS process, and then proceeds to restart IPS.

Figure 3. Sequence for merging the words from the User Dictionary with the new User Lexicon Blob

**Note   **As seen in Figure 3, merging does not occur until IPS is restarted.

When IPS restarts, it iterates through the following steps:

  1. IPS retrieves the Lexicon Generation value from the registry. IPS compares the Lexicon Generation value with the number of previous changes made in the User Dictionary.

  2. If IPS finds that the values differ, IPS merges the User Dictionary words with the User Lexicon blob that was copied onto the system by the Installer.

    **Note   **In the merge process, IPS creates a new User Lexicon blob containing all the words from the developer-created User Lexicon blob, and all the words from the User Dictionary. In two cases, the merge process may not succeed in entering all the words from the developer-created User Lexicon blob into the merged blob.

    One such case occurs when the user, prior to running the Installer, removes a word from the User Dictionary that is also found in the developer-created User Lexicon blob. When the Installer is run, IPS notices that the word was previously removed by the user, and restricts itself from adding that word into the merged blob.

    The other special case occurs when IPS notices that the number of changes made previously in the User Dictionary exceeds a threshold value of 25. In this situation, IPS creates a new blob that completely ignores all the words from the developer-created User Lexicon blob. In other words, in this situation, the resulting blob only contains words from the User Dictionary. There is currently no clean, programmatic way around this issue. One possible mitigation is to instruct custom-dictionary users to minimize use of the user-dictionary feature, and to opt into automatic learning for recognizers available starting with Windows Vista.

  3. Following the merge, IPS updates its pointer to the merged blob.

The dictionary installation is now complete. Its custom words may now be used by all application instances requiring recognition. You can confirm the installation of the custom dictionary by using the Tablet PC Input Panel to verify that the custom words get recognized correctly.

Installer Sample Code

In order to create an installer:

  1. Create a new C# project in Visual Studio 2005.

  2. Add the following lines of code at the very top of the class specification, in order to remove the requirement of writing the complete namespace for each class.

    using System.IO;
    using System.Threading;
    using System.Diagnostics;
    using Microsoft.Win32;
    using System.Runtime.InteropServices;
    
  3. Define some global variables.

    WM_CLOSE contains a close action, and is used as a message to close the IPS window.

    private const int WM_CLOSE = 0x0010;
    

    INFINITE is a very large integer value used as the time-out interval parameter in the WaitForSingleObject() method call.

    private const int INFINITE = -1;
    

    s_wszMutexName stores the GUID required to synchronize IPS access through a system mutex.

    private const string s_wszMutexName = 
    "{7CDC5061-0425-42a0-AF4A-3480170B0F44}";
    

    fileLocation specifies the location for the developer-created User Lexicon blob, on the client's computer.

    private const string fileLocation = 
    "D:\\{E03B7BD0-CEAC-43C3-9677-3F51908ECCD5}_4ac";
    

    mInstance is a system mutex associated with the IPS process. It restricts IPS activities while the Installer executes.

    private static Mutex mInstance = null;
    
  4. Define DllImport declarations to import functionality from the user32.dll and the kernel32.dll. Send a close message to the thread that created the IPS window using the PostMessage method.

    [DllImport("user32.dll")]
    public static extern bool PostMessage(int hwnd, uint wMsg, 
    int wParam, int lParam);
    
  5. Obtain the handle for the top-level IPS window using the FindWindow method.

    [DllImport("user32.dll")]
    public static extern int FindWindow(string lpClassName, 
    string lpWindowName);
    
  6. Use the WaitForSingleObject method, ensuring that installation is delayed until IPS is in a signaled state, or until the time-out interval elapses.

    [DllImport("kernel32.dll")]
    static extern int WaitForSingleObject(int hHandle, 
    int dwMilliseconds);
    
  7. Create three new methods to represent the major steps in the deployment process. After form initialization, closeIPS() will be called, followed by modRegData(), and then cleanUp().

    The closeIPS() method ensures that IPS is not running. It also puts a mutex lock on IPS, thereby preventing new instances of IPS from starting up during Installer execution.

    Private bool closeIPS()
    {
       ...
    }
    

    The modRegData() method handles file distribution and registry updates. It copies the new User Lexicon blob onto the client computer; modifies the registry so that IPS points to the new User Lexicon blob at restart; and resets the Lexicon Generation value in the registry to 0.

    Private bool modRegData()
    {
       ...
    }
    

    The cleanup() method releases the mutex lock on IPS, and starts a new instance of IPS.

    Private bool cleanUp()
    {
    ...
    }
    

Coding Instructions for the closeIPS() Method

  1. Declare the following local variables:

    ipsMutexDNE stores the result of opening an existing IPS mutex.

    bool ipsMutexDNE = false;
    

    iHandle stores the result of finding the top-level IPS window.

    int iHandle = 0;
    

    closeResult stores the result of posting the WM_CLOSE message to the thread that created the IPS window.

    bool closeResult = false;
    

    waitResult stores the result of closing down the top-level IPS window.

    int waitResult = 0;
    
  2. Create exception handlers to deal with possible problems with IPS locking. If the Installer succeeds in opening an existing system-wide mutex using the openExisting method, it locks IPS, thereby restricting other instances of IPS from starting up. Otherwise, the following exceptions are thrown. WaitHandleCannotBeOpenedException is thrown if a system-wide mutex, with a name matching the IPS name, does not exist. UnauthorizedAccessException is thrown if a valid system mutex is found, but lacks the security permissions to access it.

    **Note   **If UnauthorizedAccessException is caught, the Installer will be unable to proceed.

    try
    {
       mInstance = Mutex.OpenExisting(s_wszMutexName);
    }
    catch (WaitHandleCannotBeOpenedException)
    {
       ipsMutexDNE = true;
    }
    catch (UnauthorizedAccessException)
    {
       return false;
    }
    
  3. If the WaitHandleCannotBeOpenedException is thrown, IPS is not locked and closeIPS() should create a new mutex instance. Assign this new instance the IPS-specific name, and assign initial ownership of the mutex to the Installer. The Installer can continue processing the modRegData() method from here.

    if (ipsMutexDNE)
    {
       mInstance = new Mutex(true, s_wszMutexName);
       return true;
    }
    
  4. If the openExisting() method in step 2 successfully retrieves the IPS system mutex, closeIPS() should locate the top-level IPS instance using FindWindow() method. FindWindow() should search for a window name matching "SettingsManager Message Window," an IPS-specific window name. If FindWindow() returns '0', a failure has occurred and the Installer is unable to proceed.

    iHandle = dictInstall.FindWindow(null,
    "SettingsManager Message Window");
    if (iHandle == 0)
       return false;
    
  5. If a valid IPS handle is retrieved, the top-level IPS window receives a close message. (The WM_CLOSE message is the safest way to end IPS without corrupting data.) If the handle is invalid, the PostMessage() method prohibits the Installer from proceeding. PostMessage() failures are typically associated with an incorrect handle parameter. Since the PostMessage() method returns without waiting for the thread to process the message, its failure indicates that an error occurred while posting the message, not while attempting to close IPS.

    closeResult = dictInstall.PostMessage(iHandle, WM_CLOSE, 0, 0);
    if (!closeResult)
       return false;
    
  6. After the close message is posted, the Installer should wait until the IPS process actually terminates. If the WaitForSingleObject method returns '0', IPS has terminated successfully and flipped to a signaled state. Otherwise, the IPS process has hung or is processing something. The Installer may not execute the modRegData() method until WaitForSingleObject() returns '0'.

    waitResult = dictInstall.WaitForSingleObject
    (mInstance.SafeWaitHandle.DangerousGetHandle().ToInt32(), INFINITE);
    if (waitResult == 0)
       return true;
    else
       return false;
    

    If the closeIPS() method returns true, modRegData() is called.

Coding Instructions for the modRegData() Method

  1. Declare the following local variables:

    lexGenRegKey is a RegistryKey object used to modify the Lexicon Generation data in the IPS registry key.

    RegistryKey lexGenRegKey = Registry.CurrentUser;
    

    pathRegKey is a RegistryKey object used to modify the file-location value used by IPS to locate the User Lexicon blob.

    RegistryKey pathRegKey = Registry.CurrentUser;
    
  2. Use the Copy method to move the developer-created User Lexicon blob onto the client computer. If any associated exceptions are caught, the Installer will not proceed.

    try
    {
       File.Copy(@"D:\file2CopyOntoClientMachine",fileLocation,true);
    }
    catch (Exception)
    {
       return false;
    }
    
  3. After the file is copied, update the pointer on the user's computer that identifies the original User Lexicon blob: it should now point to the updated User Lexicon blob.

    Access the HKEY_CURRENT_USER\Software\Microsoft\InputPersonalization registry key using the OpenSubKey method. A null value will be returned by the OpenSubKey() method if the registry key is not found.

    lexGenRegKey = 
    lexGenRegKey.OpenSubKey(@"SOFTWARE\Microsoft\InputPersonalization",
    true);
    if (lexGenRegKey == null)
       return false;
    

    Figure 4 shows where to find the Lexicon Generation value under the InputPersonalization registry key.

    Figure 4. The various values under the InputPersonalization registry key

    If the IPS registry key opens successfully, set the Lexicon Generation value back to '0'.

    lexGenRegKey.SetValue("Lexicon Generation", 0);
    

    Use the OpenSubKey() method to access the E03B7BD0-CEAC-43C3-9677-3F51908ECCD5 string value's data. The E03B7BD0-CEAC-43C3-9677-3F51908ECCD5 value's data identifies the file location of the User Lexicon blob.

    The particular recognizer used to create the dictionary blob (for example, the US Recognizer) determines the registry key accessed here. Refer to the previous "Compiler Output" subsection for details. This example assumes that the US Recognizer is used.

    pathRegKey = 
    pathRegKey.OpenSubKey(@"SOFTWARE\Microsoft\InputPersonalization\
    TrainedDataStore\{6D1087D7-61D2-495F-9293-5B7B1C3FCEAB}\4", true);
    if (pathRegKey == null)
       return false;
    

    The {E03B7BD0-CEAC-43C3-9677-3F51908ECCD5} value's data points to the developer-created User Lexicon blob, thereby configuring IPS to point to the updated file.

    pathRegKey.SetValue
    ("{E03B7BD0-CEAC-43C3-9677-3F51908ECCD5}",fileLocation);
    

    Close the registry keys so that registry changes are flushed to disk.

    lexGenRegKey.Close();
    pathRegKey.Close();
    return true;
    

    If the modRegData() method returns true, call the cleanUp() method.

Coding Instructions for the cleanUp() Method

  1. Remove the lock on IPS by closing the system mutex.

    mInstance.Close();
    
  2. Instantiate the ProcessStartInfo class to start a new instance of IPS. If an exception is caught during this process, the client must manually restart the computer, or alternatively restart IPS by executing the following: (%commonprogramfiles%\microsoft shared\ink\InputPersonalization.exe).

    ProcessStartInfo ipsInfo = new 
    ProcessStartInfo(System.Environment.GetEnvironmentVariable
    ("COMMONPROGRAMFILES") + @"\microsoft 
    shared\ink\InputPersonalization.exe");
    try
    {
       Process.Start(ipsInfo)
    }
    catch (Exception)
    {
       return false;
    } 
    

Conclusion

This paper outlines an approach to customizing handwriting recognition that enables you to deploy encoded custom dictionaries. Please note, however, that the encoded format is simply the usual runtime binary format used by the handwriting recognizer; no encryption or additional encoding occurs.