The usual path

Text Services Framework assumes that your text service follows a particular processing path.  If your text service doesn't conform to these assumptions, then your programming job will be more complicated.  (Not impossible, just more complex.)  The text service samples on MSDN also follow these assumptions, but they aren't explicitly stated anywhere (that I know of).   I've mentioned some of these assumptions in previous articles, but I thought I'd bring them together in one post.

Text Services makes the following assumptions:

  1. Your service must perform all changes to a context or range object within an edit session.  Text Services Framework enforces this assumption through the use of edit cookies.
  2. Your service should not assume that it is possible to request a synchronous edit session.  (I discussed this here.)
  3. Your service should track focus changes between applications and between controls within an application.  This means that your text service must install event sinks for ITfThreadFocusSink and ITfThreadMgrEventSink.
  4. Your text service should use compositions to handle partially formed input. 

This last assumption is the big one.  It can cause problems for text services that aren't keyboard-related (speech, for example).

The problem is that TSF handles the (admittedly, very difficult) job of interacting with non-TSF aware applications entirely through compositions.  Once you close the composition, TSF assumes that you're completely finished with that piece of input.

Unfortunately, it's hard to tell beforehand when you're done with a piece of dictation.  SAPI will tell you when it's recognized a piece of text, obviously, but, ideally, once you've dictated some text, you would like to be able to correct it.  That requires that you leave the composition open.

In an application that isn't TSF-aware, though, you need to close that composition as soon as you can (it's bad form to have large open compositions; most IMEs have compositions that are a few characters in size).

So there's a tradeoff here.  Dictation in Windows Vista currently closes the composition as soon as the text is recognized.  (In fact, it doesn't use compositions at all.)   That works fine for TSF-aware applications, but causes problems with TSF-unaware applications.  In particular, once you've dictated some text, you can't correct it by voice.  That's why Windows Speech recognition makes you confirm every dictation into a TSF-unaware application.