Speech platforms

Since a few different speech technologies were mentioned in the comments, I thought it might be useful to summarize the speech platform technology we currently provide.

Windows

  • Windows XP, Tablet and 2003 all include:
    • SAPI 5.1: the COM API for use by speech applications, speech synthesis engines, and speech recognition engines.
    • Microsoft Sam: a speech synthesis engine.
  • Tablet also has speech recognition engines in English, Japanese or Chinese (simplified or traditional).

SAPI 5.1 SDK

  • The documentation, headers, etc, for building SAPI 5 apps & engines.
  • The SAPI 5 binaries, for pre-XP versions of Windows.
  • Two more speech synthesis engines: Mike and Mary.
  • Speech recognition engines for English, Japanese, and Chinese (simplified).
  • The best way to learn more is to install the SDK and take a look at the overview topics & whitepapers.
  • We also have SAPI 4.0, which is an older API. I recommend you use SAPI 5, but SAPI 4’s there for those apps that need it. 

Speech Server

  • A server for running IVR applications (e.g. a customer service app on a 1-800 number).
  • The linked web site has a lot of info about this.

Speech Application SDK

  • The SDK for building Speech Server apps. Compared to SAPI 5.1 SDK, the SASDK has a higher level of abstraction and is focused on IVR apps. If you want to build an IVR app, SASDK and Speech Server are the products you should use.