Share via


Compliance Tests (SAPI 5.3)

Microsoft Speech API 5.3

Compliance Tests



1         Contents

1      Contents. 1

2      Table of Tables. 1

3      Compliance Testing Overview.. 3

3.1       SAPI Compliance Required Tests  3

3.2       SAPI Compliance Feature List Tests  3

3.3       Minimum Requirements  3

4      Using the compliance testing tool4

5      Compliance Tests. 5

5.1       Test Result Log:5

6      Compliance Testing Configuration Options. 5

6.1       SAPI 5.0 Compliance Testing Application Toolbar6

6.2       SAPI 5.0 Compliance Testing Application Menu Choices  7

6.3       SAPI 5.0 Compliance Testing Logging Options  9

6.4       SAPI 5.0 Compliance Testing Run Options  10

6.5       SAPI 5.0 Compliance Test Selection Options  11

7      SAPI Compliance: SR.. 13

7.1       Required Tests  13

7.2       Feature Tests  23

7.3       SR Sample Engine  27

7.4       Compliance Test Customization  29

7.5       Multilingual Support29

7.6       OS Language Incompatibility  31

8      SAPI Compliance: TTS. 32

8.1       Required Tests  32

8.2       Feature Tests  34

8.3       TTS Sample Engine  35

8.4       Multilingual Support36

8.5       OS Language Incompatibility  43

 

2         Table of Tables

Table 1: Events Compliance Test14

Table 2: Lexicon Compliance Test16

Table 3: Command and Control Compliance Test20

Table 4: Required Compliance Tests. 22

Table 5: Events Feature Compliance Test23

Table 6: Grammar Feature Compliance Test25

Table 7: Feature Compliance Tests. 26

Table 8: Sample Engine Required Compliance Test results. 28

Table 9: Sample Engine Feature Compliance Test results. 28

Table 10: Strings to be localized. 31

Table 12: Speak Flag Tests. 32

Table 13: Speak Tests. 33

Table 14: Lexicon Tests. 33

Table 15: SAPI XML tests. 33

Table 16: Events Tests. 34

Table 17: Sample Engine Required Test Results. 36

Table 18: Sample Engine Feature List Test Results. 36

Table 19: Strings to be localized for compliance tests. 42

Table 21: Required Compliance Tests Failed. 43

Table 22: Feature Compliance Tests Not Supported. 43

 

 

3         Compliance Testing Overview

This paper, directed toward engine vendors, describes the SAPI 5.0 compliance testing tool by answering the following questions:

·         What does SAPI compliance for SAPI 5.0 imply?

·         What are the SAPI compliance tests?

·         What does each test look for?  

The goals of the compliance tool are to help engine vendors test their speech engines for SAPI compliance and port these speech engines to SAPI 5.0. The tests also help vendors to support various SAPI features that are not required for compliance. These tests do not test the speech or performance quality of the engines.  All compliance tests assume that SAPI will do parameter validation, and as such, they do not check the engine's ability to handle invalid parameters such as null, bad pointers, or values out of range.

 

To run the compliance tests, the SPcomp.exe tool is used and either the Text-to-Speech (TTS) or the Speech Recognition (SR) test suite is selected. This tool generates a log report indicating the results of the compliance tests.

 

There are two types of SAPI 5 compliance tests:

1)      required tests

2)      feature list tests

 

The compliance tests do not necessarily test the DDI directly, instead, the use the SAPI API function calls to test the engine's response to the DDI. The default engine is always used as the engine in the compliance test. Currently, the supported languages for the compliance tests are English, Japanese and Simplified Chinese[1]. Please check Microsoft® Speech.NET Technologies for language pack updates and information. 

 

3.1       SAPI Compliance Required Tests

The results of the required tests are of a pass/fail nature. These tests were designed to help an engine reach a minimal amount of functionality with the SAPI DDI layer. In order to be SAPI compliant, the engine must pass all required SAPI tests.

 

3.2       SAPI Compliance Feature List Tests

The results of the feature list tests are either "Supported" or "Unsupported". Feature list tests were designed to help engine vendors port advanced features to SAPI 5.0. To be SAPI compliant, the engine does not need to pass any feature list test, although it is recommended that all features be implemented if possible.

 

3.3       Minimum Requirements

The minimum requirements for speech recognition for dictation are a 200 mhz Pentium with 64 MB for win 95/98 or 96 MB for NT. The recommended computer is a 300 Mhz Pentium II with 128 megs of RAM or better.

 

 

4         Using the compliance testing tool

The SAPI 5.0 compliance test tool, SPcomp.exe, enables you to load compliance test suites and determine the test result logging options. Please see Compliance Tests for more options.

The SPcomp.exe test tool creates the .pro file for a given test suite. Perform compliance tests by starting the SPcomp.exe application and loading a test suite from a .pro file. The .pro file loads the associated dynamic link library (.dll), which contains the SR or TTS compliance tests.

To start the SAPI 5.0 compliance test tool SPcomp.exe from Windows Explorer, double-click the compliance tool icon. Alternatively, you can perform each compliance test from the command line by running the compliance test tool and command line syntax.

For example, the command line syntax for running the SRcomp compliance test in the srcompreq.pro test suite is as follows:

  
    C:> SPcomp.exe srcompreq.pro
  

SPcomp.exe srcompreq.pro

Starts the compliance test tool and loads the speech recognition (SR) required tests from the SRcomp.dll.

SPcomp.exe srcompopt.pro

Starts the compliance test tool and loads the speech recognition (SR) feature list tests from the SRcomp.dll.

SPcomp.exe ttscompreq.pro

Starts the compliance test tool and loads the text-to-speech (TTS) required tests from the TTScomp.dll.

SPcomp.exe ttscompopt.pro

Starts the compliance test tool and loads the text-to-speech (TTS) feature list tests from the TTScomp.dll.

 

 

5         Compliance Tests

The SAPI 5.0 compliance tests verify that you have successfully implemented the required features to be considered compatible with SAPI 5.0. Your engine must successfully complete each of the following four compliance tests with 100 percent pass rate to be compliant with SAPI 5.0.

1.      srcompreq.bat
  Speech Recognition (SR) required test batch file.

2.      srcompopt.bat
  Speech Recognition (SR) feature list test batch file.

2.      ttscompreq.bat
  Text-to-speech (TTS) required test batch file.

4.      ttscompopt.bat
  Text-to-speech (TTS) feature list test batch file.

5.1       Test Result Log:

The SAPI 5.0 compliance tool generates a result log and you can configure it to display the result log information or you can save it to a file. The test result log contains pass or fail state information for each segment of the test suite. If a compliance test fails, you can review the result log to determine the origin of the failure.

Example result log with a 100% pass rate for all tests.

  
    Total:
  

FakePre-cbd2cf737c0f483da12163bd82901609-cd8144b39d53433ebc838235203af3d8FakePre-b898d0ef307242d1aabdfe5799819f04-7dec9ffc88874564bb1e251279bf7f33FakePre-9b728a6d25c54d44ac3308238f7f6b01-7b529c57eb9e492d866652d60237eae2FakePre-cf990c77cc504770a8a14ffc95b92374-a235f01a98bb4adba14105a0faa86d96FakePre-756d044f95d1497d860beb0eddf66400-7300916ac00a46bda0c780505bb11b44FakePre-b2eb9de2c29b400f94a37aac20d529d5-5f9374c7500a496fb5572f4c4aa2bcd4

 

 

6         Compliance Testing Configuration Options

The SAPI 5.0 compliance test application user interface (UI) enables you to configure the testing options. The following section provides additional compliance test configuration information.

·         SAPI 5.0 Compliance Testing Application Toolbar

·         SAPI 5.0 Compliance Testing Application Menu Choices

·         SAPI 5.0 Compliance Testing Logging Options

·         SAPI 5.0 Compliance Testing Run Options

·         SAPI 5.0 Compliance Test Selection Options

6.1       SAPI 5.0 Compliance Testing Application Toolbar

The main window of the SAPI 5.0 compliance testing application contains a toolbar from which you can access the configuration options. Additionally, the configuration options are also available from the menu bar located at the top of the compliance testing application window.

ms717033.image001(en-us,VS.85).gif

Pause on an icon to display tooltip text. Click an icon to view the information associated with the feature.

Load the test DLL

Loads a test dynamic-link library (DLL).

You can run compliance tests using one of the following methods:

1.      From the SAPI Engine Compliance Tool, click File, and then click Load Test DLL.

2.      Load the test DLL into SPcomp.exe from the command line.
For more information, see Using the compliance testing tool.

Note: loading a compliance test with either method results in automatically unloading any previously loaded compliance tests.

Load the test settings

Loads one of the pre-configured test suites.

You can run compliance tests using one of the following methods:

1.      From the SAPI Engine Compliance Tool, click File, and then click Load Settings.

2.      Load the test DLL into SPcomp.exe from the command line.
For more information, see Using the compliance testing tool.

Note: loading a compliance test with either method results in automatically unloading any previously loaded compliance tests.

Save settings

Saves the configuration settings for the compliance test application.

Copy

Selects and copies content from the display log.

Clear Window

Clears the display contents of the result log.

Find

Searches for a specific word or phrase within the result log.

Find Next

Searches for the next occurrence of a specific word or phrase within the result log.

Run Test

Begins the compliance test.

Stop Test

Stops the compliance test.

Set Run Options

Configures the compliance test options.

Select Tests

Chooses which compliance test contained in the test suite to.

Set Logging

Determines location of the compliance test log information.

 

6.2       SAPI 5.0 Compliance Testing Application Menu Choices

The SAPI 5.0 compliance testing application configuration choices are accessible through the menu system. The following items are covered in this section:

·         File menu

·         Edit menu

·         Test menu

·         Options menu

·         Help menu

 

6.2.1        File menu

Click File to set configuration options to load settings, save settings, or load the appropriate test DLL. Use the arrow keys to view various menu choices. Press ENTER to select a menu choice.

ms717033.image002(en-us,VS.85).gif

6.2.2        Edit menu

Click Edit to copy text from the result log and search for text within the result log. Use the arrow keys to view various menu choices. Press ENTER to select a menu choice.

ms717033.image003(en-us,VS.85).gif

6.2.3        Test menu

Click Test to run the test or select a test. Use the arrow keys to view various menu choices. Press ENTER to select a menu choice.

ms717033.image004(en-us,VS.85).gif

6.2.4        Options menu

Click Options to view the various configuration settings. Use the arrow keys to view various menu choices. Press ENTER to select a menu choice.

ms717033.image005(en-us,VS.85).gif

6.2.5        Help menu

Click Help and then click About to display the SAPI 5.0 Engine Compliance Tool Version dialog box. Use the arrow keys to view the various menu choices. Press ENTER to select a menu choice.

ms717033.image006(en-us,VS.85).gif

6.3       SAPI 5.0 Compliance Testing Logging Options

From the Options menu, choose Logging Settings to set SAPI 5.0 compliance test result log configuration options.

ms717033.image007(en-us,VS.85).gif

Window

Displays the test result information in the main window of the compliance testing application.

Log File

Saves the test result information as text in a log file.

The log file is located at the same directory as SPcomp.exe tool and the file name will be the following style:

  spcomp@442.log

The numbers "442" in the file name are generated by the SPcomp.exe tool and will be incremented by one each time you restart SPcomp.exe tool and run the test. A new log file is generated each time you start SPcomp.exe tool and run a compliance test.

Detailed

Specifies detailed result log information.

Summary

Specifies summary result log information.

 

6.4       SAPI 5.0 Compliance Testing Run Options

From the Options menu, click Run Options to configure SAPI 5.0 compliance testing run options.

ms717033.image008(en-us,VS.85).gif

Random

Randomizes the test order.

Close after execution

Closes the compliance testing application after the test sequence.

Stress

This option should not be selected for compliance tests.

Run count

Specifies the number of interactions the selected test should run.

Disable screen saver

Disable the screen saver.

Quiet

Runs the selected test in quiet mode.

Random Seed

The random seed value set here is used for the next time you run the compliance test.

Note: When troubleshooting a failed compliance test, you need to enter the same seed value information that was used for the failed compliance test before you repeat the compliance test procedure.

You can obtain the compliance test seed value from the "Random Seed" field information in the SPcomp@xxx.log file that was generated during the unsuccessful compliance test.

 

6.5       SAPI 5.0 Compliance Test Selection Options

From the Test menu, click Select Test to configure SAPI 5.0 compliance test choices.

ms717033.image009(en-us,VS.85).gif

Test Cases

Displays the current test suite.

Selected Test Cases

Displays the current selected tests.

Add Case(s)

Adds test items to the list of selected test cases.
Alternatively, to add test cases, right-click the test case in the test case display window and click Add Item.

ms717033.image010(en-us,VS.85).gif

Remove Case

Removes the selected test case from the current test. However, removing the selected test does not affect the need to successfully pass this test case to satisfy SAPI compliancy.

Alternatively, to remove test cases, right-click the test cases in the selected test case display window and click Remove Case.

ms717033.image011(en-us,VS.85).gif

Remove All

Removes all test cases.

 

7         SAPI Compliance: SR

SAPI compliant SR engines must be able to perform the following[2]:

§         Generate certain SR events

§         Interact with the SAPI lexicon

§         Handle Command and Control (C&C) grammars

§         Generate Phrase Elements

§         Support auto pause on recognition

§         Support rule synchronization

§         Support multiple instances of the engine

§         Support multiple application contexts

 

7.1       Required Tests

7.1.1        Events

Events will be checked for with .wav files. The test will feed the wav file to the engine and expect a specific event notification to occur. Please note that whether or not the engine can fire a specific event depends on the confidence threshold of the engine. Engine vendors could change the .wav quality to meet their requirement.

 

For English:

 

Test

Description

Resource IDs

Description

SoundStart

Test will check if a sound start event occurs.

IDS_WAV_SOUNDSTART

Input .wav file, tag_l.wav

IDR_L_GRAMMAR

Input CFG grammar

SoundEnd

Test will check if a sound end event occurs.

IDS_WAV_SOUNDEND

Input .wav file, tag_l.wav

IDR_L_GRAMMAR

Input CFG grammar

PhraseStart

A .wav file with audio the engine can do recognition on. Test insures that a phrase start event occurs

IDS_WAV_PHRASESTART

IDR_L_GRAMMAR

Input .wav file, tag_l.wav

Input CFG grammar

Recognition

A .wav with audio that the engine can do recognition on. Test insures that a recognition event occurs.

IDS_WAV_RECOGNITION_1

IDR_L_GRAMMAR

Input .wav file, tag_l.wav

Input CFG grammar

False Recognition

A wav file and a mismatching C&C grammar are loaded. Test insures that false recognition event occurs.

IDS_WAV_RECOGNITION_1

IDR_RULE_GRAMMAR

Input .wav file, tag_l.wav

Input CFG grammar

SoundStart/

SoundEnd

Test will check that the sound start event occurs before the sound end event.

IDS_WAV_SOUNDSTARTEND

IDR_L_GRAMMAR

Input .wav file, tag_l.wav

Input CFG grammar

PhraseStart/

Recognition

Test will check that the phrasestart event occurs before the recognition event.

IDS_WAV_RECOGNITION_1

IDR_L_GRAMMAR

Input .wav file, tag_l.wav

Input CFG grammar

SoundStart/

PhraseStart/

Recognition/

SoundEnd/

A wav file with audio that the engine can do recognition on. Test insures that the audiooffsets of these events are correct in terms of value comparison.

IDS_WAV_RECOGNITION_1

IDR_L_GRAMMAR

Input .wav file, tag_l.wav

Input CFG grammar

Table 1: Events Compliance Test

 

7.1.2        Lexicon

It is expected that changes in the user and application lexicon will be synchronized with the engine both when the engine starts up and after it has loaded a command and control grammar.

 

 

Test

Description

Resource IDs

Description

User Lexicon Before C&C Grammar Loaded

A made-up word with its customized pronunciation is added to the user lexicon. After command and control grammar is loaded, audio will be sent with the word added and the expected result is checked for.

IDS_WAV_SYNCH_BEFORE_LOAD

IDR_SNORK_GRAMMAR

IDS_RECO_SYNCH_BEFORE_LOAD

IDS_RECO_NEWWORD_PRON

Input .wav file, lexicon.wav

Input CFG grammar

The lexicon form of new word

The pronunciation of the new word in user lexicon

User Lexicon After C&C Grammar Loaded

After command and control grammar is loaded, a made-up word with its customized pronunciation is added to the user lexicon. Audio will be sent with the word added and the expected result is checked for.

IDS_WAV_SYNCH_AFTER_GRAM

IDR_SNORK_GRAMMAR

IDS_RECO_SYNCH_AFTER_GRAM

IDS_RECO_NEWWORD_PRON

Input .wav file, lexicon.wav

Input CFG grammar

The lexicon form of new word

The pronunciation of the new word in user lexicon

Application Lexicon and C&C Grammar

A made-up word with its customized pronunciation is added to the application lexicon. After command and control grammar is loaded, audio will be sent with the word added and the expected result is checked for.

IDS_WAV_APPLEX

IDR_SNORK_GRAMMAR

IDS_APPLEX_WORD

IDS_APPLEX_PROP

 

Input .wav file, lexicon.wav

Input CFG grammar

The lexicon form of new word

The pronunciation of the new word in application lexicon

User lexicon before application lexicon

A made-up word is added to both user lexicon and application lexicon using the different customized pronunciations. After command and control grammar is loaded, audio will be sent with the word's pronunciation in user lexicon and the expected result is checked for.

IDS_WAV_USERLEXBEFOREAPPLEX

IDR_SNORK_GRAMMAR

IDS_USERLEXBEFOREAPPLEX_WORD

IDS_USERLEXBEFOREAPPLEX_USERPROP

IDS_USERLEXBEFOREAPPLEX_APPPROP

Input .wav file, lexicon.wav

Input CFG grammar

The lexicon form of new word

The pronunciation of the new word in user lexicon

The pronunciation of the new word in application lexicon

Table 2: Lexicon Compliance Test

 

7.1.3        Command and Control Grammar

Testing the engine for grammar compliance is perhaps the most complex set of tests. The engine must process a grammar correctly.  Each test will use a grammar specifically tailored for the particular feature.

 

 

Test

Description

Resource IDs

Description

L Tag

A three-element list grammar is loaded. Audio with the middle item to be recognized with the sent to the engine and the result checked for this item.

IDS_RECO_L_TAG

IDS_WAV_L_TAG

IDR_L_GRAMMAR

Expected Result

Input .wav file, tag_l.wav

Input CFG grammar

Expected Rule

A grammar with two identical rules is loaded. The first rule will be activated. Audio that triggers this rule is sent and test verifies that the engine uses the first rule. The first rule is then de-activated and the second rule is activated. The same audio is sent and the test verifies that the engine uses the second rule.

IDS_RECO_EXPRULE_FIRSTRULE

IDS_RECO_EXPRULE_SECONDRULE

IDS_WAV_EXPRULE_TAG

IDR_EXPRULE_GRAMMAR

First Rule's Name

Second Rule's Name

Input .wav file, tag_exprule.wav

Input CFG grammar

P Tag

A simple grammar with a single phrase. Audio is sent and recognition is expected. Audio that does not contain the phrase is sent and no recognition is expected.

IDS_RECO_P_TAG

IDS_WAV_P_TAG

IDR_P1_GRAMMAR

Expected Result

Input .wav file, tag_p.wav

Input CFG grammar

 

O Tag

A grammar will be defined with a phrase and an optional phrase preceding and following it. Three audio streams will be sent. One with the first optional phrase, one for the second, and the third that does not contain any optional phrases. The appropriate recognition result is checked for in each case.

IDS_RECO_O_TAG_1

IDS_RECO_O_TAG_2

IDS_RECO_O_TAG_3

IDS_WAV_O_TAG_1

IDS_WAV_O_TAG_2

IDS_WAV_O_TAG_3

FirstOptionalWord

Required word

Second Optional word

Input .wav file containing the first optional word, tag_o1.wav

Input .wav file containing the second optional word, tag_o2.wav

Input .wav file without optional words, tag_o3.wav

RULEREF Tag

A grammar with a phrase with a rule reference and a rule defined will be loaded. Audio that triggers the rule will be sent and the result checked.

IDS_RECO_RULE_TAG

IDS_WAV_RULE_TAG

IDR_RULE_GRAMMAR

Expected Result

Input .wav file, tag_rule.wav

Input CFG grammar

/Disp/lex/pron format

Test ensures engine can support customized pronunciation provided in the command and control grammar file.

IDS_CUSTOMPROP_NEWWORD_PRON

IDS_CUSTOMPROP_NEWWORD_DISP

IDS_CUSTOMPROP_NEWWORD_LEX

IDS_CUSTOMPROP_RULE

IDS_WAV_CUSTOMPROP

The customized pronunciation form of the new word

The customized display form of the new word

The customized lexicon form of the new word

The dynamic grammar rule name

Input .wav file, lexicon.wav

Table 3: Command and Control Compliance Test

 

7.1.4        Phrase Elements, Auto Pause, Rule invalidation, multiple instances and contexts.

 

 

 

Test

Description

Resource IDs

Description

Phrase Elements

The audio offsets of SPPHRASEELEMENTs in one SPPHRASE are correctly filled in, which means that the audio offset of the first SPPHRASEELEMENT is less than the audio offset of the second SPPHRASEELEMENT, the audio offset of the second SPPHRASEELEMENT is less than the third one, etc.

IDS_WAV_RULE_TAG

IDR_RULE_GRAMMAR

Input .wav file, tag_rule.wav

Input CFG grammar

Auto Pause

The test makes sure engine can support auto pause feature provided by SAPI.

IDS_AUTOPAUSE_DYNAMICWORD1

IDS_AUTOPAUSE_DYNAMICWORD2

IDS_AUTOPAUSE_DYNAMICRULE1

IDS_AUTOPAUSE_DYNAMICRULE2

IDS_WAV_AUTOPAUSE

The word in the first rule

The word in the second rule

The name of the first rule

The name of the second rule

Input .wav file, autopause.wav

Top-level rule invalidation

Test verifies that engine can synchronize the rule information after SAPI notifies engine of top-level rule invalidation.

IDS_INVALIDATETOPLEVEL_DYNAMICWORDS

IDS_INVALIDATETOPLEVEL_DYNAMICRULE

IDS_WAV_INVALIDATETOPLEVEL_OLD

IDS_INVALIDATETOPLEVEL_DYNAMICNEWWORDS

IDS_WAV_INVALIDATETOPLEVEL_NEW

The words in the dynamic grammar

The rule name in the dynamic grammar

Input .wav file used before invalidation, tag_exprule.wav

The new words in the dynamic grammar

Input .wav file used after invalidation

None-top-level rule invalidation

Test verifies that engine can synchronize the rule information after SAPI notifies engine of non-top-level rule invalidation.

IDS_INVALIDATENONTOPLEVEL_RULE1

IDS_INVALIDATENONTOPLEVEL_RULE2

IDS_INVALIDATENONTOPLEVEL_TOPLEVELRULE

IDS_INVALIDATENONTOPLEVEL_OLDWORD1

IDS_INVALIDATENONTOPLEVEL_OLDWORD2

IDS_WAV_INVALIDATENONTOPLEVEL_OLD

IDS_INVALIDATENONTOPLEVEL_NEWWORD1

IDS_INVALIDATENONTOPLEVEL_NEWWORD2

IDS_WAV_INVALIDATENONTOPLEVEL_NEW

The first rule name

The second rule name

The top-level rule name

The word in the first rule used before invalidation

The word in the second rule used before invalidation

Input .wav file used before invalidation, tag_exprule.wav

The word in the first rule used after invalidation

The word in the second rule used after invalidation

Input .wav file used after invalidation, tag_rule.wav

Multiple recognition contexts

Multiple recognition contexts will be created with different grammars. The test will verify that the recognition event is generated by the correct recognition contexts.

IDS_RECO_P_TAG

IDS_WAV_MULT_RECO

IDR_P1_GRAMMAR

IDR_P2_GRAMMAR

The result expected in the second grammar

Input .wav file, multireco.wav

The first grammar used by the first recocontext

The second grammar used by the second recocontext

Multiple recognition engine instances

Basic tests are run separately on different threads to see if engine can support multi instances.

NA

NA

Table 4: Required Compliance Tests

 

7.2       Feature Tests

Some of the features exposed through SAPI are useful from a competitive advantage point of view. Features are not required by SAPI compliance, but may be an attractive function for engine vendors to implement. SAPI features are:

§         Interference and hypothesis events

§         Dictation functionalities

§         Advanced command and control features

§         Command and control alternate

§         Engine properties

§         Inversed text normalization

 

 

 

7.2.1        Events

Events will be checked for with .wav. The test will feed the .wav  to the engine and expect a specific event notification to occur. Please note that whether or not the engine can fire a specific event depends on the confidence threshold of the engine. Engine vendors may change the .wav files if it is felt that the .wav quality does not meet their requirements (Refer to Section 7.4).

 

Test

Description

Resource IDs

Descriptions

Interference

A wav file with noises. Test will check that an interference event occurs.

IDS_WAV_INTERFERENCE

IDR_L_GRAMMAR

Input .wav file, tag_l.wav

Input CFG grammar

Hypothesis

A .wav file with audio that engine can do recognition on. Test insures a hypothesis event occurs.

IDS_WAV_HYPOTHESIS

IDR_EXPRULE_GRAMMAR

Input .wav file, tag_exprule.wav

Input CFG grammar

Table 5: Events Feature Compliance Test

 

7.2.2        Dictation functionalities

This the required features if Engine wants to support dictation grammar. This include some basic functionalities for dictation grammar. This includes lexicon, dictation tag, dictation alternates.

 

Test

Description

Resource IDs

Descriptions

User Lexicon Before dictation Grammar Loaded

A made-up word with its customized pronunciation is added to the user lexicon. After dictation grammar is loaded, audio will be sent with the word added and the expected result is checked for.

IDS_WAV_SYNCH_BEFORE_LOAD

IDS_RECO_SYNCH_BEFORE_LOAD

IDS_RECO_NEWWORD_PRON

Input .wav file, lexicon.wav

The lexicon form of new word

The pronunciation of the new word in user lexicon

User Lexicon After dictation Grammar Loaded

After dictation grammar is loaded, a made-up word with its customized pronunciation is added to the user lexicon. Audio will be sent with the word added and the expected result is checked for.

IDS_WAV_SYNCH_AFTER_DICT

IDS_RECO_SYNCH_AFTER_DICT

IDS_RECO_NEWWORD_PRON

Input .wav file, lexicon.wav

The lexicon form of new word

The pronunciation of the new word in user lexicon

Dictation Tag

A rule with dictation tag is loaded. Audio is feed and the test verifies the recognition event is generated.

IDS_DICTATIONTAG_WORDS

IDS_DICTATIONTAG_RULE

IDS_WAV_DICTATIONTAG

The word before the dictation tag

The dynamic grammar rule name

Input .wav file, tag_exprule.wav

Dictation alternates

Test ensures that engine can generate alternate results for dictation grammar. The test makes sure that engine has its own alternate object and the object can generate some alternate results.

IDS_WAV_EXPRULE_TAG

Input .wav file, tag_exprule.wav

 

Table 5: Dictation Compliance Test

 

7.2.3        Grammar

Each test will use a grammar specifically tailored for the particular feature. Some tests would use dynamic grammar instead of the static grammar.

 

 

Test

Description

Resource IDs

Descriptions

 

WildCard Tag

A rule with wildcard tag is loaded. Audio is feed and the test verifies the recognition event is generated.

IDS_WILDCARD_WORDS

IDS_WILDCARD_RULE

IDS_WAV_WILDCARD

 

The word before the wildcard tag

The dynamic grammar rule name

Input .wav file, tag_rule.wav

TextBuffer Tag

A grammar with <TextBuffer> tag will be loaded. Test fills in the content of TextBuffer on the fly. Audio with both static part and dynamic part of the grammar would be feed and the result would be checked.

IDS_CFGTEXTBUFFER_WORDS

IDS_CFGTEXTBUFFER_BUFFERWORD

IDS_CFGTEXTBUFFER_RULE

IDS_WAV_CFGTEXTBUFFER

The words before TEXTBUFFER tag

The word for TEXTBUFFER tag

The rule name

Input .wav file, tag_exprule.wav

Use the correct grammar

Two unambiguous grammars are loaded to test if engine can use the correct grammar to do recognition.

IDS_RECO_RULE_TAG

IDS_WAV_RULE_TAG

IDR_L_GRAMMAR

IDR_RULE_GRAMMAR

Expected result for the second grammar

Input .wav file, tag_rule.wav

The first grammar

The second grammar

Use the most recently activated grammar

Two ambiguous grammars are loaded to test if engine can use the most recently activated grammar to do the recognition.

IDS_WAV_RULE_TAG

IDR_RULE_GRAMMAR

 

Input .wav file, tag_rule.wav

Input CFG grammar

Table 6: Grammar Feature Compliance Test

 

7.2.4        Alternates, engine properties, inversed text normalization

 

 

Test

Description

Resource IDs

Descriptions

Command and Control alternates

Test ensures that the engine can generate alternate results for command and control grammar

IDS_ALTERNATESCFG_BESTWORD

IDS_ALTERNATESCFG_ALTERNATE1

IDS_ALTERNATESCFG_ALTERNATE2

IDS_ALTERNATESCFG_WORDS

IDS_WAV_ALTERMATESCFG

The best choice of the CFG grammar

Alternate word in CFG grammar

Alternate word in CFG grammar

Others words in the CFG grammar

tag_exprule.wav

Engine numeric properties

If engine supports the numeric properties specified by SAPI

NA

 

Engine text properties

If engine can return S_FALSE on the text properties that are not supported.

NA

 

Inversed Text Normalization

The test uses a wav file and expects engine to pass back a result containing digits together with the normal result. Please note that this is a very specific ITN test and is not coverage of ITN related issues.

IDS_RECO_GETITNRESULT

IDS_WAV_GETITNRESULT

IDR_RULE_GRAMMAR

Expected ITN result

Input .wav file, tag_rule.wav

Input CFG grammar

Table 7: Feature Compliance Tests

 

7.3       SR Sample Engine

The sample engine is not fully SAPI compliant due to the fact that it does not have the full range of functionality that a true SR engine would have. Table 8 indicates which compliance tests will pass. Table 9 indicates which features are supported.

 

 

Test

Result

Description

Events

 

 

SoundStart

Pass

 

SoundEnd

Pass

 

PhraseStart

Pass

 

FalseRecognition

Fail

The sample engine doesn't generate this event based on the real SR job.

Recognition

Pass

 

SoundStart/SoundEnd order

Pass

 

PhraseStart/Recognition order

Pass

 

Event offset

Pass

 

Lexicon

 

 

User Lexicon Before C&C Grammar Loaded

Fail

The sample engine doesn't use user lexicon.

User Lexicon After C&C Grammar Loaded

Fail

The sample engine doesn't use user lexicon.

App Lexicon

Fail

The sample engine doesn't use application lexicon.

Use user lexicon before application lexicon

Fail

The sample engine does not use either a user lexicon or an application lexicon.

Grammar

 

 

L Tag

Fail

The result might be sometimes fail and sometimes pass. The sample engine randomly generates results based on the given grammar. It doesn't do any real recognition.

Expected Rule

Pass

 

P Tag

Fail

The result might be sometimes fail and sometimes pass. The sample engine randomly generates results based on the given grammar. It doesn't do any real recognition.

O Tag

Fail

The result might be sometimes fail and sometimes pass. The sample engine randomly generates results based on the given grammar. It doesn't do any real recognition.

Ruleref Tag

Fail

The result might be sometimes fail and sometimes pass. The sample engine randomly generates results based on the given grammar. It doesn't do any real recognition.

/Disp/lex/pron format

Fail

The result might be sometimes fail and sometimes pass. The sample engine randomly generates results based on the given grammar. It doesn't do any real recognition.

Other

 

 

Phrase Elements

Pass

 

Auto Pause

Pass

 

Top-level rule invalidation

Fail

The sample engine randomly generates results based on the given grammar. It doesn't do any real recognition.

Non-top-level rule invalidation

Fail

The sample engine randomly generates results based on the given grammar. It doesn't do any real recognition.

Multiple recognition contexts

Pass

 

Multiple recognition engine instances

 

The sample engine randomly generates a cfg result based on the given grammar. It doesn't do any real recognition.

Table 8: Sample Engine Required Compliance Test results

 

 

Test

Result

Description

Events

 

 

Hypothesis

SUPPORTED

 

Interference

UNSUPPORTED

The sample engine doesn't generate the event correctly.

Dictation

 

 

User lexicon synchronize before dictation grammar loaded

UNSUPPORTED

The sample engine doesn't use user lexicon.

User lexicon synchronize after dictation grammar loaded

UNSUPPORTED

The sample engine doesn't use user lexicon.

Dictation Tag

SUPPORTED

 

Dictation alternates

SUPPORTED

 

Grammar

 

 

Wildcard Tag

SUPPORTED

 

TextBuffer Tag

SUPPORTED

 

Use the correct grammar

UNSUPPORTED

The sample engine randomly generates results based on the given grammar. It doesn't do any real recognition.

Use the most recently activated grammar

UNSUPPORTED

The sample engine randomly generates results based on the given grammar. It doesn't do any real recognition.

Other

 

 

Command and Control alternates

UNSUPPORTED

The compliance test only uses one rule while the sample engine needs at least two rules.

Engine numeric properties

SUPPORTED

 

Engine text properties

SUPPORTED

 

Inversed Text Normalization

UNSUPPORTED

The sample engine doesn't have this functionality.

Table 9: Sample Engine Feature Compliance Test results

7.4       Compliance Test Customization

 

Many of the tests do require that a specific recognition result be returned to verify proper handing of such things as the grammar format. To accommodate different engines variability with recognition of different voices and to support non-English engines, these tests will enable the engine vendor to supply a sound file that passes the test (Refer to Section 7.5). Since some tests might share the same .wav file, it is recommended to supply a .wav file with different name. Additionally the grammars can be changed to accommodate words that the engine is able to recognize better (Refer to Section 7.6). 

 

7.5       Multilingual Support

 

The compliance tests will tests engines for the supported languages[3]. To test an SR engine that uses another language, one must:

§         Ensure that the correct language pack is installed. For Windows 2000 and Millennium Edition, this may be done by installing the language pack from the Windows 2000 or Windows Millennium CD. For Windows 98 and Windows NT 4.0, install the language pack from the Windows Update web site.

§         Select the engine as the default engine using Speech Recognition tab in Speech properties.

§         Create and insert a string table in the sapi5sdk\tools\comp\sr\srcomp.rc that is localized for the language. (Refer to Table 10)

§         Create the .wav files[4] according to the new string table and place this  under the specified directory (according to the search path precedence (Refer to Section 7.5.2)).

§         Create and compile the appropriate XML files using a grammar editor and complier.

§         Include the CFG binaries into the .dll by importing the CFG file names into srcomp.rc[5].

§         Recompile the sr.dsp.

§         Run the compliance tests.

 

 

7.5.1        Example:

 

If  you want to add resource for test SoundStart for language 888:

1.      create and insert copy a string table in srcomp.rc for language 888.

2.      Change the string "IDS_WAV_SOUNDSTART" to the new .wav file you want to use.

3.      Insert the xml grammar file you want to use into the project. Modify the IDR_L_GRAMMAR reference to your cfg binary.

 

NOTE: If the default engine supports multiple languages, then the compliance test will only run on the first language specified in string "Language" under key HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Speech\Recognizers\Tokens\MSASREnglish\Attributes. In other words you need to change the order of the languages in the attributes key under your speech recognizer token for each language you wish to test. (Refer to Section 7.6)

 

The strings that need to be localized are shown in Table 10.

 

String Number

English Text

 

IDS_RECO_L_TAG

put

Translate string

IDS_RECO_EXPRULE_TAG_1

play

Translate string

IDS_RECO_EXPRULE_TAG_2

the

Translate string

IDS_RECO_P_TAG

white

Translate string

IDS_RECO_O_TAG_1

please

Translate string

IDS_RECO_O_TAG_2

walk

Translate string

IDS_RECO_O_TAG_3

slowly

Translate string

IDS_RECO_RULE_TAG

seven

Translate string

IDS_RECO_LN_TAG

red

Translate string

IDS_RECO_NEWWORD_PRON

s n ao 1 r k

Translate phonemes

IDS_AUTOPAUSE_DYNAMICWORD1

put

Translate string

IDS_AUTOPAUSE_DYNAMICWORD2

red

Translate string

IDS_AUTOPAUSE_DYNAMICRULE1

Action

Translate string

IDS_AUTOPAUSE_DYNAMICRULE2

color

Translate string

IDS_INVALIDATETOPLEVEL_DYNAMICWORDS

play the oboe

Translate string

IDS_INVALIDATETOPLEVEL_DYNAMICRULE

Play

Translate string

IDS_INVALIDATETOPLEVEL_DYNAMICNEWWORDS

please play the seven

Translate string

IDS_INVALIDATENONTOPLEVEL_RULE1

option

Translate string

IDS_INVALIDATENONTOPLEVEL_RULE2

Thing

Translate string

IDS_INVALIDATENONTOPLEVEL_TOPLEVELRULE

play

Translate string

IDS_INVALIDATENONTOPLEVEL_OLDWORD1

empty

Translate string

IDS_INVALIDATENONTOPLEVEL_OLDWORD2

Oboe

Translate string

IDS_INVALIDATENONTOPLEVEL_NEWWORD2

Seven

Translate string

IDS_INVALIDATENONTOPLEVEL_TOPLEVELWORDS

Play the

Translate string

IDS_CFGTEXTBUFFER_WORDS

Play the

Translate string

IDS_CFGTEXTBUFFER_BUFFERWORD

oboe

Translate string

IDS_CFGTEXTBUFFER_RULE

play

Translate string

IDS_ALTERNATESCFG_BESTWORD

play

Translate string

IDS_ALTERNATESCFG_ALTERNATE1

played

Translate string

IDS_ALTERNATESCFG_ALTERNATE2

pay

Translate string

IDS_ALTERNATESCFG_WORDS

The oboe

Translate string

IDS_ALTERNATESCFG_RULE

play

Translate string

IDS_RECO_GETITNRESULT

please play the 7

Translate string

IDS_CUSTOMPROP_NEWWORD_PRON

s n ao 1 r k

Translate phonemes

IDS_CUSTOMPROP_RULE

play

Translate string

IDS_CUSTOMPROP_NEWWORD_DISP

abc

Translate string

IDS_CUSTOMPROP_NEWWORD_LEX

play

Translate string

IDS_DICTATIONTAG_WORDS

Play the

Translate string

IDS_DICTATIONTAG_RULE

play

Translate string

IDS_WILDCARD_WORDS

Please play

Translate string

IDS_WILDCARD_RULE

play

Translate string

IDS_APPLEX_PROP

s n ao 1 r k

Translate phonemes

IDS_USERLEXBEFOREAPPLEX_USERPROP

s n ao 1 r k

Translate phonemes

IDS_USERLEXBEFOREAPPLEX_APPPROP

P l ey

Translate phonemes

IDS_INVALIDATENONTOPLEVEL_NEWWORD1

Please

Translate string

Table 10: Strings to be localized

 

7.5.2        Search Path Precedence

The compliance tests use a search path precedence to find the various .wav files needed for the compliance tests. The order of search is:

§         current directory

§         "../resources",

§         "../../../resources"

§         "../../resources"

 

If the compliance tests cannot find the .wav files in these directories, the test will pop up a dialog asking the user to enter the customized dir to open the file. The new directory will be added to the search order and the new search order will persist for the life of SpComp.

It is important to note that for languages that do not have a SAPI standard phoneme set (i.e. languages which are not supported in this version of SAPI), the engine will fail the following required compliance tests:

 

Test

Result

Lexicon

 

User Lexicon Before C&C Grammar Loaded

Fail

User Lexicon After C&C Grammar Loaded

Fail

App Lexicon

Fail

Use user lexicon before application lexicon

Fail

User lexicon synchronize before dictation grammar loaded

Fail

User lexicon synchronize after dictation grammar loaded

Fail

/Disp/lex/pron format

Fail

 

 

7.6       OS Language Incompatibility

The compliance tests are not based on the language of the OS. They are based on the first language in the token of the default engine. In other words, a Japanese engine on an English OS will cause the compliance tests to load the Japanese resources and run the compliance tests expecting a Japanese engine. In order to run the compliance test for an engine that has a different language than the OS, you will need to set the default engine on the Speech Recognition tab in Speech properties.

 

NOTE: If the default engine supports multiple languages, then the compliance test will only run the first language. In other words you need to change the order of the languages in the attributes key under your speech recognizer token for each language you wish to test. For example, if your engine token supports both Japanese and English, to test English, , the string "Language" must be like "409; 411" under HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Speech\Recognizers\Tokens\MSASREnglish\Attributes

To test Japanese, the string "Language" must be like "411; 409" under the same key.

 

8         SAPI Compliance: TTS

SAPI compliant TTS engines must be able to perform the following:

§         Speak with SAPI defined speak flags

§         Return the supported audio output formats

§         Interact with the SAPI lexicon

§         Interpret SAPI XML tags

§         React to programmatic volume changes

§         React to programmatic rate changes

§         Synthesize certain SAPI events

§         Skip forward and backward through a segment of text

§         Support multiple instances

 

8.1       Required Tests

8.1.1        Speak

The speak tests test the engine ability to interact with ISpTTSEngine::Speak as well as process and react to actions. Before SAPI passes a speak call to the engine, it Query Interfaces (QIs) the engine object for ISpTTSEngine and ISpObjectWithToken interfaces. If either of these are not implemented correctly, the speak call will fail. If both of the QIs pass, then SAPI will pass the speak call to the engine. The speak call contains a number of speak flags. Table 1 describes the individual tests.

 

 

 

Test

Description

SPF_DEFAULT

This is the normal, default speak flag. The engine should be able to render text to the output site under normal conditions. The engine should also be able to continue rendering when passed the continue action from SAPI

SPF_PURGEBEFORESPEAK

The engine should be able to stop rendering when passed to abort action

SPF_IS_XML

The engine should be able to interpret the SAPI XML when passed to it by SAPI.

SPF_ASYNC

The engine should be able to render text to the output site. The asynchronous environment should not affect the engine since SAPI handles the multithreading issues.

SPF_SPEAK_NLP_PUNC

the engine should be able to speak punctuation.

Table 12: Speak Flag Tests

 

The compliance tests also test a number of combination scenarios to help ensure that the engine is able to stop speaking and start speaking in various combinations that are shown in table 2.

 

Test

Description

Speak Destroy

SAPI sends the engine a speak call followed by an abort action.

Speak Stop

SAPI sends the engine a speak call followed by a purge call. This is to ensure that the engine is able to stop speaking and clear memory.

Table 13: Speak Tests

 

8.1.2        Output Format

The output format test checks to ensure that the engine is capable of passing its supported output format to SAPI. This is testing the ISpTTSEngine::GetOutputFormat function.

 

8.1.3        Lexicon

The lexicon tests check the interaction of the engine and SAPI. These tests add a word to the user and application lexicons and check to ensure that the engine is able to detect the words in the lexicon and is using the word pronunciation from the user lexicon first, and then the application lexicon second if the word is present in both lexicons. There are two separate tests as shown:

Test

Description

User lexicon

A word is added to the user lexicon and the engine is requested to pronounce this word. The engine should be aware of the case sensitivity. The engine must support the SAPI phoneme, SAPI part of speech, as well the lexicon APIs.

Application lexicon

A word is added to the application lexicon and the engine is requested to pronounce this word. The engine must support the SAPI phoneme, SAPI part of speech, as well the lexicon APIs.

Table 14: Lexicon Tests

 

8.1.4        XML Tags

The SAPI XML tags are required for compliance. The tags tests shown are required:

 

Test

Description

Bookmark

Tests if the engine is able to process the bookmark tag and write the appropriate bookmark event.

Silence

Tests if the engine outputs the correct amount of silence as indicated by the silence tag.

Volume

Tests if the engine can change the volume by the correct amount.

Spell 

Tests if the engine can handle the tag and spell out the text.

Pron

Tests if the engine can handle the SAPI defined phoneme set.

Rate

Tests if the engine can change the rate by the correct amount.

Pitch

Tests if the engine can change the pitch by the correct amount.

Context

Tests if the engine can handle the SAPI defined context tag.

Engine proprietary SAPI tags

Test if the engine can handle non-SAPI tags.

Table 15: SAPI XML tests

 

8.1.5        SetVolume

This test checks to see if the engine is capable of processing the volume change action. When the engine received this action, it should call the GetVolume function from SAPI to get the new volume, and reflect the change in the audio output.

 

8.1.6        SetRate

This test checks to see if the engine is capable of processing the rate change action. When the engine received this action, it should call the GetRate function from SAPI to get the new volume, and reflect the change in the audio output.

 

8.1.7    Events

The events test checks to ensure that the engine is writing the correct data, especially wParam and lParam, to the event structure. For the sentence boundary event, wParam is the character length of the sentence including punctuation in the current input stream being synthesized. lParam is the character position within the current text input stream of the sentence being synthesized. For the word boundary event, wParam is the character length of the word in the current input stream being synthesized. lParam is the character position within the current text input stream of the word being synthesized. Any leading and ending spaces will not be included in the length of the word or the sentence. There are three events which are required for SAPI compliance:

 

Test

Description

SPEI_TTS_BOOKMARK

Checks the engine's ability to properly fire bookmarks embedded in the text.

SPEI_WORD_BOUNDARY

Checks the engine's ability to generate word boundaries given a segment of text.

SPEI_SENTENCE_BOUNDARY

Checks the engine's ability to detect and generate sentence boundaries.

Table 16: Events Tests

 

8.1.8        Skip

This test examines the engine's ability to interact with the skip action. Once the engine receives this action, it should call the ISpTTSEngineSite::GetSkipInfo function. After it has completed the skip, it should call the ISpTTSEngineSite::CompleteSkip function.

 

8.1.9    Multi-Instance

The multiple instances test checks to ensure that the engine can handle multiple calls at the same time from SAPI. The tests contains a total of 4 threads and each thread has its own ISpVoice object and the test runs a random combination of the following tests 20 times consecutively:

 

§         Speak

§         Skip

§         GetOutPutFormat

§         SetRate

§         SetVolume

§         Check SAPI required Event

§         XML Bookmark

§         XML Silence

§         XML Spell

§         XML Pron

§         XML Rate

§         XML Volume

§         XML Pitch

§         Real time Rate changes

§         Real time Volume changes

§         Speak Stop

§         Lexicon

§         XML context

§         Engine proprietary SAPI tags and other combination of XML tags

 

8.2       Feature Tests

Some of the features exposed through SAPI are useful from a competitive advantage point of view. Features are not required by SAPI compliance, but may be an attractive function for engine vendors to implement. SAPI features are:

§         Generation of phoneme events (determines if the engine can generate a phoneme event for a given string of text. The phonemes must correspond to the SAPI defined phonemes).

§         Generation of viseme events (determines if the engine can generate a viseme for a given string of text. The viseme must correspond with the SAPI defined visemes).

§         XML emphasis tag (the engine should change the volume, rate, or pitch of the audio rendered).

§         XML PartOfSp tag (the engine should handle the SAPI defined part of speech – the engine will need to implement this for the lexicon compatibility tests)

 

 

8.3       TTS Sample Engine

The sample engine is not fully SAPI compliant due to the fact that it does not have the full range of functionality that a true TTS engine would have. Table 6 indicates which compliance tests will pass. Table 7 indicates which features are supported.

 

Test

Result

Description

Speak

 

 

SPF_DEFAULT

Pass

 

SPF_PURGEBEFORESPEAK

Pass

 

SPF_IS_XML

Pass

 

SPF_ASYNC

Pass

 

SPF_SPEAK_NLP_PUNC

Pass

 

Speak Destroy

Pass

 

Speak Stop

Pass

 

GetOutput Format

Pass

 

Lexicon

 

 

User Lexicon

Fail

The sample engine does not use a lexicon.

App Lexicon

Fail

The sample engine does not use an application lexicon.

XML Tags

 

 

Bookmark

Pass

 

Silence

Pass

 

Volume

Fail

The sample engine uses pre-recoded .wav files and cannot adjust the volume of the .wav files.

Spell

Fail

The sample engine uses pre-recorded .wav files and cannot spell each word.

Pron

Fail

The sample engine uses pre-recorded .wav files and cannot interpret the SAPI phonemes.

Rate

Fail

The sample engine uses pre-recoded .wav files and cannot adjust the rate of the .wav files.

Pitch

Fail

The sample engine uses pre-recoded .wav files and cannot adjust the pitch of the .wav files.

Context

Pass

 

Engine proprietary SAPI tags

Pass

 

SetVolume

Fail

The sample engine uses pre-recoded .wav files and cannot adjust the rate of the .wav files.

SetRate

Fail

The sample engine uses pre-recoded .wav files and cannot adjust the volume of the .wav files.

Events

 

 

SPEI_TTS_BOOKMARK

Pass

 

SPEI_WORD_BOUNDARY

Pass

 

SPEI_SENTENCE_BOUNDARY

Pass

 

Skip

Fail

The sample engine uses pre-recoded .wav files and cannot skip sentences since it does not have a sentence breaker. 

Multiple Instances Test

Fail

The sample engine uses pre-recoded .wav files and cannot skip sentences, change rate, pitch, and volume, or use lexicons.

Table 17: Sample Engine Required Test Results

 

 

Test

Result

Description

Phoneme events

UNSUPPORTED

The sample engine uses pre-recoded .wav files and cannot synthesize the phoneme events.

Viseme events

UNSUPPORTED

The sample engine uses pre-recoded .wav files and cannot synthesize the viseme events.

XML Emph Tag

UNSUPPORTED

The sample engine uses pre-recoded .wav files and cannot adjust the emphasis of the .wav files.

XML PartOfSp

UNSUPPORTED

The sample engine uses pre-recoded .wav files and cannot adjust the part of speech of the .wav files.

Table 18: Sample Engine Feature List Test Results

 

8.4       Multilingual Support

To test an engine that uses language aside for the supported languages, one must:

 

§         Ensure that the correct language pack is installed. For Windows 2000 and Millennium Edition, this may be done by installing the language pack from the Windows 2000 or Windows Millennium CD. For Windows 98 and Windows NT 4.0, install the language pack from the Windows Update web site.

§         Select the engine as the default engine using the TTS tab in Speech properties.

§         Create a string table in the \sapi5sdk\tools\comp\tts\ttscomp.rc which is localized for the language. (Refer to Table 19) 

o       Go to ResourceView in the ttscomp workspace, right click the mouse on "String Table", and select "Insert Copy". The following window will appear. From the window, select the language that the engine supports, and then click OK.

ms717033.image013(en-us,VS.85).jpg

 

 

 

 

o       Open \sapi5sdk\tools\comp\tts\ttscomp.rc to your editor and edit your language resources (Refer to Table 19), and then save ttscomp.rc

o       The following is an example how to support GetOutputFormat test in Korean using Microsoft FrontPage editor:

 

§         Open ttscomp.dsw and create a string table in Korean, save ttscomp.rc

§         Go to below table 19 and find strings used in GetOutputFormat test.  Only one string, IDS_STRING65, is found corresponding to the test.

§         Open ttscomp.rc in Notepad and find IDS_STRING65 under Korean resources

§         Launch Microsoft FrontPage and select File | New and Normal tab

§         Translate "This is the TTS Compliance Test" to Korean.

§         Select Preview tab, right click your mouse, select Encoding | Western European (Windows), and then cut/paste the string from MS FrontPage to IDS_STRING65 under Korean resources in your notepad

§         Save ttscomp.rc

 

§         Recompile the tts.dsp.

§         Run the compliance tests.

 

The strings that need to be localized are:

Test Name

String Number

English Text

 

Speak Destroy

IDS_STRING6

This is a long string of text that will not complete because it will be released it in the next line of code. The engine is expected to clean-up correctly and not fault.

Translate string

Speak

 

IDS_STRING8

Hello <BOOKMARK MARK= "12">World

Translate "Hello World"

IDS_STRING10

This is a test.

Translate string

IDS_STRING11

Blah blah …

May need to translate if engine does not understand phonemes

Phoneme & Viseme Events

IDS_STRING10

 

 

SetVolume

 

IDS_STRING10

 

 

SetRate

IDS_STRING10

 

 

Check SAPI required Events

 

IDS_STRING20

Bookmark <BOOKMARK MARK= "123"/>test

Translate "Bookmark test"

XML Bookmark

IDS_STRING20

 

 

XML Silence

 

IDS_STRING23

Hello World

Translate string

IDS_STRING24

Hello <SILENCE MSEC = "8000"/> World

Translate "Hello World"

XML Spell

 

 

IDS_STRING26

<SAPI> ENGLISH LANGUAGE</SAPI>

May need to translate "ENGLISH LANGUAGE" if letters are not known (Refer to IDS_STRING27)

IDS_STRING27

<SPELL> ENGLISH LANGUAGE </SPELL>

May need to translate "ENGLISH LANGUAGE" if letters are not known

XML Rate

 

IDS_STRING30

<RATE SPEED= "-5"> hello world </RATE>

Translate "hello world"

IDS_STRING31

<RATE SPEED= "5"> hello world </RATE>

Translate "hello world"

XML Volume

 

IDS_STRING33

<VOLUME LEVEL = "100"> hello </VOLUME>

Translate "hello"

IDS_STRING34

<VOLUME LEVEL = "1"> hello </VOLUME>

Translate "hello"

XML Pitch

 

IDS_STRING37

<PITCH MIDDLE ="-10"> a </PITCH>

Translate the letter "a"

IDS_STRING38

<PITCH MIDDLE ="+10"> a </PITCH>

Translate the letter "a"

XML PartOfSp

 

 

 

 

IDS_STRING48

H l ow

Translate phonemes

IDS_STRING52

N ow n p r ow n ow aa aa ah ao aw b ch eh er

Translate phonemes

IDS_STRING76

test

Translate "test"

Real time volume change

 

IDS_STRING53

This <BOOKMARK MARK="1234"/>string is used in the real time rate and volume tests. It's rate and volume are adjusted mid stream. Engines should pick these changes up.

Translate "This string is used in the real time rate and volume tests. It's rate and volume are adjusted mid stream. Engines should pick these changes up."

Real time rate change

IDS_STRING53

 

 

XML Pronounce

 

IDS_STRING56

A

Translate the word "a"

IDS_STRING57

<PRON SYM="aa n th ow p ow l ow jh iy aa n th ow p ow l ow jh iy aa n th ow p ow l ow jh iy">a</PRON>

Translate phonemes

Skip

IDS_STRING63

<SAPI>one.  Two.  Three. Four. Five. Six.  Seven. Eight. Nine. Ten. <BOOKMARK MARK="123"/>bookmark event.  One.  Two.  Three. Four. Five. Six.  Seven. Eight. Nine. Ten. </SAPI>

Translate "one.  Two.  Three. Four. Five. Six.  Seven. Eight. Nine. Ten. Bookmark event.  One.  Two.  Three. Four. Five. Six.  Seven. Eight. Nine. Ten."

User Lexicon Test

 

IDS_STRING64

Computer

Translate "computer"

IDS_STRING71

dh aa n th ow p ow l ow jh iy aa n th ow p ow l ow jh iy aa n th ow p ow l ow jh ch ow ao ah ow ow p ow l ow jh ch ow ao ah ow

Translate phonemes

IDS_STRING76

 

 

IDS_STRING94

h eh l ow w er l d h eh l ow w er l d h eh l ow w er l d h eh l ow w er l d

Translate phonemes

GetOutputFormat

IDS_STRING65

This is the TTS Compliance Test

Translate "This is the TTS Compliance Test"

 

XML Non-SAPI tags

IDS_STRING67

<SOMEBOGUSTAGS> Non-SAPI tags test </SOMEBOGUSTAGS>

Translate "Non-SAPI tags test"

XML Emph

 

IDS_STRING68

<SAPI>Do you hear me?</SAPI>

Translate "Do you hear me?"

IDS_STRING69

<SAPI><EMPH>Do you hear</EMPH>me?</SAPI>

Translate "Do you hear me?"

App Lexicon Test

IDS_STRING76

 

 

IDS_STRING94

 

 

XML Context

 

IDS_STRING99

<context id="date_mdy">12/21/99</context><context id="date_mdy">12.21.00</context> <context id="date_mdy">12-21-9999</context>

May need to translate dates

IDS_STRING100

<context id="date_dmy">21/12/00</context><context id="date_dmy">21.12.33</context><context id="date_dmy">21-12-1999</context>

May need to translate dates

IDS_STRING101

<context id="date_ymd">99/12/21</context><context id="date_ymd">99.12.21</context> <context id="date_ymd">1999-12-21</context>

May need to translate dates

IDS_STRING102

<context id="date_ym">99-12</context><context id="date_ym">1999.12</context><context id="date_ym">99/12</context>

May need to translate dates

IDS_STRING103

<context id="date_my">12-99</context><context id="date_my">12.1999</context><context id="date_my">12/99</context>

May need to translate dates

IDS_STRING104

<context id="date_dm">21.12</context> <context id="date_dm">21-12</context> <context id="date_dm">21/12</context>"

May need to translate dates

IDS_STRING105

<context id="date_md">12-21</context> <context id="date_md">12.21</context> <context id="date_md">12/21</context>

May need to translate dates

IDS_STRING106

<context ID = "date_year"> 1999</context> <context ID = "date_year"> 2001</context>

May need to translate dates

IDS_STRING107

<context id='time">12:30:10</context><context id='time">12:30</context><context id='time">1"21"</context>

May need to translate times

IDS_STRING108

<context id="number_cardinal">3432</context>

May need to translate number

IDS_STRING109

<context id="number_digit">3432</context>

May need to translate number

IDS_STRING110

<context id="number_fraction">3/15</context>

May need to translate number

IDS_STRING111

<context id="number_decimal">423.12433</context>

May need to translate number

IDS_STRING112

<context id="phone_number">(425)706-2693</context>

May need to translate phone number

IDS_STRING113

<context id="currency">$12312.90</context>

May need to translate currency

IDS_STRING116

<context ID = "address">One Microsoft Way, Redmond, WA, 98052</context>

May need to translate address

IDS_STRING117

<context ID = "address_postal"> A2C 4X5</context>

May need to translate postal address

IDS_STRING118

<CONTEXT ID = "MS_My_Context"> text </CONTEXT>

Translate "text"

Multiple-Instance Test

 

IDS_STRING41

Hello <SILENCE MSEC = "%d"/> World

Translate "Hello World"

IDS_STRING43

<PRON SYM = "aa n th ow p ow l iw jh iy"> hello </PRON>

Translate SYM phonemes and "hello"

IDS_STRING44

<RATE SPEED = "%d"> hello </RATE>

Translate "Hello"

IDS_STRING45

<VOLUME LEVEL = "%d"> hello </VOLUME>

Translate "Hello"

IDS_STRING46

<PITCH MIDDLE = "%d"> hello </PITCH>

Translate "hello"

IDS_STRING8

Hello <BOOKMARK MARK= "12">World

Translate "Hello World"

IDS_STRING27

 

 

IDS_STRING67

 

 

IDS_STRING76

 

 

IDS_STRING94

 

 

IDS_STRING119

 

 

Table 19: Strings to be localized for compliance tests

 

It is important to note that for languages which do not have a SAPI standard phoneme set (i.e. languages which are not supported in this SAPI release), the engine will fail the following required compliance tests:

 

Test

Result

Lexicon

 

User Lexicon

Fail

App Lexicon

Fail

XML Tags

 

Pron

Fail

Table 21: Required Compliance Tests Failed

 

The engine will also fail the following feature tests:

 

Test

Result

Phoneme events

UNSUPPORTED

Viseme events

UNSUPPORTED

XML PartOfSp

UNSUPPORTED

Table 22: Feature Compliance Tests Not Supported

 

 

8.5       OS Language Incompatibility

The compliance tests are not based on the language of the OS. They are based on the first language in the token of the default engine. In other words, a Japanese engine on an English OS will cause the compliance tests to load the Japanese resources and run the compliance tests expecting a Japanese engine. In order to run the compliance test for an engine that has a different language than the OS, you will need to set the default engine on the Speech Recognition tab of Speech properties.

[1] For SR Simplified Chinese, the .wav files, cfg files, and resources do not ship with the SAPI 5.0 SDK. To request these localized files, please send mail to sapi5@microsoft.com.

[2] All SR tests use the default recognition profile. To increase the accuracy of the tests, you may wish to change the .wav files used (Refer to Section 7.4) or train the recognition profile.

[3] The Simplified Chinese SR resource files, cfg files and .wav files are not included in the SAPI 5 SDK. Please e-mail sapi5@microsoft.com to obtain these files.

[4] These wav files MUST have different names than the original wav files. The new wav file names should be reflected in the string table.

[5] A basic assumption of the compliance tests is that the CFG files are included in the dll whereas the wav files are external.