How to: Design Grammar Rules

Article
08/18/2014

This content is no longer actively maintained. It is provided as is, for anyone who may still be using these technologies, with no warranties or claims of accuracy with regard to the most recent product version or service release.

A grammar contains a set of rules that specify the spoken words, phrases, and commands that a Web-based voice response application recognizes. XML elements and plain text are used to create the rules that identify the words or phrases that comprise a user's spoken commands. The rules used to identify spoken commands are represented by rule Element elements. In addition to using rules to identify spoken commands, you can add structure to rules by organizing spoken commands into groups of semantically related words or phrases. The resulting logically organized structure makes it easy to reuse rules within containing or external grammars.

A grammar must provide structured, logical speech statements that apply to specific situations. At the same time, a grammar must be general enough to allow slight variations in the statement to enable a more natural speaking style and provide a better user experience. For example, a coffee ordering application must accept and respond to multiple ways that a user can order coffee:

"I would like a decaf latte"
"I want a decaf latte"
"Gimme a decaf latte"

A coffee ordering application must also process and respond to orders for different types of coffee drinks in multiple variations. For example, an application must be able to substitute the word "latte" for "cappuccino" or "mocha" and must be able to insert variations of those drinks such as "decaf" or "iced." However, a coffee ordering application does not need to process and respond to orders for airline tickets; a grammar need only include words and phrases for a specific situation or task.

You can define a grammar structure that uses different elements in a specific sequence to provide the user with logical variations of phrases or single word options. For specific information about each valid grammar element, see Grammar XML. For an introduction to the elements used in a grammar structure, see Grammars Overview.

The following grammar code examples are simple progressions that demonstrate the process for designing grammars and grammar rules and for referencing grammar rules appropriately within a grammar structure. Ultimately, the grammar can become complex. For a moderately complex grammar that contains several rules, see Grammar Example: Solitaire.

Use the following procedures to assist you in designing grammar rules and in using rules appropriately within a grammar structure.

To design grammar rules

Create a List of Spoken User Commands.
Create Rules for Command Recognition.
Create Recognizable Sentences for User Commands.
Create a Series of Recognizable User Commands.
Create Variations of User Commands.

Create a List of Spoken User Commands

To create a grammar for user commands

To begin designing a grammar, list the obvious spoken commands that a user might say. In a simple case, user commands can be a single word, such as Open or Print.

Using the following information and code example as a guide, place the list of spoken user commands into a grammar structure.

Element	Description
grammar	Every grammar structure must include the grammar element's start and end tags. The grammar element is a container for all grammar rule definitions. The grammar element has the following required attributes to further define the grammar: version attribute - An identifier (the default value is 1.0) that identifies the version of the XML Speech Recognition Grammar Format. xml:lang Attribute attribute - The language identifier for the grammar or language contained by the document. root attribute - The explicit name of the default grammar rule xmlns attribute - The XML namespace (http://www.w3.org/2001/06/grammar). tag-format attribute - An identifier (default value: semantics-ms/1.0) that identifies the content type of all tag elements contained in the grammar.
rule	Each rule element must contain a unique identifier that defines the rule. The ID attribute provides this unique identifier. Developers can reference the grammar rule from elsewhere within the containing grammar by using the ID attribute. The example in step 2 uses the value, ruleOpen as the ID attribute. The scope attribute of the rule element designates the grammar as public or private, which indicates whether a rule can be referenced from an external grammar structure.
item	Each item element specifies a possible command that a user might say. In the example in step 2, the item element contains the user command Open. Each item element can contain one word or a phrase. A rule does not require item elements.

grammar

Every grammar structure must include the grammar element's start and end tags. The grammar element is a container for all grammar rule definitions. The grammar element has the following required attributes to further define the grammar:

version attribute - An identifier (the default value is 1.0) that identifies the version of the XML Speech Recognition Grammar Format.
xml:lang Attribute attribute - The language identifier for the grammar or language contained by the document.
root attribute - The explicit name of the default grammar rule
xmlns attribute - The XML namespace (http://www.w3.org/2001/06/grammar).
tag-format attribute - An identifier (default value: semantics-ms/1.0) that identifies the content type of all tag elements contained in the grammar.

rule

Each rule element must contain a unique identifier that defines the rule. The ID attribute provides this unique identifier. Developers can reference the grammar rule from elsewhere within the containing grammar by using the ID attribute. The example in step 2 uses the value, ruleOpen as the ID attribute. The scope attribute of the rule element designates the grammar as public or private, which indicates whether a rule can be referenced from an external grammar structure.

item

Each item element specifies a possible command that a user might say. In the example in step 2, the item element contains the user command Open. Each item element can contain one word or a phrase. A rule does not require item elements.

Using the following example grammar code as a template, create a grammar structure that contains your list of spoken commands. In the example, the user's command is Open.

<grammar root="ruleOpen" version="1.0"
xmlns="http://www.w3.org/2001/06/grammar"
 xml:lang="en-US" tag-format="semantics-ms/1.0" ... >
    <rule id="ruleOpen" scope="public">
        <item>open</item>
    </rule>
</grammar>

Create rules for command recognition as described in the next section.

Create Rules for Command Recognition

A grammar can have one rule, called the root**rule, available to an application for user command recognition use. The identifying name of the root rule is the value of the root attribute in the grammar element. Each rule is a separate and independent definition that uniquely identifies the user commands an application uses for recognition.

To create grammar rules for command recognition

Create two rules that define the content for two user commands: Open and Print. In the code example listed in step 4, ruleOpen identifies the Open command and rulePrint identifies the Print command.
Create a top-level rule containing a list of the two alternatives by placing two item elements in a one-of element. In the first item, place a ruleref element that refers to the ruleOpen rule. In the second item, place a ruleref element that refers to the rulePrint rule.
Specify the top-level rule as the root rule of the grammar by setting the value of the root attribute in the grammar element to the value of the ID attribute in the top-level rule element. In the example, ruleTopLevel is the value to use.

Using the following grammar example code as a guide, create the grammar structure.

<grammar root="ruleTopLevel" version="1.0"
xmlns="http://www.w3.org/2001/06/grammar"
 xml:lang="en-US" tag-format="semantics-ms/1.0" ... >
    <rule id="ruleTopLevel" scope="public">
        <one-of>
            <item> <ruleref uri="#ruleOpen" /> </item>
            <item> <ruleref uri="#rulePrint" /> </item>
        </one-of>
    </rule>
    <rule id="ruleOpen" scope="public">
        <item>open</item>
    </rule>
    <rule id="rulePrint" scope="public">
        <item>print</item>
    </rule>
</grammar>

The user can say "open" or "print," and the application recognizes the commands by matching the appropriate rule identifiers.

Create recognizable sentences for user commands as described in the next section.

Create Recognizable Sentences for User Commands

Within a rule, all the words and the sequence or pattern of the words must match for a successful recognition. When using phrases or sentences as the user commands (the text or a phrase inside the item elements), developers must include all the words a user needs to say to accomplish the command or task. Developers can separate item text or phrases into segments.

To create recognizable sentences

Define the grammar structure for a recognizable command sentence as illustrated in the XML code in step 3.
Insert a top-level rule element, set its ID attribute to ruleCoffee and set the grammar element's root attribute to the same, ruleCoffee. This is illustrated in the XML code in step 3.
Define the spoken content for the item element. Use a familiar user command such as "I would like a coffee," as shown in the following code.
```
<grammar root="ruleCoffee" version="1.0"
xmlns="http://www.w3.org/2001/06/grammar"
 xml:lang="en-US" tag-format="semantics-ms/1.0" ... >
    <rule id="ruleCoffee" scope="public">
        <item>I would like a coffee</item>
    </rule>
</grammar>
```
The user must say the exact phrase, "I would like a coffee," for the application to match the rule and result in a successful recognition.
Create a series of possible user commands as described in the next section.

Create a Series of Recognizable User Commands

Often the user is presented with a list of choices; however, these additional choices do not change the basic structure of the request. It is still appropriate for a user to say "I would like a...", and only the last word changes. To start this task, use the following grammar code example, and build on it.

<grammar root="ruleCoffee" version="1.0"
xmlns="http://www.w3.org/2001/06/grammar"
 xml:lang="en-US" tag-format="semantics-ms/1.0" ... >
    <rule id="ruleCoffee" scope="public">
        <item>I would like a</item>
        <item>coffee</item>
    </rule>
</grammar>

To create a series of recognizable user commands

Insert a one-of grammar element after the first existing item element already defined for this grammar rule.

A one-of element presents the user with a selection of recognizable choices. It is a subcontainer that can consist of nested item elements. The item elements actually contain the selections of text or phrases that a user might say.
Nest a series of item elements within the opening and closing tags of the one-of element as shown in the code in step 4.
Define the item text or phrase to comprise the series of possible user commands: coffee, mocha, latte, and water.

Use the following grammar code example to create a grammar structure that presents four phrases for possible user commands.

<grammar root="ruleCoffee" version="1.0"
xmlns="http://www.w3.org/2001/06/grammar"
 xml:lang="en-US" tag-format="semantics-ms/1.0" ... >
    <rule id="ruleCoffee" scope="public">
        <item>I would like a</item>
        <one-of>
            <item>coffee</item>
            <item>mocha</item>
            <item>latte</item>
            <item>water</item>
        </one-of>
    </rule>
</grammar>

The application matches the "ruleCoffee" rule if the user says the text or phrase contained in any of the item elements. The user can now say "I would like a coffee" or "I would like a latte" and the application recognizes either command and matches the rule.

Create variations of user commands.

These steps are covered in the following section.

Create Variations of User Commands

To make the user commands more versatile, define the text or phrase for additional grammar one-of and item elements. For reuse and ease of referencing from within a containing grammar or an external grammar, encapsulate these additional sets of one-of and item elements into additional rules.

The following instructions show how to create variations for user commands and use the code in the previous example (the grammar containing the ruleCoffee rule) as a starting point.

To create variations for user commands

Add another one-of element, below the existing closing one-of element, to allow the user to modify the drink order.
Encapsulate the existing one-of element and nested item elements that define the drink types using another rule element, and then define the ID attribute for reuse and ease of reference, drinkTypes.
Encapsulate this one-of element and nested item elements using another rule element, and then define the ID attribute to identify the coffee variations for reuse and ease of reference, drinkVariations.
Nest a series of item elements under the start tag of the one-of element, and then define the item text or phrase using "decaf," "hot," and "iced." The user can now order a latte, coffee, mocha, or water with a variation.

In the ruleCoffee rule, insert an item element, and then insert the grammar element described in the following table to reference the drinkVariations rule created in step 3.

Element	Description
ruleref	The ruleref element imports rules from the containing grammar or an external grammar files. The referenced rule is identified by the URI attribute and specifies the rule to which it is pointing. The ruleref element is especially useful for reusing component or predefined rules and grammars.

Define the URI attribute for the ruleref element created in step 5, using the #drinkVariations rule identifier, to reference the rule created in step 3.

Insert another ruleref element after the item element created in step 5, and then define the URI attribute with the #drinkTypes rule identifier to reference the rule created in step 4.

Use the following grammar code example to create a grammar structure that identifies user command variations and enables the application to match the rule for a successful recognition.

<grammar root="ruleCoffee" version="1.0"
xmlns="http://www.w3.org/2001/06/grammar"
 xml:lang="en-US" tag-format="semantics-ms/1.0" ... >
    <rule id="ruleCoffee" scope="public">
        <item>I would like a</item>
        <ruleref uri="#drinkVariations"/>
<ruleref uri="#drinkTypes"/>
<ruleref special="GARBAGE"/>
    </rule>

    <rule id="drinkTypes" scope="public">
        <one-of>
            <item>coffee</item>
            <item>mocha</item>
            <item>latte</item>
            <item>water</item>
        </one-of>
    </rule>

    <rule id="drinkVariations" scope="public">
        <one-of>
            <item>decaf</item>
            <item>hot</item>
            <item>iced</item>
        </one-of>
    </rule>
</grammar>

For reuse and ease of referencing the text or phrase, the drinkTypes and drinkVariations rules are created to encapsulate the series of drink types and variations commands. The grammar structure actually references the drinkVariations rule and drinkTypes rule by inserting the ruleref elements within the ruleCoffee rule. The application matches the ruleCoffee rule if the user says one of the text or phrase contained in any of the item elements, in the order defined by the placement of the ruleref elements: "I would like a decaf latte," versus saying "I would like a latte decaf."

Share via