Delivering Help
Everything a system says to users should help them complete their task. Nevertheless, no matter how well designed a system and how clearly written the initial prompts, some users will inevitably need and ask for help.
In order to provide the right information in the right order, designers need to imagine every possible reason why a user might need assistance at any given time. The difficulty in understanding the user's needs is that, typically, a system does not know how the user arrived at the moment when he or she requested help—the user's progress through the application was not tracked.
Help in speech recognition systems falls into three basic categories.
- Introductory or tutorial help
- Error recovery help
- Explicit responses to in-context help requests
Introductory or Tutorial Help
When a user first connects to a large system, whether personalized or not, there may be a tutorial or introductory message to explain basic facts about the system's operation. The tutorial may include a synopsis of functions the system is able to perform or perhaps some basic guidelines for communicating with the recognizer.
Because of the linear nature of speech, the challenge of delivering even moderately complex tutorials through narration is considerable. How can a system keep callers engaged without making them feel bombarded by long or boring instructions? A user's attention will wander as he or she begins to lose focus.
One approach to combating this problem includes marking the beginning of the tutorial with some indication of how much information follows and a quick overview of what is to come. Indicating how many items will be covered can also help callers understand where they are in the tutorial.
SYSTEM: | Hi. Thanks for calling Wing Tip Toy's automated help line. As you use the system, there are three basic things you'll want to remember... |
Callers may be reluctant to interrupt the tutorial even if they feel they have already heard the information they need. Perhaps they might miss some other important tip. By quickly setting expectations up front, the users who have already heard the information they need can interrupt the system with more confidence, particularly if discourse markers are clearly used.
SYSTEM: | ...and finally, if you'd like to receive our compli- |
CALLER: | (barging in) Check on my order. |
During introductory tutorials, it is advisable to allow barge-in only with a very limited set of commands that prevent (or at least minimize) accidental barge-in during the instructions. These should be specific commands that only experienced users can utter, and they should be ignored if the recognizer does not return high confidence ratings.
Error Recovery Help
Error handling prompts embedded in each dialogue are probably the most common type of help in systems today.
As described previously in the dialogue design section, most basic dialogue error handling involves one or more prompts that support the initial question or prompt. These prompts offer help to the user in the event of silence or out-of-grammar speech.
The user does not ask for error recovery help. It is impossible for designers to know exactly why an error has occurred; prompts should be as brief and as targeted as possible. In the case of misrecognitions, the prompt may include examples of in-grammar commands, as well as information about how the user can ask for more help. Systems can avoid gratuitous help by including a pause between the restatement of the main question and the subsequent help information. This pause cues users that they can barge in before the system embarks on more lengthy help information.
System Responses to In-context Help Requests
When a user asks for help, the system is able to respond in more detail without risk of overwhelming or annoying the user. In response to the request for help, the system effectively takes time out from the main conversation to explain a feature, and further clarify why the information is needed and how it will be used.
Example:
CALLER: | Help |
SYSTEM: | Sure. We need to pick a descriptive name for your appointment. The simplest thing is to call it a MEETING, but you can also say: BREAKFAST, LUNCH, DINNER, PHONECALL... |
Most systems attach increasing help information to the first and second level error prompts. In some schemes, the help prompt offers full detail, while the first and second level errors only reveal a subset of the help information. Alternatively, some schemes have already exposed the complete command list by the second level error, and the help prompt merely restates the help and offers the context or rationale behind the help.
It is unnecessary to treat the incremental help strategy as a linear progression, starting with the first level error handling, then second level, culminating with the help prompt as the most informative prompt. It is unnecessary because users may ask for help at any point in a node without progressing through the stages of error handling first. Therefore, designers should not be concerned about redundancy between the second level error prompt and the help prompt—there is no guarantee that the user will hear them in order.
A design could, in theory, customize individual help messages based on tracking which error messages have already been played. However, this level of design is rare. Typically, the system is unaware of the level of help users have already received when they request help.
Canonical Phrases
Canonical phrases are the basic phrases that users can say to invoke the different capabilities of the system. Top-level grammars often contain several commands that route users toward the various domains or parts of the dialogue model. Constructing canonical phrases that sound unique to the recognizer, are semantically clear and yet still conversational for the user, is a significant challenge in creating natural dialogue systems.
Differentiating Canonical Phrases
The challenge of creating unique canonical phrases and commands is more difficult in larger systems where a large number of phrases may be active at one time. For example, "repeat it" and "delete it" could return ambiguous recognition results. However, "say that again" or "get rid of it" would be less likely to confuse the recognizer.
Using current speech recognition technology, recognizers will rarely mistake "Yes" for "No" because the current recognition task is so severely constrained and the sound of the two inputs is so radically different. Of course, it would be unrealistic to create such a constrained dialogue model, but the notion of setting manageable limits on grammar size and insuring differentiation is instructive. The designer's challenge is to hide (as artfully as possible) the application's constraints behind logical organization of the task and carefully choreographed conversation.
Canonical Phrases in the Writing Process
The art of coaxing the user into saying something expected through careful prompt writing is similar to creating a "magician's choice" for the user. This term is used to describe the magician's artful coercion of subjects into choosing the card that the magician wants while believing that they have made a choice of their own free will. A good Natural Language (NL) interface maintains the illusion of conversational freedom for users, but in reality, the system is directing them to say exactly what the designers intend.
In order to maintain the benefits of a NL system, designers should guide users to respond with conversational utterances. If designers revert to the use of DTMF (dual-tone multi-frequency), recognizing words such as SELECT rather than THAT ONE, then systems are no longer relying on the user's conversational expertise. Instead, the interface will regress into metaphor, and the user will be forced to learn a larger share of the interface.
For this reason, the prompt writer should consider phrases for canonical input as well as focusing on the output phrases. The canonical phrases are half of the conversation and though the choice will ultimately be governed by many variables, including recognition and differentiation factors, writers accustomed to the creation of dialogue can provide valuable input.