Training
Module
Discover Microsoft guidelines for responsible conversational AI development - Training
Learn Microsoft guidelines for the development of responsible conversational AI, such as chat bots and voice-controlled systems.
This browser is no longer supported.
Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.
Now that you've determined the right level of disclosure for your text to speech avatar experience, it's a good time to explore potential design patterns.
There's a spectrum of disclosure design patterns you can apply to your synthetic voice experience. If the outcome of your disclosure assessment was 'High Disclosure', we recommend explicit disclosure, which means communicating the origins of the synthetic voice outright. Implicit disclosure includes cues and interaction patterns that benefit voice experiences whether required disclosure levels are high or low.
Category | Examples |
---|---|
Explicit disclosure patterns | |
Implicit disclosure patterns |
Use the following chart to refer directly to the patterns that apply to your synthetic voice. Some of the other conditions in this chart may also apply to your scenario:
If your synthetic voice experience… | Recommendations | Design patterns |
---|---|---|
Requires High Disclosure | Use at least one explicit pattern and implicit cues up front to helps users build associations. | |
Requires Low Disclosure | Disclosure may be minimal or unnecessary, but could benefit from some implicit patterns. | |
Has a high level of engagement | Build for the long term and offer multiple entry points to disclosure along the user journey. It is highly recommended to have an onboarding experience. | |
Includes children as the primary intended audience | Target parents as the primary disclosure audience and ensure that they can effectively communicate disclosure to children. | |
Includes blind users or people with low vision as the primary intended audience | Be inclusive of all users and ensure that any form of visual disclosure has associated alternative text or sound effects. Adhere to accessibility standards for contrast ratio and display size. Use auditory cues to communicate disclosure. | |
Is screen-less, device-less or uses voice as the primary or only mode of interaction | Use auditory cues to communicate disclosure. | |
Potentially includes multiple users/listeners (e.g., personal assistant in multiple household) | Be mindful of various user contexts and levels of understanding and offer multiple opportunities for disclosure in the user journey. |
If your synthetic voice experience requires High Disclosure, it's best to use at least one of the following explicit patterns to clearly state the synthetic nature.
Before the voice experience begins, introduce the digital assistant by being fully transparent about the origins of its voice and its capabilities. The optimal moment to use this pattern is when onboarding a new user or when introducing new features to a returning user. Implementing implicit cues during an introduction helps users form a mental model about the synthetic nature of the digital agent.
The synthetic voice is introduced while onboarding a new user.
Recommendations
If a user skips the onboarding experience, continue to offer entry points to the Transparent Introduction experience until the user triggers the voice for the first time.
Provide a consistent entry point to the synthetic voice experience. Allow the user to return to the onboarding experience when they trigger the voice for the first time at any point in the user journey.
A spoken prompt stating the origins of the digital assistant's voice is explicit enough on its own to achieve disclosure. This pattern is best for High Disclosure scenarios where voice is the only mode of interaction available.
Use a transparent introduction when there are moments in the user experience where you might already introduce or attribute a person's voice.
For additional transparency, the voice actor can disclose the origins of the synthetic voice in the first person.
Use this pattern if the user will be interacting with an audio player or interactive component to trigger the voice.
An explicit byline is the attribution of where voice came from.
Recommendations
Provide users with control over how the digital assistant responds to them (i.e., how the voice sounds). When a user interacts with a system on their own terms and with specific goals in mind, then by definition, they have already understood that it's not a real person.
Offer choices that have a meaningful and noticeable impact on the synthetic voice experience.
User preferences allow users to customize and improve their experience.
Recommendations
Offer ways to customize the digital assistant's voice. If the voice is based on a celebrity or a widely recognizable person, consider using both visual and spoken introductions when users preview the voice.
Offering the ability to select from a set of voices helps convey the artificial nature.
Recommendations
In addition to complying with COPPA regulations, provide disclosure to parents if your primary intended audience is young children and your exposure level is high. For sensitive uses, consider gaining experience until an adult has acknowledged the use of the synthetic voice. Encourage parents to communicate messages to their children.
A transparent introduction optimized for parents ensures that an adult was made aware of the synthetic nature of the voice before a child interacts with it.
Recommendations
Offer context-sensitive entry points to a page, pop-up, or external site that provides more information about synthetic voice technology. For example, you could surface a link to learn more during onboarding or when the user prompts for more information during conversation.
Example of an entry point to offer the opportunity to learn more about the synthesized voice.
Once a user requests more information about the synthetic voice, the primary goal is to educate them about the origins of the synthetic voice and to be transparent about the technology.
More information can be offered in an external site help site.
Recommendations
Consistency is the key to achieving disclosure implicitly throughout the user journey. Consistent use of visual and auditory cues across devices and modes of interaction can help build associations between implicit patterns and explicit disclosure.
Anthropomorphism can manifest in different ways, from the actual visual representation of the agent to the voice, sounds, patterns of light, bouncing shapes, or even the vibration of a device. When defining your persona, leverage implicit cues and feedback patterns rather than aim for a very human-like avatar. This is one way to minimize the need for more explicit disclosure.
These cues help anthropomorphize the agent without being too human-like. They can also become effective disclosure mechanisms on their own when used consistently over time.
Consider the different modes of interactions of your experience when incorporating the following types of cues:
Category | Examples |
---|---|
Visual Cues |
|
Auditory Cues |
|
Haptic Cues |
|
Disclosure can be achieved implicitly by setting accurate expectations for what the digital assistant is capable of. Provide sample commands so that users can learn how to interact with the digital assistant and offer contextual help to learn more about the synthetic voice during the early stages of the experience.
When conversations fall in unexpected paths, consider crafting default responses that can help reset expectations, reinforce transparency, and steer users towards successful paths. There are opportunities to use explicit disclosure in conversation as well.
Off-task or "personal" questions directed to the agent are a good time to remind users of the synthetic nature of the agent and steer them to engage with it appropriately or to redirect them to a real person.
There are many opportunities for disclosure throughout the user journey. Design for the first use, second use, nth use…, but also embrace moments of "failure" to highlight transparency—like when the system makes a mistake or when the user discovers a limitation of the agent's capabilities.
Example of a standard digital assistant user journey highlighting various disclosure opportunities.
The optimal moment for disclosure is the first time a person interacts with the synthetic voice. In a personal voice assistant scenario, this would be during onboarding, or the first time the user virtually unboxes the experience. In other scenarios, it could be the first time a synthetic voice reads content on a website or the first time a user interacts with a virtual character.
Users should be able to easily access additional information, control preferences, and receive transparent communication at any point during the user journey when requested.
Use the implicit design patterns that enhance the user experience continuously.
Use disclosure as an opportunity to fail gracefully.
Training
Module
Discover Microsoft guidelines for responsible conversational AI development - Training
Learn Microsoft guidelines for the development of responsible conversational AI, such as chat bots and voice-controlled systems.