Use interactive voice response in your copilots

Copilot Studio supports interactive voice response capabilities, including speech and dual-tone multi-frequency (DTMF) input, context variables, call transfer, and speech and DTMF customization.

Before you can create or edit copilots for voice scenarios, you need a phone number to use. With Azure Communication Services, you can get a new phone number or use an existing phone number. For more information, see Quickstart: Configure voice-enabled copilot with a phone number.

Key concepts for voice authoring

With the growing trend toward self-service applications, voice-enabled copilots are making a huge difference for businesses. Voice-enabled copilots are used in various applications, such as call centers, mobile apps, and messaging platforms.

Voice-enabled copilots can collect user input through speech and Dual-Tone Multi-Frequency (DTMF).

Supported voice features

After your copilot is ready for voice services, you can configure features for your scenario.

Feature Description
Barge-in Allows users to interrupt the system at any time during the conversation.
Dual-tone multi-frequency (DTMF) Allows users to enter data by pressing keys on their phone keypad. DTMF can accept single key menu navigation and collect business information with multi-digits.
Latency message Send messages or audio to inform users that the system is still processing their request in long-running operations.
Silence detection and timeouts Detects when the user stops speaking, allowing the system to respond appropriately.
Speech recognition improvement Speak naturally, without a script-a user's spoken command or question is translated for the voice-enabled copilot to process.
Speech Synthesis Markup Language (SSML) Control how your copilot's voice sounds and behaves with end-users. You can control the tone, pitch, and speed of the voice that interacts with the user.

How to configure voice features

The following articles show you how to enable features, for a given scenario, step by step.

Known limitations

These tips and limitations help you successfully integrate voice into your copilot.

Feature Tip or limitation
Channel order Enable the Telephony channel first and then connect with Omnichannel. The sequence is for channel reconnection.
Language/Locale For a full list of supported languages and locales, see Supported languages for voice-enabled copilots. If you have a customized locale request, contact the Copilot Studio team.
DTMF The question node supports copilot single-digit DTMF (global command) and multi-digit DTMF, with conflict handling for the DTMF key at the same time.
DTMF only When DTMF only for voice is enabled, some timers might not be effective, such as interdigit timeout for DTMF or silence detection timeout.
Latency message
on Action node
- If you don’t enable latency message or the message is empty, all messages before the action node are blocked and sent after the action completes.
- If you use multiple consecutive action nodes for one topic and hit any unexpected results, add a message node between the consecutive action nodes.
Test chat dial pad Pressing a key on the dial pad in the Test chat returns "/DTMF#," which isn't supported, and isn't a valid input. Instead, the command "/DTMFkey#" should be typed into the chat.
Multilingual voice-enabled copilots Voice support is unavailable for secondary languages, and is only available for the copilot's primary language.
Customer engagement hub Apart from Omnichannel, all the other customer engagement channels only work with chat-based copilots. The following aren't supported for voice-enabled copilots:
- Genesys
- Live person
- Salesforce
- ServiceNow
Generative AI for voice-enabled copilots - Generative answers aren't supported for voice-enabled copilots.
- Creating and editing topics using Copilot isn't supported for voice-enabled copilots. No messages will be created for Speech & DTMF, and DTMF aren't configured by Copilot.
- Generative mode to orchestrate a copilot's topics and actions with generative AI isn't supported for voice-enabled copilots.