What is Direct Line Speech?
Direct Line Speech is a robust, end-to-end solution for creating a flexible, extensible voice assistant. It is powered by the Bot Framework and its Direct Line Speech channel, that is optimized for voice-in, voice-out interaction with bots.
Voice assistants listen to users and take an action in response, often speaking back. They use speech-to-text to transcribe the user's speech, then take action on the natural language understanding of the text. This action frequently includes spoken output from the assistant generated with text-to-speech.
Direct Line Speech offers the highest levels of customization and sophistication for voice assistants. It's designed for conversational scenarios that are open-ended, natural, or hybrids of the two with task completion or command-and-control use. This high degree of flexibility comes with a greater complexity, and scenarios that are scoped to well-defined tasks using natural language input may want to consider Custom Commands for a streamlined solution experience.
Direct Line Speech supports these locales:
Getting started with Direct Line Speech
For a complete, step-by-step guide on creating a simple voice assistant using Direct Line Speech, see the tutorial for speech-enabling your bot with the Speech SDK and the Direct Line Speech channel.
We also offer quickstarts designed to have you running code and learning the APIs quickly. This table includes a list of voice assistant quickstarts organized by language and platform.
|Java||Windows, macOS, Linux||Browse|
Sample code for creating a voice assistant is available on GitHub. These samples cover the client application for connecting to your assistant in several popular programming languages.
Customization options vary by language/locale (see Supported languages).
Direct Line Speech and its associated functionality for voice assistants are an ideal supplement to the Virtual Assistant Solution and Enterprise Template. Though Direct Line Speech can work with any compatible bot, these resources provide a reusable baseline for high-quality conversational experiences as well as common supporting skills and models to get started quickly.