Quickstart: Hear and speak with chat models in the AI Studio chat playground
Give your app the ability to hear and speak by pairing Azure OpenAI Service with Azure AI Speech to enable richer interactions.
In this quickstart, you use Azure OpenAI Service and Azure AI Speech to:
- Speak to the assistant via speech to text.
- Hear the assistant's response via text to speech.
The speech to text and text to speech features can be used together or separately in the AI Studio chat playground. You can use the playground to test your chat model before deploying it.
Prerequisites
- An Azure subscription - Create one for free.
- An AI Studio project.
- A deployed Azure OpenAI chat model. This guide is tested with a
gpt-4
model.
Configure the chat playground
Before you can start a chat session, you need to configure the chat playground to use the speech to text and text to speech features.
Sign in to Azure AI Studio.
Go to your project or create a new project in Azure AI Studio.
Select Chat from the list of playgrounds.
Select your deployed chat model from the Deployment dropdown.
Select the Chat capabilities button.
Note
You should also see the options to select the microphone or speaker buttons. If you select either of these buttons, but haven't yet enabled speech to text or text to speech, you are prompted to enable them in Chat capabilities.
On the Chat capabilities page, select the box to acknowledge that usage of the speech feature will incur additional costs. For more information, see Azure AI Speech pricing.
Select Enable speech to text and Enable text to speech.
Select the language locale and voice you want to use for speaking and hearing. The list of available voices depends on the locale that you select.
Optionally, you can try the voice before you return to the chat session. Enter some sample text and select Play to
Select Save.
Start a chat session
In this chat session, you use both speech to text and text to speech. You use the speech to text feature to speak to the assistant, and the text to speech feature to hear the assistant's response.
Complete the steps in the Configure the playground section if you haven't already done so. To complete this quickstart you need to enable the speech to text and text to speech features.
Select the microphone button and speak to the assistant. For example, you can say "Do you know where I can get an Xbox".
Select the send button (right arrow) to send your message to the assistant. The assistant's response is displayed in the chat session pane.
Note
If the speaker button is turned on, you'll hear the assistant's response. If the speaker button is turned off, you won't hear the assistant's response, but the response will still be displayed in the chat session pane.
You can change the system prompt to change the assistant's response format or style.
For example, enter:
"You're an AI assistant that helps people find information. Answers shouldn't be longer than 20 words because you are on a phone. You could use 'um' or 'let me see' to make it more natural and add some disfluency."
The response is shown in the chat session pane. Since the speaker button is turned on, you also hear the response.
View sample code
You can select the View code button to view and copy the sample code, which includes configuration for Azure OpenAI and Speech services.
You can use the sample code to enable speech to text and text to speech in your application.
Tip
For another example, see the speech to speech chat code example.
Clean up resources
To avoid incurring unnecessary Azure costs, you should delete the resources you created in this quickstart if they're no longer needed. To manage resources, you can use the Azure portal.