Quickstart: Hear and speak with chat models in the Azure AI Studio playground

Note

Azure AI Studio is currently in public preview. This preview is provided without a service-level agreement, and we don't recommend it for production workloads. Certain features might not be supported or might have constrained capabilities. For more information, see Supplemental Terms of Use for Microsoft Azure Previews.

Give your app the ability to hear and speak by pairing Azure OpenAI Service with Azure AI Speech to enable richer interactions.

In this quickstart, you use Azure OpenAI Service and Azure AI Speech to:

  • Speak to the assistant via speech to text.
  • Hear the assistant's response via text to speech.

The speech to text and text to speech features can be used together or separately in the Azure AI Studio playground. You can use the playground to test your chat model before deploying it.

Prerequisites

Note

This feature isn't available if you created an Azure AI hub resource together with an existing Azure OpenAI Service resource. You must create an AI hub with an Azure AI services provider. We're gradually rolling out this feature to all customers. If you don't see it yet, check back later.

Configure the playground

Before you can start a chat session, you need to configure the playground to use the speech to text and text to speech features.

  1. Sign in to Azure AI Studio.

  2. Go to your project or create a new project in Azure AI Studio.

  3. Select Build from the top menu and then select Playground from the collapsible left menu.

  4. Make sure that Chat is selected from the Mode dropdown. Select your deployed chat model from the Deployment dropdown.

    Screenshot of the chat playground with mode and deployment highlighted.

  5. Select the Playground Settings button.

    Screenshot of the chat playground with options to get to the playground settings.

    Note

    You should also see the options to select the microphone or speaker buttons. If you select either of these buttons, but haven't yet enabled speech to text or text to speech, you are prompted to enable them in Playground Settings.

  6. On the Playground Settings page, select the box to acknowledge that usage of the speech feature will incur additional costs. For more information, see Azure AI Speech pricing.

  7. Select Enable speech to text and Enable text to speech.

    Screenshot of the playground settings page.

  8. Select the language locale and voice you want to use for speaking and hearing. The list of available voices depends on the locale that you select.

    Screenshot of the playground settings page with a voice that speaks Japanese selected.

  9. Optionally you can enter some sample text and select Play to try the voice.

  10. Select Save.

Start a chat session

In this chat session, you use both speech to text and text to speech. You use the speech to text feature to speak to the assistant, and the text to speech feature to hear the assistant's response.

  1. Complete the steps in the Configure the playground section if you haven't already done so. To complete this quickstart you need to enable the speech to text and text to speech features.

  2. Select the microphone button and speak to the assistant. For example, you can say "Do you know where I can get an Xbox".

    Screenshot of the chat session with the enabled microphone icon and send button highlighted.

  3. Select the send button (right arrow) to send your message to the assistant. The assistant's response is displayed in the chat session pane.

    Screenshot of the chat session with the assistant's response.

    Note

    If the speaker button is turned on, you'll hear the assistant's response. If the speaker button is turned off, you won't hear the assistant's response, but the response will still be displayed in the chat session pane.

  4. You can change the system prompt to change the assistant's response format or style.

    For example, enter:

    "You're an AI assistant that helps people find information. Answers shouldn't be longer than 20 words because you are on a phone. You could use 'um' or 'let me see' to make it more natural and add some disfluency."
    

    The response is shown in the chat session pane. Since the speaker button is turned on, you also hear the response.

    Screenshot of the chat session with the system prompt edited.

View sample code

You can select the View Code button to view and copy the sample code, which includes configuration for Azure OpenAI and Speech services. You can use the sample code to enable speech to text and text to speech in your application.

Screenshot of viewing the code in the playground.

Tip

For another example, see the speech to speech chat code example.

Clean up resources

To avoid incurring unnecessary Azure costs, you should delete the resources you created in this quickstart if they're no longer needed. To manage resources, you can use the Azure portal.

Next steps