Can we reference the same sample voices that Microsoft Azure TTS gives in Voice Gallery?

Question

Can we reference the same sample voices that Microsoft Azure TTS gives in Voice Gallery?

Nishant Dalvi 45

We're building an application which will allow users to create voiceovers from their scripts. We want to give a voice gallery feature similar to what Microsoft gives in its Voice Gallery where our user can preview sample voices before selecting them for each sentence. We checked Speech SDKand it doesn't allow us to simply give a reference to the sample voice stored by Microsoft.

Is there a way to fetch these sample audios or we will have to store ours in Azure BLOB for all the voices available as part of the Azure TTS service?

Accepted answer

0 additional answers

Your answer

Answer 1

https://docs.d-id.com/reference/tts-microsoft#:~:text=Microsoft%20Azure%20Text%20to%20Speech,it%20in%20your%20API%20request

Azure TTS offers over 100 languages and a wide range of natural human sounds. These voices are accessible via Azure's API, and you can use your preferred voice in API calls.

https://docs.d-id.com/reference/tts-microsoft#:~:text=Go%20to%20Microsoft%20Voice%20Gallery,AbbiNeural%60%20in%20the%20%60voice_id%60%20field

You can browse the Microsoft Voice Gallery to find and preview different voices. Each voice in the gallery is linked to a "Sample code" tab. By clicking on this, you can copy the voice name (for example, config.SpeechSynthesisVoiceName = "en-GB-AbbiNeural") and paste it into your application's voice_id field.

If you wish to give a voice gallery for consumers to see before choosing a voice for their voiceovers, you'll probably need to generate these samples with Azure TTS and then store them elsewhere, potentially in Azure Blob Storage. This is because, while you can access and use the voice names from Azure's Voice Gallery, you cannot directly reference or fetch pre-existing sample audios from the gallery for usage in your application.

https://learn.microsoft.com/en-us/azure/ai-services/speech-service/text-to-speech

https://learn.microsoft.com/en-us/azure/ai-services/speech-service/text-to-speech#:~:text=Prebuilt%20neural%20voice%20,voice%20for%20your%20business%20needs

The Azure Speech SDK or Speech Studio portal can be used to integrate these voices into your application. You can choose from pre-built neural voices or create a bespoke neural voice that is specific to your product or business. The documentation provided by Azure is a fantastic place to start when incorporating these capabilities into your apps.

https://techcommunity.microsoft.com/t5/azure-ai-services-blog/azure-neural-tts-previews-a-new-contextual-voice-model-for-long/ba-p/3587139

Microsoft's Voice Gallery provides voice samples as well as sample code, which might be useful as a reference. Voice samples and sample code for Azure's new contextual voice models, such as "RogerNeural," can be found at the Voice Gallery. This is especially beneficial if your application has long-form text, such as paragraphs.

Nishant Dalvi 45 Reputation points

2023-11-20T11:43:11.1966667+00:00

Thanks @Sedat SALMAN , this clarifies.

Share via

Can we reference the same sample voices that Microsoft Azure TTS gives in Voice Gallery?

0 additional answers

Your answer