Can we reference the same sample voices that Microsoft Azure TTS gives in Voice Gallery?

Nishant Dalvi 45 Reputation points
2023-11-20T11:03:25.23+00:00

We're building an application which will allow users to create voiceovers from their scripts. We want to give a voice gallery feature similar to what Microsoft gives in its Voice Gallery where our user can preview sample voices before selecting them for each sentence. We checked Speech SDKand it doesn't allow us to simply give a reference to the sample voice stored by Microsoft.

Is there a way to fetch these sample audios or we will have to store ours in Azure BLOB for all the voices available as part of the Azure TTS service?

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
1,817 questions
Azure AI services
Azure AI services
A group of Azure services, SDKs, and APIs designed to make apps more intelligent, engaging, and discoverable.
2,959 questions
0 comments No comments
{count} votes

Accepted answer
  1. Sedat SALMAN 14,140 Reputation points MVP
    2023-11-20T11:38:26.7566667+00:00

    https://docs.d-id.com/reference/tts-microsoft#:~:text=Microsoft%20Azure%20Text%20to%20Speech,it%20in%20your%20API%20request

    Azure TTS offers over 100 languages and a wide range of natural human sounds. These voices are accessible via Azure's API, and you can use your preferred voice in API calls.


    https://docs.d-id.com/reference/tts-microsoft#:~:text=Go%20to%20Microsoft%20Voice%20Gallery,AbbiNeural%60%20in%20the%20%60voice_id%60%20field

    You can browse the Microsoft Voice Gallery to find and preview different voices. Each voice in the gallery is linked to a "Sample code" tab. By clicking on this, you can copy the voice name (for example, config.SpeechSynthesisVoiceName = "en-GB-AbbiNeural") and paste it into your application's voice_id field.


    If you wish to give a voice gallery for consumers to see before choosing a voice for their voiceovers, you'll probably need to generate these samples with Azure TTS and then store them elsewhere, potentially in Azure Blob Storage. This is because, while you can access and use the voice names from Azure's Voice Gallery, you cannot directly reference or fetch pre-existing sample audios from the gallery for usage in your application.


    https://learn.microsoft.com/en-us/azure/ai-services/speech-service/text-to-speech

    https://learn.microsoft.com/en-us/azure/ai-services/speech-service/text-to-speech#:~:text=Prebuilt%20neural%20voice%20,voice%20for%20your%20business%20needs

    The Azure Speech SDK or Speech Studio portal can be used to integrate these voices into your application. You can choose from pre-built neural voices or create a bespoke neural voice that is specific to your product or business. The documentation provided by Azure is a fantastic place to start when incorporating these capabilities into your apps.


    https://techcommunity.microsoft.com/t5/azure-ai-services-blog/azure-neural-tts-previews-a-new-contextual-voice-model-for-long/ba-p/3587139

    Microsoft's Voice Gallery provides voice samples as well as sample code, which might be useful as a reference. Voice samples and sample code for Azure's new contextual voice models, such as "RogerNeural," can be found at the Voice Gallery. This is especially beneficial if your application has long-form text, such as paragraphs.


0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.