Azure Text to speech (Preview)
Azure Text-to-speech allows you to build apps and services that speak naturally with more than 400 voices across 140 languages and dialects.
This connector is available in the following products and regions:
Service | Class | Regions |
---|---|---|
Logic Apps | Standard | All Logic Apps regions except the following: - Azure China regions |
Power Automate | Premium | All Power Automate regions except the following: - China Cloud operated by 21Vianet |
Power Apps | Premium | All Power Apps regions except the following: - China Cloud operated by 21Vianet |
Contact | |
---|---|
Name | Speech Service Power Platform Team |
URL | https://docs.microsoft.com/azure/cognitive-services/speech-service/support |
speechpowerplatform@microsoft.com |
Connector Metadata | |
---|---|
Publisher | Microsoft |
Website | https://docs.microsoft.com/azure/cognitive-services/speech-service/ |
Privacy policy | https://privacy.microsoft.com |
Categories | AI;Website |
The Speech service allows you to convert text into synthesized speech and get a list of supported voices for a region by using a REST API.
Pre-requisites
You will need the following to proceed:
- Azure subscription - Create one for free.
- Create a Speech resource in the Azure portal.
- Get the Speech resource key and region. After your Speech resource is deployed, select Go to resource to view and manage keys. For more information about Cognitive Services resources, see Get the keys for your resource.
- Learn more about Azure Text-to-speech supported locales and voices.
Creating a connection
The connector supports the following authentication types:
Api Key | ApiKey | All regions | Shareable |
Microsoft Entra ID Integrated | Use Microsoft Entra ID to access your speech service. | All regions | Not shareable |
Default [DEPRECATED] | This option is only for older connections without an explicit authentication type, and is only provided for backward compatibility. | All regions | Not shareable |
Api Key
Auth ID: keyBasedAuth
Applicable: All regions
ApiKey
This is shareable connection. If the power app is shared with another user, connection is shared as well. For more information, please see the Connectors overview for canvas apps - Power Apps | Microsoft Docs
Name | Type | Description | Required |
---|---|---|---|
Account Key | securestring | Speech service key | True |
Region | string | Speech service region (Example: eastus) | True |
Microsoft Entra ID Integrated
Auth ID: tokenBasedAuth
Applicable: All regions
Use Microsoft Entra ID to access your speech service.
This is not shareable connection. If the power app is shared with another user, another user will be prompted to create new connection explicitly.
Name | Type | Description | Required |
---|---|---|---|
Resource ID | string | The cognitive services resource id (Example: /subscriptions//resourceGroups//providers/Microsoft.CognitiveServices/accounts/) | True |
Custom Subdomain | string | Custom subdomain endpoint url (Example: contoso) | True |
Default [DEPRECATED]
Applicable: All regions
This option is only for older connections without an explicit authentication type, and is only provided for backward compatibility.
This is not shareable connection. If the power app is shared with another user, another user will be prompted to create new connection explicitly.
Name | Type | Description | Required |
---|---|---|---|
Account Key | securestring | Azure Cognitive Services for Neural Text-to-speech Account Key | True |
Region | string | Speech service region (Example: eastus) | True |
Throttling Limits
Name | Calls | Renewal Period |
---|---|---|
API calls per connection | 100 | 60 seconds |
Actions
Convert text to speech |
Convert single text to speech. |
Convert text to speech with SSML |
Convert text to speech by using Speech Synthesis Markup Language (SSML) |
Get list of voices |
Get a full list of voices for a specific region or endpoint. |
Convert text to speech
Convert single text to speech.
Parameters
Name | Key | Required | Type | Description |
---|---|---|---|---|
Voice Name
|
voiceName | True | string |
The voice name output for text to speech. For example: en-US-JennyNeural. |
Locale
|
locale | True | string |
The locale of the contained data. For example: en-US. |
Synthesized Text
|
synthesizedText | True | string |
The synthesized text that needs to be converted to speech. |
Output Audio Format
|
outputFormat | string |
The non-streaming audio formats. Default: riff-24khz-16bit-mono-pcm. |
|
Style
|
style | string |
The express style of speech. For example: cheerful. |
|
Speaking Rate
|
speakingRate | string |
The speed rate of speech. For example: -40.00%. |
Convert text to speech with SSML
Convert text to speech by using Speech Synthesis Markup Language (SSML)
Parameters
Name | Key | Required | Type | Description |
---|---|---|---|---|
SSML Text
|
ssmlText | True | string |
The text in SSML format (e.g. power connector) |
Output Audio Format
|
outputFormat | string |
The non-streaming audio formats. Default: riff-24khz-16bit-mono-pcm. |
Get list of voices
Get a full list of voices for a specific region or endpoint.
Returns
Name | Path | Type | Description |
---|---|---|---|
|
array of object | ||
items
|
object |
array |