Azure Text to speech

Azure Text-to-speech allows you to build apps and services that speak naturally with more than 400 voices across 140 languages and dialects.

This connector is available in the following products and regions:

Service	Class	Regions
Copilot Studio	Premium	All Power Automate regions except the following: - China Cloud operated by 21Vianet
Logic Apps	Standard	All Logic Apps regions except the following: - Azure China regions
Power Apps	Premium	All Power Apps regions except the following: - China Cloud operated by 21Vianet
Power Automate	Premium	All Power Automate regions except the following: - China Cloud operated by 21Vianet

Contact
Name	Speech Service Power Platform Team
URL	https://docs.microsoft.com/azure/cognitive-services/speech-service/support
Email	speechpowerplatform@microsoft.com

Connector Metadata
Publisher	Microsoft
Website	https://docs.microsoft.com/azure/cognitive-services/speech-service/
Privacy policy	https://privacy.microsoft.com
Categories	AI;Website

The Speech service allows you to convert text into synthesized speech and get a list of supported voices for a region by using a REST API.

Pre-requisites

You will need the following to proceed:

Azure subscription - Create one for free.
Create a Speech resource in the Azure portal.
Get the Speech resource key and region. After your Speech resource is deployed, select Go to resource to view and manage keys. For more information about Cognitive Services resources, see Get the keys for your resource.
Learn more about Azure Text-to-speech supported locales and voices.

Creating a connection

The connector supports the following authentication types:


Api Key	ApiKey	All regions	Shareable
Microsoft Entra ID Integrated	Use Microsoft Entra ID to access your speech service.	All regions except Azure Government and Department of Defense (DoD) in Azure Government and US Government (GCC-High)	Not shareable
Microsoft Entra ID Integrated (Azure Government)	Use Microsoft Entra ID to access your speech service.	Azure Government and Department of Defense (DoD) in Azure Government and US Government (GCC-High) only	Not shareable
Default [DEPRECATED]	This option is only for older connections without an explicit authentication type, and is only provided for backward compatibility.	All regions	Not shareable

Api Key

Auth ID: keyBasedAuth

Applicable: All regions

ApiKey

This is shareable connection. If the power app is shared with another user, connection is shared as well. For more information, please see the Connectors overview for canvas apps - Power Apps | Microsoft Docs

Name	Type	Description	Required
Account Key	securestring	Speech service key	True
Region	string	Speech service region (Example: eastus)	True

Microsoft Entra ID Integrated

Auth ID: tokenBasedAuth

Applicable: All regions except Azure Government and Department of Defense (DoD) in Azure Government and US Government (GCC-High)

Use Microsoft Entra ID to access your speech service.

This is not shareable connection. If the power app is shared with another user, another user will be prompted to create new connection explicitly.

Name	Type	Description	Required
Resource ID	string	The cognitive services resource id (Example: /subscriptions/<Subscription ID>/resourceGroups/<ResourceGroup Name>/providers/Microsoft.CognitiveServices/accounts/<CognitiveServices Resource Name>)	True
Custom Subdomain	string	Custom subdomain endpoint url (Example: contoso)	True

Microsoft Entra ID Integrated (Azure Government)

Auth ID: tokenBasedAuth

Applicable: Azure Government and Department of Defense (DoD) in Azure Government and US Government (GCC-High) only

Use Microsoft Entra ID to access your speech service.

This is not shareable connection. If the power app is shared with another user, another user will be prompted to create new connection explicitly.

Name	Type	Description	Required
Resource ID	string	The cognitive services resource id (Example: /subscriptions/<Subscription ID>/resourceGroups/<ResourceGroup Name>/providers/Microsoft.CognitiveServices/accounts/<CognitiveServices Resource Name>)	True
Custom Subdomain	string	Custom subdomain endpoint url (Example: contoso)	True

Default [DEPRECATED]

Applicable: All regions

This option is only for older connections without an explicit authentication type, and is only provided for backward compatibility.

This is not shareable connection. If the power app is shared with another user, another user will be prompted to create new connection explicitly.

Name	Type	Description	Required
Account Key	securestring	Azure Cognitive Services for Neural Text-to-speech Account Key	True
Region	string	Speech service region (Example: eastus)	True

Throttling Limits

Name	Calls	Renewal Period
API calls per connection	100	60 seconds

Actions

Convert text to speech	Convert single text to speech.
Convert text to speech with SSML	Convert text to speech by using Speech Synthesis Markup Language (SSML)
Get list of voices	Get a full list of voices for a specific region or endpoint.

Convert text to speech

Operation ID:: ConvertTextToSpeech

Convert single text to speech.

Parameters

Name	Key	Required	Type	Description
Voice Name	voiceName	True	string	The voice name output for text to speech. For example: en-US-JennyNeural.
Locale	locale	True	string	The locale of the contained data. For example: en-US.
Synthesized Text	synthesizedText	True	string	The synthesized text that needs to be converted to speech.
Output Audio Format	outputFormat		string	The non-streaming audio formats. Default: riff-24khz-16bit-mono-pcm.
Style	style		string	The express style of speech. For example: cheerful.
Speaking Rate	speakingRate		string	The speed rate of speech. For example: -40.00%.

Convert text to speech with SSML

Operation ID:: ConvertTextToSpeechWithSSML

Convert text to speech by using Speech Synthesis Markup Language (SSML)

Parameters

Name	Key	Required	Type	Description
SSML Text	ssmlText	True	string	The text in SSML format (e.g. <speak xmlns='http://www.w3.org/2001/10/synthesis' xmlns:mstts='http://www.w3.org/2001/mstts' xmlns:emo='http://www.w3.org/2009/10/emotionml' version='1.0' xml:lang='en-US'><voice name='en-US-ChristopherNeural'>power connector</voice></speak>)
Output Audio Format	outputFormat		string	The non-streaming audio formats. Default: riff-24khz-16bit-mono-pcm.

Get list of voices

Operation ID:: GetVoicesList

Get a full list of voices for a specific region or endpoint.

Returns

Name	Path	Type	Description
		array of object
items		object	array