Azure AI Speech

1 answer

tts cant read "Cоmmuniсаtiоn is often dоnе without our own соnѕсiоuѕ awareness"

Hello everyone, im using the azure tts and encountered a weird problem. i tried testing the following line "Cоmmuniсаtiоn is often dоnе without our own соnѕсiоuѕ awareness" on the various voices available in the voice catalog. so far all i…

asked

executer 0

commented

santoshkc 5,730 Microsoft Vendor

0 answers

Use Azure Speech through a fixed public IP

A customer wishes to utilize the Azure Speech service via the internet while reducing the number of IP addresses that must be unblocked by their firewall. I tried to do that by adding a virtual network to my speech resource, creating a public IP, and…

asked

Julien S 1

commented

romungi-MSFT 43,341 Microsoft Employee

1 answer

En- NG availability on Embedded speech

HI i would like to request availability of English - Nigeria variant for embedded TTS

asked

Ankit Agarwal 0

commented

santoshkc 5,730 Microsoft Vendor

0 answers

Usage cost calculation using Azure Retail Prices API for Azure Speech to Text and Blob Storage

We are using Azure subscription with the Standard Tier. We have a requirement to calculate the monthly usage cost in JPY (Japanese Yen) of the Azure Speech to Text service and Azure Blob Storage in our application. we analyzed the Azure Retail Price API…

asked

Test Admin 171

commented

dupammi 7,745 Microsoft Vendor

0 answers

The Cognitive Services Speech SDK has no sound on iPhone's Safari, but can play successfully on Mac's Safari. How should this be handled?

The Cognitive Services Speech SDK has no sound on iPhone's Safari, but can play successfully on Mac's Safari. How should this be handled? (in react) const initializeSynthesizer = () => { const speechConfig =…

asked

jessebo 0

edited a comment

navba-MSFT 18,575 Microsoft Employee

0 answers

In, e.g., 0001.sentence.json, quotation marks present in the original sentence are dropped, if that quotation mark occurs at the beginning or end of the detected sentence. Is this expected behavior?

This is mostly in the title. Initially, I suspected this was a bug in the JSON serialization since JSON also uses " to delimit its fields, and these also have to be escaped in SSML. Upon further investigation, however, i found it also affects…

asked

Verbari LLC 20

commented

romungi-MSFT 43,341 Microsoft Employee

0 answers

Custom neural voice data size is at 0 after training. Should I deploy the model?

Hello, We prepared 1749 utterances in order to create a Custom Neural Voice. In Step 3, which is "Train Model", it identified these 1749 utterances and suggested 25 hours of training time (see image attached). The training has finished in over…

asked

PAVAGEAU Perrine 80

edited a comment

PAVAGEAU Perrine 80

0 answers

Can some voices on spx text to speech not read phonetic alphabet ?

Hello ! I am using the Azure text to speech service with SSML to read phonetic alphabet, it works well except for when I pick the voice "Andrew multilingual". The spx command does not generate any voice but there is no error in the output. Are…

asked

Houda 0

edited a comment

Houda 0

0 answers

How to receive a real-time audio stream using Websocket in Spring boot with SDK

Hello. This is really driving me crazy. Send Audio Stream from the Web Client to the Server The server must convert Stream to Text using the SDK. However, the stream in wav format does not appear to be being sent from the client to the server. I…

asked

김동윤 0

commented

VasaviLankipalle-MSFT 15,241

1 answer

How to use an Microsoft Entra ID to authenticate with the Speech to text REST API (for batch transcription

I looks like you can only authenticate to the "Speech to text REST API" with a api key (Ocp-Apim-Subscription-Key). What we would like is to authenticate with a Microsoft Entra ID. Why? Our application is running a AKS and all our containers…

asked

Johan Klijn 61

accepted

Johan Klijn 61

0 answers

is there any way of accessing the sounds that are sent to speech sdk server

hi, I'm trying to make some ai assistant using speech SDK, device is Linux kernel based, and I've configured Alsa loopback and Pulseaudio to utilize the echo cancellation feature which should be supported by SDK. One thing that I noticed is that…

asked

Faris Lemes 40

commented

dupammi 7,745 Microsoft Vendor

1 answer

Understanding "standard paid (S0)" pricing for "Audio Content Creation"

If I created a "speech service" with "standard paid (S0)", and I am only and only going to use "Audio Content Creation". What are going to be the pricing for it ? Will the free quota going to be included ? (500k characters)…

asked

Abdelrahman Mokhtar 40

accepted

Abdelrahman Mokhtar 40

1 answer

I cant access anything in "Audio Content Creation", error "You don't have operation permissions"

I just created a speech service, but when I go to "Audio Content Creation", I can't do anything (New - Upload - Export) I tried to add myself as owner role, and other roles, but still, I can't do anything in Audio Content Creation.

asked

Abdelrahman Mokhtar 40

accepted

Abdelrahman Mokhtar 40

1 answer

Will Azure AI Speech generate styles such as "happy", "cheerful", "excited" automatically from the data given?

I've added data with about 750 utterances. 80% are normal sentences, while 10% are questions and the other 10% are exclamations. What will Speech Studio need to generate styles such as Happy, Cheerful, etc? Do I have to give it more data? Or will…

asked

PAVAGEAU Perrine 80

accepted

PAVAGEAU Perrine 80

1 answer

Bug Report: Mispronunciation of Welsh Contraction "i’w" in Azure Neural TTS

Subject: Bug Report: Mispronunciation of Welsh Contraction "i’w" in Azure Neural TTS Description: The Azure Neural TTS system is mispronouncing the Welsh contraction "i’w." Instead of producing the correct pronunciation…

asked

Verbari LLC 20

accepted

Verbari LLC 20

0 answers

Speech Studio "Text to Speech" not respecting <break> markup

The text to speech renderer fails to apply the "break" markup in the Audio Content Creation interface of the Speech Studio. I haven't tried other markup. Yesterday, it didn't work with RyanMultinationalNeural, but worked with AndrewNeural. Now…

asked

Roy Jensen 20

edited a comment

Roy Jensen 20

1 answer

Why am I getting a quota error?

I'm using Azure TTS and getting the following quota error: "You have reached the quota with your free-tier (F0) Speech resource. To continue to create audios with neural voices, switch to a standard paid resource, or upgrade your free-tier…

asked

Rich Hawksworth 0

answered

ck ong 0

1 answer

No module named 'azure' when using azure.cognitiveservices.speech

Hello, I have a problem with importing azure.cognitiveservices.speech. I pip install the package but when importing it I got this error. ModuleNotFoundError: No module named 'azure'

asked

Mosub Gamal Ali Soliman Lawash 0

commented

AshokPeddakotla-MSFT 29,651

0 answers

How to transcribe silences to train a custom STT model?

Hey! 🙂 I'm about to fine-tune a STT model with Audio + human-labeled transcript data. I've gone through the docs and I'm pretty confident that I've the right use case for this type of custom model training. Also, I already know how to organize the data…

asked

Bruno Goncalves Vaz (P) 20

edited a comment

Bruno Goncalves Vaz (P) 20

0 answers

Speech-to-Text batch transcribe API in germanycentralwest doesn't work

Last Friday (May 31 2024) we started getting the following errors on all transcripts sent to the batch transcription API on our speech resource in…

asked

Matej the Mete 20

commented

santoshkc 5,730 Microsoft Vendor

Filter

Content

1,469 questions with Azure AI Speech tags

tts cant read "Cоmmuniсаtiоn is often dоnе without our own соnѕсiоuѕ awareness"

Use Azure Speech through a fixed public IP

En- NG availability on Embedded speech

Usage cost calculation using Azure Retail Prices API for Azure Speech to Text and Blob Storage

The Cognitive Services Speech SDK has no sound on iPhone's Safari, but can play successfully on Mac's Safari. How should this be handled?

In, e.g., 0001.sentence.json, quotation marks present in the original sentence are dropped, if that quotation mark occurs at the beginning or end of the detected sentence. Is this expected behavior?

Custom neural voice data size is at 0 after training. Should I deploy the model?

Can some voices on spx text to speech not read phonetic alphabet ?

How to receive a real-time audio stream using Websocket in Spring boot with SDK

How to use an Microsoft Entra ID to authenticate with the Speech to text REST API (for batch transcription

is there any way of accessing the sounds that are sent to speech sdk server

Understanding "standard paid (S0)" pricing for "Audio Content Creation"

I cant access anything in "Audio Content Creation", error "You don't have operation permissions"

Will Azure AI Speech generate styles such as "happy", "cheerful", "excited" automatically from the data given?

Bug Report: Mispronunciation of Welsh Contraction "i’w" in Azure Neural TTS

Speech Studio "Text to Speech" not respecting <break> markup

Why am I getting a quota error?

No module named 'azure' when using azure.cognitiveservices.speech

How to transcribe silences to train a custom STT model?

Speech-to-Text batch transcribe API in germanycentralwest doesn't work