tts cant read "Cоmmuniсаtiоn is often dоnе without our own соnѕсiоuѕ awareness"
Hello everyone, im using the azure tts and encountered a weird problem. i tried testing the following line "Cоmmuniсаtiоn is often dоnе without our own соnѕсiоuѕ awareness" on the various voices available in the voice catalog. so far all i…
Use Azure Speech through a fixed public IP
A customer wishes to utilize the Azure Speech service via the internet while reducing the number of IP addresses that must be unblocked by their firewall. I tried to do that by adding a virtual network to my speech resource, creating a public IP, and…
![](https://techprofile.blob.core.windows.net/images/k3pGvNqV6k28ggaVHaTPrQ.png?8D803E)
En- NG availability on Embedded speech
HI i would like to request availability of English - Nigeria variant for embedded TTS
Usage cost calculation using Azure Retail Prices API for Azure Speech to Text and Blob Storage
We are using Azure subscription with the Standard Tier. We have a requirement to calculate the monthly usage cost in JPY (Japanese Yen) of the Azure Speech to Text service and Azure Blob Storage in our application. we analyzed the Azure Retail Price API…
The Cognitive Services Speech SDK has no sound on iPhone's Safari, but can play successfully on Mac's Safari. How should this be handled?
The Cognitive Services Speech SDK has no sound on iPhone's Safari, but can play successfully on Mac's Safari. How should this be handled? (in react) const initializeSynthesizer = () => { const speechConfig =…
In, e.g., 0001.sentence.json, quotation marks present in the original sentence are dropped, if that quotation mark occurs at the beginning or end of the detected sentence. Is this expected behavior?
This is mostly in the title. Initially, I suspected this was a bug in the JSON serialization since JSON also uses " to delimit its fields, and these also have to be escaped in SSML. Upon further investigation, however, i found it also affects…
![](https://techprofile.blob.core.windows.net/images/k3pGvNqV6k28ggaVHaTPrQ.png?8D803E)
Custom neural voice data size is at 0 after training. Should I deploy the model?
Hello, We prepared 1749 utterances in order to create a Custom Neural Voice. In Step 3, which is "Train Model", it identified these 1749 utterances and suggested 25 hours of training time (see image attached). The training has finished in over…
Can some voices on spx text to speech not read phonetic alphabet ?
Hello ! I am using the Azure text to speech service with SSML to read phonetic alphabet, it works well except for when I pick the voice "Andrew multilingual". The spx command does not generate any voice but there is no error in the output. Are…
How to receive a real-time audio stream using Websocket in Spring boot with SDK
Hello. This is really driving me crazy. Send Audio Stream from the Web Client to the Server The server must convert Stream to Text using the SDK. However, the stream in wav format does not appear to be being sent from the client to the server. I…
How to use an Microsoft Entra ID to authenticate with the Speech to text REST API (for batch transcription
I looks like you can only authenticate to the "Speech to text REST API" with a api key (Ocp-Apim-Subscription-Key). What we would like is to authenticate with a Microsoft Entra ID. Why? Our application is running a AKS and all our containers…
is there any way of accessing the sounds that are sent to speech sdk server
hi, I'm trying to make some ai assistant using speech SDK, device is Linux kernel based, and I've configured Alsa loopback and Pulseaudio to utilize the echo cancellation feature which should be supported by SDK. One thing that I noticed is that…
Understanding "standard paid (S0)" pricing for "Audio Content Creation"
If I created a "speech service" with "standard paid (S0)", and I am only and only going to use "Audio Content Creation". What are going to be the pricing for it ? Will the free quota going to be included ? (500k characters)…
I cant access anything in "Audio Content Creation", error "You don't have operation permissions"
I just created a speech service, but when I go to "Audio Content Creation", I can't do anything (New - Upload - Export) I tried to add myself as owner role, and other roles, but still, I can't do anything in Audio Content Creation.
Will Azure AI Speech generate styles such as "happy", "cheerful", "excited" automatically from the data given?
I've added data with about 750 utterances. 80% are normal sentences, while 10% are questions and the other 10% are exclamations. What will Speech Studio need to generate styles such as Happy, Cheerful, etc? Do I have to give it more data? Or will…
Bug Report: Mispronunciation of Welsh Contraction "i’w" in Azure Neural TTS
Subject: Bug Report: Mispronunciation of Welsh Contraction "i’w" in Azure Neural TTS Description: The Azure Neural TTS system is mispronouncing the Welsh contraction "i’w." Instead of producing the correct pronunciation…
Speech Studio "Text to Speech" not respecting <break> markup
The text to speech renderer fails to apply the "break" markup in the Audio Content Creation interface of the Speech Studio. I haven't tried other markup. Yesterday, it didn't work with RyanMultinationalNeural, but worked with AndrewNeural. Now…
Why am I getting a quota error?
I'm using Azure TTS and getting the following quota error: "You have reached the quota with your free-tier (F0) Speech resource. To continue to create audios with neural voices, switch to a standard paid resource, or upgrade your free-tier…
No module named 'azure' when using azure.cognitiveservices.speech
Hello, I have a problem with importing azure.cognitiveservices.speech. I pip install the package but when importing it I got this error. ModuleNotFoundError: No module named 'azure'
How to transcribe silences to train a custom STT model?
Hey! 🙂 I'm about to fine-tune a STT model with Audio + human-labeled transcript data. I've gone through the docs and I'm pretty confident that I've the right use case for this type of custom model training. Also, I already know how to organize the data…
Speech-to-Text batch transcribe API in germanycentralwest doesn't work
Last Friday (May 31 2024) we started getting the following errors on all transcripts sent to the batch transcription API on our speech resource in…