Content
An error occurred in the text-to-speech preview area
An error occurred in the text-to-speech preview area, I added Southeast Asia as the region of the Speech Service resource, because it can use the preview speaker, but during use, when I select the Chinese model "zh-CN-YunjianNeural", it has an…
Docker container fails to run with a model trained structured text dataset
I'm testing and using a model trained with a specific dataset using a custom-speech-to-text container. Below is the command line I used. I only change the modelId for this command. docker run --name stt --rm -it -p 5500:5000 \ -v…
How to poll asynchronous speech synthesis for status in Python
Hello, I have an object of type speechsdk.SpeechSynthesizer which I am running asynchronous speech synthesis with speech_synthesizer.speak_ssml_async() on, and I want to be able to tell when the synthesis has completed (i.e. how to poll it for an…
How to transcribe interview with two speakers from a single audio file similar to word 365 using spx recognize cli
Hello Everyone, I have a series of interviews recorded as MP3 files and i would like to use Azure speech CLI to transcribe them in a way similar to the integrated word 365 transcriptor format which is: I would like to use the Azure Speech service,…


There is something wrong with Chinese model Yunye Voice.
I am using Text-to-Speech in Microsoft Azure. with Chinese model "Yunye" and play; I hear a buzzing current sound(like the following link) https://1drv.ms/v/s!AtIg22Hya6zakk7cF2N5LyUTRM4s?e=cX6rwy The same problem doesn't happen in when I use…


Speech-to-text: Disfluency Removal configuration
I am using the speech-to-text REST API (python) to do some research regarding fillers, pauses, and backtracking in Japanese (ja-JP). Can I config disfluency removal while using the Speech-to-text service? I need to have true text with all the fillers…


Mac M1: CLI SPX Command not found
On my Macbook M1 (20219, MacOS 12.4) I have successfully installed dotnet-sdk-6.0.301-osx-arm64 and the Speech CLI via dotnet tool install --global Microsoft.CognitiveServices.Speech.CLI but when I type spx help I get zsh:…


Why when I use the text-to-speech tool, there is a buzzing sound when playing?
Hi there, I am using Text-to-Speech in Microsoft Azure. when I select Chinese language, Voice like "Yunye" and play; I hear a buzzing current sound (see attachment). When I choose other Voices, they are all normal. Only this one Voice has a…


Unable to delete audio wav file after stop_continuous_recognition
Hi, I am using Azure start/stop_continuous_recognition function for continuous transcription of large wav audio files. After transcription I need to delete files from local storage so that my server is not out of space after transcribing many files. It…


Speech Service Cost
I am confused about the free product and free trail. There are some free products like Speech, but it said my usage is out of the quota. If I move to pay as you go, Speech is still free? Is that because my free trail ends? New to azure, apologize for…
SpeechSynthesisWordBoundaryEventArgs Class
I find the speech SDK document, SpeechSynthesisWordBoundaryEventArgs Class is a possible solution for us. unlike REST API document, there is no sample code to guide us how to use it, Is SSML required for this part? How to locate words?
How to train Custom Speech-To-Text Model to recognize a word and capitalize the first letter of the phrase.
Hi, I have created a custom speech-to-text model that recognizes a phrase. It's recognizing it perfectly, but I want the resulting text in capitalized form. For E.g. the phrase is "Terminator" and the resulting text is "terminator"…


Unable to delete audio file
Hi, I am using azure speech to text service. Originally i have video file and then getting audio file using ffmpeg. import azure.cognitiveservices.speech as speechsdk speech_config = speechsdk.SpeechConfig(subscription=key, endpoint=endpoint) …


can't view custom speech model data
I am trying to create and test a custom speech model. I'm able to go through all of the steps to upload data, train the model, and test the model. However, I can't view the contents of files after I upload them (for example, a plain text file has a…
Waiting on Microsft Azure Ashley to Unlock speaking styles
Hello Microsoft Q&A community, I have been trying to use the speaking style selection feature on Microsoft Azure's TTS service to make Ashley's TTS voice sound more human and emote. However, I have noticed that the speaking style is stuck on…


How to use more than one voice in a TTS JavaScript snippet
The TTS javascript project I am currently working on needs to use two voices with the ability to be able to switch between these two. However, as far as I can see, I can only configure the synthesizer engine with one voice using…


Can't play Custom Neural Voice
Hello, Can't play custom voice because of the error = Unsupported voice CustomVoiceNeural. websocket error code: 1007 Code for fetching: ` async function synthesizeSpeech(responseText) { const speechConfig =…


How To Fix Error 4429 "exceeded the concurrent request limit"
I can't synthesise any speech anymore. I get this error: Speech synthesis canceled: Connection was closed by the remote host. Error code: 4429. The request is throttled because you have exceeded the concurrent request limit allowed for your sub USP…


How to config disfluency removal using REST API
I am using the speech-to-text REST API (python) to do some research regarding fillers, pauses, and backtracking in Japanese (ja-JP). Can I config disfluency removal while using the Speech-to-text service? I need to have true text with all the fillers in…


Finding a Specific Azure Text-To-Speech Voice
Can anyone tell me what text to speech voice is used in the youtube shorts video: https://www.youtube.com/shorts/zS2mgrbPGSk I've been looking for days and I can really appreciate it if someone can help out, I know 99.3% it's on azure

