How to eliminate audio interference from speakers on the microphone while both are in use at the same time using speech recognizing and sythesizing
Hi, I'm creating a real-time voice chatbot. For speech recognition and synthesis I am using Azure Speech. What I do is recognize the voice, then send to an LLM to get a response, and then synthesize the response into audio in real time. My goal is that…
How to read English words aloud in syllables by text-to-speech? The purpose is to make videos of memorizing English words.
It feels like these sounds are meant to optimize the reading of complete sentences, but they can't read words in detail by syllables.
About speaker separation in "fast-transcription-api"
Dear Azure Support Team https://learn.microsoft.com/en-us/rest/api/speechtotext/transcriptions/transcribe?view=rest-speechtotext-2024-05-15-preview&tabs=HTTP The details of the TranscribeDefinition class are not described anywhere, so how should I do…
Who can provide assistance?The time required for speech to text processing on the same file varies greatly, with a maximum of around 40%. Is Azure's performance like this?
Like the annex The first test ,it took approximately 8.5 seconds.first_test_log.txt But,it only took approximately 5 seconds for the second test.Second_test_log.txt
Do you have any suggestions or assistance in using the speech to text function to recognize homophones that may cause errors.
like for Chinese "枯(Ku)",recognized as "哭(Ku)".Cannot contact context. This is just a probabilistic issue.
my speech to text doesnt recognized multi channel
when I upload my own file in the ingestionClient, it works but when I use the samplecode in Github it doesn't work and only give me a single channel. so it doesn't pick up the multi speakers. it works here:…
error:com.microsoft.cognitiveservices.speech.SpeechConfig.setTempDirectory(Ljava/lang/String;)V, run the java demo on window10
error:com.microsoft.cognitiveservices.speech.SpeechConfig.setTempDirectory(Ljava/lang/String;)V, run the java demo on window10,how to resolve thie issue? I have install Microsoft Visual C++ Redistributable for Visual Studio 2015、2017、2019 和 2022. the…
How fix this error :Speech synthesis canceled: CancellationReason.Error Error details: Connection failed (no connection to the remote host). Internal error: 1. Error details: Failed with error: WS_OPEN_ERROR_UNDERLYING_IO_OPEN_FAILED wss://westeurope.tts.
I would like to use Azur Text to Speech on Raspberry Pi 4 with python but I doesn't work. I get the following Error : Speech synthesis canceled: CancellationReason.Error Error details: Connection failed (no connection to the remote host). Internal error:…
azure prononciation assessment
In azure prononciation assessment for scripted speech , why i insert a word that does not in exist in the script in my speech why i don't get that word as inserted in the result of prononciation assessment?
Azure Real Time Speech To Text fails to take input from Blob URL
I have implemented Azure Real Time Speech to Text using Speech SDK in Python for pre recorded audio files. It works fine when the input audio is on my machine. But fails when I give the input as the Blob url containing the audio. Please help!
Internal Error for Custom Model for Italian language Project
I have a query regarding an issue we are facing while creating a custom model in the Azure Speech portal for the Italian language. It is throwing an internal error. The following is the list of items we have used. However, when we used the same…
When will fast transcription will be GA?
I want to use the fast transcription in production When it will be GA
Reading from Blob container instead of public uri for azure speech Api
I can download the file through the code below from my blob storage: public async Task<Stream> ReadFileFromBlobStorageAsync(string blobName) { try { var containerClient =…
Pronunciation assessment SDK is getting stuck
I'm trying to integrate the pronunciation assessment speech services Python SDK - specifically a web front-end will upload an audio file to a fastapi backend, which will then utilise whisper to transcribe and then send the transcription together with the…
Create or join a resource group
I would like to have or create a resource group to create a Speech Resource and have access to the text-to-speech tool. It is key for HP University.
Custom external lexcion does not work when calling TTS speech synthesis service using Java SDK
We don't want the * sign to sound, so we set up a custom lexicon, but the synthesized speech doesn't seem to be affected by the lexicon. <speak xmlns="http://www.w3.org/2001/10/synthesis" …
How to synchronize real world events happening while speech recognition is happening with individual spoken words
I am trying to synchronize real world events that are occuring during live streaming of speech to Azure speech recognition services (e.g., eye gaze shifts, hardware device interactions, etc.). I note the time when I start speech recognition and record…
SpeechSynthesizer sometimes plays speech depending on SpeechSynthesisOutputFormat
In a C# WPF application, I call this function to convert text to speech: SpeechSynthesisResult speechSynthesisResult = await speechSynthesizer.SpeakSsmlAsync(strSsml); The audio data is returned ok. BUT the function also sometimes plays the speech as…
Pronunciation Assessment: Inconsistent Results
Hi, I'm experiencing very inconsistent results with the pronunciation assessment SDK for the same audio file when using different regions. I have tested the swedencentral and the westeurope regions. I tested them in different, languages, and the results…
Internal Server Error when running evaluation on Custom Speech
Trained a Speech to Text Model on Azure, tried running an evaluation on a test set and I'm getting "Token error rate results are not applicable for some old tests." on Speech Studio. A few weeks ago the same test wasn't giving any issues.