Logs Not Generated for Custom Model in Azure Speech Services
Why are the logs not being generated for my custom speech model in Azure Speech Services, and how can I ensure that transcription is using the custom model? PS - Based on common issues that we have seen from customers and other sources, we are posting…
Training German Audio Files in Azure Custom Speech Fails
What is causing the failure when trying to train German audio files with human-made transcripts in Azure Custom Speech? PS - Based on common issues that we have seen from customers and other sources, we are posting these questions to help the Azure…
How to Monitor Speech Service Models and Understand Rate Limits in Azure
How can I monitor my Azure Speech Service models, enable alerts, and understand the rate limits for real-time transcription and other speech resources? PS - Based on common issues that we have seen from customers and other sources, we are posting these…
Ensuring Uninterrupted Speech Services During Azure Failover Scenarios
How can I ensure that speech recognition and synthesis services continue without interruption during an Azure failover scenario? PS - Based on common issues that we have seen from customers and other sources, we are posting these questions to help the…
Fixing Premature Session Stopping in Azure Speech SDK for Long Audios
What should I do if I receive a "Session Stopped" message before the end of my audio file when using Azure Speech SDK? PS - Based on common issues that we have seen from customers and other sources, we are posting these questions to help the…
I want to use the api
I want to use this API. When can I use it https://learn.microsoft.com/zh-cn/azure/ai-services/speech-service/fast-transcription-create
About speaker separation in "fast-transcription-api"
Dear Azure Support Team https://learn.microsoft.com/en-us/rest/api/speechtotext/transcriptions/transcribe?view=rest-speechtotext-2024-05-15-preview&tabs=HTTP The details of the TranscribeDefinition class are not described anywhere, so how should I do…
Transcription Denormalization.
Is there a way to "denormalize" Azure speech transcription, so it provides verbatim transcription (as close as possible, with word fillers, hesitations, repeats, etc)? I will also need word level timestamping and diarization. I am hoping there…
Improving Accuracy of Azure Speech-to-Text with Continuous Language Identification
How can I improve the accuracy of language identification and speech-to-text (STT) capabilities in Azure Speech Service for my voice bot, which is experiencing issues with detecting English language and picking up background noise? PS - Based on common…
Issues with Recognizing Mixed Thai and English Audio in Azure Speech Service
How can I improve the accuracy of recognizing mixed Thai and English audio using Azure Speech Service? PS - Based on common issues that we have seen from customers and other sources, we are posting these questions to help the Azure community.
Enhancing Multilingual Transcription Accuracy with Azure Speech Service
What steps can I take to improve the transcription accuracy of audio files that contain multiple languages using Azure Speech Service? PS - Based on common issues that we have seen from customers and other sources, we are posting these questions to help…
How to Associate Client-Side Live Transcription Sessions with Logged Audio Files in Azure AI Services
How can I associate client-side live transcription sessions with the logged audio files in Azure AI services? PS - Based on common issues that we have seen from customers and other sources, we are posting these questions to help the Azure community.
Troubleshooting WebRTC Disconnection in Avatar Service
What could be causing the disconnection issue when trying to start an avatar using WebRTC in Azure Cognitive Services-Speech Services, and how can it be resolved? PS - Based on common issues that we have seen from customers and other sources, we are…
Resolving Segmentation Fault with Azure Speech SDK and jemalloc
What steps can I take to resolve a segmentation fault when using jemalloc with the Azure Speech SDK in a Java application? PS - Based on common issues that we have seen from customers and other sources, we are posting these questions to help the Azure…
503 Error When Downloading Azure TTS License for Disconnected Containers
What should I do if I am unable to download the TTS license for Azure speech disconnected containers and encounter a 503 error code? PS - Based on common issues that we have seen from customers and other sources, we are posting these questions to help…
Resolving FetchDataError in Azure Speech Service
What steps should be taken when encountering a FetchDataError in the Azure Speech Service, causing a fatal error when accessing features? PS - Based on common issues that we have seen from customers and other sources, we are posting these questions to…
Limitation on Text-to-Speech Audio Length in Azure Cognitive Services
How can I generate audio files longer than 10 minutes using Azure Cognitive Services' Text-to-Speech API? PS - Based on common issues that we have seen from customers and other sources, we are posting these questions to help the Azure community.
Unable to Save Lexicon in Azure Cognitive Services
Why am I unable to save a lexicon in Azure Cognitive Services, and how can I resolve this issue? PS - Based on common issues that we have seen from customers and other sources, we are posting these questions to help the Azure community.
Do you have any suggestions or assistance in using the speech to text function to recognize homophones that may cause errors.
like for Chinese "枯(Ku)",recognized as "哭(Ku)".Cannot contact context. This is just a probabilistic issue.
How to read English words aloud in syllables by text-to-speech? The purpose is to make videos of memorizing English words.
It feels like these sounds are meant to optimize the reading of complete sentences, but they can't read words in detail by syllables.