questions about speech to text service ?

Lewis Liu 0 Reputation points
2025-03-16T07:30:56.83+00:00

Dear Azure Support Team,

I hope this message finds you well. I am writing to inquire about the usage and accuracy of Azure Speech to Text service, please sharing your recommendation for the following case:

  1. Common Type Recognition/Pronunciation: Can Azure TTS accurately recognize and pronounce common types like dates, times, and abbreviations?
  2. Turn Detection: Does Azure TTS support turn detection to detect when a user has finished speaking, or is it solely based on Voice Activity Detection (VAD)?
  3. Accents: Does Azure TTS offer customization options for regional accents or dialects within a language?
  4. Name Recognition: How does Azure TTS handle the pronunciation of uncommon or region-specific names?
  5. Noise: Does Azure TTS include noise suppression features for clearer audio output?
  6. Multilingual Support and Mixed Mode: Can Azure TTS seamlessly handle multilingual text within a single input?
  7. Streaming Support: Does Azure TTS support real-time streaming of synthesized speech, and what are the latency considerations?
  8. Custom Term/Vocabulary: Is it possible to customize terms or vocabulary to ensure accurate pronunciation of specialized or domain-specific words?

Lewis Liu

Thanks in advance.

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
2,069 questions
0 comments No comments
{count} votes

2 answers

Sort by: Most helpful
  1. Abiola Akinbade 29,490 Reputation points Volunteer Moderator
    2025-03-16T10:36:02.7733333+00:00

    Hello Lewis Liu,

    Thanks for your question.

    See overview of its services here: https://docs.azure.cn/en-us/ai-services/speech-service/text-to-speech

    1. Common Type Recognition/Pronunciation: Can Azure TTS accurately recognize and pronounce common types like dates, times, and abbreviations?: Yes it can
    2. Turn Detection: No, Azure TTS does not support turn detection
    3. Accents: Does Azure TTS offer customization options for regional accents or dialects within a language?: Yes  https://techcommunity.microsoft.com/blog/azure-ai-services-blog/azure-ai-speech-text-to-speech-feb-2025-updates-new-hd-voices-and-more/4387263
    4. Name Recognition: How does Azure TTS handle the pronunciation of uncommon or region-specific names? Yes, Azure TTS offers a variety of regional accents and dialects within supported languages.
    5. Noise: Does Azure TTS include noise suppression features for clearer audio output?: Yes, it allows customization to handle the pronunciation of uncommon or region-specific names.
    6. Multilingual Support and Mixed Mode: Yes, it can
    7. Streaming Support: Yes, it supports real-time streaming of synthesized speech with low latency.
    8. Custom Term/Vocabulary: Yes, it allows customization of terms or vocabulary.

    You can mark it 'Accept Answer' and 'Upvote' if this helped you

    (Please note: If you have Priority Community support please wait for a dedicated Microsoft support representative to assist you, as they have access to the necessary backend resources. If you have not yet opened a support case, we recommend reaching out through the support channel available under your subscription level.)

    Regards,

    Abiola


  2. VSawhney 800 Reputation points Microsoft External Staff Moderator
    2025-03-20T08:12:49.7+00:00

    Hello Lewis Liu,

    You can try optimizing the above mention areas as follows:

    1. Name Recognition: Sometimes, the system makes mistakes in recognizing names within email addresses. For example, "******@outlook.com" is often returned as "louis dot leo at outlook.com" or "******@outlook.com." How can I optimize this?
      Solution: In order to improve name recognition, you can use a custom model trained specifically on email address data. Additionally, implementing a post-processing step to validate and correct recognized names based on common patterns in email addresses can help reduce errors.
      Ref. doc: https://learn.microsoft.com/en-us/azure/ai-services/language-service/custom-named-entity-recognition/overview?form=MG0AV3
    2. Noise: My application operates in street environments with significant background noise. The base model does not perform well in these conditions. Should I use a custom model trained with noise data?
      Solution: Yes, using a custom model trained with noise data is a great approach. You can collect audio samples from street environments and use them to train a custom model. This will help the system adapt to noisy conditions and improve its performance.
      Ref. doc: https://learn.microsoft.com/en-us/azure/ai-services/speech-service/professional-voice-create-training-set?pivots=speech-studio
    3. Multilingual Support and Mixed Mode: Could you please share the link on how to enable Multilingual Support? My application may involve Chinese, English, and Russian, and sometimes multiple languages may appear in a single sentence.
      Solution: Azure AI provides multilingual support for various languages, including Chinese, English, and Russian. You can find detailed information on enabling multilingual support here and here. For mixed-mode scenarios, ensure your model is trained with examples that include multiple languages in a single sentence.
      Ref. doc: https://learn.microsoft.com/en-us/azure/ai-services/document-intelligence/language-support/custom?view=doc-intel-4.0.0&form=MG0AV3&tabs=printed
    4. Custom Term/Vocabulary: Yes, the system allows customization of terms or vocabulary. Should I use a phrase list or a custom model to address this?
      Solution: If your application requires specific terms or vocabulary, using a phrase list can be effective for simple scenarios. However, for more complex requirements, a custom model trained with your specific terms and vocabulary will provide better results.
      Ref. doc: https://learn.microsoft.com/en-us/azure/ai-services/speech-service/improve-accuracy-phrase-list?form=MG0AV3&tabs=terminal&pivots=programming-language-csharp

    I hope this solves your difficulties. Please feel free to reach us, if you have further queries.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.