Share via

Does Azure provide speech-to-text medical/healthcare pretrained model?

Ubaid Ghante 0 Reputation points
2024-10-08T17:05:05.09+00:00

Hi,

I am developing a speech-to-text application using Azure Speech Service for physicians, which requires high accuracy in recognizing medical terminology and achieving a low WER.

I’ve encountered a couple of challenges:

  1. Fine-tuning the base Whisper model in Azure Speech Studio took nearly 4 hours for a 3 KB file containing 1000 utterances.
  2. The phrase list feature has a limit of 500 phrases, which restricts customization for medical terms.

Azure provides a medical NER model in Language Studio—does Azure offer, or can I request, a medical-specific speech-to-text model that is better suited for healthcare use cases?

Thank you!

Azure Speech in Foundry Tools
Azure Machine Learning
Foundry Tools
Foundry Tools

Formerly known as Azure AI Services or Azure Cognitive Services is a unified collection of prebuilt AI capabilities within the Microsoft Foundry platform

0 comments No comments

1 answer

Sort by: Most helpful
  1. Sina Salam 29,101 Reputation points Volunteer Moderator
    2024-10-08T19:57:01.7933333+00:00

    Hello Ubaid Ghante,

    Welcome to the Microsoft Q&A and thank you for posting your questions here.

    I understand that you would like to know if Azure provides speech-to-text medical/healthcare pretrained model.

    Yes, Azure does offer solutions that can help with your requirements for a medical-specific speech-to-text model. If you would like to create custom speech models or there is a need for phrase list limitation, use this link to learn more https://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-to-text Though, Azure does not explicitly mention a dedicated pretrained medical speech-to-text model. But you can leverage the custom speech model capabilities to train a model specifically for medical use cases.

    Also, for Azure's Text Analytics for Health, part of the Language Studio, provides powerful natural language processing (NLP) capabilities to detect and identify medical terms in text. This also is more focused on text analysis rather than direct speech-to-text conversion. You can read more here: https://azure.microsoft.com/en-us/blog/expanding-ai-technology-for-unstructured-text-beyond-english

    I hope this is helpful! Do not hesitate to let me know if you have any other questions.


    Please don't forget to close up the thread here by upvoting and accept it as an answer if it is helpful.

    Was this answer helpful?

    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.