Does Azure provide speech-to-text medical/healthcare pretrained model?

Ubaid Ghante 0 Reputation points
2024-10-08T17:05:05.09+00:00

Hi,

I am developing a speech-to-text application using Azure Speech Service for physicians, which requires high accuracy in recognizing medical terminology and achieving a low WER.

I’ve encountered a couple of challenges:

  1. Fine-tuning the base Whisper model in Azure Speech Studio took nearly 4 hours for a 3 KB file containing 1000 utterances.
  2. The phrase list feature has a limit of 500 phrases, which restricts customization for medical terms.

Azure provides a medical NER model in Language Studio—does Azure offer, or can I request, a medical-specific speech-to-text model that is better suited for healthcare use cases?

Thank you!

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
1,976 questions
Azure Machine Learning
Azure Machine Learning
An Azure machine learning service for building and deploying models.
3,232 questions
Azure AI services
Azure AI services
A group of Azure services, SDKs, and APIs designed to make apps more intelligent, engaging, and discoverable.
3,350 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Sina Salam 18,961 Reputation points
    2024-10-08T19:57:01.7933333+00:00

    Hello Ubaid Ghante,

    Welcome to the Microsoft Q&A and thank you for posting your questions here.

    I understand that you would like to know if Azure provides speech-to-text medical/healthcare pretrained model.

    Yes, Azure does offer solutions that can help with your requirements for a medical-specific speech-to-text model. If you would like to create custom speech models or there is a need for phrase list limitation, use this link to learn more https://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-to-text Though, Azure does not explicitly mention a dedicated pretrained medical speech-to-text model. But you can leverage the custom speech model capabilities to train a model specifically for medical use cases.

    Also, for Azure's Text Analytics for Health, part of the Language Studio, provides powerful natural language processing (NLP) capabilities to detect and identify medical terms in text. This also is more focused on text analysis rather than direct speech-to-text conversion. You can read more here: https://azure.microsoft.com/en-us/blog/expanding-ai-technology-for-unstructured-text-beyond-english

    I hope this is helpful! Do not hesitate to let me know if you have any other questions.


    Please don't forget to close up the thread here by upvoting and accept it as an answer if it is helpful.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.