Hello @Payne 谢 !
Welcome to Microsoft QnA!
You can utilize Azure Custom Speech Service :
https://learn.microsoft.com/en-us/azure/ai-services/speech-service/custom-speech-overview
From the Doc :
"With Custom Speech, you can evaluate and improve the accuracy of speech recognition for your applications and products. A custom speech model can be used for real-time speech to text, speech translation, and batch transcription.
Out of the box, speech recognition utilizes a Universal Language Model as a base model that is trained with Microsoft-owned data and reflects commonly used spoken language. The base model is pre-trained with dialects and phonetics representing various common domains. When you make a speech recognition request, the most recent base model for each supported language is used by default. The base model works well in most speech recognition scenarios.
A custom model can be used to augment the base model to improve recognition of domain-specific vocabulary specific to the application by providing text data to train the model. It can also be used to improve recognition based for the specific audio conditions of the application by providing audio data with reference transcriptions."
I hope this helps!
Kindly mark the answer as Accepted and Upvote in case it helped!
Regards