Hello, thank you for reaching out to us here.
I think you are mentioning two scenarios:
- Multi-device conversation: https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/multi-device-conversation
Please correct me if I misunderstood since the first link you share is the document of general Speech to Text service.
For the Multi-device conversation feature, everyone use it's conversation ID to join, this feature uses Speech Service default model.
For the Real time conversation transcription, it creates voice signatures for the conversation participants so that they can be identified as unique speakers, but this is not necessary if you don't want to pre-enroll users.
Both of them use Speech SDK default models, but Real time conversation transcription has a improvement feature which may make the result better.
If you want to train model by your own data set, I think you are mentioning Custom Speech. By using Custom Speech, you can train and deploy your own model with you data set.
Hope this helps! Please let us know if you have more questions.
Please kindly accept the answer if you feel helpful, thank you.
My first requirement is the Speech can recognize the English words (for example IT words, Jira, Confluence, Azure, etc.) in Hungarian context, main language (for example Hungarian language).
Second is that I can use "Speech Studio" on"multi-device" scenario, too as host, too as participant (i mean, they can access to the actual model of Speech Studio instead of default, for better results (it would be easier for deafs to understand the situations).