How to create Multiple Keywords in Speech Studio

dpaul 31 Reputation points

I am learning to use the Azure speech sdk. I have used Speech Studio to create a .table file that contains a Keyword so my app can recognize it.

I want my app to have multiple keywords it recognizes. How do you do this with Speech Studio? It appears that "Custom Keyword" project in Speech Studio will only take one word or phrase.
What is the solution when you want to have 5 keywords? Do you create 5 keyword model .tables and assign to 5 separate KeywordRecognizer objects?

In .net the SpeechRecognitionEngine class you can load a list of Grammars representing commands/words you want the SpeechRecognitionEngine to recognize. But I can't seem to determine the correct way to do this with Azure speech sdk.


Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
1,391 questions
{count} vote

1 answer

Sort by: Most helpful
  1. Ramr-msft 17,611 Reputation points

    @dpaul Thanks for the question. Can you please add more details about the usecase that you are trying.
    Here is link to the document for Custom Keyword. Currently Custom keyword models, and the resulting .table files, can only be created in Speech Studio. You cannot create custom keywords from the SDK or with REST calls.
    If you use the Speech SDK, it is true that at the moment the KeywordRecognizer object can only support one model file (one keyword), but there is likely a way to workaround it. What programming language will you be using? You can try creating multiple KeywordRecognizer objects, each one created with a different keyword model. Then when one of them "fires" (a keyword was recognized), you can get the audio stream from the result of that keyword recognizer, and feed that audio stream to a new SpeechRecognizer.

    We don't have a sample showing how to do that. But assuming C#, it will be a combination of a few KeywordRecognizers with audio extracted from the result as shown in this short sample "KeywordRecognizer()" , together with a sample showing how to do speech recognition from and input audio stream. For example " RecognitionWithPushAudioStreamAsync()".

    The key phrase extraction feature can evaluate unstructured text, and for each document, return a list of key phrases.