Quickstart: Create a custom keyword

Reference documentation | Package (NuGet) | Additional Samples on GitHub

In this quickstart, you learn the basics of working with custom keywords. A keyword is a word or short phrase, which allows your product to be voice activated. You create keyword models in Speech Studio. Then export a model file that you use with the Speech SDK in your applications.

Prerequisites

Create a keyword in Speech Studio

Before you can use a custom keyword, you need to create a keyword using the Custom Keyword page on Speech Studio. After you provide a keyword, it produces a .table file that you can use with the Speech SDK.

Important

Custom keyword models, and the resulting .table files, can only be created in Speech Studio. You cannot create custom keywords from the SDK or with REST calls.

  1. Go to the Speech Studio and Sign in. If you don't have a speech subscription, go to Create Speech Services.

  2. On the Custom Keyword page, select Create a new project.

  3. Enter a Name, Description, and Language for your custom keyword project. You can only choose one language per project, and support is currently limited to English (United States) and Chinese (Mandarin, Simplified).

    Describe your keyword project

  4. Select your project's name from the list.

    Select your keyword project.

  5. To create a custom keyword for your virtual assistant, select Create a new model.

  6. Enter a Name for the model, Description, and Keyword of your choice, then select Next. See the guidelines on choosing an effective keyword.

    Enter your keyword

  7. The portal creates candidate pronunciations for your keyword. Listen to each candidate by selecting the play buttons and remove the checks next to any pronunciations that are incorrect.Select all pronunciations that correspond to how you expect your users to say the keyword and then select Next to begin generating the keyword model.

    Screenshot that shows where you choose the correct pronunciations.

  8. Select a model type, then select Create. You can view a list of regions that support the Advanced model type in the Keyword recognition region support documentation.

  9. It may take up to 30 minutes for the model to be generated. The keyword list will change from Processing to Succeeded when the model is complete.

    Review your keyword.

  10. From the collapsible menu on the left, select Tune for options to tune and download your model. The downloaded file is a .zip archive. Extract the archive, and you see a file with the .table extension. You use the .table file with the SDK, so make sure to note its path.

    Download your model table.

Use a keyword model with the Speech SDK

First, load your keyword model file using the FromFile() static function, which returns a KeywordRecognitionModel. Use the path to the .table file you downloaded from Speech Studio. Additionally, you create an AudioConfig using the default microphone, then instantiate a new KeywordRecognizer using the audio configuration.

using Microsoft.CognitiveServices.Speech;
using Microsoft.CognitiveServices.Speech.Audio;

var keywordModel = KeywordRecognitionModel.FromFile("your/path/to/Activate_device.table");
using var audioConfig = AudioConfig.FromDefaultMicrophoneInput();
using var keywordRecognizer = new KeywordRecognizer(audioConfig);

Next, running keyword recognition is done with one call to RecognizeOnceAsync() by passing your model object. This starts a keyword recognition session that lasts until the keyword is recognized. Thus, you generally use this design pattern in multi-threaded applications, or in use cases where you may be waiting for a wake-word indefinitely.

KeywordRecognitionResult result = await keywordRecognizer.RecognizeOnceAsync(keywordModel);

Note

The example shown here uses local keyword recognition, since it does not require a SpeechConfig object for authentication context, and does not contact the back-end. However, you can run both keyword recognition and verification utilizing a direct back-end connection.

Continuous recognition

Other classes in the Speech SDK support continuous recognition (for both speech and intent recognition) with keyword recognition. This allows you to use the same code you would normally use for continuous recognition, with the ability to reference a .table file for your keyword model.

For speech-to-text, follow the same design pattern shown in the recognize speech guide to set up continuous recognition. Then, replace the call to recognizer.StartContinuousRecognitionAsync() with recognizer.StartKeywordRecognitionAsync(KeywordRecognitionModel), and pass your KeywordRecognitionModel object. To stop continuous recognition with keyword recognition, use recognizer.StopKeywordRecognitionAsync() instead of recognizer.StopContinuousRecognitionAsync().

Intent recognition uses an identical pattern with the StartKeywordRecognitionAsync and StopKeywordRecognitionAsync functions.

Reference documentation | Package (NuGet) | Additional Samples on GitHub

The Speech SDK for C++ does support keyword recognition, but we haven't yet included a guide here. Please select another programming language to get started and learn about the concepts, or see the C++ reference and samples linked from the beginning of this article.

Reference documentation | Package (Go) | Additional Samples on GitHub

In this quickstart, you learn the basics of working with custom keywords. A keyword is a word or short phrase, which allows your product to be voice activated. You create keyword models in Speech Studio. Then export a model file that you use with the Speech SDK in your applications.

Prerequisites

Set up the environment

Install the Speech SDK for Go.

Recognize speech from a microphone

Follow these steps to create a new GO module.

  1. Open a command prompt where you want the new module, and create a new file named speech-recognition.go.

  2. Replace the contents of speech-recognition.go with the following code.

    package main
    
    import (
    	"bufio"
    	"fmt"
    	"os"
    
    	"github.com/Microsoft/cognitive-services-speech-sdk-go/audio"
    	"github.com/Microsoft/cognitive-services-speech-sdk-go/speech"
    )
    
    func sessionStartedHandler(event speech.SessionEventArgs) {
    	defer event.Close()
    	fmt.Println("Session Started (ID=", event.SessionID, ")")
    }
    
    func sessionStoppedHandler(event speech.SessionEventArgs) {
    	defer event.Close()
    	fmt.Println("Session Stopped (ID=", event.SessionID, ")")
    }
    
    func recognizingHandler(event speech.SpeechRecognitionEventArgs) {
    	defer event.Close()
    	fmt.Println("Recognizing:", event.Result.Text)
    }
    
    func recognizedHandler(event speech.SpeechRecognitionEventArgs) {
    	defer event.Close()
    	fmt.Println("Recognized:", event.Result.Text)
    }
    
    func cancelledHandler(event speech.SpeechRecognitionCanceledEventArgs) {
    	defer event.Close()
    	fmt.Println("Received a cancellation: ", event.ErrorDetails)
    }
    
    func main() {
        subscription :=  "YourSubscriptionKey"
        region := "YourServiceRegion"
    
    	audioConfig, err := audio.NewAudioConfigFromDefaultMicrophoneInput()
    	if err != nil {
    		fmt.Println("Got an error: ", err)
    		return
    	}
    	defer audioConfig.Close()
    	speechConfig, err := speech.NewSpeechConfigFromSubscription(subscription, region)
    	if err != nil {
    		fmt.Println("Got an error: ", err)
    		return
    	}
    	defer speechConfig.Close()
    	speechRecognizer, err := speech.NewSpeechRecognizerFromConfig(speechConfig, audioConfig)
    	if err != nil {
    		fmt.Println("Got an error: ", err)
    		return
    	}
    	defer speechRecognizer.Close()
    	speechRecognizer.SessionStarted(sessionStartedHandler)
    	speechRecognizer.SessionStopped(sessionStoppedHandler)
    	speechRecognizer.Recognizing(recognizingHandler)
    	speechRecognizer.Recognized(recognizedHandler)
    	speechRecognizer.Canceled(cancelledHandler)
    	speechRecognizer.StartContinuousRecognitionAsync()
    	defer speechRecognizer.StopContinuousRecognitionAsync()
    	bufio.NewReader(os.Stdin).ReadBytes('\n')
    }
    
  3. In speech-recognition.go, replace YourSubscriptionKey with your Speech resource key, and replace YourServiceRegion with your Speech resource region.

Run the following commands to create a go.mod file that links to components hosted on GitHub:

go mod init speech-recognition
go get github.com/Microsoft/cognitive-services-speech-sdk-go

Now build and run the code:

go build
go run speech-recognition

Clean up resources

You can use the Azure portal or Azure Command Line Interface (CLI) to remove the Speech resource you created.

Reference documentation | Additional Samples on GitHub

The Speech SDK for Java does support keyword recognition, but we haven't yet included a guide here. Please select another programming language to get started and learn about the concepts, or see the Java reference and samples linked from the beginning of this article.

Reference documentation | Package (npm) | Additional Samples on GitHub | Library source code

The Speech SDK for JavaScript does not support keyword recognition. Please select another programming language or the JavaScript reference and samples linked from the beginning of this article.

Reference documentation | Package (Download) | Additional Samples on GitHub

In this quickstart, you learn the basics of working with custom keywords. A keyword is a word or short phrase, which allows your product to be voice activated. You create keyword models in Speech Studio. Then export a model file that you use with the Speech SDK in your applications.

Prerequisites

Create a keyword in Speech Studio

Before you can use a custom keyword, you need to create a keyword using the Custom Keyword page on Speech Studio. After you provide a keyword, it produces a .table file that you can use with the Speech SDK.

Important

Custom keyword models, and the resulting .table files, can only be created in Speech Studio. You cannot create custom keywords from the SDK or with REST calls.

  1. Go to the Speech Studio and Sign in. If you don't have a speech subscription, go to Create Speech Services.

  2. On the Custom Keyword page, select Create a new project.

  3. Enter a Name, Description, and Language for your custom keyword project. You can only choose one language per project, and support is currently limited to English (United States) and Chinese (Mandarin, Simplified).

    Describe your keyword project

  4. Select your project's name from the list.

    Select your keyword project.

  5. To create a custom keyword for your virtual assistant, select Create a new model.

  6. Enter a Name for the model, Description, and Keyword of your choice, then select Next. See the guidelines on choosing an effective keyword.

    Enter your keyword

  7. The portal creates candidate pronunciations for your keyword. Listen to each candidate by selecting the play buttons and remove the checks next to any pronunciations that are incorrect.Select all pronunciations that correspond to how you expect your users to say the keyword and then select Next to begin generating the keyword model.

    Screenshot that shows where you choose the correct pronunciations.

  8. Select a model type, then select Create. You can view a list of regions that support the Advanced model type in the Keyword recognition region support documentation.

  9. It may take up to 30 minutes for the model to be generated. The keyword list will change from Processing to Succeeded when the model is complete.

    Review your keyword.

  10. From the collapsible menu on the left, select Tune for options to tune and download your model. The downloaded file is a .zip archive. Extract the archive, and you see a file with the .table extension. You use the .table file with the SDK, so make sure to note its path.

    Download your model table.

Use a keyword model with the Speech SDK

See the sample on GitHub for using your Custom Keyword model with the Objective C SDK.

Reference documentation | Package (Download) | Additional Samples on GitHub

In this quickstart, you learn the basics of working with custom keywords. A keyword is a word or short phrase, which allows your product to be voice activated. You create keyword models in Speech Studio. Then export a model file that you use with the Speech SDK in your applications.

Prerequisites

Create a keyword in Speech Studio

Before you can use a custom keyword, you need to create a keyword using the Custom Keyword page on Speech Studio. After you provide a keyword, it produces a .table file that you can use with the Speech SDK.

Important

Custom keyword models, and the resulting .table files, can only be created in Speech Studio. You cannot create custom keywords from the SDK or with REST calls.

  1. Go to the Speech Studio and Sign in. If you don't have a speech subscription, go to Create Speech Services.

  2. On the Custom Keyword page, select Create a new project.

  3. Enter a Name, Description, and Language for your custom keyword project. You can only choose one language per project, and support is currently limited to English (United States) and Chinese (Mandarin, Simplified).

    Describe your keyword project

  4. Select your project's name from the list.

    Select your keyword project.

  5. To create a custom keyword for your virtual assistant, select Create a new model.

  6. Enter a Name for the model, Description, and Keyword of your choice, then select Next. See the guidelines on choosing an effective keyword.

    Enter your keyword

  7. The portal creates candidate pronunciations for your keyword. Listen to each candidate by selecting the play buttons and remove the checks next to any pronunciations that are incorrect.Select all pronunciations that correspond to how you expect your users to say the keyword and then select Next to begin generating the keyword model.

    Screenshot that shows where you choose the correct pronunciations.

  8. Select a model type, then select Create. You can view a list of regions that support the Advanced model type in the Keyword recognition region support documentation.

  9. It may take up to 30 minutes for the model to be generated. The keyword list will change from Processing to Succeeded when the model is complete.

    Review your keyword.

  10. From the collapsible menu on the left, select Tune for options to tune and download your model. The downloaded file is a .zip archive. Extract the archive, and you see a file with the .table extension. You use the .table file with the SDK, so make sure to note its path.

    Download your model table.

Use a keyword model with the Speech SDK

See the sample on GitHub for using your Custom Keyword model with the Objective C SDK.

Reference documentation | Package (PyPi) | Additional Samples on GitHub

In this quickstart, you learn the basics of working with custom keywords. A keyword is a word or short phrase, which allows your product to be voice activated. You create keyword models in Speech Studio. Then export a model file that you use with the Speech SDK in your applications.

Prerequisites

Create a keyword in Speech Studio

Before you can use a custom keyword, you need to create a keyword using the Custom Keyword page on Speech Studio. After you provide a keyword, it produces a .table file that you can use with the Speech SDK.

Important

Custom keyword models, and the resulting .table files, can only be created in Speech Studio. You cannot create custom keywords from the SDK or with REST calls.

  1. Go to the Speech Studio and Sign in. If you don't have a speech subscription, go to Create Speech Services.

  2. On the Custom Keyword page, select Create a new project.

  3. Enter a Name, Description, and Language for your custom keyword project. You can only choose one language per project, and support is currently limited to English (United States) and Chinese (Mandarin, Simplified).

    Describe your keyword project

  4. Select your project's name from the list.

    Select your keyword project.

  5. To create a custom keyword for your virtual assistant, select Create a new model.

  6. Enter a Name for the model, Description, and Keyword of your choice, then select Next. See the guidelines on choosing an effective keyword.

    Enter your keyword

  7. The portal creates candidate pronunciations for your keyword. Listen to each candidate by selecting the play buttons and remove the checks next to any pronunciations that are incorrect.Select all pronunciations that correspond to how you expect your users to say the keyword and then select Next to begin generating the keyword model.

    Screenshot that shows where you choose the correct pronunciations.

  8. Select a model type, then select Create. You can view a list of regions that support the Advanced model type in the Keyword recognition region support documentation.

  9. It may take up to 30 minutes for the model to be generated. The keyword list will change from Processing to Succeeded when the model is complete.

    Review your keyword.

  10. From the collapsible menu on the left, select Tune for options to tune and download your model. The downloaded file is a .zip archive. Extract the archive, and you see a file with the .table extension. You use the .table file with the SDK, so make sure to note its path.

    Download your model table.

Use a keyword model with the Speech SDK

See the sample on GitHub for using your Custom Keyword model with the Python SDK.

Speech-to-text REST API reference | Speech-to-text REST API for short audio reference | Additional Samples on GitHub

The Speech to text REST API does not support keyword recognition. Please select another programming language or the reference and samples linked from the beginning of this article.

The Speech CLI does support keyword recognition, but we haven't yet included a guide here. Please select another programming language to get started and learn about the concepts.

Next steps