Customize a language model with Azure AI Video Indexer
Azure AI Video Indexer supports automatic speech recognition through integration with the Microsoft Custom Speech Service. You can customize the language model by uploading adaptation text. This text comes from the domain whose vocabulary you'd like the engine to use to adapt. Once you train your model, new words appearing in the adaptation text is recognized, assuming default pronunciation, and the language model learns new probable sequences of words. See the list of supported by Azure AI Video Indexer languages in supported langues.
For example, "Kubernetes" (in the context of Azure Kubernetes service), is a word that is highly specific. Since the word is new to Azure AI Video Indexer, it's recognized as "communities". Train the model to recognize it as "Kubernetes". In other cases, the words exist, but the language model isn't expecting them to appear in a certain context. For example, "container service" isn't a 2-word sequence that a nonspecialized language model would recognize as a specific set of words.
There are two ways to customize a language model:
- Option 1: Edit the transcript that was generated by Azure AI Video Indexer. By editing and correcting the transcript, you're training a language model to provide improved results in the future.
- Option 2: Upload text file(s) to train the language model. The file can either contain a list of words as you would like them to appear in the Video Indexer transcript or the relevant words included naturally in sentences and paragraphs. As better results are achieved with the latter approach, it's recommended for the upload file to contain full sentences or paragraphs related to your content.
Important
Don't include the words or sentences as currently incorrectly transcribed (for example, "communities") in the upload file as this will negate the intended impact. Only include the words as you would like them to appear (for example, "Kubernetes").
Optimize your custom language model
Azure AI Video Indexer learns based on probabilities of word combinations, so to learn best:
- Give enough real examples of sentences as they would be spoken.
- Put only one sentence per line, not more. Otherwise the system will learn probabilities across sentences.
- It's okay to put one word as a sentence to boost the word against others, but the system learns best from full sentences.
- When introducing new words or acronyms, if possible, give as many examples of usage in a full sentence to give as much context as possible to the system.
- Try to put several adaptation options, and see how they work for you.
- Avoid repetition of the exact same sentence multiple times. It may create bias against the rest of the input.
- Avoid including uncommon symbols (~, # @ % &) as they'll get discarded. The sentences in which they appear will also get discarded.
- Avoid putting too large inputs, such as hundreds of thousands of sentences, because doing so will dilute the effect of boosting.
Prerequisites
- An Azure account
- An Azure AI Video Indexer account
Create a language model
- Go to the Azure AI Video Indexer website and sign in.
- To customize a model in your account, select the Content model customization button on the left of the page.
- Select the Language tab. You see a list of supported languages.
- Under the language that you want, select Add model.
- Type in the name for the language model and hit enter. This step creates the model and gives the option to upload text files to the model.
- To add a text file, select Add file. Your file explorer will open.
- Navigate to and select the text file. You can add multiple text files to a language model. You can also add a text file by selecting the ... button on the right side of the language model and selecting Add file.
- Once you're done uploading the text files, select the green Train option.
The training process can take a few minutes. Once the training is done, Trained appears next to the model. You can preview, download, and delete the file from the model.
Using a language model on a new video
To use your language model on a new video, do one of the following actions:
- Select the Upload button on the top of the page.
- Drop your audio or video file or browse for your file.
- Select a language model you created from the Video source language dropdown list.
- Select the Upload option in the bottom of the page, and your new video will be indexed using your Language model.
Using a language model to reindex
- Sign in to the Azure AI Video Indexer home page.
- Click on ... button on the video and select Re-index.
- Select the Video source language drop-down and select a language model that you created from the list.
- Select the Re-index button and your video will be reindexed using your language model.
Edit a language model
You can edit a language model by changing its name, adding files to it, and deleting files from it. If you add or delete files from the language model, you'll have to train the model again by selecting the green Train option.
Rename the language model
You can change the name of the language model by selecting the ellipsis (...) button on the right side of the language model and selecting Rename. Enter the the new name.
Add files
- Select Add file. Your file explorer will open.
- Navigate to and select the text file. You can add multiple text files to a language model.
You can also add a text file by selecting the ellipsis (...) button on the right side of the language model and selecting Add file.
Delete files
This action removes the file completely from the language model.
- Select the ellipsis (...) button on the right side of the text file.
- Select Delete. A new window pops up telling you that the deletion can't be undone.
- Select the Delete option in the new window.
Delete a language model
This action removes the language model completely from your account. Any video that was using the deletedlLanguage model will keep the same index until you reindex the video. If you reindex the video, you can assign a new language model to the video. Otherwise, Azure AI Video Indexer will use its default model to reindex the video.
- Select the ellipsis (...) button on the right side of the Language model.
- Select Delete. A new window pops up telling you that the deletion can't be undone.
- Select the Delete option in the new window.
Customize language models by correcting transcripts
Azure AI Video Indexer customizes language models based on the actual corrections users make to the transcriptions of their videos. It captures all lines that you corrected in the transcription of your video and adds them to a text file called From transcript edits
. These edits are used to retrain the language model that was used to index the video.
Edits that were done in the widget's timeline are also included.
If you didn't specify a language model when indexing this video, all edits for this video is stored in a default language model called Account adaptations
within the detected language of the video.
In case multiple edits have been made to the same line, only the last version of the corrected line is used for updating the Language model.
Note
Only textual corrections are used for the customization. Corrections that don't involve actual words (for example, punctuation marks or spaces) aren't included.
- Select the video that you want to edit from your library.
- Select the Timeline tab.
- Select the pencil icon to edit the transcript of your transcription.
- You'll see transcript corrections show up in the Language tab of the Content model customization page. To look at the "From transcript edits" file for each of your Language models, select it to open it.