Customize a language model with Azure AI Video Indexer

[アーティクル]
07/12/2024

Warning

Over the past year, Azure AI Video Indexer (VI) announced the removal of its dependency on Azure Media Services (AMS) due to its retirement. Features adjustments and changes were announced and a migration guide was provided.

The deadline to complete migration was June 30, 2024. VI has extended the update/migrate deadline so you can update your VI account and opt in to the AMS VI asset migration through July 15th, 2024. To use the AMS VI asset migration, you also must extend your AMS account through July. Navigate to your AMS account in the Azure portal and select Click here to extend.

However, after June 30, if you have not updated your VI account, you won't be able to index new videos nor will you be able to play any videos that have not been migrated. If you update your account after June 30, you can resume indexing immediately but you won't be able to play videos indexed before the account update until they are migrated through the AMS VI migration.

Azure AI Video Indexer supports automatic speech recognition through integration with the Microsoft Custom Speech Service. You can customize the language model by uploading adaptation text. This text comes from the domain whose vocabulary you'd like the engine to use to adapt. Once you train your model, new words appearing in the adaptation text is recognized, assuming default pronunciation, and the language model learns new probable sequences of words. See the list of supported by Azure AI Video Indexer languages in supported langues.

For example, "Kubernetes" (in the context of Azure Kubernetes service), is a word that is highly specific. Since the word is new to Azure AI Video Indexer, it's recognized as "communities". Train the model to recognize it as "Kubernetes". In other cases, the words exist, but the language model isn't expecting them to appear in a certain context. For example, "container service" isn't a 2-word sequence that a nonspecialized language model would recognize as a specific set of words.

There are two ways to customize a language model:

Option 1: Edit the transcript that was generated by Azure AI Video Indexer. By editing and correcting the transcript, you're training a language model to provide improved results in the future.
Option 2: Upload text file(s) to train the language model. The file can either contain a list of words as you would like them to appear in the Video Indexer transcript or the relevant words included naturally in sentences and paragraphs. As better results are achieved with the latter approach, it's recommended for the upload file to contain full sentences or paragraphs related to your content.

Important

Don't include the words or sentences as currently incorrectly transcribed (for example, "communities") in the upload file as this will negate the intended impact. Only include the words as you would like them to appear (for example, "Kubernetes").

Optimize your custom language model

Azure AI Video Indexer learns based on probabilities of word combinations, so to learn best:

Give enough real examples of sentences as they would be spoken.
Put only one sentence per line, not more. Otherwise the system will learn probabilities across sentences.
It's okay to put one word as a sentence to boost the word against others, but the system learns best from full sentences.
When introducing new words or acronyms, if possible, give as many examples of usage in a full sentence to give as much context as possible to the system.
Try to put several adaptation options, and see how they work for you.
Avoid repetition of the exact same sentence multiple times. It may create bias against the rest of the input.
Avoid including uncommon symbols (~, # @ % &) as they'll get discarded. The sentences in which they appear will also get discarded.
Avoid putting too large inputs, such as hundreds of thousands of sentences, because doing so will dilute the effect of boosting.

Prerequisites

An Azure account
An Azure AI Video Indexer account

Web portal
API

Create a language model

Go to the Azure AI Video Indexer website and sign in.
To customize a model in your account, select the Content model customization button on the left of the page.
Select the Language tab. You see a list of supported languages.
Under the language that you want, select Add model.
Type in the name for the language model and hit enter. This step creates the model and gives the option to upload text files to the model.
To add a text file, select Add file. Your file explorer will open.
Navigate to and select the text file. You can add multiple text files to a language model. You can also add a text file by selecting the ... button on the right side of the language model and selecting Add file.
Once you're done uploading the text files, select the green Train option.

The training process can take a few minutes. Once the training is done, Trained appears next to the model. You can preview, download, and delete the file from the model.

Using a language model on a new video

To use your language model on a new video, do one of the following actions:

Select the Upload button on the top of the page.
Drop your audio or video file or browse for your file.
Select a language model you created from the Video source language dropdown list.
Select the Upload option in the bottom of the page, and your new video will be indexed using your Language model.

Using a language model to reindex

Sign in to the Azure AI Video Indexer home page.
Click on ... button on the video and select Re-index.
Select the Video source language drop-down and select a language model that you created from the list.
Select the Re-index button and your video will be reindexed using your language model.

Edit a language model

You can edit a language model by changing its name, adding files to it, and deleting files from it. If you add or delete files from the language model, you'll have to train the model again by selecting the green Train option.

Rename the language model

You can change the name of the language model by selecting the ellipsis (...) button on the right side of the language model and selecting Rename. Enter the the new name.

Add files

Select Add file. Your file explorer will open.
Navigate to and select the text file. You can add multiple text files to a language model.

You can also add a text file by selecting the ellipsis (...) button on the right side of the language model and selecting Add file.

Delete files

This action removes the file completely from the language model.

Select the ellipsis (...) button on the right side of the text file.
Select Delete. A new window pops up telling you that the deletion can't be undone.
Select the Delete option in the new window.

Delete a language model

This action removes the language model completely from your account. Any video that was using the deletedlLanguage model will keep the same index until you reindex the video. If you reindex the video, you can assign a new language model to the video. Otherwise, Azure AI Video Indexer will use its default model to reindex the video.

Select the ellipsis (...) button on the right side of the Language model.
Select Delete. A new window pops up telling you that the deletion can't be undone.
Select the Delete option in the new window.

Customize language models by correcting transcripts

Azure AI Video Indexer customizes language models based on the actual corrections users make to the transcriptions of their videos. It captures all lines that you corrected in the transcription of your video and adds them to a text file called From transcript edits. These edits are used to retrain the language model that was used to index the video.

Edits that were done in the widget's timeline are also included.

If you didn't specify a language model when indexing this video, all edits for this video is stored in a default language model called Account adaptations within the detected language of the video.

In case multiple edits have been made to the same line, only the last version of the corrected line is used for updating the Language model.

Note

Only textual corrections are used for the customization. Corrections that don't involve actual words (for example, punctuation marks or spaces) aren't included.

Select the video that you want to edit from your library.
Select the Timeline tab.
Select the pencil icon to edit the transcript of your transcription.
You'll see transcript corrections show up in the Language tab of the Content model customization page. To look at the "From transcript edits" file for each of your Language models, select it to open it.

Create a language model

The Create Language Model request creates a new custom language model for the specified account. You can upload files for the language model using this request. Alternatively, you can create the language model here and upload files for the model later by updating the language model.

You must upload files in the body using FormData in addition to providing values for the required parameters. There are two ways to define the key pair for this task:

Key is the file name and value is the txt file.
Key is the file name and value is a URL to txt file.

Note

You must still train the model with its enabled files for the model to learn the contents of its files.

Example response

{
    "id": "dfae5745-6f1d-4edd-b224-42e1ab57a891",
    "name": "TestModel",
    "language": "En-US",
    "state": "None",
    "languageModelId": "00000000-0000-0000-0000-000000000000",
    "files": [
    {
        "id": "25be7c0e-b6a6-4f48-b981-497e920a0bc9",
        "name": "hellofile",
        "enable": true,
        "creator": "John Doe",
        "creationTime": "2018-04-28T11:55:34.6733333"
    },
    {
        "id": "33025f5b-2354-485e-a50c-4e6b76345ca7",
        "name": "worldfile",
        "enable": true,
        "creator": "John Doe",
        "creationTime": "2018-04-28T11:55:34.86"
    }
    ]
}

Train a language model

The Train Language Model request trains a custom Language model for the specified account with the contents of the uploaded and enabled files in the language model.

Note

You must first create the language model and upload its files. You can upload files when creating the language model or by updating the language model.

Example response

{
    "id": "41464adf-e432-42b1-8e09-f52905d7e29d",
    "name": "TestModel",
    "language": "En-US",
    "state": "Waiting",
    "languageModelId": "531e5745-681d-4e1d-b124-12e5ab57a891",
    "files": [
    {
        "id": "84fcf1ac-1952-48f3-b372-18f768eedf83",
        "name": "RenamedFile",
        "enable": false,
        "creator": "John Doe",
        "creationTime": "2018-04-27T20:10:10.5233333"
    },
    {
        "id": "9ac35b4b-1381-49c4-9fe4-8234bfdd0f50",
        "name": "hellofile",
        "enable": true,
        "creator": "John Doe",
        "creationTime": "2018-04-27T20:10:10.68"
    }
    ]
}

The id is a unique ID used to distinguish between language models, while languageModelId is used both for uploading a video to index and reindexing a video requests (also known as linguisticModelId in Azure AI Video Indexer upload/reindex requests).

Delete a language model

The Delete Language Model request deletes a custom Language model from the specified account. Any video that was using the deleted Language model keeps the same index until you reindex the video. If you reindex the video, you can assign a new Language model to the video. Otherwise, Azure AI Video Indexer uses its default model to reindex the video.

Example response

There's no returned content when the language model is deleted successfully.

Update a language model

The Update Language Model request updates a custom Language person model in the specified account.

Note

You must have already created the language model. You can use this call to enable or disable all files under the model, update the name of the Language model, and upload files to be added to the language model.

To upload files to be added to the language model, you must upload files in the body using FormData in addition to providing values for the required parameters above. There are two ways to do this task:

Key is the file name and value is the txt file.
Key is the file name and value is a URL to txt file.

Example response

{
    "id": "41464adf-e432-42b1-8e09-f52905d7e29d",
    "name": "TestModel",
    "language": "En-US",
    "state": "Waiting",
    "languageModelId": "531e5745-681d-4e1d-b124-12e5ab57a891",
    "files": [
    {
        "id": "84fcf1ac-1952-48f3-b372-18f768eedf83",
        "name": "RenamedFile",
        "enable": true,
        "creator": "John Doe",
        "creationTime": "2018-04-27T20:10:10.5233333"
    },
    {
        "id": "9ac35b4b-1381-49c4-9fe4-8234bfdd0f50",
        "name": "hellofile",
        "enable": true,
        "creator": "John Doe",
        "creationTime": "2018-04-27T20:10:10.68"
    }
    ]
}

Use the id of the files returned in the response to download the contents of the file.

Update a file from a language model

The Update Language Model File request allows you to update the name and enable state of a file in a custom Language model in the specified account.

Example response

{
  "id": "84fcf1ac-1952-48f3-b372-18f768eedf83",
  "name": "RenamedFile",
  "enable": false,
  "creator": "John Doe",
  "creationTime": "2018-04-27T20:10:10.5233333"
}

Use the id of the file returned in the response to download the contents of the file.

Get a specific language model

The Get Language Model request returns information on the specified language model in the specified account such as language and the files that are in the language model.

Example response

{
    "id": "dfae5745-6f1d-4edd-b224-42e1ab57a891",
    "name": "TestModel",
    "language": "En-US",
    "state": "None",
    "languageModelId": "00000000-0000-0000-0000-000000000000",
    "files": [
    {
        "id": "25be7c0e-b6a6-4f48-b981-497e920a0bc9",
        "name": "hellofile",
        "enable": true,
        "creator": "John Doe",
        "creationTime": "2018-04-28T11:55:34.6733333"
    },
    {
        "id": "33025f5b-2354-485e-a50c-4e6b76345ca7",
        "name": "worldfile",
        "enable": true,
        "creator": "John Doe",
        "creationTime": "2018-04-28T11:55:34.86"
    }
    ]
}

Use the id of the file returned in the response to download the contents of the file.

Get all the language models

The Get Language Models request returns all of the custom Language models in the specified account in a list.

Example response

[
    {
        "id": "dfae5745-6f1d-4edd-b224-42e1ab57a891",
        "name": "TestModel",
        "language": "En-US",
        "state": "None",
        "languageModelId": "00000000-0000-0000-0000-000000000000",
        "files": [
        {
            "id": "25be7c0e-b6a6-4f48-b981-497e920a0bc9",
            "name": "hellofile",
            "enable": true,
            "creator": "John Doe",
            "creationTime": "2018-04-28T11:55:34.6733333"
        },
        {
            "id": "33025f5b-2354-485e-a50c-4e6b76345ca7",
            "name": "worldfile",
            "enable": true,
            "creator": "John Doe",
            "creationTime": "2018-04-28T11:55:34.86"
        }
        ]
    },
    {
        "id": "dfae5745-6f1d-4edd-b224-42e1ab57a892",
        "name": "AnotherTestModel",
        "language": "En-US",
        "state": "None",
        "languageModelId": "00000000-0000-0000-0000-000000000001",
        "files": []
    }
]

Delete a file from a language model

The Delete Language Model File request deletes the specified file from the specified Language model in the specified account.

Example response

There's no returned content when the file is deleted from the language model successfully.

Get metadata on a file from a Language model

The Get Language Model File Data request returns the contents of and metadata on the specified file from the chosen language model in your account.

Example response

{
    "content": "hello\r\nworld",
    "id": "84fcf1ac-1952-48f3-b372-18f768eedf83",
    "name": "Hello",
    "enable": true,
    "creator": "John Doe",
    "creationTime": "2018-04-27T20:10:10.5233333"
}

Note

The contents of this example file are the words "hello" and "world" in two separate lines.

Download a file from a language model

The Download Language Model File Content request downloads a text file containing the contents of the specified file from the specified Language model in the specified account. This text file should match the contents of the text file that was originally uploaded.

Example response

The response is the download of a text file with the contents of the file in the JSON format.

次の方法で共有

Customize a language model with Azure AI Video Indexer

Optimize your custom language model

Prerequisites

Create a language model

Using a language model on a new video

Using a language model to reindex

Edit a language model

Rename the language model

Add files

Delete files

Delete a language model

Customize language models by correcting transcripts

Create a language model

Example response

Train a language model

Example response

Delete a language model

Example response

Update a language model

Example response

Update a file from a language model

Example response

Get a specific language model

Example response

Get all the language models

Example response

Delete a file from a language model

Example response

Get metadata on a file from a Language model

Example response

Download a file from a language model

Example response

フィードバック

フィードバック

その他のリソース