Automatically identify the spoken language with language identification model

Article
03/22/2024

Important

Due to the Azure Media Services retirement announcement, Azure AI Video Indexer announces Azure AI Video Indexer features adjustments. See Changes related to Azure Media Service (AMS) retirement to understand what this means for your Azure AI Video Indexer account. See the Preparing for AMS retirement: VI update and migration guide.

Azure AI Video Indexer supports automatic language identification (LID), which is the process of automatically identifying the spoken language from audio content. The media file is transcribed in the dominant identified language.

See the list of supported by Azure AI Video Indexer languages in supported languages.

Make sure to review the Guidelines and limitations section.

Choosing auto language identification on indexing

When indexing or reindexing a video using the API, choose the auto detect option in the sourceLanguage parameter.

When using portal, go to your Account videos on the Azure AI Video Indexer home page and hover over the name of the video that you want to reindex. On the right-bottom corner, select the Re-index button. In the Re-index video dialog, choose Auto detect from the Video source language drop-down box.

Screenshot showing where to select auto detect.

Model output

Azure AI Video Indexer transcribes the video according to the most likely language if the confidence for that language is > 0.6. If the language can't be identified with confidence, it assumes the spoken language is English.

Model dominant language is available in the insights JSON as the sourceLanguage attribute (under root/videos/insights). A corresponding confidence score is also available under the sourceLanguageConfidence attribute.

"insights": {
        "version": "1.0.0.0",
        "duration": "0:05:30.902",
        "sourceLanguage": "fr-FR",
        "language": "fr-FR",
        "transcript": [...],
        . . .
        "sourceLanguageConfidence": 0.8563
      }

Guidelines and limitations

Automatic language identification (LID) supports the following languages: