Training
Certification
Microsoft Certified: Azure AI Engineer Associate - Certifications
Design and implement an Azure AI solution using Azure AI services, Azure AI Search, and Azure Open AI.
This browser is no longer supported.
Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.
Keywords extraction detects insights on the different keywords discussed in media files. It extract insights in both single language and multi-language media files.
Once you have uploaded and indexed a video, insights are available in JSON format for download using the web portal.
&includeSummarizedInsights=false
. "keywords": [
{
"id": 1,
"text": "office insider",
"confidence": 1,
"language": "en-US",
"instances": [
{
"adjustedStart": "0:00:00",
"adjustedEnd": "0:00:05.75",
"start": "0:00:00",
"end": "0:00:05.75"
},
{
"adjustedStart": "0:01:21.82",
"adjustedEnd": "0:01:24.7",
"start": "0:01:21.82",
"end": "0:01:24.7"
},
{
"adjustedStart": "0:01:31.32",
"adjustedEnd": "0:01:32.76",
"start": "0:01:31.32",
"end": "0:01:32.76"
},
{
"adjustedStart": "0:01:35.8",
"adjustedEnd": "0:01:37.84",
"start": "0:01:35.8",
"end": "0:01:37.84"
}
]
},
{
"id": 2,
"text": "insider tip",
"confidence": 0.9975,
"language": "en-US",
"instances": [
{
"adjustedStart": "0:01:14.91",
"adjustedEnd": "0:01:19.51",
"start": "0:01:14.91",
"end": "0:01:19.51"
}
]
}
Important
It is important to read the transparency note overview for all VI features. Each insight also has transparency notes of its own:
Always upload a high-quality audio and video content. The recommended maximum frame size is HD and frame rate is 30 FPS. A frame should contain no more than 10 people. When outputting frames from videos to AI models, only send around 2 or 3 frames per second. Processing 10 and more frames might delay the AI result. At least 1 minute of spontaneous conversational speech is required to perform analysis. Audio effects are detected in nonspeech segments only. The minimal duration of a nonspeech section is 2 seconds. Voice commands and singing aren't supported.
During the Keywords procedure, audio and images in a media file are processed, as follows:
Component | Definition |
---|---|
Source language | The user uploads the source file for indexing. |
Transcription API | The audio file is sent to Azure AI services and the translated transcribed output is returned. If a language has been specified, it's processed. |
OCR of video | Images in a media file are processed using the Azure AI Vision Read API to extract text, its location, and other insights. |
Keywords extraction | An extraction algorithm processes the transcribed audio. The results are then combined with the insights detected in the video during the OCR process. The keywords and where they appear in the media and then detected and identified. |
Confidence level | The estimated confidence level of each keyword is calculated as a range of 0 to 1. The confidence score represents the certainty in the accuracy of the result. For example, an 82% certainty is represented as an 0.82 score. |
Training
Certification
Microsoft Certified: Azure AI Engineer Associate - Certifications
Design and implement an Azure AI solution using Azure AI services, Azure AI Search, and Azure Open AI.