How to customize language model for Azure Media Services API v3 (without using videoindexer.ai)

Taylor Ackley 21 Reputation points
2022-03-25T21:41:59.927+00:00

Hi there - I have been successfully using VideoIndexer.ai in production now for several months, but with a different encoding provider.

I am looking into using the Azure Media Services encoder v3, but I don't see a way to customize the language model for domain-specific words.

I don't see anything in the main REST API and this article is only for videoindexer.ai. Is it possible to customize the Azure Media Encoder language model? or does the insights/analytics need to stay a separate step and I keep doing what I am doing on videoindexer.ai

Azure Media Services
Azure Media Services
A group of Azure services that includes encoding, format conversion, on-demand streaming, content protection, and live streaming services.
310 questions
0 comments No comments
{count} votes

Accepted answer
  1. John Deutscher (MSFT) 2,126 Reputation points
    2022-03-25T22:06:34.013+00:00

    Not sure which client SDK language you are using, but I have a few examples of setting and overriding the language in an AMS Transform for Audio Analyzer here.

    https://github.com/Azure-Samples/media-services-v3-node-tutorials/blob/40514b6339c6cd9542a9cfdb8aa339da149aca3e/AudioAnalytics/index.ts#L84

    You can first create transform and set it up with a specific default language. Then you can use the presetOverride feature shown here to swap the language code used on a per-job basis if needed.
    https://github.com/Azure-Samples/media-services-v3-node-tutorials/blob/40514b6339c6cd9542a9cfdb8aa339da149aca3e/AudioAnalytics/index.ts#L131

    Let me know if that is what you had in mind.

    UPDATE:
    As noted, AMS does not support custom speech models to be defined and used in processing. AMS only supports the built-in speech to text models provided by our speech services team.
    You can either continue to use the Video Indexer custom speech support, or you can look at integrating directly with the custom speech API here:
    https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/custom-speech-overview

    1 person found this answer helpful.

0 additional answers

Sort by: Most helpful