How to customize language model for Azure Media Services API v3 (without using videoindexer.ai)

Question

How to customize language model for Azure Media Services API v3 (without using videoindexer.ai)

Taylor Ackley 21

Mar 25, 2022, 9:41 PM

Hi there - I have been successfully using VideoIndexer.ai in production now for several months, but with a different encoding provider.

I am looking into using the Azure Media Services encoder v3, but I don't see a way to customize the language model for domain-specific words.

I don't see anything in the main REST API and this article is only for videoindexer.ai. Is it possible to customize the Azure Media Encoder language model? or does the insights/analytics need to stay a separate step and I keep doing what I am doing on videoindexer.ai

Accepted answer

0 additional answers

Your answer

Answer 1

John Deutscher (MSFT) 2,126

Mar 25, 2022, 10:06 PM

Not sure which client SDK language you are using, but I have a few examples of setting and overriding the language in an AMS Transform for Audio Analyzer here.

https://github.com/Azure-Samples/media-services-v3-node-tutorials/blob/40514b6339c6cd9542a9cfdb8aa339da149aca3e/AudioAnalytics/index.ts#L84

You can first create transform and set it up with a specific default language. Then you can use the presetOverride feature shown here to swap the language code used on a per-job basis if needed.
https://github.com/Azure-Samples/media-services-v3-node-tutorials/blob/40514b6339c6cd9542a9cfdb8aa339da149aca3e/AudioAnalytics/index.ts#L131

Let me know if that is what you had in mind.

UPDATE:
As noted, AMS does not support custom speech models to be defined and used in processing. AMS only supports the built-in speech to text models provided by our speech services team.
You can either continue to use the Video Indexer custom speech support, or you can look at integrating directly with the custom speech API here:
https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/custom-speech-overview

Taylor Ackley 21 Reputation points

Mar 25, 2022, 10:36 PM

Thank you for the fast reply! I am more looking for how to customize the language model, as I can in videoindexer.api. Let's say I work for a medical services company and I want to teach it the name of instruments and other specific healthcare terms, how can I provide feedback by uploading correct phrases? For example, this videoindexer.ai . I am looking for the equivalent api or how to do that in AMS without having to use the separate videoindexer.ai app.
ajkuma 28,116 Reputation points Microsoft Employee

Mar 28, 2022, 12:24 PM

TaylorAckley-4185, Thanks for the follow-up and additional info. Apologies for the delay from over the weekend.
We will get back to you on this shortly. // @John Deutscher (MSFT)
John Deutscher (MSFT) 2,126 Reputation points

Mar 28, 2022, 5:18 PM

I understand your scenario a bit better now, thanks for the clarification.

There is currently no support in AMS for custom speech model integration. Only Video Indexer provides the capability to train a custom speech model and apply it to your speech to text processing. For this situation you should consider continuing to use the Video Indexer API directly or if that does not meet your needs, you should look at using the Custom Speech service here.
https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/custom-speech-overview

The future is yours

Share via

How to customize language model for Azure Media Services API v3 (without using videoindexer.ai)

0 additional answers

Your answer