Azure transcription service details

Question

Azure transcription service details

Andy Bell 40

We are using Azure batch transcription service. We are a bit unclear from the documentation and API requests which models are being used. Could anyone provide a bit more information on identifying model types from IDs, and whether some of these models are based on Whisper Large v2 (and if yes, for which model IDs)?

Anonymous

2025-12-17T18:20:04.4233333+00:00

Hi Andy Bell

Did you get any chance to review the above response.

Thank you!

Answer accepted by question author

0 additional answers

Your answer

Anonymous

2025-12-17T18:20:04.4233333+00:00

Hi Andy Bell

Did you get any chance to review the above response.

Thank you!

Answer 1

Hi Andy Bell

Welcome to the Microsoft Q&A and thank you for posting your questions here.

Azure does not publish any official mapping between Batch Transcription model IDs (GUIDs) and the underlying model architectures (e.g., Whisper Large v2). These IDs are internal service identifiers, and you cannot determine the model type from the GUID alone. [Foundry Mo...soft Learn | Learn.Microsoft.com]

How to Identify Whisper in Batch Transcription:

Because GUIDs cannot be decoded, Microsoft relies on behavioral indicators to confirm that a Batch Transcription job used Whisper. Whisper is a display‑only model, so the output contains display text only, with no lexical field. Whisper also uses the property displayFormWordLevelTimestampsEnabled, unlike standard STT models, which use wordLevelTimestampsEnabled. These behavioral differences are the reliable way to confirm that the job used Whisper. [Foundry Mo...soft Learn | Learn.Microsoft.com]

Another strong indicator is that punctuationMode is ignored for Whisper models. Microsoft explicitly documents that punctuationMode does not apply to Whisper. If your transcription ignores punctuationMode settings, the job was processed using Whisper. [ai.azure.com]

Whisper Visibility in the Models API:

Whisper models do not always appear in the Models API list unless they have been explicitly enabled for your Speech resource and region. When enabled, Whisper appears with human‑readable IDs such as “whisper-base” and displayName “Whisper Base”, not as GUIDs. Absence from the list simply means Whisper is not enabled for that resource.

Azure Uses Whisper Large v2 (Not v3):

Microsoft’s internal product guidance confirms that Azure Speech currently deploys Whisper Large v2, and Whisper Large v3 is not yet released for Azure Speech Batch Transcription. Therefore, any Batch Transcription job using Whisper today is using Whisper Large v2. [github.com]

Correct Interpretation

You cannot identify Whisper Large v2 from the GUID itself, because GUIDs are not descriptive. The correct method is:

Check for display‑only text (no lexical output).
Check whether the job uses displayFormWordLevelTimestampsEnabled.
Confirm punctuationMode has no effect. If all three are true, the job was processed using Whisper, and since Azure uses Whisper Large v2, that is the model behind your job.

I Hope this helps. Do let me know if you have any further queries.

Thank you!

Share via

Azure transcription service details

0 additional answers

Your answer