How Speech-to-Text billing works?

Jefferson 0 Reputation points
2025-04-10T08:43:05.4133333+00:00

Hi, Currently, we have 3 implementations using Microsoft AI Services – Cognitive Services. The Fast Transcript and Batch Transcript are being used as background processes. However, one endpoint and key are being used for both the Batch Transcript and the Microsoft SDK for front-end speech recognition.

We assumed that the SDK is also using the Batch Transcript for Microsoft's billing charges. We would like to confirm if this is incorrect. If it is incorrect, is there any way we can be charged less by changing how we use the SDK and be charged lower than real-time transcription?

I have read online that it's possible to use the SDK with the recognizeOnceAsync integration rather than startContinuousRecognition, as it acts like Fast Transcription and would be charged as such.

Could you please confirm this? Thanks!

Azure AI services
Azure AI services
A group of Azure services, SDKs, and APIs designed to make apps more intelligent, engaging, and discoverable.
3,619 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Abdelaziz Khajour 235 Reputation points Microsoft Employee
    2025-04-18T08:57:27.71+00:00

    Hi Jefferson

    It's great to hear that you are utilizing Microsoft AI Services - Cognitive Services for your implementations.

    Based on the information you provided, it seems that you are concerned about the billing charges related to using the Microsoft SDK for front-end speech recognition.

    When using the Speech SDK for front-end speech recognition, you are charged for both the Speech-to-text recognition request and the Language service request for Conversational Language Understanding (CLU). The billing is based on the type of service you are using and the subscription level you have.

    If you are looking to reduce costs and be charged less than real-time transcription, you can consider using the recognizeOnceAsync integration instead of startContinuousRecognition. The recognizeOnceAsync method is suitable for scenarios where you need a single recognition result and does not require continuous recognition. This method is more akin to Fast Transcription and may result in lower charges compared to continuous recognition.

    To confirm the billing details and explore ways to optimize costs further, I recommend referring to the official Azure Cognitive Services pricing page for the latest information and details specific to your subscription level and usage patterns.

    If you have any more questions or need further assistance, feel free to ask!

    Abdelaziz

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.