Pricing And Details Of Whisper Model (Open AI)

Sakib Ali Choudhary 225 Reputation points
2023-10-20T16:44:41.46+00:00

I was looking at the azure open ai functions and came across Whisper model and i wanted to know its pricing details as i am not able to find any of the doc which states that.

And i wanted to know whether which service should i use in for the following usecase:-

  1. If I wanted to do a real time speech to text and text to speech. Like a conversational chatbot.
  2. Where we are doing a customer call automation will this Whisper model will be helpful there or not. Like a call chatbot where all the speech will be converted into the text and then we can process that and the return a voice output.

And can I know what's the average response time for Whisper to return a response

And while looking on the doc i could not understand which are the Azure AI Speech models.

Eagerly waiting for your response.

Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
4,121 questions
{count} votes

Accepted answer
  1. Saurabh Sharma 23,851 Reputation points Microsoft Employee Moderator
    2023-10-24T22:14:16.5666667+00:00

    @Sakib Ali Choudhary

    Please find below the pricing details -

    • Azure Speech Batch w/Whisper model $0.36/hour
    • Whisper in Azure OpenAI Service $0.36/hour
    • 20% discount for 2000 hours
    • 35% discount for 10,000 hours
    • 50% discount for 50,000 hours

    Regarding your other ask -

    The choice between Azure OpenAI's Whisper model and Azure Cognitive Services Speech Services depends on your specific use case and requirements.

    • Whisper is optimized for transcribing audio files that contain speech in English, while Speech Services supports over 100 languages and dialects for speech to text and over 60 languages for speech translation.
    • On the other hand, Azure Cognitive Services Speech Services are designed for speech-related tasks such as speech-to-text, text-to-speech, speaker recognition, and translation. They provide APIs that can be used to build conversational chatbots that can interact with users via voice. The Speech Services are generally easier to use than the Whisper model, as they require less expertise in NLP and machine learning.
    • The Azure OpenAI Service is recommended for fast processing of individual audio files, while the Azure AI Speech service is recommended for batch processing of large files, diarization, and word level timestamps. For more information, see Whisper model via Azure AI Speech or via Azure OpenAI Service?
    • Also, please note that Whisper is currently in preview and may not be available in all regions or have the same level of reliability, security, and privacy as Speech Services.

    Overall, the choice between the Whisper model and the Speech Services depends on your specific use case. Please refer to the documentation - Whisper model vs Azure AI Speech models to understand which is appropriate for your scenario.

    Please let me know if you have any other questions.

    Thanks

    Saurabh


    Please 'Accept as answer' and Upvote if it helped so that it can help others in the community looking for help on similar topics.

    3 people found this answer helpful.
    0 comments No comments

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.