Pricing And Details Of Whisper Model (Open AI)

Question

Pricing And Details Of Whisper Model (Open AI)

Sakib Ali Choudhary 225

I was looking at the azure open ai functions and came across Whisper model and i wanted to know its pricing details as i am not able to find any of the doc which states that.

And i wanted to know whether which service should i use in for the following usecase:-

If I wanted to do a real time speech to text and text to speech. Like a conversational chatbot.
Where we are doing a customer call automation will this Whisper model will be helpful there or not. Like a call chatbot where all the speech will be converted into the text and then we can process that and the return a voice output.

And can I know what's the average response time for Whisper to return a response

And while looking on the doc i could not understand which are the Azure AI Speech models.

Eagerly waiting for your response.

Saurabh Sharma 23,851 Reputation points Microsoft Employee Moderator

2023-10-20T20:14:08.9233333+00:00

Hi @Sakib Ali Choudhary

Welcome to Microsoft Q&A! Thanks for posting the question.

I am checking internally on the pricing details of Whisper model and get back to you as soon as I get an update.

Regarding your use case, the Azure Cognitive Services Speech Services may be a better fit for your needs. The Speech Services provide real-time speech-to-text and text-to-speech capabilities, as well as other features such as speaker recognition and translation. You can use the Speech Services to build conversational chatbots that can interact with users via voice.

The Speech Services offer several different models, including the Speech-to-Text, Text-to-Speech, and Speech Translation models. You can choose the model that best fits your needs based on factors such as language support, accuracy, and performance.

The response time for the Whisper model will depend on a variety of factors, including the complexity of the input and the current load on the system. However, the Whisper model is designed to provide real-time responses, so you can expect relatively low latency.

To get started with the Azure Cognitive Services Speech Services, you can visit the Azure Speech Services documentation page. This page provides an overview of the different models and features available, as well as tutorials and sample code to help you get started.

Please let me know if you have any other questions.

Thanks

Saurabh
Sakib Ali Choudhary 225 Reputation points

2023-10-21T00:57:29.55+00:00

Hi,
Thanks for the response will be waiting for the pricing of the whisper model. And i wanted to ask you that in the above scenarios should we use whisper or the Speech Services itself and what's the major difference in between them to opt the optimal solution.

Thanks.
Sakib Ali Choudhary 225 Reputation points

2023-10-23T08:22:36.41+00:00

Hi Saurabh Sharma eagerly waiting for your response.
Sakib Ali Choudhary 225 Reputation points

2023-10-25T15:46:59.25+00:00

Thanks for the answer. I'll trouble you one more time. Can you pls help me out with some realtime api documentation for whisper model ?
Saurabh Sharma 23,851 Reputation points Microsoft Employee Moderator

2023-10-25T17:25:20.98+00:00
@Sakib Ali Choudhary

As mentioned in the comparison chart, real time transcriptions are not available for Whisper model.

Please refer to Real-time speech to text to use it with Azure AI Speech model.

For whisper model REST requests please refer to -

Request a speech to text transcription

Speech to text with the Azure OpenAI Whisper model

Thanks

Saurabh

Accepted answer

0 additional answers

Your answer

Saurabh Sharma 23,851 Reputation points Microsoft Employee Moderator

2023-10-20T20:14:08.9233333+00:00

Hi @Sakib Ali Choudhary

Welcome to Microsoft Q&A! Thanks for posting the question.

I am checking internally on the pricing details of Whisper model and get back to you as soon as I get an update.

Regarding your use case, the Azure Cognitive Services Speech Services may be a better fit for your needs. The Speech Services provide real-time speech-to-text and text-to-speech capabilities, as well as other features such as speaker recognition and translation. You can use the Speech Services to build conversational chatbots that can interact with users via voice.

The Speech Services offer several different models, including the Speech-to-Text, Text-to-Speech, and Speech Translation models. You can choose the model that best fits your needs based on factors such as language support, accuracy, and performance.

The response time for the Whisper model will depend on a variety of factors, including the complexity of the input and the current load on the system. However, the Whisper model is designed to provide real-time responses, so you can expect relatively low latency.

To get started with the Azure Cognitive Services Speech Services, you can visit the Azure Speech Services documentation page. This page provides an overview of the different models and features available, as well as tutorials and sample code to help you get started.

Please let me know if you have any other questions.

Thanks

Saurabh
Sakib Ali Choudhary 225 Reputation points

2023-10-21T00:57:29.55+00:00

Hi,
Thanks for the response will be waiting for the pricing of the whisper model. And i wanted to ask you that in the above scenarios should we use whisper or the Speech Services itself and what's the major difference in between them to opt the optimal solution.

Thanks.
Sakib Ali Choudhary 225 Reputation points

2023-10-23T08:22:36.41+00:00

Hi Saurabh Sharma eagerly waiting for your response.
Sakib Ali Choudhary 225 Reputation points

2023-10-25T15:46:59.25+00:00

Thanks for the answer. I'll trouble you one more time. Can you pls help me out with some realtime api documentation for whisper model ?
Saurabh Sharma 23,851 Reputation points Microsoft Employee Moderator

2023-10-25T17:25:20.98+00:00

@Sakib Ali Choudhary

As mentioned in the comparison chart, real time transcriptions are not available for Whisper model.

Please refer to Real-time speech to text to use it with Azure AI Speech model.

For whisper model REST requests please refer to -

Request a speech to text transcription

Speech to text with the Azure OpenAI Whisper model

Thanks

Saurabh

Answer 1

@Sakib Ali Choudhary

Please find below the pricing details -

Azure Speech Batch w/Whisper model $0.36/hour
Whisper in Azure OpenAI Service $0.36/hour
20% discount for 2000 hours
35% discount for 10,000 hours
50% discount for 50,000 hours

Regarding your other ask -

The choice between Azure OpenAI's Whisper model and Azure Cognitive Services Speech Services depends on your specific use case and requirements.

Whisper is optimized for transcribing audio files that contain speech in English, while Speech Services supports over 100 languages and dialects for speech to text and over 60 languages for speech translation.
On the other hand, Azure Cognitive Services Speech Services are designed for speech-related tasks such as speech-to-text, text-to-speech, speaker recognition, and translation. They provide APIs that can be used to build conversational chatbots that can interact with users via voice. The Speech Services are generally easier to use than the Whisper model, as they require less expertise in NLP and machine learning.
The Azure OpenAI Service is recommended for fast processing of individual audio files, while the Azure AI Speech service is recommended for batch processing of large files, diarization, and word level timestamps. For more information, see Whisper model via Azure AI Speech or via Azure OpenAI Service?
Also, please note that Whisper is currently in preview and may not be available in all regions or have the same level of reliability, security, and privacy as Speech Services.

Overall, the choice between the Whisper model and the Speech Services depends on your specific use case. Please refer to the documentation - Whisper model vs Azure AI Speech models to understand which is appropriate for your scenario.

Please let me know if you have any other questions.

Thanks

Saurabh

Please 'Accept as answer' and Upvote if it helped so that it can help others in the community looking for help on similar topics.

Share via

Pricing And Details Of Whisper Model (Open AI)

0 additional answers

Your answer