Please find below the pricing details -
- Azure Speech Batch w/Whisper model $0.36/hour
- Whisper in Azure OpenAI Service $0.36/hour
- 20% discount for 2000 hours
- 35% discount for 10,000 hours
- 50% discount for 50,000 hours
Regarding your other ask -
The choice between Azure OpenAI's Whisper model and Azure Cognitive Services Speech Services depends on your specific use case and requirements.
- Whisper is optimized for transcribing audio files that contain speech in English, while Speech Services supports over 100 languages and dialects for speech to text and over 60 languages for speech translation.
- On the other hand, Azure Cognitive Services Speech Services are designed for speech-related tasks such as speech-to-text, text-to-speech, speaker recognition, and translation. They provide APIs that can be used to build conversational chatbots that can interact with users via voice. The Speech Services are generally easier to use than the Whisper model, as they require less expertise in NLP and machine learning.
- The Azure OpenAI Service is recommended for fast processing of individual audio files, while the Azure AI Speech service is recommended for batch processing of large files, diarization, and word level timestamps. For more information, see Whisper model via Azure AI Speech or via Azure OpenAI Service?
- Also, please note that Whisper is currently in preview and may not be available in all regions or have the same level of reliability, security, and privacy as Speech Services.
Overall, the choice between the Whisper model and the Speech Services depends on your specific use case. Please refer to the documentation - Whisper model vs Azure AI Speech models to understand which is appropriate for your scenario.
Please let me know if you have any other questions.
Thanks
Saurabh
Please 'Accept as answer' and Upvote if it helped so that it can help others in the community looking for help on similar topics.