What is batch transcription?


New pricing is in effect for batch transcription via Speech to text REST API v3.2. For more information, see the pricing guide.

Batch transcription is used to transcribe a large amount of audio data in storage. Both the Speech to text REST API and Speech CLI support batch transcription.

You should provide multiple files per request or point to an Azure Blob Storage container with the audio files to transcribe. The batch transcription service can handle a large number of submitted transcriptions. The service transcribes the files concurrently, which reduces the turnaround time.

How does it work?

With batch transcriptions, you submit the audio data, and then retrieve transcription results asynchronously. The service transcribes the audio data and stores the results in a storage container. You can then retrieve the results from the storage container.


For a low or no-code solution, you can use the Batch Speech to text Connector in Power Platform applications such as Power Automate, Power Apps, and Logic Apps. See the Power automate batch transcription guide to get started.

To use the batch transcription REST API:

  1. Locate audio files for batch transcription - You can upload your own data or use existing audio files via public URI or shared access signature (SAS) URI.
  2. Create a batch transcription - Submit the transcription job with parameters such as the audio files, the transcription language, and the transcription model.
  3. Get batch transcription results - Check transcription status and retrieve transcription results asynchronously.


Batch transcription jobs are scheduled on a best-effort basis. At peak hours it may take up to 30 minutes or longer for a transcription job to start processing. See how to check the current status of a batch transcription job in this section.

Next steps