Batch Transcription API – Excessive Processing Time (24+ hours for 2h 24m audio)

Question

Batch Transcription API – Excessive Processing Time (24+ hours for 2h 24m audio)

Yoichi NARUSE 0

We are experiencing an unexpected and significant delay with the Azure Cognitive Services Batch Transcription API.

Issue:

An audio file of approximately 2 hours and 24 minutes in length has been submitted for batch transcription and remains in a processing state after more than 24 hours, with no completion or error returned.

Expected Behavior:

Last month, the exact same audio file (same duration, same format, same configuration) completed transcription in approximately 1 hour.

Impact:

This regression in processing time is blocking our production workflow and affecting end users who depend on timely transcription results.

Details:

| Item | Value |

|---|---|

| Service | Azure Cognitive Services – Batch Transcription API |

| Audio duration | 2 hours 24 minutes |

| Current processing time | 24+ hours (still in progress) |

| Expected processing time | ~1 hour (based on last month's run) |

| Audio file | Same file, same format, same configuration as previous run |

Questions:

Is there currently any known degradation or throttling affecting the Batch Transcription API in our region?
Has there been a change in resource allocation or queue prioritization that could explain this regression?
What SLA applies to batch transcription jobs of this duration?

Please advise on the root cause.

Anshika Varshney 13,320 Reputation points Microsoft External Staff Moderator

2026-05-22T07:47:44.64+00:00

Hi @Yoichi NARUSE

Thanks for your question. This behavior with batch transcription taking a long time is actually expected in many cases, so let me explain it in a simple way and also share what you can check.

First understand how batch transcription works Batch transcription is asynchronous. This means when you submit a job, it does not start immediately and run in real time. The service puts your job in a queue and processes it based on availability.

Because of this, there can be delays before processing even begins.

Now coming to your concern about long processing time

There are a few common reasons why you may see high processing duration.

Job queue and service load The service runs on best effort scheduling. During peak usage, your job may wait in queue before starting.

It can take up to 30 minutes just to start processing and in some cases up to several hours to complete depending on workload. [learn.microsoft.com]

Size and number of audio files Processing time depend on audio length number of files total size of data

Larger or multiple files will naturally take more time, and even similar jobs can take different durations based on load. [docs.azure.cn]

Submitting too many jobs quickly If you submit many jobs at once, they will not run in parallel immediately. The system processes jobs in sequence or limited parallel capacity, so other jobs wait. [learn.microsoft.com]

Audio quality and format Poor quality audio or certain formats may take longer to process because the system needs more internal retries or transformations. This can slow down the job.

Now here are some practical troubleshooting steps you can try

Check job status properly Instead of checking very frequently, monitor the status at intervals. Jobs may stay in Not Started or Running state for some time before completing.

Submit multiple files in one request Instead of sending many small jobs, try combining files in a single batch request. This helps the service process them more efficiently.

Distribute your requests over time Do not submit all jobs at once. Spread them over some time to avoid queue delays.

Reduce file size if possible If you are using very large audio files, try splitting them into smaller chunks. This can sometimes improve overall completion time.

Use fast transcription if applicable If your files are smaller and you need faster results, you can consider the fast transcription API which is designed for quicker response. [docs.azure.cn]

Keep expectations aligned Batch transcription is designed for large scale processing, not for immediate response. So, delays of minutes to hours can happen depending on load.

In short: This is usually not an issue but expected behavior due to queueing, workload, and data size. Try optimizing how jobs are submitted and monitored to improve overall experience.

I hope this helps. Do let me know if you have any further queries.

Thankyou!
Anshika Varshney 13,320 Reputation points Microsoft External Staff Moderator

2026-05-25T17:31:34.9966667+00:00

Hi @Yoichi NARUSE

Did you get any chance to review the response.

Thankyou!
Anshika Varshney 13,320 Reputation points Microsoft External Staff Moderator

2026-05-26T15:52:16.1433333+00:00

Hi @Yoichi NARUSE

Just checking back to see if you’re still facing the same issue. If the problem persists, please share a few more details and we’ll be happy to help you further.

Thankyou!

2 answers

Your answer

Anshika Varshney 13,320 Reputation points Microsoft External Staff Moderator

2026-05-25T17:31:34.9966667+00:00

Hi @Yoichi NARUSE

Did you get any chance to review the response.

Thankyou!
Anshika Varshney 13,320 Reputation points Microsoft External Staff Moderator

2026-05-26T15:52:16.1433333+00:00

Hi @Yoichi NARUSE

Just checking back to see if you’re still facing the same issue. If the problem persists, please share a few more details and we’ll be happy to help you further.

Thankyou!

Answer 1

Hey Yoichi NARUSE

Sorry for the delay in response.

Ideally optimizing below

audio file size
Batch size
and load balancing with multiple speech resource
Benchmarking other available Batch transcription models (MAI transcribe, Whisper etc from model catalogue)
Concurrency and Asynchronous processing
Polling and retrying, Optimizing Job configuration params like TTL (time to live)

Would help reduce the overall processing time for Batch transcription.

Reference -

https://learn.microsoft.com/en-us/azure/ai-services/speech-service/batch-transcription#best-practices-for-improving-performance (with respect to default Speech services)

https://microsoft.ai/news/today-were-announcing-3-new-world-class-mai-models-available-in-foundry/

(Newer inhouse MAI models for bench marking)

I am going through closed support and engineering ticket on same context

Could you please help us create a support ticket with consent link shared on private message if the issue persists

Thank you for your inputs on forum.

Answer 2

kagiyama yutaka 3,605

I think Batch Transcription lacks per‑job SLAs or visible queue data and 24h+ delays cannot be diagnosed externally. Check Azure Service Health and open a support ticket with the job ID.

0 comments

Share via

Batch Transcription API – Excessive Processing Time (24+ hours for 2h 24m audio)

2 answers

Your answer