Share via

Batch Transcription API – Excessive Processing Time (24+ hours for 2h 24m audio)

Yoichi NARUSE 0 Reputation points
2026-05-22T04:35:09.2533333+00:00

We are experiencing an unexpected and significant delay with the Azure Cognitive Services Batch Transcription API.

Issue:

An audio file of approximately 2 hours and 24 minutes in length has been submitted for batch transcription and remains in a processing state after more than 24 hours, with no completion or error returned.

Expected Behavior:

Last month, the exact same audio file (same duration, same format, same configuration) completed transcription in approximately 1 hour.

Impact:

This regression in processing time is blocking our production workflow and affecting end users who depend on timely transcription results.

Details:

| Item | Value |

|---|---|

| Service | Azure Cognitive Services – Batch Transcription API |

| Audio duration | 2 hours 24 minutes |

| Current processing time | 24+ hours (still in progress) |

| Expected processing time | ~1 hour (based on last month's run) |

| Audio file | Same file, same format, same configuration as previous run |

Questions:

  1. Is there currently any known degradation or throttling affecting the Batch Transcription API in our region?
  2. Has there been a change in resource allocation or queue prioritization that could explain this regression?
  3. What SLA applies to batch transcription jobs of this duration?

Please advise on the root cause.

Azure Speech in Foundry Tools

2 answers

Sort by: Most helpful
  1. Manas Mohanty 17,185 Reputation points Microsoft External Staff Moderator
    2026-06-04T07:17:28.4633333+00:00

    Hey Yoichi NARUSE

    Sorry for the delay in response.

    Ideally optimizing below

    1. audio file size
    2. Batch size
    3. and load balancing with multiple speech resource
    4. Benchmarking other available Batch transcription models (MAI transcribe, Whisper etc from model catalogue)
    5. Concurrency and Asynchronous processing
    6. Polling and retrying, Optimizing Job configuration params like TTL (time to live)

    Would help reduce the overall processing time for Batch transcription.

    Reference -

    https://learn.microsoft.com/en-us/azure/ai-services/speech-service/batch-transcription#best-practices-for-improving-performance (with respect to default Speech services)

    https://microsoft.ai/news/today-were-announcing-3-new-world-class-mai-models-available-in-foundry/

    (Newer inhouse MAI models for bench marking)

    I am going through closed support and engineering ticket on same context

    Could you please help us create a support ticket with consent link shared on private message if the issue persists

    Thank you for your inputs on forum.

    Was this answer helpful?

    0 comments No comments

  2. kagiyama yutaka 3,605 Reputation points
    2026-05-22T08:41:32.15+00:00

    I think Batch Transcription lacks per‑job SLAs or visible queue data and 24h+ delays cannot be diagnosed externally. Check Azure Service Health and open a support ticket with the job ID.

    Was this answer helpful?

    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.