Azure OpenAI batch API failing to process all files without error notifications

Chen Chen 15 Reputation points
2025-03-15T02:46:04.5933333+00:00

Hi there, I'm experiencing an issue with the Azure OpenAI batch API. When submitting multiple files for processing, the API doesn't complete all the files in the batch. And there are no error notifications or warnings when this happens - the process simply completes without processing all files. Is this a known issue with the Azure OpenAI batch API?

I'm wondering if there's a status page available where I can follow these types of issues and check if there are any known service disruptions?

Thanks

Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
4,080 questions
{count} votes

2 answers

Sort by: Most helpful
  1. Chakaravarthi Rangarajan Bhargavi 1,115 Reputation points MVP
    2025-03-16T04:28:39.2366667+00:00

    Hi Chen Chen,

    Welcome to the Microsoft Q&A forum! Thanks for your question. 😊

    The issue you're facing with the Azure OpenAI Batch API not processing all files in a batch without any error notifications could be due to several reasons. Can you please confirm if you have followed these steps below so that I could get some clarity on the issue or else

    Check Batch Job Status for Each File

    • The batch API processes files asynchronously. Ensure you're checking the status of each file in the batch using the /batches/{batch_id} endpoint.
    • Use the following API call to check batch details:
        GET https://{your-resource-name}.openai.azure.com/openai/deployments/{deployment_id}/batches/{batch_id}?api-version=2024-11-30
      
    • If some files are missing, look at the processing status to identify if they were skipped or failed silently.

    Batch Size Limitations

    • Ensure that your batch request does not exceed API limits.
    • Try submitting smaller batches and verify if the issue persists.

    Enable Logging for Debugging

    • If possible, enable logging in your Azure OpenAI service to capture any silent failures.
    • Navigate to Azure Portal > OpenAI Resource > Monitoring > Logs to check for potential warnings or errors.

    Check for Service Health Issues

    Validate API Version & Retry Mechanism

    • Ensure you're using the latest API version (2024-11-30 or later).
    • Implement a retry mechanism in case some files fail due to transient issues.

    References for Further Troubleshooting

    Azure OpenAI Batch API Documentation

    Azure Machine Learning Known Issues

    As part of the next steps, please try these steps and check if the issue persists. If you need further assistance, feel free to comment below, and I’d be happy to help!

    Regards,

    Chakravarthi Rangarajan Bhargavi

    - Please accept the answer and vote 'Yes' if you found it helpful to support the community. Thanks a lot!

    0 comments No comments

  2. Manas Mohanty 5,620 Reputation points Microsoft External Staff Moderator
    2025-03-19T15:01:06.03+00:00

    Hi Chen Chen

    Hope the Chakaravarthi Rangarajan Bhargavi's pointers shed some light on azure status, batch size limitation and debugging.

    It might be unsupported operation or intermittent server issues.

    Attached known issues sections for reference.

    Note: Embedding operation is not supported for batch operations.

    Wanted to add that you can debug it properly through python SDK too.

    Here is the step-by-step troubleshooting process

    1. Please find details about the failure in the errors property for respective file and go through Error code section to find a fix
    2. You can use smaller file to see to find the issues associated with files and change your code accordingly.
    3. Once smaller batches work, please use bigger batches under batch limit and batch quota.
    4. Attached reference code for exponential backup retry logic.
    5. Change to other API version or opt for Global standard deployment which has feature to route traffic to most stable Data Center and provides higher TPM and rate limits.

    Reference: Batch operation OpenAI

    Thank you.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.