Share via

mistral-document-ai-2512: intermittent 408 and 503 errors on PDF inputs (document_url) while image_url works

Noe G 10 Reputation points
2026-05-05T14:30:01.54+00:00

Summary

Calls to a mistral-document-ai-2512 serverless deployment on Microsoft Foundry intermittently return HTTP 408 "upstream request timeout" and HTTP 503 "service unavailable" when the payload uses document_url with a base64-encoded PDF. The exact same endpoint, deployment, model and credentials succeed reliably when the payload uses image_url with a base64-encoded image. PDF inputs were working consistently until a few days ago, without any change on our side.

Endpoint pattern

https://<resource>.cognitiveservices.azure.com/providers/mistral/azure/ocr

Model

mistral-document-ai-2512 (deployed as global-standard serverless)

Failing request body (PDF)


{

  "model": "mistral-document-ai-2512",

  "document": {

    "type": "document_url",

    "document_url": "data:application/pdf;base64,<...>"

  },

  "include_image_base64": false

}

Working request body (image, same endpoint)


{

  "model": "mistral-document-ai-2512",

  "document": {

    "type": "image_url",

    "image_url": "data:image/png;base64,<...>"

  },

  "include_image_base64": false

}

Error responses

Two distinct errors are returned, both server-side:

408 - upstream request timeout

503 - service unavailable

The errors appear non-deterministic — the same PDF payload can return 408 on one attempt and 503 on the next. Far below our 50 RPM quota in every case (no 429 ever observed).

Reproduction details

  • PDF size tested: small (single page, well under the documented 30 MB / 30 page limit)
  • Also tested with pages: [0] to restrict processing to the first page only — same 408 / 503 mix
  • Tried with include_image_base64 set to both true and false — same behavior
  • Client-side timeout set to 180000 ms — errors are clearly server-side, not client-side
  • Multiple retries with 30 to 45 second spacing all fail
  • Image inputs (PNG, same byte range) succeed within seconds on the same endpoint

What we have tried

  • Increased client timeout up to 180 seconds
  • Added retry logic (up to 5 attempts, 45 seconds between tries)
  • Sent only pages: [0] to minimize processing load
  • Toggled include_image_base64
  • Verified the payload matches the documented API contract

Questions

  1. Is there a known regression or capacity issue affecting the document_url (PDF) path of mistral-document-ai-2512 over the past few days?
  2. Is the PDF processing pipeline routed differently from the image_url pipeline? The clear asymmetry (PDFs fail, images succeed on the same endpoint) suggests separate backend handling.
  3. Are 408 and 503 expected responses when underlying capacity is constrained, even when the client is well below its assigned RPM quota?
  4. Are there recommended workarounds while this is investigated, other than client-side PDF rasterization to PNG images?

Thanks for any guidance from the Foundry or Mistral team.

Summary

Calls to a mistral-document-ai-2512 serverless deployment on Microsoft Foundry intermittently return HTTP 408 "upstream request timeout" and HTTP 503 "service unavailable" when the payload uses document_url with a base64-encoded PDF. The exact same endpoint, deployment, model and credentials succeed reliably when the payload uses image_url with a base64-encoded image. PDF inputs were working consistently until a few days ago, without any change on our side.

Endpoint pattern

https://<resource>.cognitiveservices.azure.com/providers/mistral/azure/ocr

Model

mistral-document-ai-2512 (deployed as global-standard serverless)

Failing request body (PDF)


{

  "model": "mistral-document-ai-2512",

  "document": {

    "type": "document_url",

    "document_url": "data:application/pdf;base64,<...>"

  },

  "include_image_base64": false

}

Working request body (image, same endpoint)


{

  "model": "mistral-document-ai-2512",

  "document": {

    "type": "image_url",

    "image_url": "data:image/png;base64,<...>"

  },

  "include_image_base64": false

}

Error responses

Two distinct errors are returned, both server-side:

408 - upstream request timeout

503 - service unavailable

The errors appear non-deterministic — the same PDF payload can return 408 on one attempt and 503 on the next. Far below our 50 RPM quota in every case (no 429 ever observed).

Reproduction details

  • PDF size tested: small (single page, well under the documented 30 MB / 30 page limit)
  • Also tested with pages: [0] to restrict processing to the first page only — same 408 / 503 mix
  • Tried with include_image_base64 set to both true and false — same behavior
  • Client-side timeout set to 180000 ms — errors are clearly server-side, not client-side
  • Multiple retries with 30 to 45 second spacing all fail
  • Image inputs (PNG, same byte range) succeed within seconds on the same endpoint

What we have tried

  • Increased client timeout up to 180 seconds
  • Added retry logic (up to 5 attempts, 45 seconds between tries)
  • Sent only pages: [0] to minimize processing load
  • Toggled include_image_base64
  • Verified the payload matches the documented API contract

Questions

  1. Is there a known regression or capacity issue affecting the document_url (PDF) path of mistral-document-ai-2512 over the past few days?
  2. Is the PDF processing pipeline routed differently from the image_url pipeline? The clear asymmetry (PDFs fail, images succeed on the same endpoint) suggests separate backend handling.
  3. Are 408 and 503 expected responses when underlying capacity is constrained, even when the client is well below its assigned RPM quota?
  4. Are there recommended workarounds while this is investigated, other than client-side PDF rasterization to PNG images?

Thanks for any guidance from the Foundry or Mistral team.

Summary

Calls to a mistral-document-ai-2512 serverless deployment on Microsoft Foundry intermittently return HTTP 408 "upstream request timeout" and HTTP 503 "service unavailable" when the payload uses document_url with a base64-encoded PDF. The exact same endpoint, deployment, model and credentials succeed reliably when the payload uses image_url with a base64-encoded image. PDF inputs were working consistently until a few days ago, without any change on our side.

Endpoint pattern

https://<resource>.cognitiveservices.azure.com/providers/mistral/azure/ocr

Model

mistral-document-ai-2512 (deployed as global-standard serverless)

Failing request body (PDF)


{

  "model": "mistral-document-ai-2512",

  "document": {

    "type": "document_url",

    "document_url": "data:application/pdf;base64,<...>"

  },

  "include_image_base64": false

}

Working request body (image, same endpoint)


{

  "model": "mistral-document-ai-2512",

  "document": {

    "type": "image_url",

    "image_url": "data:image/png;base64,<...>"

  },

  "include_image_base64": false

}

Error responses

Two distinct errors are returned, both server-side:

408 - upstream request timeout

503 - service unavailable

The errors appear non-deterministic — the same PDF payload can return 408 on one attempt and 503 on the next. Far below our 50 RPM quota in every case (no 429 ever observed).

Reproduction details

  • PDF size tested: small (single page, well under the documented 30 MB / 30 page limit)
  • Also tested with pages: [0] to restrict processing to the first page only — same 408 / 503 mix
  • Tried with include_image_base64 set to both true and false — same behavior
  • Client-side timeout set to 180000 ms — errors are clearly server-side, not client-side
  • Multiple retries with 30 to 45 second spacing all fail
  • Image inputs (PNG, same byte range) succeed within seconds on the same endpoint

What we have tried

  • Increased client timeout up to 180 seconds
  • Added retry logic (up to 5 attempts, 45 seconds between tries)
  • Sent only pages: [0] to minimize processing load
  • Toggled include_image_base64
  • Verified the payload matches the documented API contract

Questions

  1. Is there a known regression or capacity issue affecting the document_url (PDF) path of mistral-document-ai-2512 over the past few days?
  2. Is the PDF processing pipeline routed differently from the image_url pipeline? The clear asymmetry (PDFs fail, images succeed on the same endpoint) suggests separate backend handling.
  3. Are 408 and 503 expected responses when underlying capacity is constrained, even when the client is well below its assigned RPM quota?
  4. Are there recommended workarounds while this is investigated, other than client-side PDF rasterization to PNG images?

Thanks for any guidance from the Foundry or Mistral team.

Summary

Calls to a mistral-document-ai-2512 serverless deployment on Microsoft Foundry intermittently return HTTP 408 "upstream request timeout" and HTTP 503 "service unavailable" when the payload uses document_url with a base64-encoded PDF. The exact same endpoint, deployment, model and credentials succeed reliably when the payload uses image_url with a base64-encoded image. PDF inputs were working consistently until a few days ago, without any change on our side.

Endpoint pattern

https://<resource>.cognitiveservices.azure.com/providers/mistral/azure/ocr

Model

mistral-document-ai-2512 (deployed as global-standard serverless)

Failing request body (PDF)


{

  "model": "mistral-document-ai-2512",

  "document": {

    "type": "document_url",

    "document_url": "data:application/pdf;base64,<...>"

  },

  "include_image_base64": false

}

Working request body (image, same endpoint)


{

  "model": "mistral-document-ai-2512",

  "document": {

    "type": "image_url",

    "image_url": "data:image/png;base64,<...>"

  },

  "include_image_base64": false

}

Error responses

Two distinct errors are returned, both server-side:

408 - upstream request timeout

503 - service unavailable

The errors appear non-deterministic — the same PDF payload can return 408 on one attempt and 503 on the next. Far below our 50 RPM quota in every case (no 429 ever observed).

Reproduction details

  • PDF size tested: small (single page, well under the documented 30 MB / 30 page limit)
  • Also tested with pages: [0] to restrict processing to the first page only — same 408 / 503 mix
  • Tried with include_image_base64 set to both true and false — same behavior
  • Client-side timeout set to 180000 ms — errors are clearly server-side, not client-side
  • Multiple retries with 30 to 45 second spacing all fail
  • Image inputs (PNG, same byte range) succeed within seconds on the same endpoint

What we have tried

  • Increased client timeout up to 180 seconds
  • Added retry logic (up to 5 attempts, 45 seconds between tries)
  • Sent only pages: [0] to minimize processing load
  • Toggled include_image_base64
  • Verified the payload matches the documented API contract

Questions

  1. Is there a known regression or capacity issue affecting the document_url (PDF) path of mistral-document-ai-2512 over the past few days?
  2. Is the PDF processing pipeline routed differently from the image_url pipeline? The clear asymmetry (PDFs fail, images succeed on the same endpoint) suggests separate backend handling.
  3. Are 408 and 503 expected responses when underlying capacity is constrained, even when the client is well below its assigned RPM quota?
  4. Are there recommended workarounds while this is investigated, other than client-side PDF rasterization to PNG images?

Thanks for any guidance from the Foundry or Mistral team.

Foundry Models
Foundry Models

A catalog of AI models in Microsoft Foundry that you can discover, compare, and deploy using Azure’s built‑in tools for evaluation, fine‑tuning, and inference


3 answers

Sort by: Most helpful
  1. Karnam Venkata Rajeswari 3,335 Reputation points Microsoft External Staff Moderator
    2026-05-18T19:05:56.58+00:00

    Hello @Noe G ,

    Welcome to Microsoft Q&A .Thank you for reaching out to us.

    Thanks for sharing the detailed investigation results and reproduction steps. Based on the observed behavior, the issue appears consistent with a transient backend service condition affecting the PDF (document_url) processing workflow for mistral-document-ai-2512.

    The key observation is that the same endpoint, deployment, credentials, and request structure continue to work reliably for image_url inputs, while only PDF requests intermittently return HTTP 408 (“upstream request timeout”) and HTTP 503 (“service unavailable”). This helps isolate the issue specifically to the PDF-processing path rather than authentication, networking, quota limits, or the core OCR inference service.

    Although both request types use the same OCR endpoint, the PDF workflow follows a different and more compute-intensive processing path compared to image-based OCR. PDF requests typically require additional backend stages such as:

    • PDF parsing
    • Page rendering/rasterization
    • Layout extraction
    • OCR orchestration
    • Document post-processing

    Because of these additional stages, the PDF path can be more sensitive to transient backend latency, orchestration delays, or temporary service-side capacity conditions.

    The observed 408 and 503 responses are generally associated with temporary backend processing interruptions rather than client-side timeout or payload validation issues. The absence of HTTP 429 (“Too Many Requests”) responses, combined with request volumes remaining well below documented quotas, further indicates that this is unlikely to be related to throttling or quota exhaustion.

    Additionally, the fact that the same requests later completed successfully without any client-side changes strongly suggests a transient backend condition that self-recovered.

    To improve reliability and help further isolate the behavior, please check if the following steps help -

    1. Using HTTPS-hosted PDF URLs instead of inline base64 PDFs Instead of "document_url": "data:application/pdf;base64,...". Please test using Azure Blob Storage SAS URLs or another HTTPS-accessible PDF location. This reduces inline payload overhead and helps determine whether the issue is related to PDF ingestion or backend document processing.
    2. Continue retries using exponential backoff and jitter with suggested retry intervals - 5s ,15s , 30s , 60s, 120s.
    3. Testing alternate Azure regions if feasible - This can help determine whether the behavior is tied to temporary regional backend conditions.
    4. Simplifying or flattening PDFs where possible - Re-saving or flattening PDFs may help eliminate parser/rendering edge cases.
    5. Temporary workaround for critical workloads - Converting PDF pages into PNG/JPEG images and using the image_url workflow may provide more stable processing while monitoring the PDF path.

    The following references might be helpful , please check them out

    Reference: Document Intelligence Errors - Foundry Tools | Microsoft Learn

    Thank you

     

    Please 'Upvote'(Thumbs-up) and 'Accept' as answer if the response was helpful. This will be benefitting other community members who face the same issue.

    Was this answer helpful?

    2 people found this answer helpful.
    0 comments No comments

  2. Noe G 10 Reputation points
    2026-05-06T07:19:09.5266667+00:00

    Central-Sweden but it has been fixed this morning, I think it was just a server overload.

    Was this answer helpful?

    1 person found this answer helpful.

  3. AI answer

    2026-05-05T14:31:17.38+00:00

    The context only documents mistral-document-ai-2512 as an image-to-text model that accepts image or PDF pages as input and describes general behavior for capacity and throttling, but it doesn’t contain any product note or incident information about regressions, nor internal routing details for document_url vs image_url. It also doesn’t document 408/503 patterns specific to this model.

    From the available information, only the following can be stated:

    1. mistral-document-ai-2512 capabilities

    mistral-document-ai-2512 is listed as an Image-to-Text model with:

    • Input: image or PDF pages (30 pages, max 30 MB PDF file)
    • Output: text
    • Languages: en
    • Response formats: Text, JSON, Markdown
    • Deployment type: Global standard (all regions) and Data zone standard (US and EU)

    This confirms that PDF input is officially supported for this model, but the context doesn’t describe any current regression or known issue with the PDF (document_url) path.

    1. Capacity and error semantics

    The provisioned-throughput documentation explains how capacity issues are normally surfaced:

    • When capacity is exceeded, the API returns HTTP 429 and continues to do so until utilization drops below 100%.
    • 429 is explicitly described as the signal for capacity limits, with retry-after headers to guide client behavior.

    There is no mention in the context of 408 or 503 being the expected signal for capacity exhaustion on Foundry Models; instead, 429 is the documented behavior when capacity is constrained.

    1. Known error patterns for Foundry / Azure OpenAI

    The batch and “on your data” troubleshooting content documents other error codes and conditions (e.g., too_many_tasks, model_not_found, invalid_request, search 503s due to Azure AI Search throttling), but none of these are specific to mistral-document-ai-2512 or to PDF ingestion via document_url.

    The 503 guidance that does exist is for Azure AI Search queries in “on your data” scenarios, where 503 can be caused by insufficient replicas/partitions and mitigated by scaling or adding retry/sleep logic. That guidance is specific to search resources and doesn’t describe the Mistral OCR endpoint.

    1. Answers to the specific questions (limited to what is in context)
    1. Known regression or capacity issue on document_url for mistral-document-ai-2512

    The provided documentation and Q&A snippets contain no statement about any regression, outage, or capacity issue specific to mistral-document-ai-2512, nor to its document_url (PDF) path.

    1. Whether PDF processing is routed differently from image_url

    The context only states that the model accepts “image or PDF pages” as input. It does not describe internal routing, separate pipelines, or architectural differences between PDF and image handling for mistral-document-ai-2512.

    1. Whether 408 and 503 are expected for capacity constraints

    The capacity documentation for Foundry Models states that when capacity is exceeded, the API returns 429 and that this is by design as the capacity signal. There is no documentation here that 408 or 503 are used as the standard signal for capacity issues on Foundry Models, even when within RPM quota.

    1. Recommended workarounds beyond rasterizing PDFs

    The context does not provide any model-specific workaround guidance for mistral-document-ai-2512 PDF issues. General guidance for capacity or backend issues includes:

    • Retrying with backoff when receiving capacity-related responses (documented for 429).
    • Redirecting traffic to other deployments or models when a given deployment is fully utilized.

    However, there is no explicit recommendation in the context for handling intermittent 408/503 on this specific OCR endpoint, nor any alternative PDF ingestion pattern beyond what is already being used.

    Given the constraints of the provided documentation, any further guidance (such as confirming a regression, explaining backend routing, or proposing product-level mitigations) would require escalation through Azure support or service health channels, which are not described in detail in this context.


    References:

    AI-generated content may be incorrect. Read our transparency notes for more information.

    Was this answer helpful?

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.