Azure Mistral OCR API does not support confidence_scores_granularity parameter

Question

Azure Mistral OCR API does not support confidence_scores_granularity parameter

Vu Thi Mai Linh(TMI) 0

I am currently integrating the Azure-hosted Mistral OCR API

I would like to retrieve OCR confidence scores (especially word-level confidence), similar to the native Mistral OCR API functionality.

However, when sending the following parameter:

{
  "confidence_scores_granularity": "word"
}

the API returns HTTP 422 with this error:

{
  "error": {
    "code": "Invalid input",
    "message": "{\"detail\":[{\"type\":\"extra_forbidden\",\"loc\":[\"body\",\"confidence_scores_granularity\"],\"msg\":\"Extra inputs are not permitted\",\"input\":\"word\"}]}",
    "status": 422
  }
}

Environment details:

Region: Japan East
Endpoint type: Azure-hosted Mistral OCR

Questions:

Does Azure Mistral OCR currently support OCR confidence scores?
Is there another Azure-compatible parameter or API version for retrieving word/page confidence values in Mistral?

Our use case requires OCR confidence values to validate scanned PDF quality and detect low-confidence OCR regions.

Thank you for your support.

Vu Thi Mai Linh(TMI) 0

Example implementation:

const endpoint = process.env.MISTRAL_BASE_URL;
const apiKey = process.env.MISTRAL_API_KEY;
const modelName = "mistral-document-ai-2512";
const base64Pdf = pdfBuffer.toString("base64");
const response = await fetch(endpoint, {
  method: "POST",
  headers: {
    "Content-Type": "application/json",
    Authorization: `Bearer ${apiKey}`,
  },
  body: JSON.stringify({
    model: modelName,
    document: {
      type: "document_url",
      document_url: `data:application/pdf;base64,${base64Pdf}`,
    },
    include_image_base64: true,
    // confidence_scores_granularity: "word",
  }),
});

Vu Thi Mai Linh(TMI) 0 Reputation points

2026-05-15T01:31:53.58+00:00

Link doc about Mistral OCR confidence_scores_granularity: https://docs.mistral.ai/studio-api/document-processing/basic_ocr?tab=confidence-scores-example#explorer-tabs-confidence-scores

1 answer

Your answer

Vu Thi Mai Linh(TMI) 0 Reputation points

2026-05-15T01:31:27.4066667+00:00

Example implementation:

const endpoint = process.env.MISTRAL_BASE_URL; const apiKey = process.env.MISTRAL_API_KEY; const modelName = "mistral-document-ai-2512"; const base64Pdf = pdfBuffer.toString("base64"); const response = await fetch(endpoint, { method: "POST", headers: { "Content-Type": "application/json", Authorization: `Bearer ${apiKey}`, }, body: JSON.stringify({ model: modelName, document: { type: "document_url", document_url: `data:application/pdf;base64,${base64Pdf}`, }, include_image_base64: true, // confidence_scores_granularity: "word", }), });
Vu Thi Mai Linh(TMI) 0 Reputation points

2026-05-15T01:31:53.58+00:00

Link doc about Mistral OCR confidence_scores_granularity: https://docs.mistral.ai/studio-api/document-processing/basic_ocr?tab=confidence-scores-example#explorer-tabs-confidence-scores

Answer 1

Vu Thi Mai Linh(TMI)

Azure-hosted Mistral OCR currently does not support the confidence_scores_granularity parameter.

The HTTP 422 (extra_forbidden) error confirms the parameter is not part of the Azure-exposed API schema.

There is no alternative Azure parameter today to retrieve word/page confidence scores from Mistral OCR.

🔍 What’s might be happening

1. Native Mistral vs Azure-hosted Mistral (key difference)

The native Mistral OCR API supports:

JSON"confidence_scores_granularity": "word" | "page"Show more lines

→ returns confidence scores per word/page [docs.mistral.ai]

However, in Azure AI Foundry (Azure-hosted Mistral):

The request schema is restricted

Unsupported fields are rejected with:

extra_forbidden → Extra inputs are not permitted

Azure exposes a subset of the upstream Mistral API, and confidence_scores_granularity is currently not included.

Does Azure Mistral OCR return confidence scores?

Based on available documentation and behavior:

Azure Mistral OCR:

✅ Returns extracted content (markdown/text/structured output)

❌ Does NOT return confidence scores (word/page)

There is no documented field in Azure Mistral responses for confidence values.

⚠️ Important implication for your use case

“validate scanned PDF quality and detect low-confidence regions”

This cannot currently be implemented using Azure Mistral OCR alone.

✅ Recommended alternatives (Azure-supported)

Since you’re doing quality validation / low-confidence detection, here are practical workarounds used in real customer scenarios:

Option 1 — Use Azure Document Intelligence

Azure Document Intelligence provides:

✅ Word-level confidence scores

✅ Page-level / field-level confidence

✅ Production-grade OCR + structured extraction

Microsoft explicitly documents:

“Document Intelligence returns confidence for predicted words… between 0 and 1” [learn.microsoft.com]

👉 Best fit for:

OCR quality validation

Threshold-based filtering (e.g., reject < 0.8 confidence)

Compliance / human-in-the-loop workflows

Option 2 — Dual-pass pipeline (common workaround)

If you must use Mistral OCR for layout/quality:

Pattern

Run Mistral OCR

Get high-quality markdown + structure

Run Document Intelligence (Read OCR)

Extract confidence scores

Align results

Map words/regions between outputs

Use Document Intelligence confidence as proxy

Option 3 — Native Mistral (non-Azure endpoint)

Native Mistral API supports the parameter, You can call them via Custom functions

Hope it helps.

Thank you.

Share via

Azure Mistral OCR API does not support confidence_scores_granularity parameter

1 answer

Your answer