Azure Document analysis custom extraction model failing to recognize '.'(Period).

BIGWORKS Support 0 Reputation points
2025-05-05T11:58:25.34+00:00

When training the custom extraction model within the Document Intelligence Studio, the default Optical Character Recognition (OCR) engine appears to exhibit a limitation in accurately identifying the period character when it precedes numeric digits. To put it simply, I am not able to select the preceding period while labelling the documents.

The image below illustrates the mentioned issue.
User's image

Conversely, the OCR engine accurately recognizes and processes the period when it functions as a decimal separator following a numeric value.

User's image

This limitation is hindering the accurate labeling of fields containing such instances (e.g., ".789").

Any insights or potential solutions would be greatly appreciated.

Azure AI Document Intelligence
Azure AI Document Intelligence
An Azure service that turns documents into usable data. Previously known as Azure Form Recognizer.
2,045 questions
{count} votes

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.