With MS Vision Cognitive Services read OCR I get for the words overlapping bounding boxes, which is not correct. What is the solution?

Thomas Pfaendler 0 Reputation points
2024-07-12T10:00:26.8133333+00:00

By calling the Azure AI Vision 3.2 GA Read API with Latest GA model to read maschine printet text using the endpoint https://{endpoint}/vision/v3.2/read/analyze we get

1.) sporadically overlapping bounding boxes for recognized words. This is particularly the case when the lines are written close together. In such case it is possible that 2 words of different lines does overlap partly. This produces issues in further processing.

2.) the overlapping is produced because it is often seen that the bounding boxes of the recognized words are not exactly related to the words on the image. The bounding boxes are often a bit larger than the word rectangles. The recognized rectangles are often larger than the actual rectangle boundaries

Azure Computer Vision
Azure Computer Vision
An Azure artificial intelligence service that analyzes content in images and video.
370 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. VasaviLankipalle-MSFT 17,121 Reputation points
    2024-07-12T18:42:25.8566667+00:00

    Hello @Thomas Pfaendler , Thanks for using Microsoft Q&A Platform.

    Is it possible to share the document with us so that we can reproduce the same on our end?

    I suggest trying the document intelligence Read or pre-built model, as they work better with documents if that fits your use case. These models are optimized for text-heavy scanned and digital documents.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.