With MS Vision Cognitive Services read OCR I get for the words overlapping bounding boxes, which is not correct. What is the solution?

Thomas Pfaendler 0 Reputation points
2024-07-12T10:00:26.8133333+00:00

By calling the Azure AI Vision 3.2 GA Read API with Latest GA model to read maschine printet text using the endpoint https://{endpoint}/vision/v3.2/read/analyze we get

1.) sporadically overlapping bounding boxes for recognized words. This is particularly the case when the lines are written close together. In such case it is possible that 2 words of different lines does overlap partly. This produces issues in further processing.

2.) the overlapping is produced because it is often seen that the bounding boxes of the recognized words are not exactly related to the words on the image. The bounding boxes are often a bit larger than the word rectangles. The recognized rectangles are often larger than the actual rectangle boundaries

Azure Computer Vision
Azure Computer Vision
An Azure artificial intelligence service that analyzes content in images and video.
341 questions
{count} votes