With MS Vision Cognitive Services read OCR I get for the words overlapping bounding boxes, which is not correct. What is the solution?

Question

By calling the Azure AI Vision 3.2 GA Read API with Latest GA model to read maschine printet text using the endpoint https://{endpoint}/vision/v3.2/read/analyze we get

1.) sporadically overlapping bounding boxes for recognized words. This is particularly the case when the lines are written close together. In such case it is possible that 2 words of different lines does overlap partly. This produces issues in further processing.

2.) the overlapping is produced because it is often seen that the bounding boxes of the recognized words are not exactly related to the words on the image. The bounding boxes are often a bit larger than the word rectangles. The recognized rectangles are often larger than the actual rectangle boundaries

Answer

Hello @Thomas Pfaendler , Thanks for using Microsoft Q&A Platform.

Is it possible to share the document with us so that we can reproduce the same on our end?

I suggest trying the document intelligence Read or pre-built model, as they work better with documents if that fits your use case. These models are optimized for text-heavy scanned and digital documents.

Share via

With MS Vision Cognitive Services read OCR I get for the words overlapping bounding boxes, which is not correct. What is the solution?

1 answer

Your answer