READ OCR Bounding Box Accuracy

Question

READ OCR Bounding Box Accuracy

Baruch Shadrouz 31

I am using the basic Code for the READ OCR that Microsoft gives in their getting started guide. The only modification is that I am running that program with multiprocessing with 4 cores so I am making multiple calls to the API at the same time.

The READ OCR sometimes gives me a Bounding Box of a word that is either too large or too small for that word. You can see in the screenshot below that the word "40.00" was completely found but the bounding box is too small and does not cover the entire word. The same issue happens with the word "Current". There is cases where the bounding box is too large as well.

Any help would be appreciated!

GiftA-MSFT 11,176 Reputation points

2021-03-08T17:36:22.483+00:00

Hi, can you please provide steps to reproduce this issue? If you have a repo to share that would be helpful as well. Thanks.
GiftA-MSFT 11,176 Reputation points

2021-03-08T18:27:04.277+00:00

Using the sample, I'm able to get great accuracy. Please share steps to reproduce the issue you've described above, thanks.
Baruch Shadrouz 31 Reputation points

2021-03-10T21:23:36.003+00:00

I have a parent program that creates 6-8 forks of the Sample code that Microsoft has for the Cognitive Service READ. The files that are being passed to Microsoft are on my local machine. The only difference I have in my code from the Sample code is the output is saved to a file in a specific format.

The actual results are good but the accuracy of the "boundingbox" for each word seems to be off. You can see in the image posted in the question that the value "40.00" was found read correctly but the bounding box is not aligned correctly.

Thank you
GiftA-MSFT 11,176 Reputation points

2021-03-12T20:45:29.013+00:00

Hi, thanks for your feedback. We are still reviewing your feedback, will share updates as soon as possible. Thanks.

6 answers

Your answer

GiftA-MSFT 11,176 Reputation points

2021-03-08T17:36:22.483+00:00

Hi, can you please provide steps to reproduce this issue? If you have a repo to share that would be helpful as well. Thanks.
GiftA-MSFT 11,176 Reputation points

2021-03-08T18:27:04.277+00:00

Using the sample, I'm able to get great accuracy. Please share steps to reproduce the issue you've described above, thanks.
Baruch Shadrouz 31 Reputation points

2021-03-10T21:23:36.003+00:00

I have a parent program that creates 6-8 forks of the Sample code that Microsoft has for the Cognitive Service READ. The files that are being passed to Microsoft are on my local machine. The only difference I have in my code from the Sample code is the output is saved to a file in a specific format.

The actual results are good but the accuracy of the "boundingbox" for each word seems to be off. You can see in the image posted in the question that the value "40.00" was found read correctly but the bounding box is not aligned correctly.

Thank you
GiftA-MSFT 11,176 Reputation points

2021-03-12T20:45:29.013+00:00

Hi, thanks for your feedback. We are still reviewing your feedback, will share updates as soon as possible. Thanks.

Answer 1

Mitchell 11

I have found the same as well, the below image shows Tesseract bounding boxes in Blue and Azure Read API bounding boxes in Red. Almost every bounding box from azure is taller than the actual text and nearly all shifted to the left slightly.

Is there any update for a fix to this issue? @GiftA-MSFT

Answer 2

cr 6

I am struggling with the same problem as @Mitchell . Bounding boxes are inaccurate and often shifted to the left.
Is there any progress on this? Can we expect this problem to be fixed at some point? @GiftA-MSFT
As my use case is depending on exact bounding boxes, I'd need to find another solution if fixing this is not on the roadmap.

Thanks!

Answer 3

Eike Thies 5

it is 2024 and this problem is still there. we have it on thousands of documents. what is even more strange is that in the document intelligence studio on hover you can see the correct values displayed. it also correctly highlights the words. but the results in the json via api or in the "result" tab in studio are slightly off

grafik

Answer 4

dj-ancora 1

I too am have the same problem with the bounding boxes often being too spacious on the left/top/bottom sides and not properly encompassing the text on the right side.

Answer 5

Anonymous

I also have same problem. in most of documents 1, D and 5 is extracting wrong.

Share via

READ OCR Bounding Box Accuracy

6 answers

Your answer