OCR seems to get characters wrong.

Question

OCR seems to get characters wrong.

nick 6

I am throwing a few documents at computer vision in order to extract the text within them. These documents contain no hand-written characters, and the font used in the document seems to be reasonably easy to read. (however I am not a computer),

Unfortunately I am encountering a lot of consistency issues, as computer vision seems to mistake characters frequently.
examples:

I for 1
O for Q

Does anybody have any experience/ advice on what can be done to resolve this?

attached example screenshot. I was scanning the forums for similar issues, and it might be caused by the issue: read-ocr-bounding-box-accuracy.html

1 answer

Your answer

Answer 1

romungi-MSFT 48,911 Microsoft Employee Moderator

@nick I have used the read API and checked the accuracy and it seems to have detected the characters correctly.

Response from the OCR api also seems accurate.

romungi-MSFT 48,911 Reputation points Microsoft Employee Moderator

2021-03-24T07:06:18.913+00:00

@nick Did you get a chance to review the above response and try the scenario again?
nick 6 Reputation points

2021-03-24T22:27:24.383+00:00

hey @romungi-MSFT , sorry for not coming back sooner.

After your comment, I noticed that the c# sdk I was using was using the v2 computer vision api, instead of the v3. After updating to the latest, that particular OCR issue was resolved.

Thanks for your help!

Share via

OCR seems to get characters wrong.

1 answer

Your answer