OCR seems to get characters wrong.

nick 6 Reputation points
2021-03-17T09:16:25.493+00:00

I am throwing a few documents at computer vision in order to extract the text within them. These documents contain no hand-written characters, and the font used in the document seems to be reasonably easy to read. (however I am not a computer),

Unfortunately I am encountering a lot of consistency issues, as computer vision seems to mistake characters frequently.
examples:

  • I for 1
  • O for Q

Does anybody have any experience/ advice on what can be done to resolve this?

attached example screenshot. I was scanning the forums for similar issues, and it might be caused by the issue: read-ocr-bounding-box-accuracy.html

78638-screen-shot-2021-03-17-at-82337-pm.png

Azure Computer Vision
Azure Computer Vision
An Azure artificial intelligence service that analyzes content in images and video.
381 questions
0 comments No comments
{count} vote

1 answer

Sort by: Most helpful
  1. romungi-MSFT 46,986 Reputation points Microsoft Employee
    2021-03-17T16:39:35.917+00:00

    @nick I have used the read API and checked the accuracy and it seems to have detected the characters correctly.

    78896-image.png

    Response from the OCR api also seems accurate.

    78836-image.png


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.