File extensions missing in OCR APA despite settings

Jim Gurley 21 Reputation points
2022-03-27T15:53:12.33+00:00

I'm trying to get an AutoIt3 script to work and I'm just a user of another script, so I don't have a lot of control over how it calls the OCR API.

If I provide a jpg image, the returned text from OCR has eliminated text that also happen to be file extensions (or at least XML, xml, and PDF). It recognizes other text with no issue. It also treats a double underline "__" as a delimiter, and makes the string in to a table, jumbling everything up.
I'm trying to parse screens created by an app over which I have absolutely no control and it is chock full of the above strings.

I have File Explorer options set to show extensions, and confirmed that Developer Settings shows the same option.

Are there some buried configuration options for the OCR API?

If I try to recognize a jpg image containing the string "XML hello" I get "hello" as the return value.

Windows 10
Windows 10
A Microsoft operating system that runs on personal computers and tablets.
10,939 questions
0 comments No comments
{count} votes

3 answers

Sort by: Most helpful
  1. Jim Gurley 21 Reputation points
    2022-03-27T15:53:54.973+00:00

    OCR API, not APA :-)

    0 comments No comments

  2. Jim Gurley 21 Reputation points
    2022-03-27T16:30:46.337+00:00

    It's not that simple. "XML Hello" translates fine if it comes from a Notepad screenshot. My image from my app doesn't work, despite enhancing contrast, etc. Not sure how to post images here.

    0 comments No comments

  3. Limitless Technology 39,461 Reputation points
    2022-04-01T11:58:48.113+00:00

    Hello @Jim Gurley

    Have you visited the Autoitscript forums for assistance with this issue? This is not really a Windows issue.

    I do hope this answers your question.

    Thanks.

    --
    --If the reply is helpful, please Upvote and Accept as answer--

    0 comments No comments