How to parse numbers from screen

Mattia Fanti 356 Reputation points
2022-01-02T00:59:28.32+00:00

Hi, I'm trying to parse numbers, specifically percentages using an ocr.
It's plenty of topics online but 99% are all dated and old and all of them are asking how to get text from an image.
In my case, I want simply a picturebox that will scan whatever text there is under.
I found a quite old youtube tutorial watch

and the code that is stated on the bio is

Imports Emgu.CV
Imports Emgu.Util
Imports Emgu.CV.OCR
Imports Emgu.CV.Structure

Public Class Form1

Dim OCRz As Tesseract = New Tesseract("tessdata", "eng", Tesseract.OcrEngineMode.OEM_TESSERACT_ONLY)
Dim pic As Bitmap = New Bitmap(270, 100)
Dim gfx As Graphics = Graphics.FromImage(pic)

Private Sub Timer1_Tick(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Timer1.Tick

'If Windows XP
gfx.CopyFromScreen(New Point(Me.Location.X + PictureBox1.Location.X + 4, Me.Location.Y + PictureBox1.Location.Y + 30), New Point(0, 0), pic.Size)
PictureBox1.Image = pic

'If Windows 7
'gfx.CopyFromScreen(MousePositi­on, New Point(0, 0), pic.Size)

End Sub

Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click

OCRz.Recognize(New Image(Of Bgr, Byte)(pic))
RichTextBox1.Text = OCRz.GetText

End Sub
End Class

It is a 8+ y old code so I think there must be something better. I downloaded emgu from nuget, the first and most downloaded one emgu package but at runtime I get the error "'Unable to create ocr model using Path 'tessdata' and language 'eng'.'" At compile time I get the error "'Tesseract' is not defined.". That's really frustrating cause I've been looking everywhere, also in c# forums but none that can help me. Do you have any solution? I would appreciate your help. Thanks

Developer technologies VB
0 comments No comments
{count} votes

Accepted answer
  1. Castorix31 90,521 Reputation points
    2022-01-03T08:14:10.833+00:00

    From comments on the first sample, updated sample by using a transparent PictureBox (it seems to work better than with a Bitmap...)
    Add a PictureBox with a transparent color, a Button for the click, a RichTextBox =>

    ' Add reference to : "C:\Program Files (x86)\Windows Kits\10\UnionMetadata\Windows.winmd"
    ' For Await :
    ' Add reference to : "C:\Program Files (x86)\Reference Assemblies\Microsoft\Framework\.NETCore\v4.5\System.Runtime.WindowsRuntime.dll"
    
    Imports Windows.Media.Ocr
    ' For .AsStream()
    Imports System.IO
    Imports System.Runtime.InteropServices.WindowsRuntime
    
    Public Class Form1
        Private Async Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
            Dim softwareBmp As Windows.Graphics.Imaging.SoftwareBitmap
            Using bmp As Bitmap = New Bitmap(PictureBox1.Width, PictureBox1.Height)
                Using g As Graphics = Graphics.FromImage(bmp)
                    Dim pt As Point = Me.PointToScreen(New Point(PictureBox1.Left, PictureBox1.Top))
                    g.CopyFromScreen(pt.X, pt.Y, 0, 0, bmp.Size, CopyPixelOperation.SourceCopy)
                    Using memStream = New Windows.Storage.Streams.InMemoryRandomAccessStream()
                        bmp.Save(memStream.AsStream(), System.Drawing.Imaging.ImageFormat.Bmp)
                        Dim decoder As Windows.Graphics.Imaging.BitmapDecoder = Await Windows.Graphics.Imaging.BitmapDecoder.CreateAsync(memStream)
                        softwareBmp = Await decoder.GetSoftwareBitmapAsync()
                    End Using
                End Using
            End Using
    
            Dim ocrEng = OcrEngine.TryCreateFromUserProfileLanguages()
            'Dim ocrEng = OcrEngine.TryCreateFromLanguage(New Windows.Globalization.Language("en-US"))
    
            ' For test
            Dim languages As IReadOnlyList(Of Windows.Globalization.Language) = ocrEng.AvailableRecognizerLanguages
            For Each language In languages
                Console.WriteLine(language.LanguageTag)
            Next
            Dim r = ocrEng.RecognizerLanguage
            Dim n = ocrEng.MaxImageDimension
    
            Dim ocrResult = Await ocrEng.RecognizeAsync(softwareBmp)
            'Console.WriteLine(ocrResult.Text)
            RichTextBox1.Text = ocrResult.Text
            ' For test (extract lines from result text)
            Dim lines As IReadOnlyList(Of OcrLine) = ocrResult.Lines
            For Each line In lines
                Console.WriteLine(line.Text)
            Next
        End Sub
    End Class
    
    1 person found this answer helpful.

1 additional answer

Sort by: Most helpful
  1. Castorix31 90,521 Reputation points
    2022-01-02T14:13:02.21+00:00

    I tested Windows.Media.Ocr
    but the result is not perfect
    I get
    "Forml Button I"
    with this test with a Form and a Button :

    ' Add reference to : "C:\Program Files (x86)\Windows Kits\10\UnionMetadata\Windows.winmd"  
    ' For Await :  
    ' Add reference to : "C:\Program Files (x86)\Reference Assemblies\Microsoft\Framework\.NETCore\v4.5\System.Runtime.WindowsRuntime.dll"  
      
    Imports Windows.Media.Ocr  
      
    Public Class Form1  
        Private Async Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click  
            Dim sFile As String = "e:\ScreenCopy_Test.bmp"    
            Using bitmap As Bitmap = New Bitmap(Me.Width, Me.Height, System.Drawing.Imaging.PixelFormat.Format32bppArgb)  
                Using g As Graphics = Graphics.FromImage(bitmap)  
                    g.SmoothingMode = Drawing2D.SmoothingMode.HighQuality  
                    g.InterpolationMode = Drawing2D.InterpolationMode.HighQualityBilinear  
                    g.CompositingQuality = Drawing2D.CompositingQuality.HighQuality  
                    g.PixelOffsetMode = Drawing2D.PixelOffsetMode.HighQuality                
                    g.CopyFromScreen(Me.Left, Me.Top, 0, 0, bitmap.Size, CopyPixelOperation.SourceCopy)  
                End Using  
                bitmap.Save(sFile, System.Drawing.Imaging.ImageFormat.Bmp)  
            End Using  
      
            Dim ocrEng = OcrEngine.TryCreateFromUserProfileLanguages()  
            'Dim ocrEng = OcrEngine.TryCreateFromLanguage(New Windows.Globalization.Language("en-US"))  
            Dim languages As IReadOnlyList(Of Windows.Globalization.Language) = ocrEng.AvailableRecognizerLanguages  
            For Each language In languages  
                Console.WriteLine(language.LanguageTag)  
            Next  
            Dim r = ocrEng.RecognizerLanguage  
            Dim n = ocrEng.MaxImageDimension  
            Dim file = Await Windows.Storage.StorageFile.GetFileFromPathAsync(sFile)  
            Dim stream = Await file.OpenAsync(Windows.Storage.FileAccessMode.Read)  
            Dim bitmapDecoder = Await Windows.Graphics.Imaging.BitmapDecoder.CreateAsync(stream)  
            Dim softwareBitmap = Await bitmapDecoder.GetSoftwareBitmapAsync()  
            Dim ocrResult = Await ocrEng.RecognizeAsync(softwareBitmap)  
            Console.WriteLine(ocrResult.Text)  
        End Sub  
    End Class  
    

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.