Edit

Share via


Text recognizer walkthrough

This short tutorial walks through the text recognition functionality included in the Windows AI Foundry Sample for WinForms. Specifically, it demonstrates how to use Windows AI Foundry APIs to perform text recognition on an image and summarize the recognized text.

Prerequisites

Complete the steps in the Getting Started page for WinForms.

Introduction

The MainForm class in MainForm.cs is the main user interface for the Windows AI Foundry Sample application and implements the following functionality:

  • Select File: Lets the user select an image file from their file system and displays that image in a PictureBox.
  • Process Image: Processes the selected image to extract text using Optical Character Recognition (OCR) and then summarizes the extracted text.

Key functions and event handlers

Some of the more significant functions and event handlers in the Windows AI Foundry Sample for WinForms include the following:

  • SelectFile_Click: Opens a file dialog for the user to select an image file and displays the selected image.
  • ProcessButton_Click: Handles the processing of the selected image, including loading AI models, performing text recognition, and summarizing the text.
  • LoadAIModels: Loads the necessary AI models (TextRecognizer and LanguageModel) for text recognition and summarization.
  • PerformTextRecognition: Uses the TextRecognizer to perform OCR on the selected image and extracts the text. This function is included in the following Text recognition example.
  • SummarizeImageText: Uses the LanguageModel to generate a summary of the extracted text given a prompt.

Text recognition example

The PerformTextRecognition function in this example

The input image.

The initialized sample app.

private async Task<string> PerformTextRecognition()
{
    using TextRecognizer textRecognizer = await TextRecognizer.CreateAsync();
    ImageBuffer? imageBuffer = await LoadImageBufferFromFileAsync(pathToImage);

    if (imageBuffer == null)
    {
        throw new Exception("Failed to load image buffer.");
    }

    RecognizedText recognizedText = 
        textRecognizer!.RecognizeTextFromImage(imageBuffer);

    var recognizedTextLines = recognizedText.Lines.Select(line => line.Text);
    string text = string.Join(Environment.NewLine, recognizedTextLines);

    richTextBoxForImageText.Text = text;
    return text;
}

Sample app after capturing image text (displayed in bottom left box) and summarizing image text (displayed in bottom right box).

Build and run the sample

  1. Clone the repository onto your Copilot+PC.
  2. Open the solution file WindowsCopilotRuntimeSample.sln in Visual Studio 2022.
  3. Change the Solution Platform to match the architecture of your Copilot+ PC.
  4. Right-click on the solution in Solution Explorer and select "Build" to build solution.
  5. Once the build is successful, right-click on the project in Solution Explorer and select "Set as Startup Project".
  6. Press F5 or select "Start Debugging" from the Debug menu to run the sample. Note: The sample can also be run without debugging by selecting "Start Without Debugging" from the Debug menu or Ctrl+F5.

See also