Text Extractor utility

Text Extractor enables you to copy text from anywhere on your screen, including inside images or videos. This code is based on Joe Finney's Text Grab.

How to activate

With the activation shortcut (default: ⊞ Win+Shift+T), you'll see an overlay on the screen. Click and hold your primary mouse button and drag to activate your capture. The text will be saved to your clipboard.

How to deactivate

Capture mode is deactivated immediately after text in the selected region is recognized and copied to the clipboard. You can exit capture mode by pressing Esc at any moment.

Adjust while trying to capture

By holding Shift, you will change from adjusting the capture region's size to moving the capture region. When you release Shift, you will be able to resize again.

Important

  1. The produced text may not be perfect, so you have to do a quick proof read of the output.
  2. This tool uses OCR (Optical Character Recognition) to read text on the screen.
  3. The default language used will be based on your Windows system language > keyboard settings (OCR language packs are available for install).

Settings

From the Settings menu, the following options can be configured:

Setting Description
Activation shortcut The customizable keyboard command to turn on or off this module.
Preferred language The language used for OCR.

Supported languages

Text Extractor can only recognize languages that have the OCR language pack installed.

The list can be obtained via PowerShell by running the following commands:

# Please use Windows PowerShell, not PowerShell 7 as these aren't .NET Core libraries

[Windows.Media.Ocr.OcrEngine, Windows.Foundation, ContentType = WindowsRuntime]

[Windows.Media.Ocr.OcrEngine]::AvailableRecognizerLanguages

How to query for OCR language packs

To return the list of all supported language packs, open PowerShell as an Administrator (right-click, then select "Run as Administrator"), and enter the following command:

Get-WindowsCapability -Online | Where-Object { $_.Name -Like 'Language.OCR*' }

An example output:

Name  : Language.OCR~~~el-GR~0.0.1.0
State : NotPresent

Name  : Language.OCR~~~en-GB~0.0.1.0
State : NotPresent

Name  : Language.OCR~~~en-US~0.0.1.0
State : Installed

Name  : Language.OCR~~~es-ES~0.0.1.0
State : NotPresent

Name  : Language.OCR~~~es-MX~0.0.1.0
State : NotPresent

The language and location is abbreviated, so "en-US" would be "English-United States" and "en-GB" would be "English-Great Britain". If a language is not available in the output, then it's not supported by OCR. State: NotPresent languages must be installed first.

How to install an OCR language pack

The following commands install the OCR pack for "en-US":

$Capability = Get-WindowsCapability -Online | Where-Object { $_.Name -Like 'Language.OCR*en-US*' }
$Capability | Add-WindowsCapability -Online

How to remove an OCR language pack

The following commands remove the OCR pack for "en-US":

$Capability = Get-WindowsCapability -Online | Where-Object { $_.Name -Like 'Language.OCR*en-US*' }
$Capability | Remove-WindowsCapability -Online

Troubleshooting

This section will list possible errors and solutions.

"No Possible OCR languages are installed."

This message is shown when there are no available languages for recognition.

If an OCR pack is supported and installed, but still is not available and your system drive X: is different than "C:", then copy X:/Windows/OCR folder to C:/Windows/OCR to fix the issue.