OCR - Optical Character Recognition

OCR or Optical Character Recognition is also referred to as text recognition or text extraction. Machine-learning based OCR techniques allow you to extract printed or handwritten text from images, such as posters, street signs and product labels, as well as from documents like articles, reports, forms, and invoices. The text is typically extracted as words, text lines, and paragraphs or text blocks, enabling access to digital version of the scanned text. This eliminates or significantly reduces the need for manual data entry.

Intelligent Document Processing (IDP) uses OCR as its foundational technology to additionally extract structure, relationships, key-values, entities, and other document-centric insights with an advanced machine-learning based AI service like Form Recognizer. Form Recognizer includes a document-optimized version of Read as its OCR engine while delegating to other models for higher-end insights. If you are extracting text from scanned and digital documents, use Form Recognizer Read OCR.

OCR engine

Microsoft's Read OCR engine is composed of multiple advanced machine-learning based models supporting global languages. This allows them to extract printed and handwritten text including mixed languages and writing styles. Read is available as cloud service and on-premises container for deployment flexibility. With the latest preview, it's also available as a synchronous API for single, non-document, image-only scenarios with performance enhancements that make it easier to implement OCR-assisted user experiences.

Warning

The Computer Vision legacy ocr and RecognizeText operations are no longer supported and should not be used.

OCR (Read) editions

Important

Select the Read edition that best fits your requirements.

Input Examples Read edition Benefit
Images: General, in-the-wild images labels, street signs, and posters Computer Vision v4.0 preview Optimized for general, non-document images with a performance-enhanced synchronous API that makes it easier to embed OCR in your user experience scenarios.
Documents: Digital and scanned, including images books, articles, and reports Form Recognizer Optimized for text-heavy scanned and digital documents with an asynchronous API to help automate intelligent document processing at scale.

About Computer Vision v3.2 GA Read

Looking for the most recent Computer Vision v3.2 GA Read? Note that all future Read OCR enhancements will be part of the two new services listed above. There will be no further updates to the Computer Vision v3.2. To continue, see the Computer Vision v3.2 GA Read overview and quickstart.

How to use OCR

Try out OCR by using Vision Studio. Then follow one of the links to the Read edition in the later sections that best meet your requirements.

Screenshot: Read OCR demo in Vision Studio.

OCR supported languages

Both Read versions available today in Computer Vision support several languages for printed and handwritten text. OCR for printed text includes support for English, French, German, Italian, Portuguese, Spanish, Chinese, Japanese, Korean, Russian, Arabic, Hindi, and other international languages that use Latin, Cyrillic, Arabic, and Devanagari scripts. OCR for handwritten text includes support for English, Chinese Simplified, French, German, Italian, Japanese, Korean, Portuguese, and Spanish languages.

Refer to the full list of OCR-supported languages.

OCR common features

The Read OCR model is available in Computer Vision and Form Recognizer with common baseline capabilities while optimizing for respective scenarios. The following list summarizes the common features:

  • Printed and handwritten text extraction in supported languages
  • Pages, text lines and words with location and confidence scores
  • Support for mixed languages, mixed mode (print and handwritten)
  • Available as Distroless Docker container for on-premises deployment

Use the OCR cloud APIs or deploy on-premises

The cloud APIs are the preferred option for most customers because of their ease of integration and fast productivity out of the box. Azure and the Computer Vision service handle scale, performance, data security, and compliance needs while you focus on meeting your customers' needs.

For on-premises deployment, the Read Docker container (preview) enables you to deploy the Computer Vision v3.2 generally available OCR capabilities in your own local environment. Containers are great for specific security and data governance requirements.

OCR data privacy and security

As with all of the Cognitive Services, developers using the Computer Vision service should be aware of Microsoft's policies on customer data. See the Cognitive Services page on the Microsoft Trust Center to learn more.

Next steps