What is Azure Form Recognizer?
This article applies to: Form Recognizer v3.0. Earlier version: Form Recognizer v2.1
Azure Form Recognizer is a cloud-based Azure Applied AI Service that enables you to build intelligent document processing solutions. Massive amounts of data, spanning a wide variety of data types, are stored in forms and documents. Form Recognizer enables you to effectively manage the velocity at which data is collected and processed and is key to improved operations, informed data-driven decisions, and enlightened innovation.
| ✔️ Document analysis models | ✔️ Prebuilt models | ✔️ Custom models | ✔️Gated preview models |
Document analysis models
Document analysis models enable text extraction from forms and documents and return structured business-ready content ready for your organization's action, use, or progress.
Read | Extract printed
and handwritten text.
Layout | Extract text
and document structure.
General document | Extract text,
structure, and key-value pairs.
Prebuilt models
Prebuilt models enable you to add intelligent document processing to your apps and flows without having to train and build your own models.
Invoice | Extract customer
and vendor details.
Receipt | Extract sales
transaction details.
Identity | Extract identification
and verification details.
🆕 Insurance card | Extract health insurance details.
W2 | Extract taxable
compensation details.
Business card | Extract business contact details.
Contract | Extract agreement
and party details.
Custom models
Custom models are trained using your labeled datasets to extract distinct data from forms and documents, specific to your use cases. Standalone custom models can be combined to create composed models.
Extraction models
Custom extraction models are trained to extract labeled fields from documents.
Custom template | Extract data from static layouts.
Custom neural | Extract data from mixed-type documents.
Custom composed | Extract data using a collection of models.
Classification model
Custom classifiers analyze input documents to identify document types prior to invoking an extraction model.
Custom classifier | Identify designated document types (classes) prior to invoking an extraction model.
Gated preview models
Form Recognizer Studio preview features are currently in gated preview. Features, approaches and processes may change, prior to General Availability (GA), based on user feedback. Complete and submit the Form Recognizer private preview request form to request access.
US Tax 1098-E form | Extract student loan interest details
US Tax 1098 form | Extract mortgage interest details.
US Tax 1098-T form | Extract qualified tuition details.
Models and development options
Note
The following document understanding models and development options are supported by the Form Recognizer service v3.0.
You can use Form Recognizer to automate document processing in applications and workflows, enhance data-driven strategies, and enrich document search capabilities. Use the links in the table to learn more about each model and browse development options.
Read
About | Description | Automation use cases | Development options |
---|---|---|---|
Read OCR model | ● Extract text from documents. ● Data and field extraction |
● Contract processing. ● Financial or medical report processing. |
● Form Recognizer Studio ● REST API ● C# SDK ● Python SDK ● Java SDK ● JavaScript |
Layout
About | Description | Automation use cases | Development options |
---|---|---|---|
Layout analysis model | ● Extract text and layout information from documents. ● Data and field extraction ● Layout API has been updated to a prebuilt model. |
● Document indexing and retrieval by structure. ● Preprocessing prior to OCR analysis. |
● Form Recognizer Studio ● REST API ● C# SDK ● Python SDK ● Java SDK ● JavaScript |
General document
About | Description | Automation use cases | Development options |
---|---|---|---|
General document model | ● Extract text,layout, and key-value pairs from documents. ● Data and field extraction |
● Key-value pair extraction. ● Form processing. ● Survey data collection and analysis. |
● Form Recognizer Studio ● REST API ● C# SDK ● Python SDK ● Java SDK ● JavaScript |
Invoice
About | Description | Automation use cases | Development options |
---|---|---|---|
Invoice model | ● Extract key information from invoices. ● Data and field extraction |
● Accounts payable processing. ● Automated tax recording and reporting. |
● Form Recognizer Studio ● REST API ● C# SDK ● Python SDK ● Java SDK ● JavaScript |
Receipt
About | Description | Automation use cases | Development options |
---|---|---|---|
Receipt model | ● Extract key information from receipts. ● Data and field extraction ● Receipt model v3.0 supports processing of single-page hotel receipts. |
● Expense management. ● Consumer behavior data analysis. ● Customer loyalty program. ● Merchandise return processing. ● Automated tax recording and reporting. |
● Form Recognizer Studio ● REST API ● C# SDK ● Python SDK ● Java SDK ● JavaScript |
Identity (ID)
About | Description | Automation use cases | Development options |
---|---|---|---|
Identity document (ID) model | ● Extract key information from passports and ID cards. ● Document types ● Extract endorsements, restrictions, and vehicle classifications from US driver's licenses. |
● Know your customer (KYC) financial services guidelines compliance. ● Medical account management. ● Identity checkpoints and gateways. ● Hotel registration. |
● Form Recognizer Studio ● REST API ● C# SDK ● Python SDK ● Java SDK ● JavaScript |
Health insurance card
About | Description | Automation use cases | Development options |
---|---|---|---|
Health insurance card | ● Extract key information from US health insurance cards. ● Data and field extraction |
● Coverage and eligibility verification. ● Predictive modeling. ● Value-based analytics. |
● Form Recognizer Studio ● REST API ● C# SDK ● Python SDK ● Java SDK ● JavaScript |
W-2
About | Description | Automation use cases | Development options |
---|---|---|---|
W-2 Form | ● Extract key information from IRS US W2 tax forms (year 2018-2021). ● Data and field extraction |
● Automated tax document management. ● Mortgage loan application processing. |
● Form Recognizer Studio ● REST API ● C# SDK ● Python SDK ● Java SDK ● JavaScript |
Business card
About | Description | Automation use cases | Development options |
---|---|---|---|
Business card model | ● Extract key information from business cards. ● Data and field extraction |
● Sales lead and marketing management. | ● Form Recognizer Studio ● REST API ● C# SDK ● Python SDK ● Java SDK ● JavaScript |
Custom model overview
About | Description | Automation use cases | Development options |
---|---|---|---|
Custom model | Extracts information from forms and documents into structured data based on a model created from a set of representative training document sets. | Extract distinct data from forms and documents specific to your business and use cases. | ● Form Recognizer Studio ● REST API ● C# SDK ● Java SDK ● JavaScript SDK ● Python SDK |
Custom template
Note
To train a custom template model, set the buildMode
property to template
.
For more information, see Training a template model
About | Description | Automation use cases | Development options |
---|---|---|---|
Custom Template model | The custom template model extracts labeled values and fields from structured and semi-structured documents. |
Extract key data from highly structured documents with defined visual templates or common visual layouts, forms. | ● Form Recognizer Studio ● REST API ● C# SDK ● Python SDK ● Java SDK ● JavaScript SDK |
Custom neural
Note
To train a custom neural model, set the buildMode
property to neural
.
For more information, see Training a neural model
About | Description | Automation use cases | Development options |
---|---|---|---|
Custom Neural model | The custom neural model is used to extract labeled data from structured (surveys, questionnaires), semi-structured (invoices, purchase orders), and unstructured documents (contracts, letters). | Extract text data, checkboxes, and tabular fields from structured and unstructured documents. | Form Recognizer Studio ● REST API ● C# SDK ● Java SDK ● JavaScript SDK ● Python SDK |
Custom composed
About | Description | Automation use cases | Development options |
---|---|---|---|
Composed custom models | A composed model is created by taking a collection of custom models and assigning them to a single model built from your form types. | Useful when you've trained several models and want to group them to analyze similar form types like purchase orders. | ● Form Recognizer Studio ● REST API ● C# SDK ● Java SDK ● JavaScript SDK ● Python SDK |
Custom classification model
About | Description | Automation use cases | Development options |
---|---|---|---|
Composed classification model | Custom classification models combine layout and language features to detect, identify, and classify documents within an input file. | ● A loan application packaged containing application form, payslip, and, bank statement. ● A collection of scanned invoices. |
● Form Recognizer Studio ● REST API |
Contract model (preview)
About | Development options |
---|---|
Extract contract agreement and party details. | ● Form Recognizer Studio ● REST API |
US tax 1098 form (preview)
About | Development options |
---|---|
Extract mortgage interest information and details. | ● Form Recognizer Studio ● REST API |
US tax 1098-E form (preview)
About | Development options |
---|---|
Extract student loan information and details. | ● Form Recognizer Studio ● REST API |
US tax 1098-T form (preview)
About | Development options |
---|---|
Extract tuition information and details. | ● Form Recognizer Studio ● REST API |
Azure Form Recognizer is a cloud-based Azure Applied AI Service for developers to build intelligent document processing solutions. Form Recognizer applies machine-learning-based optical character recognition (OCR) and document understanding technologies to extract text, tables, structure, and key-value pairs from documents. You can also label and train custom models to automate data extraction from structured, semi-structured, and unstructured documents. To learn more about each model, see the Concepts articles:
Model type | Model name |
---|---|
Document analysis model | ● Layout analysis model |
Prebuilt models | ● Invoice model ● Receipt model ● Identity document (ID) model ● Business card model |
Custom models | ● Custom model ● Composed model |
This article applies to: Form Recognizer v2.1. Later version: Form Recognizer v3.0
Form Recognizer models and development options
Tip
- For an enhanced experience and advanced model quality, try the Form Recognizer v3.0 Studio.
- The v3.0 Studio supports any model trained with v2.1 labeled data.
- You can refer to the API migration guide for detailed information about migrating from v2.1 to v3.0.
Note
The following models and development options are supported by the Form Recognizer service v2.1.
Use the links in the table to learn more about each model and browse the API references:
Model | Description | Development options |
---|---|---|
Layout analysis | Extraction and analysis of text, selection marks, tables, and bounding box coordinates, from forms and documents. | ● Form Recognizer labeling tool ● REST API ● Client-library SDK ● Form Recognizer Docker container |
Custom model | Extraction and analysis of data from forms and documents specific to distinct business data and use cases. | ● Form Recognizer labeling tool ● REST API ● Sample Labeling Tool ● Form Recognizer Docker container |
Invoice model | Automated data processing and extraction of key information from sales invoices. | ● Form Recognizer labeling tool ● REST API ● Client-library SDK ● Form Recognizer Docker container |
Receipt model | Automated data processing and extraction of key information from sales receipts. | ● Form Recognizer labeling tool ● REST API ● Client-library SDK ● Form Recognizer Docker container |
Identity document (ID) model | Automated data processing and extraction of key information from US driver's licenses and international passports. | ● Form Recognizer labeling tool ● REST API ● Client-library SDK ● Form Recognizer Docker container |
Business card model | Automated data processing and extraction of key information from business cards. | ● Form Recognizer labeling tool ● REST API ● Client-library SDK ● Form Recognizer Docker container |
Data privacy and security
As with all AI services, developers using the Form Recognizer service should be aware of Microsoft policies on customer data. See our Data, privacy, and security for Form Recognizer page.
Next steps
Try processing your own forms and documents with the Form Recognizer Studio
Complete a Form Recognizer quickstart and get started creating a document processing app in the development language of your choice.
Try processing your own forms and documents with the Form Recognizer Sample Labeling tool
Complete a Form Recognizer quickstart and get started creating a document processing app in the development language of your choice.
Feedback
Submit and view feedback for