What's new in Azure AI Document Intelligence
This content applies to: v4.0 (preview) v3.1 (GA) v3.0 (GA) v2.1 (GA)
Document Intelligence service is updated on an ongoing basis. Bookmark this page to stay up to date with release notes, feature enhancements, and our newest documentation.
Important
Preview API versions are retired once the GA API is released. The 2023-02-28-preview API version is being retired, if you are still using the preview API or the associated SDK versions, please update your code to target the latest API version 2023-07-31 (GA).
August 2024
The Document Intelligence 2024-07-31-preview REST API is now available. This preview API introduces new and updated capabilities:
Public preview version 2024-07-31-preview is currently available only in the following Azure regions. The new document field extraction model in AI Studio is only available in North Central US region:
East US
West US2
West Europe
North Central US
π Document field extraction (custom generative) model
- Use Generative AI to extract fields from documents and forms. Document Intelligence now offers a new document field extraction model that utilizes large language models (LLMs) to extract fields from unstructured documents or structured forms with a wide variety of visual templates. With grounded values and confidence scores, the new Generative AI based extraction fits into your existing processes.
π Model compose with custom classifiers
- Document Intelligence now adds support for composing model with an explicit custom classification model. Learn more about the benefits of using the new compose capability.
-
- Custom classification model now supports updating the model in-place as well.
- Custom classification model adds support for model copy operation to enable backup and disaster recovery.
- Custom classification model now supports explicitly specifying pages to be classified from an input document.
-
- Extract information from Appraisal (Form 1004).
- Extract information from Validation of Employment (Form 1005).
-
- Extract payee, amount, date, and other relevant information from checks.β
-
- New prebuilt to process pay stubs to extract wages, hours, deductions, net pay and more.β
-
- New prebuilt to extract account information including beginning and ending balances, transaction details from bank statements.β
-
- New unified US tax model that can extract from forms such as W-2, 1098, 1099, and 1040.
π Searchable PDF. The prebuilt read model now supports PDF output to download PDFs with embedded text from extraction results, allowing for PDF to be utilized in scenarios such as search copy of contents.
Layout model now supports improved figure detection where figures from documents can now be downloaded as an image file to be used for further figure understanding. The layout model also features improvements to the OCR model for scanned text targeting improvements for single characters, boxed text, and dense text documents.
-
- Document Intelligence now adds support for batch analysis operation to support analyzing a set of documents to simplify developer experience and improve efficiency.
-
- Query fields AI quality of extraction is improved with the latest model.
May 2024
The Document Intelligence Studio adds support for Microsoft Entra (formerly Azure Active Directory) authentication. For more information, see Document Intelligence Studio overview.
February 2024
The Document Intelligence 2024-07-31-preview REST API is now available. This preview API introduces new and updated capabilities:
Public preview version 2024-07-31-preview is currently available only in the following Azure regions:
- East US
- West US2
- West Europe
Layout model now supports figure detection and hierarchical document structure analysis (sections and subsections). The AI quality of reading order and logical roles detection is also improved.
-
- Custom extraction models now support cell, row, and table level confidence scores. Learn more about table, row, and cell confidence.
- Custom extraction models have AI quality improvements for field extraction.
- Custom template extraction model now supports extracting overlapping fields. Learn more about overlapping fields and how you use them.
-
- Custom classification model now supported incremental training for scenarios where you need to update the classifier model with added samples or classes. Learn more about incremental training.
- Custom classification model adds support for Office document types (.docx, .pptx, and .xls). Learn more about expanded document type support.
-
- Support for new locales:
Locale Code Arabic ( ar
)Bulgarian ( bg
)Greek ( el
)Hebrew ( he
)Macedonian ( mk
)Russian ( ru
)Serbian Cyrillic ( sr-cyrl
)Ukrainian ( uk
)Thai ( th
)Turkish ( tr
)Vietnamese ( vi
)- Support for new currency codes:
Currency Locale Code BAM
Bosnian Convertible Mark ( ba
)BGN
Bulgarian Lev ( bg
)ILS
Israeli New Shekel ( il
)MKD
Macedonian Denar ( mk
)RUB
Russian Ruble ( ru
)THB
Thai Baht ( th
)TRY
Turkish Lira ( tr
)UAH
Ukrainian Hryvnia ( ua
)VND
Vietnamese Dong ( vn
)- Tax items support expansion for Germany (
de
), Spain (es
), Portugal (pt
), English Canadaen-CA
.
-
- Expanded field support for European Union IDs and driver license.
-
- Extract information from Uniform Residential Loan Application (Form 1003).
- Extract information from Uniform Underwriting and Transmittal Summary or Form 1008.
- Extract information from mortgage closing disclosure.
-
- Extract information from bank cards.
-
- New prebuilt to extract information from marriage certificates.
December 2023
The Document Intelligence client libraries targeting REST API 2023-10-31-preview are now available for use!
November 2023
The Document Intelligence 2023-10-31-preview REST API is now available. This preview API introduces new and updated capabilities:
Public preview version 2023-10-31-preview is currently only available in the following Azure regions:
- East US
- West US2
- West Europe
-
- Language Expansion for Handwriting: Russian(
ru
), Arabic(ar
), Thai(th
). - Cyber Executive Order (EO) compliance.
- Language Expansion for Handwriting: Russian(
-
- Support office and HTML files.
- Markdown output support.
- Table extraction, reading order, and section heading detection improvements.
- With the Document Intelligence 2023-10-31-preview, the general document model (prebuilt-document) is deprecated. Going forward, to extract key-value pairs from documents, use the
prebuilt-layout
model with the optional query string parameterfeatures=keyValuePairs
enabled.
-
- Now extracts currency for all price-related fields.
-
- New field support for Medicare and Medicaid information.
-
- New 1099 tax model. Supports base 1099 form and the following variations: A, B, C, CAP, DIV, G, H, INT, K, LS, LTC, MISC, NEC, OID, PATR, Q, QA, R, S, SA, SBβ.
-
- Support for
KVK
field. - Support for
BPAY
field. - Numerous field refinements.
- Support for
-
- Support for multi-language documents.
- New page splitting options: autosplit, always split by page, no split.
-
- Query fields are available with the
2023-10-31-preview
release. - Add-on capabilities are available within all models excluding the Read model.
- Query fields are available with the
Note
With the 2022-08-31 API general availability (GA) release, the associated preview APIs are being deprecated. If you are using the 2021-09-30-preview, the 2022-01-30-preview or he 2022-06-30-preview API versions, please update your applications to target the 2022-08-31 API version. There are a few minor changes involved, for more information, see the migration guide.
July 2023
Note
Form Recognizer is now Azure AI Document Intelligence!
- Document, Azure AI services encompass all of what were previously known as Cognitive Services and Azure Applied AI Services.
- There are no changes to pricing.
- The names Cognitive Services and Azure Applied AI continue to be used in Azure billing, cost analysis, price list, and price APIs.
- There are no breaking changes to application programming interfaces (APIs) or client libraries.
- Some platforms are still awaiting the renaming update. All mention of Form Recognizer or Document Intelligence in our documentation refers to the same Azure service.
Document Intelligence v3.1 (GA)
The Document Intelligence version 3.1 API is now generally available (GA)! The API version corresponds to 2023-07-31
.
The v3.1 API introduces new and updated capabilities:
- Document Intelligence APIs are now more modular and with support for optional features. You can now customize the output to specifically include the features you need. Learn more about the optional parameters.
- Document classification API for splitting a single file into individual documents. Learn more about document classification.
- Prebuilt contract model.
- Prebuilt US tax form 1098 model.
- Support for Office file types with Read API.
- Barcode recognition in documents.
- Formula recognition add-on capability.
- Font recognition add-on capability.
- Support for high resolution documents.
- Custom neural models now require a single labeled sample to train.
- Custom neural models language expansion. Train a neural model for documents in 30 languages. See language support for the complete list of supported languages.
- π Prebuilt health insurance card model.
- Prebuilt invoice model locale expansion.
- Prebuilt receipt model language and locale expansion with more than 100 languages supported.
- Prebuilt ID model now supports European IDs.
Document Intelligence Studio UX Updates
βοΈ Analyze Options
Document Intelligence now supports more sophisticated analysis capabilities and the Studio allows one entry point (Analyze options button) for configuring the add-on capabilities with ease.
Depending on the document extraction scenario, configure the analysis range, document page range, optional detection, and premium detection features.
Note
Font extraction is not visualized in Document Intelligence Studio. However, you can check the styles section of the JSON output for the font detection results.
βοΈ Auto labeling documents with prebuilt models or one of your own models
In custom extraction model labeling page, you can now auto label your documents using one of Document Intelligent Service prebuilt models or models you previously trained.
For some documents, there can be duplicate labels after running auto label. Make sure to modify the labels so that there are no duplicate labels in the labeling page afterwards.
βοΈ Auto labeling tables
In custom extraction model labeling page, you can now auto label the tables in the document without having to label the tables manually.
βοΈ Add test files directly to your training dataset
Once you train a custom extraction model, make use of the test page to improve your model quality by uploading test documents to training dataset if needed.
If a low confidence score is returned for some labels, make sure your labels are correct. If not, add them to the training dataset and relabel to improve the model quality.
βοΈ Make use of the document list options and filters in custom projects
Use the custom extraction model labeling page. You can now navigate through your training documents with ease by making use of the search, filter, and sort by feature.
Utilize the grid view to preview documents or use the list view to scroll through the documents more easily.
βοΈ Project sharing
- Share custom extraction projects with ease. For more information, see Project sharing with custom models.
May 2023
Introducing refreshed documentation for Build 2023
π Document Intelligence Overview enhanced navigation, structured access points, and enriched images.
π Choose a Document Intelligence model provides guidance for choosing the best Document Intelligence solution for your projects and workflows.
April 2023
Announcing the latest Document Intelligence client-library public preview release
Document Intelligence REST API Version 2023-02-28-preview supports the public preview release client libraries. This release includes the following new features and capabilities available for .NET/C# (4.1.0-beta-1), Java (4.1.0-beta-1), JavaScript (4.1.0-beta-1), and Python (3.3.0b.1) client libraries:
For more information, see Document Intelligence SDK (public preview) and March 2023 release notes
March 2023
Important
2023-02-28-preview
capabilities are currently only available in the following regions:
- West Europe
- West US2
- East US
- Custom classification model is a new capability within Document Intelligence starting with the
2023-02-28-preview
API. - Query fields capabilities added to the General Document model, use Azure OpenAI models to extract specific fields from documents. Try the General documents with query fields feature using the Document Intelligence Studio. Query fields are currently only active for resources in the
East US
region. - Add-on capabilities:
- Font extraction is now recognized with the
2023-02-28-preview
API. - Formula extraction is now recognized with the
2023-02-28-preview
API. - High resolution extraction is now recognized with the
2023-02-28-preview
API.
- Font extraction is now recognized with the
- Custom extraction model updates:
- Custom neural model now supports added languages for training and analysis. Train neural models for Dutch, French, German, Italian, and Spanish.
- Custom template model now has an improved signature detection capability.
- Document Intelligence Studio updates:
- In addition to support for all the new features like classification and query fields, the Studio now enables project sharing for custom model projects.
- New model additions in gated preview: Vaccination cards, Contracts, US Tax 1098, US Tax 1098-E, and US Tax 1098-T. To request access to gated preview models, complete and submit the Document Intelligence private preview request form.
- Receipt model updates:
- Receipt model adds support for thermal receipts.
- Receipt model now adds language support for 18 languages and three regional languages (English, French, Portuguese).
- Receipt model now supports
TaxDetails
extraction.
- Layout model now improves table recognition.
- Read model now adds improvement for single-digit character recognition.
February 2023
Select Document Intelligence containers for v3.0 are now available for use!
Currently Read v3.0 and Layout v3.0 containers are available.
For more information, see Install and run Document Intelligence containers.
January 2023
Prebuilt receipt model - added languages supported. The receipt model now supports these added languages and locales
- Japanese - Japan (ja-JP)
- French - Canada (fr-CA)
- Dutch - Netherlands (nl-NL)
- English - United Arab Emirates (en-AE)
- Portuguese - Brazil (pt-BR)
Prebuilt invoice model - added languages supported. The invoice model now supports these added languages and locales
- English - United States (en-US), Australia (en-AU), Canada (en-CA), United Kingdom (en-UK), India (en-IN)
- Spanish - Spain (es-ES)
- French - France (fr-FR)
- Italian - Italy (it-IT)
- Portuguese - Portugal (pt-PT)
- Dutch - Netherlands (nl-NL)
Prebuilt invoice model - added fields recognized. The invoice model now recognizes these added fields
- Currency code
- Payment options
- Total discount
- Tax items (en-IN only)
Prebuilt ID model - added document types supported. The ID model now supports these added document types
- US Military ID
Tip
All January 2023 updates are available with REST API version 2022-08-31 (GA).
Prebuilt receipt modelβadditional language support:
The prebuilt receipt model adds support for the following languages:
- English - United Arab Emirates (en-AE)
- Dutch - Netherlands (nl-NL)
- French - Canada (fr-CA)
- German - (de-DE)
- Italian - (it-IT)
- Japanese - Japan (ja-JP)
- Portuguese - Brazil (pt-BR)
Prebuilt invoice modelβadditional language support and field extractions
The prebuilt invoice model adds support for the following languages:
- English - Australia (en-AU), Canada (en-CA), United Kingdom (en-UK), India (en-IN)
- Portuguese - Brazil (pt-BR)
The prebuilt invoice model now adds support for the following field extractions:
- Currency code
- Payment options
- Total discount
- Tax items (en-IN only)
Prebuilt ID document modelβadditional document types support
The prebuilt ID document model now adds support for the following document types:
- Driver's license expansion supporting India, Canada, United Kingdom, and Australia
- US military ID cards and documents
- India ID cards and documents (PAN and Aadhaar)
- Australia ID cards and documents (photo card, Key-pass ID)
- Canada ID cards and documents (identification card, Maple card)
- United Kingdom ID cards and documents (national/regional identity card)
December 2022
Document Intelligence Studio updates
The December Document Intelligence Studio release includes the latest updates to Document Intelligence Studio. There are significant improvements to user experience, primarily with custom model labeling support.
Page range. The Studio now supports analyzing specified pages from a document.
Custom model labeling:
Run Layout API automatically. You can opt to run the Layout API for all documents automatically in your blob storage during the setup process for custom model.
Search. The Studio now includes search functionality to locate words within a document. This improvement allows for easier navigation while labeling.
Navigation. You can select labels to target labeled words within a document.
Auto table labeling. After you select the table icon within a document, you can opt to autolabel the extracted table in the labeling view.
Label subtypes and second-level subtypes The Studio now supports subtypes for table columns, table rows, and second-level subtypes for types such as dates and numbers.
Building custom neural models is now supported in the US Gov Virginia region.
Preview API versions
2022-01-30-preview
and2021-09-30-preview
will be retired January 31 2023. Update to the2022-08-31
API version to avoid any service disruptions.
November 2022
- Announcing the latest stable release of Azure AI Document Intelligence libraries
- This release includes important changes and updates for .NET, Java, JavaScript, and Python client libraries. For more information, see Azure SDK DevBlog.
- The most significant enhancements are the introduction of two new clients, the
DocumentAnalysisClient
and theDocumentModelAdministrationClient
.
October 2022
Document Intelligence versioned content
Document Intelligence documentation is updated to present a versioned experience. Now, you can choose to view content targeting the
v3.0 GA
experience or thev2.1 GA
experience. The v3.0 experience is the default.
Document Intelligence Studio Sample Code
- Sample code for the Document Intelligence Studio labeling experience is now available on GitHub. Customers can develop and integrate Document Intelligence into their own UX or build their own new UX using the Document Intelligence Studio sample code.
Language expansion
- With the latest preview release, Document Intelligence's Read (OCR), Layout, and Custom template models support 134 new languages. These language additions include Greek, Latvian, Serbian, Thai, Ukrainian, and Vietnamese, along with several Latin, and Cyrillic languages. Document Intelligence now has a total of 299 supported languages across the most recent GA and new preview versions. Refer to the supported languages page to see all supported languages.
- Use the REST API parameter
api-version=2022-06-30-preview
when using the API or the corresponding SDK to support the new languages in your applications.
New Prebuilt Contract model
- A new prebuilt that extracts information from contracts such as parties, title, contract ID, execution date and more. the contracts model is currently in preview, request access here.
Region expansion for training custom neural models
- Training custom neural models now supported in added regions.
- East US
- East US2
- US Gov Arizona
- Training custom neural models now supported in added regions.
September 2022
Note
Starting with version 4.0.0, a new set of clients has been introduced to leverage the newest features of the Document Intelligence service.
SDK version 4.0.0 GA release includes the following updates:
- Version 4.0.0 GA (2022-09-08)
- Supports REST API v3.0 and v2.0 clients
Region expansion for training custom neural models now supported in six new regions
- Australia East
- Central US
- East Asia
- France Central
- UK South
- West US2
For a complete list of regions where training is supported see custom neural models.
Document Intelligence SDK version
4.0.0 GA
release:- Document Intelligence client libraries version 4.0.0 (.NET/C#, Java, JavaScript) and version 3.2.0 (Python) are generally available and ready for use in production applications!.
- For more information on Document Intelligence client libraries, see the SDK overview.
- Update your applications using your programming language's migration guide.
August 2022
Document Intelligence SDK beta August 2022 preview release includes the following updates:
Version 4.0.0-beta.5 (2022-08-09)
Document Intelligence v3.0 generally available
- Document Intelligence REST API v3.0 is now generally available and ready for use in production applications! Update your applications with REST API version 2022-08-31.
Document Intelligence Studio updates
- Next steps. Under each model page, the Studio now has a next steps section. Users can quickly reference sample code, troubleshooting guidelines, and pricing information.
- Custom models. The Studio now includes the ability to reorder labels in custom model projects to improve labeling efficiency.
- Copy Models Custom models can be copied across Document Intelligence services from within the Studio. The operation enables the promotion of a trained model to other environments and regions.
- Delete documents. The Studio now supports deleting documents from labeled dataset within custom projects.
Document Intelligence service updates
- prebuilt-read. Read OCR model is now also available in Document Intelligence with paragraphs and language detection as the two new features. Document Intelligence Read targets advanced document scenarios aligned with the broader document intelligence capabilities in Document Intelligence.
- prebuilt-layout. The Layout model extracts paragraphs and whether the extracted text is a paragraph, title, section heading, footnote, page header, page footer, or page number.
- prebuilt-invoice. The TotalVAT and Line/VAT fields now resolves to the existing fields TotalTax and Line/Tax respectively.
- prebuilt-idDocument. Data extraction support for US state ID, social security, and green cards. Support for passport visa information.
- prebuilt-receipt. Expanded locale support for French (fr-FR), Spanish (es-ES), Portuguese (pt-PT), Italian (it-IT) and German (de-DE).
- prebuilt-businessCard. Address parse support to extract subfields for address components like address, city, state, country/region, and zip code.
AI quality improvements
- prebuilt-read. Enhanced support for single characters, handwritten dates, amounts, names, other key data commonly found in receipts and invoices and improved processing of digital PDF documents.
- prebuilt-layout. Support for better detection of cropped tables, borderless tables, and improved recognition of long spanning cells.
- prebuilt-document. Improved value and check box detection.
- custom-neural. Improved accuracy for table detection and extraction.
June 2022
- Document Intelligence SDK beta June 2022 preview release includes the following updates:
Version 4.0.0-beta.4 (2022-06-08)
Document Intelligence Studio June release is the latest update to the Document Intelligence Studio. There are considerable user experience and accessibility improvements addressed in this update:
- Code sample for JavaScript and C#. The Studio code tab now adds JavaScript and C# code samples in addition to the existing Python one.
- New document upload UI. Studio now supports uploading a document with drag & drop into the new upload user interface.
- New feature for custom projects. Custom projects now support creating storage account and blobs when configuring the project. In addition, custom project now supports uploading training files directly within the Studio and copying the existing custom model.
Document Intelligence v3.0 2022-06-30-preview release presents extensive updates across the feature APIs:
- Layout extends structure extraction. Layout now includes added structure elements including sections, section headers, and paragraphs. This update enables finer grain document segmentation scenarios. For a complete list of structure elements identified, see enhanced structure.
- Custom neural model tabular fields support. Custom document models now support tabular fields. Tabular fields by default are also multi page. To learn more about tabular fields in custom neural models, see tabular fields.
- Custom template model tabular fields support for cross page tables. Custom form models now support tabular fields across pages. To learn more about tabular fields in custom template models, see tabular fields.
- Invoice model output now includes general document key-value pairs. Where invoices contain required fields beyond the fields included in the prebuilt model, the general document model supplements the output with key-value pairs. See key value pairs.
- Invoice language expansion. The invoice model includes expanded language support. See supported languages.
- Prebuilt business card now includes Japanese language support. See supported languages.
- Prebuilt ID document model. The ID document model now extracts DateOfIssue, Height, Weight, EyeColor, HairColor, and DocumentDiscriminator from US driver's licenses. See field extraction.
- Read model now supports common Microsoft Office document types. Document types like Word (docx), Excel (xlsx), and PowerPoint (pptx) are now supported with the Read API. See Read data extraction.
February 2022
Version 4.0.0-beta.3 (2022-02-10)
Document Intelligence v3.0 preview release introduces several new features, capabilities, and enhancements:
- Custom neural model or custom document model is a new custom model to extract text and selection marks from structured forms, semi-structured and unstructured documents.
- W-2 prebuilt model is a new prebuilt model to extract fields from W-2 forms for tax reporting and income verification scenarios.
- Read API extracts printed text lines, words, text locations, detected languages, and handwritten text, if detected.
- General document pretrained model is now updated to support selection marks in addition to API text, tables, structure, and key-value pairs from forms and documents.
- Invoice API Invoice prebuilt model expands support to Spanish invoices.
- Document Intelligence Studio adds new demos for Read, W2, Hotel receipt samples, and support for training the new custom neural models.
- Language Expansion Document Intelligence Read, Layout, and Custom Form add support for 42 new languages including Arabic, Hindi, and other languages using Arabic and Devanagari scripts to expand the coverage to 164 languages. Handwritten language support expands to Japanese and Korean.
Get started with the new v3.0 preview API.
Document Intelligence model data extraction:
Model Text extraction Key-Value pairs Selection Marks Tables Signatures Read β General document β β β β Layout β β β Invoice β β β β Receipt β β β ID document β β Business card β β Custom template β β β β β Custom neural β β β β Document Intelligence SDK beta preview release includes the following updates:
Custom Document models and modes:
- Custom template (formerly custom form).
- Custom neural.
- Custom modelβbuild mode.
W-2 prebuilt model (prebuilt-tax.us.w2).
Read prebuilt model (prebuilt-read).
Invoice prebuilt model (Spanish) (prebuilt-invoice).
Next steps
Try processing your own forms and documents with the Document Intelligence Studio.
Complete a Document Intelligence quickstart and get started creating a document processing app in the development language of your choice.
Try processing your own forms and documents with the Document Intelligence Sample Labeling tool.
Complete a Document Intelligence quickstart and get started creating a document processing app in the development language of your choice.