Επεξεργασία

Κοινή χρήση μέσω


What's new in Azure AI Document Intelligence

This content applies to: checkmark v4.0 (preview) checkmark v3.1 (GA) checkmark v3.0 (GA) checkmark v2.1 (GA)

Document Intelligence service is updated on an ongoing basis. Bookmark this page to stay up to date with release notes, feature enhancements, and our newest documentation.

Important

Preview API versions are retired once the GA API is released. The 2023-02-28-preview API version is being retired, if you are still using the preview API or the associated SDK versions, please update your code to target the latest API version 2023-07-31 (GA).

August 2024

The Document Intelligence 2024-07-31-preview REST API is now available. This preview API introduces new and updated capabilities:

  • Public preview version 2024-07-31-preview is currently available only in the following Azure regions. The new document field extraction model in AI Studio is only available in North Central US region:

  • East US

  • West US2

  • West Europe

  • North Central US

  • 🆕 Document field extraction (custom generative) model

    • Use Generative AI to extract fields from documents and forms. Document Intelligence now offers a new document field extraction model that utilizes large language models (LLMs) to extract fields from unstructured documents or structured forms with a wide variety of visual templates. With grounded values and confidence scores, the new Generative AI based extraction fits into your existing processes.
  • 🆕 Model compose with custom classifiers

    • Document Intelligence now adds support for composing model with an explicit custom classification model. Learn more about the benefits of using the new compose capability.
  • Custom classification model

    • Custom classification model now supports updating the model in-place as well.
    • Custom classification model adds support for model copy operation to enable backup and disaster recovery.
    • Custom classification model now supports explicitly specifying pages to be classified from an input document.
  • 🆕 Mortgage documents model

    • Extract information from Appraisal (Form 1004).
    • Extract information from Validation of Employment (Form 1005).
  • 🆕 Check model

    • Extract payee, amount, date, and other relevant information from checks.​
  • 🆕 Pay Stub model

    • New prebuilt to process pay stubs to extract wages, hours, deductions, net pay and more.​
  • 🆕 Bank statement model

    • New prebuilt to extract account information including beginning and ending balances, transaction details from bank statements.​
  • 🆕 US Tax model

    • New unified US tax model that can extract from forms such as W-2, 1098, 1099, and 1040.
  • 🆕 Searchable PDF. The prebuilt read model now supports PDF output to download PDFs with embedded text from extraction results, allowing for PDF to be utilized in scenarios such as search copy of contents.

  • Layout model now supports improved figure detection where figures from documents can now be downloaded as an image file to be used for further figure understanding. The layout model also features improvements to the OCR model for scanned text targeting improvements for single characters, boxed text, and dense text documents.

  • 🆕 Batch API

    • Document Intelligence now adds support for batch analysis operation to support analyzing a set of documents to simplify developer experience and improve efficiency.
  • Add-on capabilities

    • Query fields AI quality of extraction is improved with the latest model.

May 2024

The Document Intelligence Studio adds support for Microsoft Entra (formerly Azure Active Directory) authentication. For more information, see Document Intelligence Studio overview.

February 2024

The Document Intelligence 2024-07-31-preview REST API is now available. This preview API introduces new and updated capabilities:

  • Public preview version 2024-07-31-preview is currently available only in the following Azure regions:

    • East US
    • West US2
    • West Europe
  • Layout model now supports figure detection and hierarchical document structure analysis (sections and subsections). The AI quality of reading order and logical roles detection is also improved.

  • Custom extraction models

  • Custom classification model

    • Custom classification model now supported incremental training for scenarios where you need to update the classifier model with added samples or classes. Learn more about incremental training.
    • Custom classification model adds support for Office document types (.docx, .pptx, and .xls). Learn more about expanded document type support.
  • Invoice model

    • Support for new locales:
    Locale Code
    Arabic (ar)
    Bulgarian (bg)
    Greek (el)
    Hebrew (he)
    Macedonian (mk)
    Russian (ru) Serbian Cyrillic (sr-cyrl)
    Ukrainian (uk)
    Thai (th)
    Turkish (tr)
    Vietnamese (vi)
    • Support for new currency codes:
    Currency Locale Code
    BAM Bosnian Convertible Mark (ba)
    BGN Bulgarian Lev (bg)
    ILS Israeli New Shekel (il)
    MKD Macedonian Denar (mk)
    RUB Russian Ruble (ru)
    THB Thai Baht (th)
    TRY Turkish Lira (tr)
    UAH Ukrainian Hryvnia (ua)
    VND Vietnamese Dong (vn)
    • Tax items support expansion for Germany (de), Spain (es), Portugal (pt), English Canada en-CA.
  • ID model

  • 🆕 Mortgage documents

    • Extract information from Uniform Residential Loan Application (Form 1003).
    • Extract information from Uniform Underwriting and Transmittal Summary or Form 1008.
    • Extract information from mortgage closing disclosure.
  • 🆕 Credit/Debit card model

    • Extract information from bank cards.
  • 🆕 Marriage certificate

    • New prebuilt to extract information from marriage certificates.

December 2023

The Document Intelligence client libraries targeting REST API 2023-10-31-preview are now available for use!

November 2023

The Document Intelligence 2023-10-31-preview REST API is now available. This preview API introduces new and updated capabilities:

  • Public preview version 2023-10-31-preview is currently only available in the following Azure regions:

    • East US
    • West US2
    • West Europe
  • Read model

    • Language Expansion for Handwriting: Russian(ru), Arabic(ar), Thai(th).
    • Cyber Executive Order (EO) compliance.
  • Layout model

    • Support office and HTML files.
    • Markdown output support.
    • Table extraction, reading order, and section heading detection improvements.
    • With the Document Intelligence 2023-10-31-preview, the general document model (prebuilt-document) is deprecated. Going forward, to extract key-value pairs from documents, use the prebuilt-layout model with the optional query string parameter features=keyValuePairs enabled.
  • Receipt model

    • Now extracts currency for all price-related fields.
  • Health Insurance Card model

    • New field support for Medicare and Medicaid information.
  • US Tax Document models

    • New 1099 tax model. Supports base 1099 form and the following variations: A, B, C, CAP, DIV, G, H, INT, K, LS, LTC, MISC, NEC, OID, PATR, Q, QA, R, S, SA, SB​.
  • Invoice model

    • Support for KVK field.
    • Support for BPAY field.
    • Numerous field refinements.
  • Custom Classification

    • Support for multi-language documents.
    • New page splitting options: autosplit, always split by page, no split.
  • Add-on capabilities

    • Query fields are available with the 2023-10-31-preview release.
    • Add-on capabilities are available within all models excluding the Read model.

Note

With the 2022-08-31 API general availability (GA) release, the associated preview APIs are being deprecated. If you are using the 2021-09-30-preview, the 2022-01-30-preview or he 2022-06-30-preview API versions, please update your applications to target the 2022-08-31 API version. There are a few minor changes involved, for more information, see the migration guide.

July 2023

Note

Form Recognizer is now Azure AI Document Intelligence!

  • Document, Azure AI services encompass all of what were previously known as Cognitive Services and Azure Applied AI Services.
  • There are no changes to pricing.
  • The names Cognitive Services and Azure Applied AI continue to be used in Azure billing, cost analysis, price list, and price APIs.
  • There are no breaking changes to application programming interfaces (APIs) or client libraries.
  • Some platforms are still awaiting the renaming update. All mention of Form Recognizer or Document Intelligence in our documentation refers to the same Azure service.

Document Intelligence v3.1 (GA)

The Document Intelligence version 3.1 API is now generally available (GA)! The API version corresponds to 2023-07-31. The v3.1 API introduces new and updated capabilities:

Document Intelligence Studio UX Updates

✔️ Analyze Options

  • Document Intelligence now supports more sophisticated analysis capabilities and the Studio allows one entry point (Analyze options button) for configuring the add-on capabilities with ease.

  • Depending on the document extraction scenario, configure the analysis range, document page range, optional detection, and premium detection features.

    Animated screenshot showing use of the analyze-options button to configure options in Studio.

    Note

    Font extraction is not visualized in Document Intelligence Studio. However, you can check the styles section of the JSON output for the font detection results.

✔️ Auto labeling documents with prebuilt models or one of your own models

  • In custom extraction model labeling page, you can now auto label your documents using one of Document Intelligent Service prebuilt models or models you previously trained.

    Animated screenshot showing auto labeling in Studio.

  • For some documents, there can be duplicate labels after running auto label. Make sure to modify the labels so that there are no duplicate labels in the labeling page afterwards.

    Screenshot showing duplicate label warning after auto labeling.

✔️ Auto labeling tables

  • In custom extraction model labeling page, you can now auto label the tables in the document without having to label the tables manually.

    Animated screenshot showing auto table labeling in Studio.

✔️ Add test files directly to your training dataset

  • Once you train a custom extraction model, make use of the test page to improve your model quality by uploading test documents to training dataset if needed.

  • If a low confidence score is returned for some labels, make sure your labels are correct. If not, add them to the training dataset and relabel to improve the model quality.

Animated screenshot showing how to add test files to training dataset.

✔️ Make use of the document list options and filters in custom projects

  • Use the custom extraction model labeling page. You can now navigate through your training documents with ease by making use of the search, filter, and sort by feature.

  • Utilize the grid view to preview documents or use the list view to scroll through the documents more easily.

    Screenshot showing document list view options and filters.

✔️ Project sharing

May 2023

Introducing refreshed documentation for Build 2023

April 2023

Announcing the latest Document Intelligence client-library public preview release

March 2023

Important

2023-02-28-preview capabilities are currently only available in the following regions:

  • West Europe
  • West US2
  • East US
  • Custom classification model is a new capability within Document Intelligence starting with the 2023-02-28-preview API.
  • Query fields capabilities added to the General Document model, use Azure OpenAI models to extract specific fields from documents. Try the General documents with query fields feature using the Document Intelligence Studio. Query fields are currently only active for resources in the East US region.
  • Add-on capabilities:
  • Custom extraction model updates:
    • Custom neural model now supports added languages for training and analysis. Train neural models for Dutch, French, German, Italian, and Spanish.
    • Custom template model now has an improved signature detection capability.
  • Document Intelligence Studio updates:
    • In addition to support for all the new features like classification and query fields, the Studio now enables project sharing for custom model projects.
    • New model additions in gated preview: Vaccination cards, Contracts, US Tax 1098, US Tax 1098-E, and US Tax 1098-T. To request access to gated preview models, complete and submit the Document Intelligence private preview request form.
  • Receipt model updates:
    • Receipt model adds support for thermal receipts.
    • Receipt model now adds language support for 18 languages and three regional languages (English, French, Portuguese).
    • Receipt model now supports TaxDetails extraction.
  • Layout model now improves table recognition.
  • Read model now adds improvement for single-digit character recognition.

February 2023


January 2023

  • Prebuilt receipt model - added languages supported. The receipt model now supports these added languages and locales

    • Japanese - Japan (ja-JP)
    • French - Canada (fr-CA)
    • Dutch - Netherlands (nl-NL)
    • English - United Arab Emirates (en-AE)
    • Portuguese - Brazil (pt-BR)
  • Prebuilt invoice model - added languages supported. The invoice model now supports these added languages and locales

    • English - United States (en-US), Australia (en-AU), Canada (en-CA), United Kingdom (en-UK), India (en-IN)
    • Spanish - Spain (es-ES)
    • French - France (fr-FR)
    • Italian - Italy (it-IT)
    • Portuguese - Portugal (pt-PT)
    • Dutch - Netherlands (nl-NL)
  • Prebuilt invoice model - added fields recognized. The invoice model now recognizes these added fields

    • Currency code
    • Payment options
    • Total discount
    • Tax items (en-IN only)
  • Prebuilt ID model - added document types supported. The ID model now supports these added document types

    • US Military ID

Tip

All January 2023 updates are available with REST API version 2022-08-31 (GA).

  • Prebuilt receipt model—additional language support:

    The prebuilt receipt model adds support for the following languages:

    • English - United Arab Emirates (en-AE)
    • Dutch - Netherlands (nl-NL)
    • French - Canada (fr-CA)
    • German - (de-DE)
    • Italian - (it-IT)
    • Japanese - Japan (ja-JP)
    • Portuguese - Brazil (pt-BR)
  • Prebuilt invoice model—additional language support and field extractions

    The prebuilt invoice model adds support for the following languages:

    • English - Australia (en-AU), Canada (en-CA), United Kingdom (en-UK), India (en-IN)
    • Portuguese - Brazil (pt-BR)

    The prebuilt invoice model now adds support for the following field extractions:

    • Currency code
    • Payment options
    • Total discount
    • Tax items (en-IN only)
  • Prebuilt ID document model—additional document types support

    The prebuilt ID document model now adds support for the following document types:

    • Driver's license expansion supporting India, Canada, United Kingdom, and Australia
    • US military ID cards and documents
    • India ID cards and documents (PAN and Aadhaar)
    • Australia ID cards and documents (photo card, Key-pass ID)
    • Canada ID cards and documents (identification card, Maple card)
    • United Kingdom ID cards and documents (national/regional identity card)

December 2022

  • Document Intelligence Studio updates

    The December Document Intelligence Studio release includes the latest updates to Document Intelligence Studio. There are significant improvements to user experience, primarily with custom model labeling support.

    • Page range. The Studio now supports analyzing specified pages from a document.

    • Custom model labeling:

      • Run Layout API automatically. You can opt to run the Layout API for all documents automatically in your blob storage during the setup process for custom model.

      • Search. The Studio now includes search functionality to locate words within a document. This improvement allows for easier navigation while labeling.

      • Navigation. You can select labels to target labeled words within a document.

      • Auto table labeling. After you select the table icon within a document, you can opt to autolabel the extracted table in the labeling view.

      • Label subtypes and second-level subtypes The Studio now supports subtypes for table columns, table rows, and second-level subtypes for types such as dates and numbers.

  • Building custom neural models is now supported in the US Gov Virginia region.

  • Preview API versions 2022-01-30-preview and 2021-09-30-preview will be retired January 31 2023. Update to the 2022-08-31 API version to avoid any service disruptions.


November 2022

  • Announcing the latest stable release of Azure AI Document Intelligence libraries
    • This release includes important changes and updates for .NET, Java, JavaScript, and Python client libraries. For more information, see Azure SDK DevBlog.
    • The most significant enhancements are the introduction of two new clients, the DocumentAnalysisClient and the DocumentModelAdministrationClient.

October 2022

  • Document Intelligence versioned content

    • Document Intelligence documentation is updated to present a versioned experience. Now, you can choose to view content targeting the v3.0 GA experience or the v2.1 GA experience. The v3.0 experience is the default.

      Screenshot of the Document Intelligence landing page denoting the version dropdown menu.

  • Document Intelligence Studio Sample Code

    • Sample code for the Document Intelligence Studio labeling experience is now available on GitHub. Customers can develop and integrate Document Intelligence into their own UX or build their own new UX using the Document Intelligence Studio sample code.
  • Language expansion

    • With the latest preview release, Document Intelligence's Read (OCR), Layout, and Custom template models support 134 new languages. These language additions include Greek, Latvian, Serbian, Thai, Ukrainian, and Vietnamese, along with several Latin, and Cyrillic languages. Document Intelligence now has a total of 299 supported languages across the most recent GA and new preview versions. Refer to the supported languages pages to see all supported languages.
    • Use the REST API parameter api-version=2022-06-30-preview when using the API or the corresponding SDK to support the new languages in your applications.
  • New Prebuilt Contract model

    • A new prebuilt that extracts information from contracts such as parties, title, contract ID, execution date and more. the contracts model is currently in preview, request access here.
  • Region expansion for training custom neural models

    • Training custom neural models now supported in added regions.
      • East US
      • East US2
      • US Gov Arizona

September 2022

Note

Starting with version 4.0.0, a new set of clients has been introduced to leverage the newest features of the Document Intelligence service.

SDK version 4.0.0 GA release includes the following updates:

  • Version 4.0.0 GA (2022-09-08)
  • Supports REST API v3.0 and v2.0 clients

Package (NuGet)

Changelog/Release History

Migration guide

ReadMe

Samples

  • Region expansion for training custom neural models now supported in six new regions

    • Australia East
    • Central US
    • East Asia
    • France Central
    • UK South
    • West US2
    • For a complete list of regions where training is supported see custom neural models.

    • Document Intelligence SDK version 4.0.0 GA release:

      • Document Intelligence client libraries version 4.0.0 (.NET/C#, Java, JavaScript) and version 3.2.0 (Python) are generally available and ready for use in production applications!.
      • For more information on Document Intelligence client libraries, see the SDK overview.
      • Update your applications using your programming language's migration guide.

August 2022

Document Intelligence SDK beta August 2022 preview release includes the following updates:

Version 4.0.0-beta.5 (2022-08-09)

Changelog/Release History

Package (NuGet)

SDK reference documentation

  • Document Intelligence v3.0 generally available

    • Document Intelligence REST API v3.0 is now generally available and ready for use in production applications! Update your applications with REST API version 2022-08-31.
  • Document Intelligence Studio updates

    • Next steps. Under each model page, the Studio now has a next steps section. Users can quickly reference sample code, troubleshooting guidelines, and pricing information.
    • Custom models. The Studio now includes the ability to reorder labels in custom model projects to improve labeling efficiency.
    • Copy Models Custom models can be copied across Document Intelligence services from within the Studio. The operation enables the promotion of a trained model to other environments and regions.
    • Delete documents. The Studio now supports deleting documents from labeled dataset within custom projects.
  • Document Intelligence service updates

    • prebuilt-read. Read OCR model is now also available in Document Intelligence with paragraphs and language detection as the two new features. Document Intelligence Read targets advanced document scenarios aligned with the broader document intelligence capabilities in Document Intelligence.
    • prebuilt-layout. The Layout model extracts paragraphs and whether the extracted text is a paragraph, title, section heading, footnote, page header, page footer, or page number.
    • prebuilt-invoice. The TotalVAT and Line/VAT fields now resolves to the existing fields TotalTax and Line/Tax respectively.
    • prebuilt-idDocument. Data extraction support for US state ID, social security, and green cards. Support for passport visa information.
    • prebuilt-receipt. Expanded locale support for French (fr-FR), Spanish (es-ES), Portuguese (pt-PT), Italian (it-IT) and German (de-DE).
    • prebuilt-businessCard. Address parse support to extract subfields for address components like address, city, state, country/region, and zip code.
  • AI quality improvements

    • prebuilt-read. Enhanced support for single characters, handwritten dates, amounts, names, other key data commonly found in receipts and invoices and improved processing of digital PDF documents.
    • prebuilt-layout. Support for better detection of cropped tables, borderless tables, and improved recognition of long spanning cells.
    • prebuilt-document. Improved value and check box detection.
    • custom-neural. Improved accuracy for table detection and extraction.

June 2022

  • Document Intelligence SDK beta June 2022 preview release includes the following updates:

February 2022

  • Document Intelligence v3.0 preview release introduces several new features, capabilities, and enhancements:

    • Custom neural model or custom document model is a new custom model to extract text and selection marks from structured forms, semi-structured and unstructured documents.
    • W-2 prebuilt model is a new prebuilt model to extract fields from W-2 forms for tax reporting and income verification scenarios.
    • Read API extracts printed text lines, words, text locations, detected languages, and handwritten text, if detected.
    • General document pretrained model is now updated to support selection marks in addition to API text, tables, structure, and key-value pairs from forms and documents.
    • Invoice API Invoice prebuilt model expands support to Spanish invoices.
    • Document Intelligence Studio adds new demos for Read, W2, Hotel receipt samples, and support for training the new custom neural models.
    • Language Expansion Document Intelligence Read, Layout, and Custom Form add support for 42 new languages including Arabic, Hindi, and other languages using Arabic and Devanagari scripts to expand the coverage to 164 languages. Handwritten language support expands to Japanese and Korean.
  • Get started with the new v3.0 preview API.

  • Document Intelligence model data extraction:

    Model Text extraction Key-Value pairs Selection Marks Tables Signatures
    Read
    General document
    Layout
    Invoice
    Receipt
    ID document
    Business card
    Custom template
    Custom neural
  • Document Intelligence SDK beta preview release includes the following updates:


Next steps