Custom template document model

This article applies to: Form Recognizer v3.0 checkmark Form Recognizer v3.0.

Custom template (formerly custom form) is an easy-to-train document model that accurately extracts labeled key-value pairs, selection marks, tables, regions, and signatures from documents. Template models use layout cues to extract values from documents and are suitable to extract fields from highly structured documents with defined visual templates.

Custom template models share the same labeling format and strategy as custom neural models, with support for more field types and languages.

Model capabilities

Custom template models support key-value pairs, selection marks, tables, signature fields, and selected regions.

Form fields Selection marks Tabular fields (Tables) Signature Selected regions
Supported Supported Supported Supported Supported

Build mode

The build custom model operation has added support for the template and neural custom models. Previous versions of the REST API and SDKs only supported a single build mode that is now known as the template mode.

Template models only accept documents that have the same basic page structure—a uniform visual appearance—or the same relative positioning of elements within the document. For more information, see Custom model build mode

Tabular fields

With the release of API versions 2022-06-30-preview and later, custom template models will add support for cross page tabular fields (tables):

  • To label a table that spans multiple pages, label each row of the table across the different pages in a single table.
  • As a best practice, ensure that your dataset contains a few samples of the expected variations. For example, include samples where the entire table is on a single page and where tables span two or more pages if you expect to see those variations in documents.

Tabular fields are also useful when extracting repeating information within a document that isn't recognized as a table. For example, a repeating section of work experiences in a resume can be labeled and extracted as a tabular field.

Dealing with variations

Template models rely on a defined visual template, changes to the template will result in lower accuracy. In those instances, split your training dataset to include at least five samples of each template and train a model for each of the variations. You can then compose the models into a single endpoint. For subtle variations, like digital PDF documents and images, it's best to include at least five examples of each type in the same training dataset.

Training a model

Template models are available generally v3.0 API and v2.1 API. If you're starting with a new project or have an existing labeled dataset, work with the v3 API and Form Recognizer Studio to train a custom template model.

Model REST API SDK Label and Test Models
Custom template Form Recognizer 3.0 Form Recognizer SDK Form Recognizer Studio
Custom template Form Recognizer 2.1 Form Recognizer SDK Form Recognizer Sample labeling tool

On the v3 API, the build operation to train model supports a new buildMode property, to train a custom template model, set the buildMode to template.

https://{endpoint}/formrecognizer/documentModels:build?api-version=2022-08-31

{
  "modelId": "string",
  "description": "string",
  "buildMode": "template",
  "azureBlobSource":
  {
    "containerUrl": "string",
    "prefix": "string"
  }
}

Next steps

Learn to create and compose custom models: