Create a document processing custom model

After you review the requirements, you can get started creating your document processing model.

Sign in to AI Builder

  1. Sign in to Power Apps or Power Automate.

  2. On the left pane, select ... More > AI hub.

  3. Under Discover an AI capability, select AI models.

    (Optional) To keep AI models permanently on the menu for easy access, select the pin icon.

  4. Select Extract custom information from documents.

  5. Select Create custom model.

  6. A step-by-step wizard walks you through the process by asking you to list all data you want to extract from your document. If you want to create your model by using your own documents, make sure you have at least five examples that use the same layout. Otherwise, you can use sample data to create the model.

  7. Select Train.

  8. Test the model by selecting Quick test.

Select the type of document

On the Choose document type step, select the type of document you want to build an AI model to automate data extraction. There are three options:

  • Fixed-template documents: Previously known as Structured, this option is ideal when, for a given layout, the fields, tables, checkboxes, and other items can be found in similar places. You can teach this model to extract data from structured documents that have different layouts. This model has a quick training time.

  • General documents: Previously known as Unstructured, this option is ideal for any kind of documents, especially when there's no set structure, or when the format is complex. You can teach this model to extract data from structured or unstructured documents that have different layouts. This model is powerful, but has long training time.

  • Invoices: Augment the behaviors of the prebuilt invoice processing model by adding new fields to be extracted in addition to the ones by default, or samples of documents not properly extracted.

    Screenshot of the tiles under 'Select the type of documents your model will process'.

Define information to extract

On the Choose information to extract screen, define the fields, tables, and checkboxes you want to teach your model to extract. Select the +Add button to start defining these.

Screenshot of the step in the document processing wizard where you define the fields, tables, and checkboxes you want the AI model to extract.

  1. For each Text field, provide a name for the field to use in the model.

  2. For each Number field, provide a name for the field to use in the model.

    Also, define the format dot (.) or comma (,) as a decimal separator.

  3. For each Date field, provide a name for the field to use in the model.

    Also, define the format (Year, Month, Day), or (Monthly, Day, Year), or (Day, Month, Year).

  4. For each Checkbox, provide a name for the checkbox to use in the model.

    Define separate checkboxes for each item that can be checked in a document.

  5. For each Table, provide the name for the table.

    Also, define the different columns that the model should extract.

Note

The custom invoices model comes with default fields that can’t be edited.

Group documents by collections

A collection is a group of documents that share the same layout. Create as many collections as document layouts that you want your model to process. For example, if you're building an AI model to process invoices from two different vendors, each having their own invoice template, create two collections.

Animation of creating collections.

For each collection that you create, you need to upload at least five sample documents per collection. Files with formats JPG, PNG, and PDF files are currently accepted.

Animation of uploading documents.

Note

You can create up to 200 collections per model.

Next step

Tag documents in a document processing model

See also

Training: Process custom documents with AI Builder (module)