Document intelligence architecture (preview)

In the ever-evolving financial services industry, efficient and accurate management of documents, especially as part of onboarding processes and a range of other business processes is crucial and often a regulatory or legal requirement. The Document intelligence control uses the latest technological developments to empower onboarding agents to handle documents in a seamless and efficient way. This innovative component streamlines document processing, including document verification, AI-driven data extraction, and extraction of data points provided by the documents.

The predefined Document intelligence flows help the business admin to set up a smart and highly automated process for each document definition within a business scenario. The smart automated process eases the work of the agents by providing them with automatic insights generation, automatic document data extraction, and extra supporting information for downstream process use based on the configuration of the respective flow. Generated insights and outputs are displayed in the Document management user interface control.

Architecture

The Document intelligence building block consists of the following solution components:

  • Data Models: Document Core and Document Intelligence Data Models including solution-aware entities represented as a component in a solution.
  • Power Apps Control Framework (PCF) Controls: Visualizing document requests and intelligence details associated with a single document
  • Power Automate flows: Automated document processing framework that supports a hierarchical set of document processing flows (Document intelligence core flows, custom pipeline flows, and custom pipeline step flows).
  • Out-of-the-box AI Models: Using native integration with AI Builder built-in models such as Identity Document Reader to extract data from the identity document and Document Processing for data extraction models.

You can experience these Document intelligence controls using the Loan Onboarding Sample Application (preview). This sample application is a model-driven app to show how Onboarding essentials and Document intelligence controls can be configured and extended to support a sample loan onboarding scenario.

The following diagram shows the Document Intelligence solution architecture with built-in workflows, data model, loan onboarding sample application, and integration to AI Builder.

A diagram showing the solution components of document intelligence building block.

Download a printable PDF of this solution architecture diagram.

The rest of this article discusses the component layers that compose the solution architecture layer.

User interface

The Document intelligence toolkit allows you to create model-driven app components that interface with end users, and integrate with apps that need document processing. For example, onboarding apps in the financial services context. These model-driven app components include the following built-in Power Apps Component Framework (PCF) controls that have configurable parameters:

Built-in control Description Configuration capabilities
Document Intelligence Control The Document intelligence management control allows you to show the document request entities as a card list grouped by statuses.
You can either embed the control to a form subgrid with a document-related context (the form entity) or to the Documents main grid without context.
You can configure the document-related context. The document-related context table must have a relationship with the polymorphic Regarding field of document request table.
Document Intelligence Detail Control The Document intelligence details control enables you to show the intelligence details associated with a single document.
You can either embed the control to a document request form or to a document request lookup in a different form.
- Set the referenced document request id field and associated document table
- Show/hide description in document request
- Set the selected tab in the dialog (Review status or Extracted details)

For more information on extending the user interface, see Design best practices for Document Intelligence and Set up your Onboarding model-driven app. The articles explain how to create an onboarding application using Onboarding essentials and Document intelligence controls.

Data layer

You can deploy the Document intelligence data model using the Onboarding essentials solution package. The solution package includes different data model solutions layering on top of each other to construct specific capabilities. The following data models, in the order of solution layering, form the Document intelligence capabilities:

The following figure illustrates the data models and tables involved in the Document intelligence solution, and their relationships and key fields for implementation.

A diagram showing the relationships and key fields of document intelligence implementation.

You can download the visio diagram of the entity-relationship diagram (ERD).

Some of the Document intelligence tables are solution-aware. You can add a record of that table as a component in the solution. Adding the record helps to package the configurations within solution files and easily promote the configuration between environments. For more information, you can refer to solution-aware entities.

Data models are deployed in the Dataverse environment database. For extensibility, you can add new fields to existing tables in the data models or you can create relationships to new custom or existing tables. Currently, some of those relationships are already defined. For example, between the Onboarding essentials data model and the entities belonging to the loan onboarding sample application via Document request and Document definition table to support document intelligence for onboarding applications.

For more information on extending the data layer, such as adding a new document related-context table, see Design best practices for Document intelligence.

AI models

The solution comes with configuration data to utilize the prebuilt AI models in AI Builder such as ID reader. You can take a configure-first approach by using these prebuilt models before customization. However, you need to validate if the AI model supports the language, format and size, throttle limits and scope (such as, ID reader supporting only passports and some valid US identity documents).

For more information on using existing AI models for new document types or using your own custom AI models, see Design best practices for Document Intelligence.

Business logic

The Document intelligence workflow comprises a series of flows that enable the business administrator to design a customized document journey for each document definition. The workflow incorporates multi-tiered verification steps, such as categorization, data extraction, and a document status recommendation and verification (either fully or partially), as specified in the following image:

A diagram showing the document intelligence flow

The Document intelligence component includes three levels of workflows: Document intelligence core flows, custom pipeline flows, and custom pipeline step flows.

Level-1 Document intelligence core flows

Core flows are the main workflows placed as part of the Document intelligence core solution. These workflows aren't customizable and are responsible for the core steps through which every document passes. Core flows also have specific flows (Document AI builder step, Format-extracted data AI builder) to execute and format AI Builder model outputs. The core flows include child flows to trigger the custom pipeline by document definition. Refer to Document intelligence workflows for details on core flows.

Level-2 Custom pipeline flow

You need to define a custom pipeline flow and custom pipeline flow steps while extending the solution with a new document type with automation. Document types with no automation don't require any flow steps.

The update on document pipeline table's Trigger custom pipeline on column needs to trigger custom pipeline flow. The column is updated after each document upload. The custom flow should include actions to call child custom step pipeline flows (level-3) like enrichment and verification steps. Refer to Configure a document definition for how to create custom pipeline flow and custom pipeline steps.

Level-3 Custom pipeline step flow

Custom pipeline steps are defined in Pipeline step definition table associated to document type definition added into Document definition table.

You need to create custom pipeline step flow for Enrichment and Other steps configured in Pipeline step definition table. However, you need not create a custom flow for the Extraction step if you've configured it to use prebuilt AI models and it's set as the initial step in the custom pipeline step definition (with an order of 0).

The built-in core flow of the Document AI builder step handles the extraction step. The core flow extracts data from the document. The Format-extracted data AI builder core flow formats the AI builder's output and writes it to Output and Raw Output fields in the Document pipeline step table.

The status of the extracted step is determined based on the thresholds (Success, Failure) configured in the Pipeline step definition table. These thresholds are compared against the confidence score generated for the collection type attribute and the confidence scores of the extracted attributes. Finally, you can retrieve this output within the Custom pipeline flow using the built-in Get pipeline details flow.

The Loan onboarding sample app solution ships sample custom pipeline flows, custom pipeline steps, and configuration of Document intelligence flow for the Identity document (supported by ID reader). These sample flows can help you understand the extension path for new document types. Document intelligence flow for the Identity document includes three steps: Extraction step, Support information (Enrichment) step, and Verification step. These steps are defined in Pipeline step definition table associated to Identification record added into Document definition table. Following are the related custom pipeline flow and custom pipeline step flows shared in the sample application:

  • Identification document pipeline: An update on the Document pipeline trigger custom pipeline on column for a given document definition of Identity Document triggers the Identification document pipeline flow. This flow executes the overall enrichment and verification step for the document. It uses the output of the data extraction step (confidence score) for verification, and accordingly updates the document status in Document pipeline step (pipeline step document state field), Document pipeline (pipeline document state field), and Document Request (State field) tables.

  • Identification document enrichment step: This flow is triggered from Identification document pipeline flow and it brings in credit score information from Know Your Customer (KYC) table via calculated fields in Related Party table. The flow passes this information to post-pipeline step to record it in Output and Raw Output fields in Document Pipeline Step table. This flow can be extended to bring in any other supporting information from within Dataverse or from your master systems or send the information to other systems use. Refer to Support information (Enrichment) step for details.

  • Identification document type verification step: Checking the confidence score of the data extraction output and verify if the document is an ID document. For more information, see Verification step.

For more information on additional automation extension scenarios, see Design best practices for Document Intelligence.

Integration

You might need to integrate the Document Intelligence solution with your existing system of records for onboarding applications and your document management systems to synchronize the information. You can see Operational data estate for design guidance of your integration needs with other operational systems.

The built-in solution exposes the following custom APIs for deleting and uploading documents. These APIs can be used to integrate with other systems that originate the document request.

A diagram showing the two custom APIs (UploadDocument and DeleteDocument) that can be used for integration to submit and delete documents

Download a printable PDF of this diagram.

Custom API Bound To Entity Description
Upload Document (msfsi_UploadDocument) Document Request (msfsi_documentrequest) Upload new document to an existing document request with file body, file name and file type
Delete Document (msfsi_DeleteDocument) Document Request (msfsi_documentrequest) Delete document's related records such as document request.

Next step