Document intelligence workflows

The Document intelligence component includes three levels of workflows: Document intelligence core flows, custom pipeline flows, and custom pipeline step flows.

Document intelligence flows.

Document intelligence core workflows

Core workflows are the main workflows placed as part of the Document intelligence core solution. These workflows are not customizable and are responsible for the core steps through which every document passes. In addition, they include child flows and helpers for building custom workflows and steps.

The following table lists the core workflows:

Order Task name Description Dependency
1 Pre-pipeline Initiates the data in Dataverse as part of the main pipeline (for internal use)
2 Post-pipeline Saves the output data to Dataverse as part of the custom pipeline implementation
3 Post-pipeline step Saves the output data to Dataverse as part of the step implementation
4 Pre-pipeline step Initiates the data in Dataverse as part of the step implementation
5 Document AI builder step An out of the box step that is used for running an AI builder model (the model should be configured in step definition entity) Format-extracted data AI builder
Pre-pipeline step
Post-pipeline step
6 Format-extracted data AI builder Format the AI builder output (for internal use)
7 Get pipeline details Get the pipeline details such as the file, the extraction output, and so on.
8 Main pipeline Core implementation of the Document intelligence solution Post-pipeline
Pre-pipeline
Document AI builder step

Custom pipelines (sub-workflow)

Custom pipelines are flows configured to run automatically for a specific document definition.

The business admin defines the document definition. A document definition is a record that is set per document type and its automatic workflow. The business admin can define each document type multiple times as different document definition records if the document type requires different automatic workflows in multiple business scenarios.

The administrator must define the sub-workflow for each document type. This sub-workflow includes the following: the document journey for a specific document type, the steps (custom pipeline steps) that the document passes through, and the business recommendation logic that is used for the document status recommendation. Each business recommendation can be automatic or semi-automatic. If the recommendation is automatic, it automatically modifies the document status without human intervention. If the recommendation is semi-automatic, the pipeline document status appears in the Document tab as a recommendation to assist the agent with the journey insights.

Custom pipeline steps

Custom pipeline steps are flows that are configured to trigger manually, and are used in the custom pipelines (sub-workflow).

Each step must use the Pre-pipeline step and Post-pipeline step core flows to indicate the status using the Step definition id and the Pipeline id from the parent flow. The custom pipelines flow that runs these steps gathers the results of all the steps and determines the complete pipeline status using Post-pipeline core flow.

Screenshot of pipeline results visualization.

Out-of-the-box AI Builder step for extraction

The Document AI builder step is a part of core flows, and it can run an AI builder model.

Using the Document AI builder step is a faster way to define a step that hosts an AI Builder model as part of a document journey because you need not manually define the custom pipeline step. You only need to configure the Model definition.

In case the AI model is an extraction model, you can use the Document AI builder step and also skip the Custom pipelines flow configuration.

Note

Ensure that the AI model is configured in the Pipeline step definition table for a specific step. The AI result is returned and saved in the Raw output field.

Using the out-of-the-box AI builder step for extraction

If the step definition type is Extraction and the AI model chosen by the document is a model supported for formatting (currently, pre-built Identity document reader and Custom document processing), the flow returns a structured output in the Output field. In addition, a Pipeline state is also returned based on the thresholds defined in the step. The result of the formatted output is shown in the user interface under the Extracted details tab.

Note

  • If the extraction step is defined as the first step for the document definition using the Order column, the step is run automatically by the main pipeline. You need not add the step to the custom pipeline as it will be configured in the custom pipelines flow.
  • Each document definition can include only one extraction step.

Screenshot of the Extracted information section.

Supporting information

Supporting information is defined in the step definition table of Enrichment step type. The business admin can enrich the extracted data for further use to ease the agent verification process by adding calculated fields or supporting information from the data. For example, information extracted from a payslip document can help classifying the salary as high, medium, or low. You can add the salary classification to the extracted data display or add information like credit score to the extracted data for further use.

Note

Only one step for each document type can be defined as an enrichment step. The enrichment step always has to go along with an extraction step. After a step is defined as Enrichment, the step output can be displayed in the Document management UI control as part of Document preview in the Extracted Details tab, under Supporting information.

Visualizing the supporting information output

The business admin must perform the following steps for visualizing the supporting information output:

  1. Define the fields required to display in the user interface using the Step field definition table.

  2. Configure the Custom pipeline steps.

  3. Configure the step in the custom pipeline, as specified in Custom pipelines (sub-workflow).

  4. Post the step results with the structured output.

The following image shows the Extracted details tab with the Supporting information section:

Screenshot of the Supporting information section.

Structured output

Custom steps can produce a structured output, which is a consistent representation of the raw output corresponding to the step field definitions. The structured output enables the user interface and other processes to make use of the fields in an easy and organized manner.

Note

  • The current release supports the extraction of fields. Table extraction is not yet supported.

  • Each step can post the output in addition to the raw output. However, the user interface shows only single extraction and single enrichment outputs.

Enrichment output JSON schema

TypeScript

export interface IEnrichmentOutput{ 
    fields: { 
        [field_external_id: string]: { 
            value: string; 
            originalValue: string; 
        } 
    } 
} 

Power Automate

{ 
    "type": "object", 
    "properties": { 
        "fields": { 
            "type": "object", 
            "properties": { 
                "[field_external_id]": { 
                    "type": "object", 
                    "properties": { 
                        "value": { 
                            "type": "string" 
                        }, 
                        "originalValue": { 
                            "type": "string" 
                        }, 
                    } 
                } 
            } 
        } 
    } 
} 

Extraction output JSON schema

TypeScript

export interface IExtractionOutput { 
    pageCount: number; 
    collection: string; 
    collectionConfidence: number; 
    fields: { 
        [field_external_id: string]: { 
            value: string; 
            originalValue: string; 
            confidence: number; 
        }; 
    }; 
} 

Power Automate

{ 
    "type": "object", 
    "properties": { 
        "pageCount": { 
            "type": "integer" 
        }, 
        "collection": { 
            "type": "string" 
        }, 
        "collectionConfidence": { 
            "type": "number" 
        }, 
        "fields": { 
            "type": "object", 
            "properties": { 
                "[field_external_id]": { 
                    "type": "object", 
                    "properties": { 
                        "value": { 
                            "type": "string" 
                        }, 

                        "originalValue": { 
                            "type": "string" 
                        }, 
                        "confidence": { 
                            "type": "number" 
                        } 
                    } 
                } 
            } 
        } 
    } 
}

See also

Document intelligence
Main data model entities for Document intelligence
Configure a document definition
Support
Deploy Microsoft Cloud for Financial Services
What is Microsoft Cloud for Financial Services?