Share via


Document Processor (preview)

[This article is prerelease documentation and is subject to change.]

Publisher: Microsoft

The Document Processor managed agent is an off-the-shelf, packaged solution for end-to-end document processing, including extraction, validation, human monitoring, and exporting to downstream apps. Users don't need to label data or train custom models. Instead, an agent maker can upload a relevant sample document and then configure what to extract in a guided experience. The agent also keeps users informed of the processing status and what data the agent extracts, which can be viewed and manually verified in the Validation Station. Makers can review the steps the agent takes through the Activity page. You can also publish the agent to Microsoft Teams, where users can interact with and upload documents to the Document Processor agent.

You can also customize the agent after installation.

If the agent trigger receives a document that is different from the type used during configuration (for example, the agent was installed and configured with an invoice, but the agent receives a contract), the agent extracts data from the document, but doesn't process it further.

Important

This feature is in preview, and available in English only. This article contains Microsoft Copilot Studio preview documentation and is subject to change.

Preview features aren't meant for production use and may have restricted functionality. These features are available before an official release so that you can get early access and provide feedback. Review the Responsible AI FAQ before using this feature.

If you're building a production-ready agent, see Microsoft Copilot Studio Overview.

Prerequisites

Limitations

Documents to be process must meet the following limitations:

  • Each upload for processing must be less than 25 MB.
  • Files must be PNG, JPG, JPEG, and PDF.
  • Documents must have 50 pages or fewer.
  • Large documents with many data fields can take longer to process and require more manual verification, especially for tables

AI-generated content can have mistakes, so don't forget to make sure it's accurate and appropriate. Review the Supplemental Terms.

Set up your agent

Install and set up connections

Before you can install the Document Processor agent, you need to create and authenticate connections with the required services:

  • Microsoft Dataverse
  • Microsoft Copilot Studio
  • Power Apps for Admins
  1. On the Home or Create pages of Copilot Studio, select Document Processor from the Managed agents list.

    Screenshot showing how to create a Document Processor agent.

  2. Select Install.

  3. Authenticate with the required services. If you see all green checkmarks , you're good to go. If not, select More () to sign in. If you don't have access to Dataverse with your Copilot Studio account, contact your admin for help.

  4. Select Next. Copilot Studio installs your new agent.

  5. To configure your new agent right away, wait for installation, then select Configure. Otherwise, select Close and configure the agent later.

Configure data fields to extract

You can edit what and how the Document Processor agent extracts in this part of the configuration process. The agent determines which fields and data extract using a sample of the document you expect it to process.

  1. In your Document Processor agent, select Upload. Select your sample document and upload it to your agent.
  2. Your agent shows a list of the fields and the data for each field it detected. You can refine what fields and data you want your agent to extract:
    • Deselect any fields you don't want your agent to extract
    • Select Advanced to edit the prompt your agent follows
    • Select Upload another document and compare the results between two documents
    • For any detected tables, select View table to view and deselect table data Screenshot showing a detected data field with a table to view.
  3. When you're finished configuring the fields for extraction, select Next to add validation rules.

Create and test validation rules

Validation rules are the checks you want the Document Processor agent to run on the data it extracts. For example, the agent can validate that the client ID in a sales form already exists in a Dataverse table, or that the date on an invoice follows a regulatory format. If the data doesn't follow the rules, the agent sends a notification to a manual reviewer for validation.

Screenshot showing the validation rules list.

If you don't want to add any validation rules, delete or turn off the sample rules, then select Next.

To add validation rules:

  1. Select Add rule.
  2. In the text field, describe the rule you want to add. For example, Dates must be in ISO86001 format.
  3. Select Add. Screenshot showing how to add a validation rule.
  4. To add more rules, select Add rule again.
  5. After adding all rules, make sure you test them. Select Advanced.
  6. You can enter your rules in the Instructions textbox. You can edit them here, as well. Select Test.
  7. Review the test results in the Model response textbox. If they don't match your expectations, modify and retest the rules. .
  8. To return to the basic validation rules editor, select . Here, you can select to modify each rule. You can also turn off or on individual rules, or select to delete them.
  9. When you're finished, select Next.

You can also add rules using natural language:

  1. Select Advanced.
  2. Describe your rules with your own words in the Instructions textbox.
  3. Select Test.
  4. Review the test results in the Model response textbox. If they don't match your expectations, modify and retest the rules. .
  5. To return to the basic validation rules editor, select . Here, you can select to modify each rule. You can also turn off or on individual rules, or select to delete them.
  6. When you're finished, select Next.

Assign a reviewer

The reviewer receives a notification when a document fails to pass a validation rule. The notification contains a link to the Validation station where the reviewer can view the document, data, and validation results.

  1. Reviewing requires the Approvals connector, and authentication using your credentials for Outlook to send the email notification. A green checkmark means the authentication is set up. If the connector doesn't authenticate, select More (). Screenshot showing a successful Approvals connection.
  2. Add the recipient of the review notification. You can also add a message, such as context or next steps, to include in the notification.

Choose how your agent receives documents

Users don't need to manually upload documents directly to the Document Processor agent. Instead, you set a trigger that prompts the agent to process and validate documents.

For more information on triggers, see Triggers overview.

  1. Select the trigger for how you want your agent to receive documents.
  2. Authenticate the trigger and add the required details.
  3. Select Add trigger. You can add additional triggers, if you want your agent to be triggered from multiple sources.
  4. When you're finished adding triggers, select Next.

On the final page, you can see the extracted data and its location by selecting Check data.

To finish setting up your agent, select Complete.

Test your agent

You can test your agent by sending I want to process a document or something similar in the agent's test chat and uploading a sample document to process. It's a good idea to complete some test runs after making any changes to your agent.

For more information on testing agents, see Test your agent

Monitor processing

The Document Processor agent includes the Document Processing Monitoring and Validation Station. This Power Apps application lets users monitor document processing and view the outcomes of each processing.

To open the app, select monitoring app in the agent's introduction message in the test chat, or open the link in a reviewer notification.

Screenshot showing the monitoring app link in the test chat.

Validate data

When a processing fails a validation rule, the agent sends an email notification to the reviewer designated in the agent configuration. The email shows the data that failed validation and the applicable rule.

The reviewer can approve the data for extraction, or reject the data and prevent the data from being extracted. They can do so in the email itself, or open the link in the notification to open the Document Processing Monitoring and Validation Station to see more details about the data.

From the main page, select one of the files.

Screenshot showing the main page of the monitoring app.

After opening the file from the main page or the verification notification, review the extracted data against the submitted document, then select Approve or Reject.

Screenshot showing the Approve and Reject button on the manual validation page.

Talk to the agent over Teams

You can make your Document Processor agent available to users to talk to, get updates from, and upload documents for processing by publishing to Microsoft Teams.

Customize agent

After setting up your Document Processor agent, you can further customize, modify, and test it.

To make changes to your agent's set up, open the configuration wizard link in the agent's introduction messages in the test chat.

Screenshot showing the location of the intro message in the test chat.

However, some parts of the agent can't be changed or removed, such as the Data Processing Event knowledge source that powers document processing. Changing other parts, such as the actions automatically added to your agent, could affect your agent's performance. Use caution when changing any pieces that were part of the initial agent creation and pieces that determine agent behavior.

Note

If you update a managed agent to a new version, only the uncustomized components receive updates. Customized components don't update to newer versions. For example, if you modify an action included in the managed agent, then update the agent to a new version, the action doesn't receive any changes related to the new version. Adding new components to the agent doesn't affect the ability to receive updates.

For information on customizing your agent, such as changing the name and description, see Create an agent.

You can also add or modify triggers and actions.

AI disclosures

It's a best practice to communicate to users and stakeholders that the agent uses artificial intelligence. Instruct your agent or configure relevant actions to include a message in emails and other media to inform users that the content was produced by an AI. Conversational communication should include the default message: "Just so you are aware, I sometimes use AI to answer your questions."