Issue with Document Intelligence Custom Model for BOE Extraction and Alternative Solutions

Niket Kumar Singh 300 Reputation points
2024-07-07T07:59:14.26+00:00

I'm facing issues with a custom Document Intelligence model configured to extract fields from Bills of Entry (BOEs) presented in tables. Specifically, the model:

  • Fails to extract all fields accurately from BOEs in table format.
  • Does not label all fields correctly when processing multiple pages of BOEs.
  • (my data is in pdf extracting in excel, trained the model working fine with single page pdf but not with multiple pages)

Could you please provide guidance on optimizing the model settings or any specific considerations for handling multi-page documents in Document Intelligence? Additionally, I'm interested in learning about alternative services or workarounds that could achieve similar functionality.

Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
2,598 questions
Azure AI Document Intelligence
Azure AI Document Intelligence
An Azure service that turns documents into usable data. Previously known as Azure Form Recognizer.
1,526 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Azar 21,960 Reputation points MVP
    2024-07-07T19:04:24.5633333+00:00

    Hi there Niket Kumar Singh

    Thanks for using QandA platfrom

    It sounds like the issues with your custom Document Intelligence model for extracting fields from multi-page BOEs.

    make sure the model is trained with a diverse dataset including multi-page documents. and Check if there are any specific settings for handling multi-page documents in the model configuration.

    • Split the PDF into individual pages before processing and then merge the extracted data.
    • Use a pre-processing step to label and segment data across pages consistently.

    for the alternative solution i guess form recogniser its Good for extracting structured data from forms and tables.

    If this helps kindly accept the answer thanks much.