Unable to analyze all the mappers in the Document Intelligence custom model.

Niket Kumar Singh 655 Reputation points
2024-10-29T05:11:51.35+00:00

We have built a custom model for BOE entity, the model has been trained successfully. It functions correctly with single-page PDFs, where all mappers are labeled correctly. However, with multiple-page BOEs, not all entries are being labeled.User's image

User's image

Azure AI Document Intelligence
Azure AI Document Intelligence
An Azure service that turns documents into usable data. Previously known as Azure Form Recognizer.
2,087 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Deepanshukatara-6769 15,920 Reputation points Moderator
    2024-10-29T08:18:30.6766667+00:00

    Hello Niket, Welcome to MS Q&A

    To handle multiple-page PDFs in Azure AI Document Intelligence for a custom model, you should note the following:

    1. Page Limit: For PDF documents, you can process up to 2,000 pages. However, if you are using a free tier subscription, only the first two pages will be processed.
    2. Training Data: When training a custom model, the maximum number of pages for training data is 500 for a custom template extraction model and 50,000 for a custom neural model. Ensure that your training documents are structured consistently to improve accuracy.
    3. File Size: The file size for analyzing documents is limited to 500 MB for paid tiers and 4 MB for free tiers.
    4. Visual Consistency: Ensure that the documents you use for training present a consistent visual template. This helps in accurately extracting the BOE entity and other labeled data.

    By following these guidelines, you can effectively manage multiple-page PDFs in your custom model for BOE entity extraction.

    References:

    Please check and let us know if any further questions

    Kindly accept answer if it helps

    Thanks
    Deepanshu


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.