Should I use one custom model or multiple custom models for varying tax deduction forms?

Shahar Spencer 40 Reputation points

I have a use case for tax deduction forms that have significant variation among them. They might be multi-page and have different structures. The general outlay is a table with some possible extra information around it.

There are a list of columns that should be in each of them, but many of the documents lack some columns or have the information elsewhere in the document.

I am wondering whether I should train multiple template models since there is such high variation, or only train one model and give as many possible variations in that single model. Accuracy is crucial, and in case of less than 100% accuracy, I will want to send the document for manual review.

one of my considerations is how much post-processing i will have to do. for example, in the case that all rows have some single column value, the value might just appear at the top of the document rather than as a column of itself; in this case, i have to recognize that this happened and do post-processing. if i have a seperate model detecting this it might be made easier.

Thank you!

Azure AI Document Intelligence
Azure AI Document Intelligence
An Azure service that turns documents into usable data. Previously known as Azure Form Recognizer.
1,328 questions
{count} votes

Accepted answer
  1. Konstantinos Passadis 17,201 Reputation points

    Hello @Shahar Spencer

    Welcome to Microsoft QnA!

    Deciding whether to train multiple template models or a single model to handle variations in tax deduction forms is a common challenge in document processing, especially when dealing with highly variable formats. Here are some considerations to help you decide the best approach for your use case:

    1. Variability of Forms: Multiple Models: If the variations in forms are significantly different in layout and structure, training separate models for each type of form might yield higher accuracy. This approach is beneficial when the documents can be easily categorized into distinct types. Single Model: If the variations are more about the presence or absence of certain fields rather than entirely different layouts, a single, more robust model might be sufficient.
    2. Post-Processing Efforts: Training multiple models could potentially reduce the amount of post-processing needed, as each model would be tailored to a specific form type. However, this comes at the cost of increased complexity in maintaining and updating multiple models. A single model might require more sophisticated post-processing logic to handle the variations, but it simplifies the model management aspect.
    3. Training Data Availability: More models mean you need enough representative data for each type of form. Ensure you have sufficient and diverse data samples for each category. A single model requires diverse data that covers all variations and combinations seen in the forms.
    4. Accuracy and Manual Review: If accuracy is crucial, you might lean towards multiple models, as this could provide more precise results for each form type. Implement a confidence scoring mechanism. Documents that the model processes with low confidence can be flagged for manual review.
    5. Scalability and Maintenance: Multiple models can be challenging to scale and maintain, especially if new form types are introduced or existing forms are updated. A single model is easier to update and scale, but it must be robust enough to handle all variations.


    I hope this helps

    The answer or portions of it may have been assisted by AI Source: ChatGPT Subscription

    Kindly mark the answer as Accepted and Upvote in case it helped or post your feedback to help !


    0 comments No comments

1 additional answer

Sort by: Most helpful
  1. VasaviLankipalle-MSFT 13,686 Reputation points

    Hello @Shahar Spencer , Thanks for using Microsoft Q&A Platform.

    Based on your use case, it looks like you have a variation in your tax deduction forms, with different structures. As Custom neural models support structured, semi-structured, and unstructured documents to extract fields. You can choose this model if that fits your use case.

    It is recommended to train multiple template models, each with a specific structure. This will ensure that each model is optimized for a specific type of form and can improve the accuracy of the extraction process. Otherwise, it can result in lower accuracy for the model.

    To address this, you can split your training dataset into different variations of the template, with at least five samples of each variation. Once you have trained the individual models, you can then compose them into a single endpoint.

    Regarding post-processing, it will totally depend on the specific details of your use case.

    Try to work with the latest API version and see if that helps.

    I hope this helps.



    -Please kindly accept the answer and vote 'yes' if you feel helpful to support the community, thanks.

    0 comments No comments