An Azure service that turns documents into usable data. Previously known as Azure Form Recognizer.
Clarification Needed: Retaining Labeled Data Across Versions in Custom Extraction Model
I’m working on a project using Azure Document Intelligence (Custom Extraction model), and I had a question regarding how labeled data is handled across model versions.
We initially trained a custom extraction model using a set of labeled documents and were able to test it successfully. Later, based on our testing results, we found the need to include a few additional document types and proceeded to label them using the same labeling project and layout.
However, after completing the new labeling and initiating a new training session, we noticed two things:
Only the newly labeled documents appeared in the label layout — the previously labeled documents from the original training set were no longer visible.
We were required to enter a new model name each time we trained, which resulted in the creation of a completely new model, rather than an updated version of the existing one.
It appears that the model is being retrained only on the new set of documents, and not on the full accumulated data — even though we are working within the same labeling project.
Since we expect to regularly update the model with new document formats over time, we were hoping there would be a way to keep the labeled data consolidated and continuously build upon the existing model, rather than creating disconnected versions each time.
Could you please clarify:
- Is there a supported way to continuously update or extend an existing model with new labeled documents?
- How can we ensure that the model gets trained on the entire labeled dataset, not just the newly added documents?
- Is there a way to view or manage the full labeled dataset across training versions within the same project?
- What’s the recommended best practice if we want to maintain and grow a single evolving model over time?
We’d appreciate your guidance, as this is a key part of our use case.
Thank you in advance for your support.