Document Intelligence Studio - How to label automatically? Custom extraction model

Anonymous
2024-09-23T10:59:19.2333333+00:00

Im currently working on a custom extraction model to classify different parts of a press release text.

When I was working with the language studio to do a custom text classification I was able to automatically classify text by passing a json file with the respective labels contained in said json file. Therefore speeding up the process of labeling files.

Is there a way to do that with the custom extraction model aswell, with an API for instance?

I have only found the option of auto labeling a document with a prebuilt model.

Azure AI Document Intelligence
Azure AI Document Intelligence
An Azure service that turns documents into usable data. Previously known as Azure Form Recognizer.
2,100 questions
{count} votes

1 answer

Sort by: Most helpful
  1. santoshkc 15,325 Reputation points Microsoft External Staff Moderator
    2024-09-26T10:22:51.9433333+00:00

    Hi @Pavith Vickneswararajah,

    Thank you for reaching out to Microsoft Q&A forum!

    In Azure Document Intelligence Studio, the process for auto-labeling with a custom extraction model is somewhat different from what you experienced in Azure Language Studio. Here’s how you can enable auto-labeling for your custom extraction model:

    1. Initial Manual Labeling: Start by manually labeling a small set of documents within Document Intelligence Studio. This initial step is crucial as it provides the model with examples to learn from.
    2. Training the Model: After you have labeled a few documents, train the custom extraction model using these examples. This training process allows the model to understand the specific data structures and labels associated with your documents.
    3. Utilizing Auto-Labeling: Once your custom model is trained, you can use it to automatically label new documents. The model will apply the learned labels based on the patterns and structures it recognized during the training phase.
    4. Automation with APIs: Currently, Azure Document Intelligence does not provide an API for fully automatic labeling similar to how JSON files are used in Azure Language Studio. However, you can streamline the labeling process by creating scripts to format your labeled data and integrate it into the custom model training workflow.

    I hope you understand. Thank you.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.