Microsoft AI for contract document entity extraction

Sooraj Sudhakaran 1 Reputation point
2022-02-01T18:11:09.447+00:00

We are trying to use Microsoft AI for contract document custom entity extraction. For example when a contract document is uploaded, we have to extract the Party Name, Address , Effective Date , etc. .
Any help in identifying the right API is highly appreciated.
Is there any trained model available for contract document extraction ?

Azure AI services
Azure AI services
A group of Azure services, SDKs, and APIs designed to make apps more intelligent, engaging, and discoverable.
2,406 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. YutongTie-MSFT 46,986 Reputation points
    2022-02-02T01:14:08.913+00:00

    @Sooraj Sudhakaran

    Thanks for reaching out to us. As you mentioned, you want a trained model to extract content directly. I would recommend you to try Form Recognizer or Named Entity Recognition (NER).

    For Named Entity Recognition, the difference between it and Custom Entity Extraction is, it uses the default model, you don't need to train it: https://learn.microsoft.com/en-us/azure/cognitive-services/language-service/named-entity-recognition/overview

    For Form Recognizer, I recommend general document model, the General document preview model combines powerful Optical Character Recognition (OCR) capabilities with deep learning models to extract key-value pairs and entities from documents. General document is only available with the preview (v3.0) API.
    https://learn.microsoft.com/en-us/azure/applied-ai-services/form-recognizer/concept-general-document

    If you need better performance, custom NER what you have tried should be a better choice, it enables its users to build custom AI models to extract domain-specific entities from unstructured text, such as contracts or financial documents. By creating a Custom NER project, developers can iteratively tag data, train, evaluate, and improve model performance before making it available for consumption. The quality of the tagged data greatly impacts model performance.
    https://learn.microsoft.com/en-us/azure/cognitive-services/language-service/custom-named-entity-recognition/overview

    development-lifecycle.png

    What you need to do is basically tagging your data. Please try above to see which is the best choice for your business.

    Hope this helps, please let us know if you need further assistance.

    Please kindly accept the answer if you feel helpful, thank you !

    Regards,
    Yutong