Entities Extraction - Form Recognizer VS NER in Cognitive Service ?

Mani Wadhwa 1 Reputation point
2022-03-11T11:22:05.113+00:00

Hi,

I have a requirement to identify different entities in a set of documents. I came across two services in Azure that are seems to be useful for our use case - Form Recognizer and NER in Cognitive Service

The latest version(2021-09-30-preview) of Azure Form Recognizer provides DocumentAnalysisClient which can be used to extract entities from a given PDF/Image. I tried this feature by referring to one of the samples in Python using the "prebuilt-document" model for analyzing general document - https://github.com/Azure/azure-sdk-for-python/blob/23decbe4b61626b6a37f1f23dcf18514a2f445a5/sdk/formrecognizer/azure-ai-formrecognizer/samples/v3.2-beta/sample_analyze_general_documents.py#L48
https://pypi.org/project/azure-ai-formrecognizer/3.2.0b3/

I know this feature is currently in Beta Version(Pre-Release) but I must say the results are very good. The best part I liked about this service is that it is taking care of extracting the content from PDF with OCR Capabilities and we can extract different Entities with Page Numbers where those entities are found.

Now the thing is, there is another service - NER in Cognitive which also can be used to extract entities. I tried this service as well in Python using TextAnalytics client but found that this service takes text as an input and there is a restriction on the length of the document text that can be used as an input to this service(5120 characters). With this limitation in place in NER, I find the entity extraction in Form Recognizer better.

Also, I found that there is a slight difference in the result of the Confidence Score of identified Entities generated by Form Recognizer and NER in Cognitive.

Now what I'm looking forward to is to understand the following things -

  • Is New Document Analysis(Entities Extraction) API in Azure Form Recognizer based upon NER in Cognitive?
  • Is the algorithm behind Entities Extraction the same in Form Recognizer and NER?
  • Which seems to be better when it comes to entity recognition - Form Recognizer or NER in Cognitive? Is there any documentation available explaining the difference between these two choices?
  • Why entity recognization feature is introduced in Form Recognizer when there was already a somewhat similar service available in form of NER in Cognitive?

I believe the answer to these above questions will definitely help us to choose the right service for our product.

Azure AI Language
Azure AI Language
An Azure service that provides natural language capabilities including sentiment analysis, entity extraction, and automated question answering.
364 questions
Azure AI Document Intelligence
Azure AI Document Intelligence
An Azure service that turns documents into usable data. Previously known as Azure Form Recognizer.
1,443 questions
Azure AI services
Azure AI services
A group of Azure services, SDKs, and APIs designed to make apps more intelligent, engaging, and discoverable.
2,449 questions
{count} votes

1 answer

Sort by: Most helpful
  1. YutongTie-MSFT 46,996 Reputation points
    2022-03-15T18:13:41.233+00:00

    Hello @Mani Wadhwa

    Thank you for the patience. I just got confirmation from the pm of Form Recognizer, the function of Entity Extraction is actually leveraging the NER of Language Service:
    https://learn.microsoft.com/en-us/azure/applied-ai-services/form-recognizer/concept-general-document#entities

    To answer your question clear:

    Is New Document Analysis(Entities Extraction) API in Azure Form Recognizer based upon NER in Cognitive? The Entity Extraction function of FR is leveraging NER of Language Service.

    Is the algorithm behind Entities Extraction the same in Form Recognizer and NER? Sure.

    Which seems to be better when it comes to entity recognition - Form Recognizer or NER in Cognitive? Is there any documentation available explaining the difference between these two choices?
    The functions itself are the same, so to choose the right product, it depends on your scenario. If you are doing form analysis, it's clear Form Recognizer will be better since it integrate a lot of functions to under the table.

    Why entity recognition feature is introduced in Form Recognizer when there was already a somewhat similar service available in form of NER in Cognitive?
    The feature is leveraged from Language Service for Form Recognition scenario, we want our customers who is focusing on forms can accomplish their workflow in one place.

    Hope this helps!

    Regards,
    Yutong

    -Please kindly accept the answer if you feel helpful.