Document Classification

Patrick Gonzalez 6 Reputation points
2022-09-27T01:54:04.98+00:00

What would be the most straightforward Azure service(s) to use in order to classify documents, such as 'Contract', 'Invoice', 'Insurance Policy', 'Court Document'? Suppose I wanted to classify documents (mainly pdf's), but not necessarily need to extract data from them? Would I first need go through an OCR process, then analyze the text, or is there an Azure AI service that can be used to identify and classify the documents directly?

Azure AI Document Intelligence
Azure AI Document Intelligence
An Azure service that turns documents into usable data. Previously known as Azure Form Recognizer.
2,100 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Ramr-msft 17,826 Reputation points
    2022-09-27T11:15:29.213+00:00

    @Patrick Gonzalez Thanks for the question. Currently LUIS document classification is in preview.

    There’s a solution accelerator for Knowledge Mining that could fit your case: https://github.com/microsoft/Accelerator-AzureML_CognitiveSearch
    It combines reading text from documents using Azure Search’s OCR capabilities (as suggested below) + training and deploying a Natural Language Processing model using Azure Machine Learning.

    In the example the model is doing Named Entity Recognition, not classification, but you could replace it by a classification model. Reference materials I can think of:


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.