@Patrick Gonzalez Thanks for the question. Currently LUIS document classification is in preview.
There’s a solution accelerator for Knowledge Mining that could fit your case: https://github.com/microsoft/Accelerator-AzureML_CognitiveSearch
It combines reading text from documents using Azure Search’s OCR capabilities (as suggested below) + training and deploying a Natural Language Processing model using Azure Machine Learning.
In the example the model is doing Named Entity Recognition, not classification, but you could replace it by a classification model. Reference materials I can think of:
- Automatically train a classification model using AutoML in AML: https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/automated-machine-learning/classification-text-dnn/auto-ml-classification-text-dnn.ipynb
- Training a custom classification model: check NLP recipes https://github.com/microsoft/nlp-recipes/tree/master/examples/text_classification
or
JFK Files (jfk-demo.azurewebsites.net)
It uses Azure Cognitive Search + Key Phrase Extraction (Azure Text Analytics Service) to do some groupings of the data.