Build and deploy custom document processing models on Azure

Azure AI Document Intelligence
Azure AI services
Azure Logic Apps
Azure Machine Learning Studio
Azure Storage

This article describes Azure solutions for building, training, deploying, and using custom document processing models. These Azure services also provide user interface (UI) capabilities for labeling or tagging text during processing.

Architecture

Diagram that shows several alternatives for a custom document processing model build and deployment process.

Download a Visio file of this architecture.

Dataflow

The following dataflow corresponds to the previous diagram:

  1. Orchestrators like Azure Logic Apps, Azure Data Factory, or Azure Functions ingest messages and attachments from email servers and files from file transfer protocol servers or web applications.

    • Functions and Logic Apps enable serverless workloads. The service that you choose depends on your preference for service capabilities like development, connectors, management, and operational context. For more information, see Compare Functions and Logic Apps.

    • Consider using Azure Data Factory to move data in bulk.

  2. The orchestrators send ingested data to Azure Blob Storage or Azure Data Lake Storage. They organize the data within these stores based on characteristics like file extensions or customer details.

  3. You can use the following Azure services, either independently or in combination, for training documents and building custom models to address various use cases.

    • Document Intelligence Studio: If the document requires you to extract key-value pairs or create a custom table from an image or PDF, use Document Intelligence Studio to tag the data and train the custom model. If there's a requirement to identify the type of document, called document classification, before you invoke the correct extraction model, use Document Intelligent Studio to label the documents and build the models.

    • Language Studio: For document classification based on content, or for domain-specific entity extraction, you can train a custom text classification or named entity recognition (NER) model in Language Studio.

    • Azure Machine Learning studio: For labeling data for text classification or entity extraction to use with open-source frameworks like PyTorch or TensorFlow, use Machine Learning studio, the Python SDK, Azure CLI, or the REST API. Machine Learning studio provides a model catalog of foundation models. These foundation models have fine-tuning capabilities for various tasks like text classification, question answering, and summarization. To fine-tune foundation models, use the Machine Learning studio UI or code.

    • Azure OpenAI Service: To fine-tune Azure OpenAI models on your own data or domain for various tasks like text summarization and question answering, use Azure AI Foundry portal, Python SDK, or REST API.

  4. To deploy the custom models and use them for inferencing:

    • Azure AI Document Intelligence has built-in model deployment. Inferencing with the custom models is done by using SDKs or document models REST API. The modelId, or model name, specified during model creation is included in the request URL for document analysis. Document Intelligence doesn't require any further deployment steps.

    • Language Studio provides an option to deploy custom language models. Get the REST endpoint prediction URL by selecting the model for deployment. You can inference models by using either the REST endpoint or the Azure SDK client libraries.

    • Machine Learning deploys custom models to online or batch Machine Learning managed endpoints. You can also use the Machine Learning SDK to deploy to Azure Kubernetes Service (AKS) as a web service. Fine-tuned foundation models can be deployed from the model catalog via managed compute or a serverless API. Models deployed through managed compute can be inferenced by using managed endpoints, which include online endpoints for real-time inferencing and batch endpoints for batch inferencing.

    • Azure AI Foundry provides options to deploy fine-tuned Azure OpenAI models. You can also deploy fine-tuned Azure OpenAI models by using the Python SDK or REST API.

Components

  • Logic Apps is part of Azure Integration Services. Logic Apps creates automated workflows that integrate apps, data, services, and systems. You can use managed connectors for services like Azure Storage and Microsoft 365 to trigger workflows when a file arrives in the storage account or an email is received.

  • Azure Data Factory is a managed cloud extract, transform, and load service for data integration and transformation. Azure Data Factory can add transformation activities to a pipeline that include invoking a REST endpoint or running a notebook on the ingested data.

  • Functions is a serverless compute service that can host event-driven workloads that have short-lived processes.

  • Blob Storage is the object storage solution for raw files in this scenario. Blob Storage supports libraries for multiple languages, such as .NET, Node.js, and Python. Applications can access files on Blob Storage via HTTP or HTTPS. Blob Storage has hot, cool, and archive access tiers to support cost optimization for storing large amounts of data.

  • Data Lake Storage is a set of capabilities built on Blob Storage for big data analytics. Data Lake Storage maintains the cost effectiveness of Blob Storage and provides features like file-level security and file system semantics with a hierarchical namespace.

  • Document Intelligence is a component of Azure AI services. Document Intelligence has built-in document analysis capabilities for extracting printed and handwritten text, tables, and key-value pairs. Document Intelligence has prebuilt models for extracting data from invoices, documents, receipts, ID cards, and business cards. Document Intelligence also has a custom template form model and a custom neural document model that you can use to train and deploy custom models.

  • Document Intelligence Studio provides an interface to explore Document Intelligence features and models. It also enables you to build, tag, train, and deploy custom models.

  • Azure AI Language consolidates the Azure natural language processing (NLP) services. The suite provides prebuilt and customizable options.

  • Language Studio provides a UI that you can use to explore and analyze Language features. It also provides options for building, tagging, training, and deploying custom models.

  • Azure Machine Learning is a managed machine learning platform for model development and deployment at scale.

    • Machine Learning studio provides data labeling options for images and text.

    • Export labeled data as COCO or Machine Learning datasets. You can use these datasets to train and deploy models in Machine Learning notebooks.

  • Azure OpenAI provides powerful language models and multimodal models as REST APIs that you can use to perform various tasks. Specific models can be fine-tuned to improve the model performance on data that's missing or underrepresented when the base model is originally trained.

Alternatives

You can add more workflows to this scenario based on specific use cases.

Scenario details

Document processing covers a wide range of tasks. It can be difficult to meet all your document processing needs by using the prebuilt models available in Language and Document Intelligence. You might need to build custom models to automate document processing for different applications and domains.

Major challenges in model customization include:

  • Labeling or tagging text data with relevant key-value pair entities to classify text for extraction.

  • Managing training infrastructure, such as compute and storage, and their integrations.

  • Deploying models securely at scale for seamless integration with consuming applications.

Potential use cases

The following use cases can take advantage of custom models for document processing:

  • Build custom NER and text classification models based on open-source frameworks.

  • Extract custom key values from documents for various industry verticals like insurance and healthcare.

  • Tag and extract specific domain-dependent entities beyond the prebuilt NER models for domains like security or finance.

  • Create custom tables from documents.

  • Extract signatures.

  • Label and classify emails or other documents based on content.

  • Summarize documents or create custom question-and-answer models based on your data.

Considerations

These considerations implement the pillars of the Azure Well-Architected Framework, which is a set of guiding tenets that you can use to improve the quality of a workload. For more information, see Well-Architected Framework.

For this example workload, implementing each pillar depends on optimally configuring and using each component Azure service.

Reliability

Reliability helps ensure that your application can meet the commitments that you make to your customers. For more information, see Design review checklist for Reliability.

Availability

Resiliency

Security

Security provides assurances against deliberate attacks and the misuse of your valuable data and systems. For more information, see Design review checklist for Security.

Implement data protection, identity and access management, and network security recommendations for Blob Storage, AI services for Document Intelligence and Language Studio, Machine Learning, and Azure OpenAI.

Cost Optimization

Cost Optimization focuses on ways to reduce unnecessary expenses and improve operational efficiencies. For more information, see Design review checklist for Cost Optimization.

The total cost of implementing this solution depends on the pricing of the services that you choose.

The major costs for this solution include:

For more information about pricing for specific components, see the following resources:

Use the Azure pricing calculator to add the component options that you choose and estimate the overall cost of the solution.

Performance Efficiency

Performance Efficiency refers to your workload's ability to scale to meet user demands efficiently. For more information, see Design review checklist for Performance Efficiency.

Scalability

Contributors

Microsoft maintains this article. The following contributors wrote this article.

Principal author:

To see nonpublic LinkedIn profiles, sign in to LinkedIn.

Next steps