Choose an Azure AI targeted language processing technology

Azure AI services help developers and organizations rapidly create intelligent, cutting-edge, market-ready, and responsible applications with out-of-the-box and prebuilt and customizable APIs and models.

This article covers Azure AI services that offer targeted language processing capabilities such as natural language processing (NLP), text analytics, language understanding, translation, and document data extraction. Azure AI Language is one of the broadest categories in Azure AI services. You can use the APIs in your workload to incorporate language features like named entity recognition, sentiment analysis, language detection, and text summarization.

Services

The following services provide targeted language processing capabilities for Azure AI services:

  • Azure AI Language provides natural language processing for text analysis.

    • Use the Azure AI Language service when you need to work with structured or unstructured documents for the wide array of language related tasks described.
    • Don't use Language service if you need to search documents with chat, check them for content safety, or translate them.
  • Azure AI Translator is a machine translation service. It can perform real-time text translation, batch and single file document translation, and custom translations that allow you to incorporate specialized terminology or industry-specific language for your scenario. It supports many languages.

    • Use Translator service when you need to perform translation specifically. While you could use other general purpose foundation language models to perform translation, using the translator for its specialized purpose can prove more reliably effective and can be more cost effective by using targeted translation models.
    • Don't use Translator service if you need engage with chat, to analyze content for sentiment, or for content moderation. For sentiment analysis, use the Language service instead. For content moderation, use the Content Safety service.
  • Azure AI Document Intelligence is a service that can convert images directly into electronic forms. You can specify expected fields and then searches images you provide to capture those fields without human intervention. The service hosts many prebuilt models, and also allows you to build custom form models of your own.

    • Use Document Intelligence service when you know exactly which fields you need to extract from scanned documents to fill electronic forms appropriately.
    • Use Document Intelligence to identify key structures (headers, footers, chapter breaks, and so on) in diverse document corpuses to further programmatically interact with the document, such as in a retrieval augmented generation (RAG) implementation.
    • Don't use Document Intelligence service as a real-time search API.

Azure AI Language

Azure AI Language is a cloud-based service that provides Natural Language Processing (NLP) features for understanding and analyzing text. Use this service to help build intelligent applications using the web-based Language Studio, REST APIs, and client libraries.

Capabilities

The following table provides a list of capabilities available in Azure AI Language service.

Capability Description
Custom question answering Finds the most appropriate answer for inputs from your users, and is commonly used to build conversational client applications, such as social media applications, chat bots, and speech-enabled desktop applications.
Custom text classification Use to build custom AI models to classify unstructured text documents into custom classes you define.
Conversational language understanding (CLU) Use to build custom natural language understanding models to predict the overall intention of an incoming utterance and extract important information from it.
Entity linking Disambiguates the identity of entities (words or phrases) found in unstructured text and returns links to Wikipedia.
Language detection Detects the language a document is written in, and returns a language code for a wide range of languages, variants, dialects, and some regional/cultural languages.
Key phrase extraction Evaluates and returns the main concepts in unstructured text, and returns them as a list.
Named entity recognition (NER) Categorizes entities (words or phrases) in unstructured text across several predefined category groups. For example: people, events, places, dates, and more.
Orchestration workflow Use to connect Conversational Language Understanding (CLU).
Personally identifying (PII) and health (PHI) information detection Identifies, categorizes, and redacts sensitive information in both unstructured text documents, and conversation transcripts. For example: phone numbers, email addresses, forms of identification, and more.
Sentiment analysis and opinion mining Help you find out what people think of your brand or topic by mining text for clues about positive or negative sentiment, and can associate them with specific aspects of the text.
Summarization Uses extractive text summarization to produce a summary of documents and conversation transcriptions. It extracts sentences that collectively represent the most important or relevant information within the original content.
Text analysis for health Extracts and labels relevant medical information from unstructured texts such as doctor's notes, discharge summaries, clinical documents, and electronic health records. When designing your workload, evaluate the processing location and data residency of this cloud-hosted feature to ensure it aligns with your compliance expectations. Some workloads might be restricted in their capacity to send healthcare data to to a cloud-hosted platform. You can use this API as a docker container to host in your own compute in the cloud or on-premises, which might help address compliance concerns involving PaaS. For more information, see Use Text Analytics for health containers

Use cases

The following table provides a list of possible use cases for Azure AI Language service.

Use case Customizable*
Predict the intention of user inputs and extract information from them. Yes
Identify and/or redact sensitive information such as PII.
Identify the language that a text was written in.
Extract medical information from clinical/medical documents, without building a model
Extract medical information from clinical/medical documents using a model that's trained on your data. Yes
Extract categories of information without creating a custom model.
Extract categories of information using a model specific to your data. Yes
Extract main topics and important phrases.
Summarize a document
Classify text by using sentiment analysis. Yes
Classify text by using custom classes. Yes
Classify items into categories provided at inference time.
Link an entity with knowledge base articles.
Understand questions and answers (generic). Yes
Build a conversational application that responds to user inputs.
Connect apps from conversational language understanding and question answering. Yes

*If a feature is customizable, you can train an AI model using our tools to fit your data specifically. Otherwise a feature is preconfigured, meaning the AI models it uses cannot be changed. You just send your data, and use the feature's output in your applications.

Azure AI Translator

Azure AI Translator is a machine translation service that is part of the Azure AI services family. Translator powers many Microsoft products and services.

Capabilities

The following table provides a list of capabilities available in Azure AI Translator service.

Capability Description
Text Translation Execute text translation between supported source and target languages in real time. Create a dynamic dictionary and learn how to prevent translations using the Translator API.
Document Translation Asynchronous batch translation: Translate batch and complex files while preserving the structure and format of the original documents. The batch translation process requires an Azure Blob storage account with containers for your source and translated documents.
Synchronous single file translation: Translate a single document file alone or with a glossary file while preserving the structure and format of the original document. The file translation process doesn't require an Azure Blob storage account. The final response contains the translated document and is returned directly to the calling client.
Custom Translator Build customized models to translate domain- and industry-specific language, terminology, and style. Create a dictionary (phrase or sentence) for custom translations.

Use cases

The following table provides a list of possible use cases for Azure AI Translator service.

Use case Documentation
Translate industry-specific text. AI Services Custom Translator
Translate generic text that isn't specific to an industry. What is Azure Text Translation

Azure AI Document Intelligence

Azure AI Language is a cloud-based service that provides Natural Language Processing (NLP) features for understanding and analyzing text. Use this service to help build intelligent applications using the web-based Language Studio, REST APIs, and client libraries.

Capabilities

The following table provides a list of some of the capabilities available in AI Document Intelligence service.

Capability Description
Business card extraction The Document Intelligence business card model combines Optical Character Recognition (OCR) capabilities with deep learning models to analyze and extract data from business card images. The API analyzes printed business cards; extracts key information such as first name, surname, company name, email address, and phone number; and returns a structured JSON data representation.
Contract model extraction The Document Intelligence contract model uses Optical Character Recognition (OCR) capabilities to analyze and extract key fields and line items from a select group of important contract entities. Contracts can be of various formats and quality including phone-captured images, scanned documents, and digital PDFs. The API analyzes document text; extracts key information such as Parties, Jurisdictions, Contract ID, and Title; and returns a structured JSON data representation. The model currently supports English-language document formats.
Credit card extraction The Document Intelligence credit/debit card model uses Optical Character Recognition (OCR) capabilities to analyze and extract key fields from credit and debit cards. Credit cards and debit cards can be of various formats and quality including phone-captured images, scanned documents, and digital PDFs. The API analyzes document text; extracts key information such as Card Number, Issuing Bank, and Expiration Date; and returns a structured JSON data representation. The model currently supports English-language document formats.
Health insurance card extraction The Document Intelligence health insurance card model combines Optical Character Recognition (OCR) capabilities with deep learning models to analyze and extract key information from US health insurance cards. A health insurance card is a key document for care processing and can be digitally analyzed for patient onboarding, financial coverage information, cashless payments, and insurance claim processing. The health insurance card model analyzes health card images; extracts key information such as insurer, member, prescription, and group number; and returns a structured JSON representation. Health insurance cards can be presented in various formats and quality including phone-captured images, scanned documents, and digital PDFs.
US tax document extraction The Document Intelligence contract model uses Optical Character Recognition (OCR) capabilities to analyze and extract key fields and line items from a select group of tax documents. Tax documents can be of various formats and quality including phone-captured images, scanned documents, and digital PDFs. The API analyzes document text; extracts key information such as customer name, billing address, due date, and amount due; and returns a structured JSON data representation. The model currently supports certain English tax document formats.
Many more... Azure AI Document Intelligence supports a wide variety of models that enable you to add intelligent document processing to your apps and flows. You can use a prebuilt domain-specific model or train a custom model tailored to your specific business need and use cases. Document Intelligence can be used with the REST API or Python, C#, Java, and JavaScript client libraries.

To learn more about how to choose a model that works for your scenario, see Which model should I choose?

Next steps