What is document and conversation summarization (preview)?

Important

  • To use conversation summarization, you must submit an online request and have it approved.
  • Conversation summarization is only available through Language resources in the following regions:
    • North Europe
    • East US
    • UK South
  • Conversation summarization is only available using:
    • REST API
    • Python
    • C#

Summarization is one of the features offered by Azure Cognitive Service for Language, a collection of machine learning and AI algorithms in the cloud for developing intelligent applications that involve written language. Use this article to learn more about this feature, and how to use it in your applications.

This documentation contains the following article types:

  • Quickstarts are getting-started instructions to guide you through making requests to the service.
  • How-to guides contain instructions for using the service in more specific or customized ways.

Text summarization is a broad topic, consisting of several approaches to represent relevant information in text. The document summarization feature described in this documentation enables you to use extractive text summarization to produce a summary of a document. It extracts sentences that collectively represent the most important or relevant information within the original content. This feature is designed to shorten content that could be considered too long to read. For example, it can condense articles, papers, or documents to key sentences.

As an example, consider the following paragraph of text:

"We’re delighted to announce that Cognitive Service for Language service now supports extractive summarization! In general, there are two approaches for automatic document summarization: extractive and abstractive. This feature provides extractive summarization. Document summarization is a feature that produces a text summary by extracting sentences that collectively represent the most important or relevant information within the original content. This feature is designed to shorten content that could be considered too long to read. Extractive summarization condenses articles, papers, or documents to key sentences."

The document summarization feature would simplify the text into the following key sentences:

A simple example of the document summarization feature.

Key features

Document summarization supports the following features:

  • Extracted sentences: These sentences collectively convey the main idea of the document. They’re original sentences extracted from the input document’s content.
  • Rank score: The rank score indicates how relevant a sentence is to a document's main topic. Document summarization ranks extracted sentences, and you can determine whether they're returned in the order they appear, or according to their rank.
  • Maximum sentences: Determine the maximum number of sentences to be returned. For example, if you request a three-sentence summary Document summarization will return the three highest scored sentences.
  • Positional information: The start position and length of extracted sentences.

Get started with summarization

To use this feature, you submit raw unstructured text for analysis and handle the API output in your application. Analysis is performed as-is, with no additional customization to the model used on your data. There are two ways to use summarization:

Development option Description Links
Language Studio A web-based platform that enables you to try document summarization without needing writing code. Language Studio website
Quickstart: Use Language Studio
REST API or Client library (Azure SDK) Integrate document summarization into your applications using the REST API, or the client library available in a variety of languages. Quickstart: Use document summarization

Input requirements and service limits

  • Summarization takes raw unstructured text for analysis. See Data and service limits in the how-to guide for more information.
  • Summarization works with a variety of written languages. See language support for more information.

Reference documentation and code samples

As you use document summarization in your applications, see the following reference documentation and samples for Azure Cognitive Services for Language:

Development option / language Reference documentation Samples
REST API REST API documentation
C# C# documentation C# samples
Java Java documentation Java Samples
JavaScript JavaScript documentation JavaScript samples
Python Python documentation Python samples

Responsible AI

An AI system includes not only the technology, but also the people who will use it, the people who will be affected by it, and the environment in which it’s deployed. Read the transparency note for summarization to learn about responsible AI use and deployment in your systems. You can also see the following articles for more information: