What is document and conversation summarization?

Article
01/31/2024

Important

Our preview region, Sweden Central, showcases our latest and continually evolving LLM fine tuning techniques based on GPT models. You are welcome to try them out with a Language resource in the Sweden Central region.

Conversation summarization is only available using:

REST API
Python
C#

Summarization is one of the features offered by Azure AI Language, a collection of machine learning and AI algorithms in the cloud for developing intelligent applications that involve written language. Use this article to learn more about this feature, and how to use it in your applications.

Though the services are labeled document and conversation summarization, document summarization only accepts plain text blocks, and conversation summarization accept various speech artifacts in order for the model to learn more. If you want to process a conversation but only care about text, you can use document summarization for that scenario.

Document summarization
Conversation summarization

This documentation contains the following article types:

Quickstarts are getting-started instructions to guide you through making requests to the service.
How-to guides contain instructions for using the service in more specific or customized ways.

Document summarization uses natural language processing techniques to generate a summary for documents. There are three supported API approaches to automatic summarization: extractive, abstractive and query-focused.

Extractive summarization extracts sentences that collectively represent the most important or relevant information within the original content. Abstractive summarization generates a summary with concise, coherent sentences or words that aren't verbatim extract sentences from the original document. These features are designed to shorten content that could be considered too long to read.

Native document support

A native document refers to the file format used to create the original document such as Microsoft Word (docx) or a portable document file (pdf). Native document support eliminates the need for text preprocessing prior to using Azure AI Language resource capabilities. Currently, native document support is available for both AbstractiveSummarization and ExtractiveSummarization capabilities.

Currently Document Summarization supports the following native document formats:

File type	File extension	Description
Text	`.txt`	An unformatted text document.
Adobe PDF	`.pdf`	A portable document file formatted document.
Microsoft Word	`.docx`	A Microsoft Word document file.

For more information, see Use native documents for language processing

Key features

There are the aspects of document summarization this API provides:

Extractive summarization: Produces a summary by extracting salient sentences within the document.
- Multiple extracted sentences: These sentences collectively convey the main idea of the document. They're original sentences extracted from the input document's content.
- Rank score: The rank score indicates how relevant a sentence is to a document's main topic. Document summarization ranks extracted sentences, and you can determine whether they're returned in the order they appear, or according to their rank.
- Multiple returned sentences: Determine the maximum number of sentences to be returned. For example, if you request a three-sentence summary extractive summarization returns the three highest scored sentences.
- Positional information: The start position and length of extracted sentences.
Abstractive summarization: Generates a summary that doesn't use the same words as in the document, but captures the main idea.
- Summary texts: Abstractive summarization returns a summary for each contextual input range within the document. A long document can be segmented so multiple groups of summary texts can be returned with their contextual input range.
- Contextual input range: The range within the input document that was used to generate the summary text.
Query-focused summarization: Generates a summary based on a query

As an example, consider the following paragraph of text:

"At Microsoft, we are on a quest to advance AI beyond existing techniques, by taking a more holistic, human-centric approach to learning and understanding. As Chief Technology Officer of Azure AI services, I have been working with a team of amazing scientists and engineers to turn this quest into a reality. In my role, I enjoy a unique perspective in viewing the relationship among three attributes of human cognition: monolingual text (X), audio or visual sensory signals, (Y) and multilingual (Z). At the intersection of all three, there's magic—what we call XYZ-code as illustrated in Figure 1—a joint representation to create more powerful AI that can speak, hear, see, and understand humans better. We believe XYZ-code enables us to fulfill our long-term vision: cross-domain transfer learning, spanning modalities and languages. The goal is to have pretrained models that can jointly learn representations to support a broad range of downstream AI tasks, much in the way humans do today. Over the past five years, we achieve human performance on benchmarks in conversational speech recognition, machine translation, conversational question answering, machine reading comprehension, and image captioning. These five breakthroughs provided us with strong signals toward our more ambitious aspiration to produce a leap in AI capabilities, achieving multi-sensory and multilingual learning that is closer in line with how humans learn and understand. I believe the joint XYZ-code is a foundational component of this aspiration, if grounded with external knowledge sources in the downstream AI tasks."

The document summarization API request is processed upon receipt of the request by creating a job for the API backend. If the job succeeded, the output of the API is returned. The output is available for retrieval for 24 hours. After this time, the output is purged. Due to multilingual and emoji support, the response can contain text offsets. For more information, see how to process offsets.

If we use the above example, the API might return these summarized sentences:

Extractive summarization:

"At Microsoft, we are on a quest to advance AI beyond existing techniques, by taking a more holistic, human-centric approach to learning and understanding."
"We believe XYZ-code enables us to fulfill our long-term vision: cross-domain transfer learning, spanning modalities and languages."
"The goal is to have pretrained models that can jointly learn representations to support a broad range of downstream AI tasks, much in the way humans do today."

Abstractive summarization:

"Microsoft is taking a more holistic, human-centric approach to learning and understanding. We believe XYZ-code enables us to fulfill our long-term vision: cross-domain transfer learning, spanning modalities and languages. Over the past five years, we achieved human performance on benchmarks in conversational speech recognition."

Important

Conversation summarization is only available in English.

This documentation contains the following article types:

Quickstarts are getting-started instructions to guide you through making requests to the service.
How-to guides contain instructions for using the service in more specific or customized ways.

Key features

Conversation summarization supports the following features:

Issue/resolution summarization: A call center specific feature that gives a summary of issues and resolutions in conversations between customer-service agents and your customers.
Chapter title summarization: Segments a conversation into chapters based on the topics discussed in the conversation, and gives suggested chapter titles of the input conversation.
Recap: Summarizes a conversation into a brief paragraph.
Narrative summarization: Generates detail call notes, meeting notes or chat summaries of the input conversation.
Follow-up tasks: Gives a list of follow-up tasks discussed in the input conversation.

When to use issue and resolution summarization

When there are aspects of an "issue" and "resolution" such as:
- The reason for a service chat/call (the issue).
- That resolution for the issue.
You only want a summary that focuses on related information about issues and resolutions.
When there are two participants in the conversation, and you want to summarize what each had said.

As an example, consider the following example conversation:

Agent: "Hello, you're chatting with Rene. How may I help you?"

Customer: "Hi, I tried to set up wifi connection for Smart Brew 300 espresso machine, but it didn't work."

Agent: "I'm sorry to hear that. Let's see what we can do to fix this issue. Could you push the wifi connection button, hold for 3 seconds, then let me know if the power light is slowly blinking?"

Customer: "Yes, I pushed the wifi connection button, and now the power light is slowly blinking."

Agent: "Great. Thank you! Now, please check in your Contoso Coffee app. Does it prompt to ask you to connect with the machine?"

Customer: "No. Nothing happened."

Agent: "I see. Thanks. Let's try if a factory reset can solve the issue. Could you please press and hold the center button for 5 seconds to start the factory reset."

Customer: "I've tried the factory reset and followed the above steps again, but it still didn't work."

Agent: "I'm very sorry to hear that. Let me see if there's another way to fix the issue. Please hold on for a minute."

Conversation summarization feature would simplify the text as follows:

Example summary	Format	Conversation aspect
Customer wants to use the wifi connection on their Smart Brew 300. But it didn't work.	One or two sentences	issue
Checked if the power light is blinking slowly. Checked the Contoso coffee app. It had no prompt. Tried to do a factory reset.	One or more sentences, generated from multiple lines of the transcript.	resolution

Get started with summarization

To use summarization, you submit for analysis and handle the API output in your application. Analysis is performed as-is, with no added customization to the model used on your data. There are two ways to use summarization:

Document summarization
Conversation summarization

Development option	Description
Language studio	Language Studio is a web-based platform that lets you try entity linking with text examples without an Azure account, and your own data when you sign up. For more information, see the Language Studio website or language studio quickstart.
REST API or Client library (Azure SDK)	Integrate document summarization into your applications using the REST API, or the client library available in various languages. For more information, see the summarization quickstart.

Development option	Description	Links
REST API	Integrate conversation summarization into your applications using the REST API.	Quickstart: Use conversation summarization

Custom Summarization enables users to build custom AI models to summarize unstructured text, such as contracts or novels. By creating a Custom Summarization project, developers can iteratively label data, train, evaluate, and improve model performance before making it available for consumption. The quality of the labeled data greatly impacts model performance. To simplify building and customizing your model, the service offers a custom web portal that can be accessed through the Language studio. You can easily get started with the service by following the steps in this quickstart.

Summarization takes text for analysis. For more information, see Data and service limits in the how-to guide.
Summarization works with various written languages. For more information, see language support.

Reference documentation and code samples

As you use document summarization in your applications, see the following reference documentation and samples for Azure AI Language:

Development option / language	Reference documentation	Samples
C#	C# documentation	C# samples
Java	Java documentation	Java Samples
JavaScript	JavaScript documentation	JavaScript samples
Python	Python documentation	Python samples

Responsible AI

An AI system includes not only the technology, but also the people who use it, the people affected by it, and the deployment environment. Read the transparency note for summarization to learn about responsible AI use and deployment in your systems. For more information, see the following articles:

What is document and conversation summarization?

Native document support

Key features

Key features

When to use issue and resolution summarization

Get started with summarization

Input requirements and service limits

Reference documentation and code samples

Responsible AI

Feedback

Additional resources