Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
The ai.summarize function summarizes text from one column or across all columns in each row.
Note
- This article covers
ai.summarizewith pandas. For PySpark, see Use ai.summarize with PySpark. - For all AI Functions and prerequisites, see AI Functions overview.
- Change default configuration for AI Functions with pandas.
Overview
The ai.summarize function extends the pandas Series class. To summarize each row value from that column alone, call the function on a pandas DataFrame text column. You can also call the ai.summarize function on an entire DataFrame to summarize values across all the columns.
The function returns a pandas Series that contains summaries, which can be stored in a new DataFrame column.
Syntax
df["summaries"] = df["text"].ai.summarize()
Parameters
| Name | Description |
|---|---|
instructions Optional |
A string that provides more context for the AI model, such as output length, tone, audience, or focus. More precise instructions produce better results. |
Returns
The function returns a pandas Series that contains summaries for each input text row. If the input text is null, the result is null.
Example
# This code uses AI. Always review output for mistakes.
df= pd.DataFrame([
("Microsoft Teams", "2017",
"""
The ultimate messaging app for your organization—a workspace for real-time
collaboration and communication, meetings, file and app sharing, and even the
occasional emoji! All in one place, all in the open, all accessible to everyone.
"""),
("Microsoft Fabric", "2023",
"""
An enterprise-ready, end-to-end analytics platform that unifies data movement,
data processing, ingestion, transformation, and report building into a seamless,
user-friendly SaaS experience. Transform raw data into actionable insights.
""")
], columns=["product", "release_year", "description"])
df["summaries"] = df["description"].ai.summarize()
display(df)
Output:
Customize summaries with instructions
Use the instructions parameter to control the tone, length, audience, or focus of generated summaries without changing the source text.
# This code uses AI. Always review output for mistakes.
df["executive_summary"] = df["description"].ai.summarize(
instructions="Write one concise sentence for a business executive. Focus on product value and avoid marketing language."
)
display(df)
Multimodal input
To summarize images, PDFs, or text files, set column_type="path" when the input column contains file path strings. For setup, see Use multimodal input with AI Functions.
# This code uses AI. Always review output for mistakes.
custom_df["summary"] = custom_df["file_path"].ai.summarize(
instructions="Summarize this file in one sentence for a support analyst.",
column_type="path",
)
display(custom_df)
Related content
- Use ai.summarize with PySpark.
- Learn more about AI Functions.
- Use multimodal input with AI Functions.
- Change default configuration for AI Functions with pandas.
- Understand billing for AI Functions.