將 ai.summarize 與 PySpark 結合使用

該 ai.summarize 函數使用生成式 AI 生成輸入文本的摘要，只需一行代碼即可。函式可以彙總 DataFrame 一個資料行的值，也可以彙總所有資料行的值。

備註

本文介紹了將 ai.summarize 與 PySpark 一起使用。要將 ai.summarize 與 pandas 一起使用，請參閱這篇文章。
請參閱此概述文章中的其他 AI 功能。
瞭解如何自訂 AI 功能的設定。

概觀

ai.summarize 函式也適用於 Spark DataFrames。如果您將現有輸入直欄的名稱指定為參數，則函數會單獨彙總該直欄中的每個值。否則，函式會逐列彙總 DataFrame 所有資料行的值。

函式會傳回新的 DataFrame，在輸出欄中存放來自單一欄或所有欄的每個輸入文字列的摘要。

df.ai.summarize(input_col="text", output_col="summaries")

df.ai.summarize(output_col="summaries")

參數

名稱	Description
`input_col` 可選	包含現有欄名稱的字串，其中包含要摘要的輸入文字值。如果您未設定此參數，函式會彙總 DataFrame 中所有資料行的值，而不是特定資料行的值。
`instructions` 可選	包含更多 AI 模型上下文的字串，例如指定輸出長度、音調等。更精確的指示會帶來更好的結果。
`error_col` 可選	字串，其中包含一個新欄位的名稱，用於儲存處理每個輸入行時產生的任何 OpenAI 錯誤。如果您未設定此參數，則會為錯誤資料行產生預設名稱。如果輸入資料列沒有錯誤，則此資料列中的值會 `null`。
`output_col` 可選	包含新資料行名稱的字串，以儲存每個輸入文字資料列的摘要。如果您未設定此參數，則會為輸出資料行產生預設名稱。

退貨

函式會傳回 Spark DataFrame ，其中包含包含每個輸入文字資料列的摘要文字的新資料行。如果輸入文字 null，則結果會 null。如果未指定輸入資料行，函式會彙總 DataFrame 中所有資料行的值。

Example

彙總單一資料行中的值
彙總所有欄位的值

# This code uses AI. Always review output for mistakes.

df = spark.createDataFrame([
        ("Microsoft Teams", "2017",
        """
        The ultimate messaging app for your organization—a workspace for real-time 
        collaboration and communication, meetings, file and app sharing, and even the 
        occasional emoji! All in one place, all in the open, all accessible to everyone.
        """,),
        ("Microsoft Fabric", "2023",
        """
        An enterprise-ready, end-to-end analytics platform that unifies data movement, 
        data processing, ingestion, transformation, and report building into a seamless, 
        user-friendly SaaS experience. Transform raw data into actionable insights.
        """,)
    ], ["product", "release_year", "description"])

summaries = df.ai.summarize(input_col="description", output_col="summaries")
display(summaries)

此範例程式碼儲存格提供下列輸出：

# This code uses AI. Always review output for mistakes.

df = spark.createDataFrame([
        ("Microsoft Teams", "2017",
        """
        The ultimate messaging app for your organization—a workspace for real-time 
        collaboration and communication, meetings, file and app sharing, and even the 
        occasional emoji! All in one place, all in the open, all accessible to everyone.
        """,),
        ("Microsoft Fabric", "2023",
        """
        An enterprise-ready, end-to-end analytics platform that unifies data movement, 
        data processing, ingestion, transformation, and report building into a seamless, 
        user-friendly SaaS experience. Transform raw data into actionable insights.
        """,)
    ], ["product", "release_year", "description"])

summaries = df.ai.summarize(output_col="summaries")
display(summaries)

此範例程式碼儲存格提供下列輸出：

將 ai.summarize 與 pandas 一起使用。
用 ai.analyze_sentiment檢測情緒。
使用 ai.classify 對文本進行分類。
用 ai.embed 產生向量嵌入。
使用ai_extract提取實體。
用 ai.fix_grammar修復語法。
使用 ai.generate_response 回答自訂使用者提示。
使用 ai.similarity 計算相似度。
使用 ai.translate 翻譯文本。
進一步了解全套 AI 功能。
自訂 AI功能的配置。
我們錯過了您需要的功能嗎？請在 Fabric Ideas 論壇上提出建議。

意見反應

此頁面對您有幫助嗎？

Last updated on 2025-11-21

共用方式為

將 ai.summarize 與 PySpark 結合使用

概觀

語法

參數

退貨

Example

相關內容

意見反應

其他資源