将ai.generate_response与 PySpark 配合使用

该 ai.generate_response 函数利用生成式人工智能，根据你自己的指令生成自定义文本响应，只需一行代码即可实现。

注释

本文介绍如何将 ai.generate_response 与 PySpark 配合使用。若要使用 pandas 中的 ai.generate_response，请参阅本文。
请参阅本概述文章中的其他 AI 函数。
了解如何自定义 AI 函数的配置。

概述

该 ai.generate_response 函数可用于 Spark 数据帧。必须将现有输入列的名称指定为参数。还必须指定基于字符串的提示符和一个布尔值，该值指示是否应将该提示视为格式字符串。

该函数返回新的 DataFrame，为每个输入文本行提供自定义响应，这些响应存储在一个输出列中。

小窍门

了解如何遵循 OpenAI 的 gpt-4.1 提示，创建更有效的提示以获取更高质量的响应。

df.ai.generate_response(prompt="Instructions for a custom response based on all column values", output_col="response")

df.ai.generate_response(prompt="Instructions for a custom response based on specific {column1} and {column2} values", is_prompt_template=True, output_col="response")

参数

Name	Description
`prompt` 必选	包含提示说明的字符串。这些说明适用于自定义响应的输入文本值。
`is_prompt_template` 可选	一个布尔值，指示提示是格式字符串还是文本字符串。如果此参数设置为`True`，则该函数只考虑格式字符串中每列的特定行值。在这种情况下，这些列名必须出现在大括号之间，而其他列将被忽略。如果此参数设置为其默认值 `False`，则该函数会将所有列值视为每个输入行的上下文。
`output_col` 可选	一个字符串，其中包含用于存储每行输入文本的自定义响应的新列的名称。如果未设置此参数，则为输出列生成默认名称。
`error_col` 可选	一个字符串，其中包含新列的名称，用于存储因处理每行输入文本而导致的任何 OpenAI 错误。如果未设置此参数，则为错误列生成默认名称。如果输入行没有错误，则此列中的值为 `null`。
`response_format` 可选	一个字符串或字典，指定模型响应的预期结构。字符串值可以设置为自由格式文本的“text”或“json_object”，以确保输出是有效的 JSON 对象。否则，可以将 `type` 字段设置为“json_schema”并使用自定义的 JSON Schema 来强制实施特定的响应结构。如果未提供此参数，响应将作为纯文本返回。

退货

该函数返回一个包含新列的 Spark 数据帧，该列包含针对每个输入文本行的提示的自定义文本响应。

Example

使用简单的提示生成响应
使用模板提示生成响应

# This code uses AI. Always review output for mistakes. 

df = spark.createDataFrame([
        ("Scarves",),
        ("Snow pants",),
        ("Ski goggles",)
    ], ["product"])

responses = df.ai.generate_response(prompt="Write a short, punchy email subject line for a winter sale.", output_col="response")
display(responses)

此示例代码单元提供以下输出：

# This code uses AI. Always review output for mistakes. 

df = spark.createDataFrame([
        ("001", "Scarves", "Boots", "2021"),
        ("002", "Snow pants", "Sweaters", "2010"),
        ("003", "Ski goggles", "Helmets", "2015")
    ], ["id", "product", "product_rec", "yr_introduced"])

responses = df.ai.generate_response(prompt="Write a short, punchy email subject line for a winter sale on the {product}.", is_prompt_template=True, output_col="response")
display(responses)

此示例代码单元提供以下输出：

响应格式示例

以下示例演示如何使用 response_format 参数指定不同的响应格式，包括纯文本、JSON 对象和自定义 JSON 架构。

# This code uses AI. Always review output for mistakes.

df = spark.createDataFrame([
        ("Alex Rivera is a 24-year-old soccer midfielder from Barcelona who scored 12 goals last season.",),
        ("Jordan Smith, a 29-year-old basketball guard from Chicago, averaged 22 points per game.",),
        ("William O'Connor is a 22-year-old tennis player from Dublin who won 3 ATP titles this year.",)
    ], ["bio"])

# response_format : text
df = df.ai.generate_response(
        prompt="Create a player card with the player's details and a motivational quote",
        response_format="text",
        output_col="card_text"
)

# response_format : json object
df = df.ai.generate_response(
        prompt="Create a player card with the player's details and a motivational quote in JSON",
        response_format="json_object", # Requires "json" in the prompt
        output_col="card_json_object"
)

# response_format : specified json schema
df = df.ai.generate_response(
        prompt="Create a player card with the player's details and a motivational quote",
        response_format={
           "type": "json_schema",
            "json_schema": {
                "name": "player_card_schema",
                "strict": True,
                "schema": {
                    "type": "object",
                    "properties": {
                        "name": {"type": "string"},
                        "age": {"type": "integer"},
                        "sport": {"type": "string"},
                        "position": {"type": "string"},
                        "hometown": {"type": "string"},
                        "stats": {"type": "string", "description": "Key performance metrics or achievements"},
                        "motivational_quote": {"type": "string"},
                    },
                    "required": ["name", "age", "sport", "position", "hometown", "stats", "motivational_quote"],
                    "additionalProperties": False,
                },
            }
        },
        output_col="card_json_schema"
)

display(df)

此示例代码单元提供以下输出：

将ai.generate_response与 pandas 配合使用。
使用 ai.analyze_sentiment检测情绪。
使用 ai.classify 对文本进行分类。
使用 ai.embed 生成矢量嵌入。
使用 ai_extract提取实体。
使用 ai.fix_grammar修复语法。
使用 ai.similarity 计算相似性。
使用ai.summarize功能汇总文本。
使用 ai.translate 翻译文本。
详细了解完整的 AI 函数集。
自定义 AI 函数的配置。
我们错过了所需的功能吗？在面料创意论坛上提出建议。

反馈

此页面是否有帮助？

Last updated on 2025-11-21

通过

将ai.generate_response与 PySpark 配合使用

概述

Syntax

参数

退货

Example

响应格式示例

相关内容

反馈

其他资源