Analyze customer reviews with ai_generate_text() and OpenAI

Important

This feature is in Public Preview.

Warning

The AI function, ai_generate_text() is deprecated. Databricks recommends using ai_query with external models.

This article illustrates how to use the built-in Databricks SQL function, ai_generate_text() to examine customer reviews and determine if a response needs to be generated. See AI Functions on Azure Databricks for more detail about the function.

The dataset and commands in this guide are from the Databricks demo Action Customer Reviews at Scale with Databricks SQL AI Functions. The demo uses fake data generated by OpenAI that mimics customer reviews for grocery products submitted to an e-commerce website.

Customer reviews

This example steps you through:

  • Breaking down free-form customer review text into its constituent entities.
  • For each entity, determining sentiment and whether a response is required back to the customer.
  • Generating a response mentioning alternative products that may satisfy the customer.

ai_generate_text data flow

Prerequisites

Prompt design

The keys to getting useful results back from a GPT model are:

  • Asking it a well-formed question.
  • Being specific about the type of answer that you are expecting.

In order to get results in a form that you can easily store in a table, you can ask the model to return the result in a string that reflects JSON representation, and specify the expected schema.

The following is the example prompt for this scenario:

A customer left a review. Follow up with anyone who appears unhappy.

Extract all entities mentioned. For each entity:

  • Classify sentiment as [“POSITIVE”,”NEUTRAL”,”NEGATIVE”]
  • Whether customer requires a follow-up: Y or N
  • Reason for requiring followup

Return JSON ONLY. No other text outside the JSON.

JSON format:

{
"entities": [{
    "entity_name": "entity_name",
    "entity_type": "entity_type",
    "entity_sentiment": "entity_sentiment",
    "followup": "Y or N for follow up",
    "followup_reason": "reason for followup"
  }]
}

Review:

<’insert_review_text_here’>

Create SQL functions

Databricks recommends decomposing your questions into granular SQL functions so that they can be reused for other scenarios within your organization.

Throughout this section, you create SQL functions in order to abstract away the details of the ai_generate_text() call from the end users, and use those functions as your interface for interacting with Azure OpenAI.

Handle calls to Azure OpenAI

The following wrapper function, prompt_handler(), handles all your calls to Azure OpenAI. The Azure OpenAI API key is stored in a Databricks secret, and you can reference it with the secret() function. You can also pass it the Azure OpenAI resource name (resourceName) and the model’s deployment name (deploymentName).

CREATE OR REPLACE FUNCTION PROMPT_HANDLER(prompt STRING)
RETURNS STRING
RETURN AI_GENERATE_TEXT(prompt,
  "azure_openai/gpt-35-turbo",
  "apiKey", SECRET("tokens", "azure-openai"),
  "temperature", CAST(0.0 AS DOUBLE),
  "deploymentName", "llmbricks",
  "apiVersion", "2023-03-15-preview",
  "resourceName", "llmbricks"
);

Analyze customer review data

The annotate_review() function annotates your review with entities, entity sentiments, and whether a follow-up is required and why. Notice the prompt returns a well-formed json representation, so you can instruct the function to return a struct type for easier querying downstream, such as inserting it into a Delta table.

CREATE OR REPLACE FUNCTION ANNOTATE_REVIEW(review STRING)
RETURNS STRUCT<entities: ARRAY<STRUCT<entity_name: STRING, entity_type: STRING, entity_sentiment: STRING, followup: STRING, followup_reason: STRING>>>
RETURN FROM_JSON(
  PROMPT_HANDLER(CONCAT(
    'A customer left a review. Follow up with anyone who appears unhappy.
     Extract all entities mentioned. For each entity:
      - classify sentiment as ["POSITIVE","NEUTRAL","NEGATIVE"]
      - whether customer requires a follow-up: Y or N
      - reason for requiring followup

    Return JSON ONLY. No other text outside the JSON. JSON format:
    {
        entities: [{
            "entity_name": <entity name>,
            "entity_type": <entity type>,
            "entity_sentiment": <entity sentiment>,
            "followup": <Y or N for follow up>,
            "followup_reason": <reason for followup>
        }]
    }

    Review:
    ', review)),
  "STRUCT<entities: ARRAY<STRUCT<entity_name: STRING, entity_type: STRING, entity_sentiment: STRING, followup: STRING, followup_reason: STRING>>>"
);

You can pass in data from the customer reviews dataset to see how the annotate_review() function classifies freeform customer reviews.

SELECT review_body,
  ANNOTATE_REVIEW(review_body) AS review_annotated
FROM dbdemos.openai_demo.fake_reviews
WHERE product_category = "Grocery"
LIMIT 3;

Generate responses with recommendations

After reviewing the customer responses, you can use the generate_response() function to generate a response to a customer based on their complaint and include recommendations for alternative products to try.

CREATE OR REPLACE FUNCTION GENERATE_RESPONSE(product STRING, entity STRING, reason STRING)
RETURNS STRING
RETURN PROMPT_HANDLER(
  CONCAT("What alternative products can you recommend for ", product,
    " when a customer had a complaint about ", entity, " because ", reason,
    "Give me a response in the tone of an empathetic message back to the customer; only provide the body")
);

The following generates a sample message response for a customer’s review on Country Choice Snacking Cookies.

SELECT GENERATE_RESPONSE("Country Choice Snacking Cookies", "cookies", "Quality issue") AS customer_response

Adhoc queries

You can also create ad-hoc queries using your newly created prompt_handler() function.

For example, you might be interested in understanding whether a review discusses beverages.

SELECT review_id,
  PROMPT_HANDLER(
    CONCAT(
      "Does this review discuss beverages? Answer Y or N only, no explanations or notes. Review: ", review_body)
  ) AS discusses_beverages,
  review_body
FROM dbdemos.openai_demo.fake_reviews
WHERE review_id IN ("R9LEFDWWXPDEY", "R27UON10EV9FSV", "R299ZTEFIAHRQD")
ORDER BY discusses_beverages DESC;