Share via


Tutorial: Customer segmentation with Genie Code

In this tutorial, you use Genie Code to run end-to-end customer segmentation directly inside a Databricks notebook. Starting from a raw marketing campaign data set, Genie Code handles data profiling, feature engineering, K-means clustering, and persona generation — all from a single prompt.

Customer segmentation analysis with Genie Code

Requirements

Step 1: Get the data set

For this tutorial, you use a marketing campaign data set.

  1. Download the Marketing Campaign data set from Kaggle.
  2. Click New Icon New > Add or upload data.
  3. Click Create or modify a table.
  4. Click browse or drag and drop the downloaded file onto the drop zone.
  5. Select the target catalog and schema in Unity Catalog.
  6. (Optional) Edit the table name.
  7. Click Create table.

Step 2: Open a notebook

  1. In the sidebar, click New and select Notebook.
  2. Name the notebook Marketing Campaign Data.
  3. Attach the notebook to compute or use serverless compute.

Step 3: Launch Genie Code in Agent mode

Genie Code in Agent mode can plan and run multi-step tasks autonomously — it reads cell outputs, fixes errors, and adapts its approach based on results.

  1. In the upper-right corner of the notebook, click DB Assistant icon. to open Genie Code pane.
  2. In the mode selector at the bottom of Genie Code pane, select Agent.

Step 4: Submit your segmentation prompt

Segmentation analysis is commonly performed by clustering customers who have similar purchase patterns together. For example, segments might be based on income, demographics, or specific purchasing behaviors. One common approach is K-means clustering, a technique that automatically groups similar customers into distinct segments, called "clusters."

Enter the following prompt and press Enter or click Send icon.:

Help me cluster my customers from my marketing campaign to profile them. I want to identify interesting segments that may be useful for marketing purposes.

Genie Code breaks down the prompt into steps and runs them:

  1. Understands context — Genie Code reads your prompt and the current state of the notebook.
  2. Finds relevant data — Genie Code searches Unity Catalog for relevant data assets and loads them for analysis.
  3. Generates and runs code — Genie Code edits notebook cells following a standard data science workflow: importing libraries, preprocessing data, training the model, and visualizing results.
  4. Summarizes results — Genie Code has a plain-language summary of what it found.

Genie Code asks for your approval before running code. Review each step and click Allow. You can also select Allow in this thread to approve all steps in the current conversation, or Always allow to skip future approval prompts.

Step 5: Review the results

After Genie Code finishes, review the generated notebook cells and the summary in Genie Code pane. The summary describes each identified customer segment, including demographic characteristics, purchase behavior, and suggestions for how to engage each group.

For example, Genie Code might identify segments like Premium Loyalists (high-income, frequent buyers) and Bargain Seekers (price-sensitive, promotion-driven).

Customer segmentation analysis results in Genie Code.

Step 6: Refine with follow-up prompts

Use follow-up prompts to dig deeper into the analysis:

  • Are there any other clustering techniques we should consider?
  • What happens if we increase the number of clusters?
  • Filter to customers who have made a purchase in the last 90 days.

Each follow-up prompt builds on the previous results without starting over.

Marketing campaign data in a dashboard.

Additional resources