Edit

Share via


Use ai.embed with pandas

The ai.embed function uses generative AI to convert text into vector embeddings. These vectors let AI understand relationships between texts, so you can search, group, and compare content based on meaning rather than exact wording. With a single line of code, you can generate vector embeddings from a column in a DataFrame.

Note

Overview

The ai.embed function extends the pandas Series class.

To generate vector embeddings of each input row, call the function on either a pandas Series or a text column of pandas DataFrame.

The function returns a pandas Series that contains embeddings, which can be stored in a new DataFrame column.

Syntax

df["embed"] = df["col1"].ai.embed()

Parameters

None.

Returns

The function returns a pandas Series that contains embeddings as numpy array of float-32 for each input text row. The number of elements in array depends on the embedding model's dimensions, which are configurable in AI functions

Example

# This code uses AI. Always review output for mistakes.

df = pd.DataFrame([
        "This duvet, lovingly hand-crafted from all-natural fabric, is perfect for a good night's sleep.",
        "Tired of friends judging your baking? With these handy-dandy measuring cups, you'll create culinary delights.",
        "Enjoy this *BRAND NEW CAR!* A compact SUV perfect for the professional commuter!"
    ], columns=["descriptions"])
    
df["embed"] = df["descriptions"].ai.embed()
display(df)

This example code cell provides the following output:

Screenshot of a data frame with columns 'descriptions' and 'embed'. The 'embed' column contains embeddings for the descriptions.