Edit

Use multimodal input with AI Functions (Preview)

Important

This feature is in preview.

AI Functions apply one-line, LLM-powered transformations to large pandas or PySpark DataFrames with high concurrency by default. With multimodal input, you can also process images, PDFs, and text files to classify documents, summarize PDFs, extract information from images, and more.

Use this table to jump to multimodal examples and detailed documentation.

Function Description Detailed documentation
ai.analyze_sentiment Detect sentiment in files. Example. pandas, PySpark
ai.classify Classify files by using your labels. Example. pandas, PySpark
ai.extract Extract fields from files. Example. pandas, PySpark
ai.fix_grammar Correct spelling, grammar, and punctuation in files. Example. pandas, PySpark
ai.generate_response Generate responses grounded in file content. Example. pandas, PySpark
ai.summarize Summarize file content. Example. pandas, PySpark
ai.translate Translate file content. Example. pandas, PySpark
aifunc.load Load files from a folder into a structured table. Example. Syntax and parameters
aifunc.list_file_paths Get file paths from a folder. Example. Syntax and parameters
ai.infer_schema Infer an extraction schema from file contents. Example. Syntax and parameters

Supported file types

Multimodal AI Functions support the following file types:

  • Images: jpg, jpeg, png, static gif, webp
  • Documents: pdf
  • Text files: md, txt, csv, tsv, json, xml, py, and other text files

Note

  • Multimodal calls with file-path inputs work with the responses API, which is the default. Don't set api_type to chat_completions for file-path inputs.
  • Office file formats (such as .docx, .pptx, and .xlsx) aren't currently supported.
  • You can convert .docx and .pptx files to PDF and .xlsx files to CSV before using them with multimodal AI Functions.
  • Each input file is limited to 50 MB in size.

Supported URL protocols

Multimodal inputs are strings that use one of these URL protocols:

  • local file paths
  • http(s)
  • wasbs
  • abfs(s)

Prerequisites

Multimodal AI Functions share the same prerequisites as text-based AI Functions. For the full list, see Prerequisites.

Set up your files

Organize your files in a folder that can be referenced by a path or a glob-style string.

Tip

Use the AI Functions Starter Notebooks for end-to-end AI Functions examples that use all AI Functions. The starter notebooks include one notebook for pandas and one notebook for PySpark.

Example

You can store files in a Lakehouse attached to your notebook.

folder_path = "/lakehouse/default/Files"

Load your files

To use AI Functions with multimodal input, you can either load the file contents into a structured table or reference the file paths directly in your DataFrame. The following examples show both approaches.

Load files into a table

Use the aifunc.load function to read files from a folder and generate a structured table. The function can infer the table structure on its own, or you can provide a prompt to guide the extraction, or a schema for consistent structure. This approach is useful when you want the AI to extract specific information from the files and present it in a structured format.

df, schema = aifunc.load(folder_path)
# or
df, schema = aifunc.load(folder_path, prompt="Give me candidate's name and the most recent company they worked for.")
display(df)

Load file paths into a column

Alternatively, you can use aifunc.list_file_paths to get a list of file paths from a folder and load them into a DataFrame column. This approach is useful when you want to run AI Functions across each file.

Note

Most multimodal functions accept file paths with column_type="path" in pandas or input_col_type/col_types="path" in PySpark.

file_path_series = aifunc.list_file_paths(folder_path)
df = pd.DataFrame({"file_path": file_path_series}).reset_index(drop=True)
display(df)

Important

When your file paths are stored as string URLs in a DataFrame column, you must explicitly tell the AI function to treat the values as file paths rather than plain text.

For Series-level AI Functions (operating on a single column), set the column_type parameter:

df["result"] = df["file_path"].ai.analyze_sentiment(column_type="path")

For DataFrame-level AI Functions (operating on multiple columns), use the column_type_dict parameter:

df["result"] = df.ai.generate_response(
    prompt="Describe the content.",
    column_type_dict={"file_path": "path"},
)

Note

If you use aifunc.list_file_paths() to create your file path column, the returned yarl.URL objects are automatically detected as file paths. You only need to specify column_type="path" when your column contains plain string URLs.

New multimodal functions

aifunc.load: Load files into a table

The aifunc.load function reads all files from a folder path and generates a structured table from their contents. You can optionally provide a prompt to guide the extraction, or a schema for consistent structure.

Syntax

df, schema = aifunc.load(folder_path, prompt=None, schema=None)

Parameters

Name Description
folder_path (Required) A string path to a folder or a glob-style pattern matching files.
prompt (Optional) A string that guides the table generation process. Use it to specify which fields to extract from the files.
schema (Optional) A schema object (returned by a previous load call) that defines the table structure. When provided, the function uses this schema directly.

Returns

A tuple of (DataFrame, schema). The DataFrame contains the structured data extracted from the files. The schema can be reused in subsequent load calls for consistent results.

Example

# This code uses AI. Always review output for mistakes.

# Basic load – let the AI infer the table structure
df, schema = aifunc.load(folder_path)
display(df)
# This code uses AI. Always review output for mistakes.

# Guided load – provide a prompt to specify what to extract
guided_df, guided_schema = aifunc.load(
    folder_path,
    prompt="Give me candidate's name and the most recent company they worked for.",
)
display(guided_df)

aifunc.list_file_paths: List files

The aifunc.list_file_paths function fetches all valid file paths from a specified folder. You can use the returned file paths as input to any multimodal AI function. The function also supports glob-style patterns.

Syntax

file_path_series = aifunc.list_file_paths(folder_path)

Parameters

Name Description
folder_path (Required) A string path to a folder or a glob-style pattern matching files.

Returns

A pandas Series of yarl.URL objects, indexed by their string representations. These yarl.URL objects are automatically treated as file paths by AI Functions, so you don't need to specify column_type="path".

Example

# This code uses AI. Always review output for mistakes.

file_path_series = aifunc.list_file_paths(folder_path)
custom_df = pd.DataFrame({"file_path": file_path_series}).reset_index(drop=True)
display(custom_df)

ai.infer_schema: Infer schema from files

The ai.infer_schema function infers a common schema from file contents. The inferred schema is represented as a list of aifunc.ExtractLabel objects that you can pass directly to ai.extract for structured data extraction.

Syntax

schema = df["file_path"].ai.infer_schema(column_type="path")

Parameters

Name Description
prompt (Optional) A string to guide schema inference. If not provided, the function infers the schema from the file contents alone.
n_samples (Optional) An integer specifying how many items to sample for inference. Default is 3.
column_type (Optional) Set to "path" to treat column values as file paths.

Returns

A list of aifunc.ExtractLabel objects that describe the inferred schema. You can pass this list to ai.extract to extract structured data from files.

Example

# This code uses AI. Always review output for mistakes.

# Infer a schema from file contents
schema = df["file_path"].ai.infer_schema(column_type="path")
for label in schema:
    print(label)

# Use the inferred schema with ai.extract
extracted_df = df["file_path"].ai.extract(*schema, column_type="path")
display(extracted_df)

Use multimodal input with existing AI Functions

The following examples show how to use multimodal input with each of the supported AI Functions.

ai.analyze_sentiment: Detect sentiment from files

For full parameters, see pandas or PySpark.

# This code uses AI. Always review output for mistakes.

animal_urls = [
    "<image-url-golden-retriever>",  # Replace with URL to an image of a golden retriever
    "<image-url-giant-panda>",  # Replace with URL to an image of a giant panda
    "<image-url-bald-eagle>",  # Replace with URL to an image of a bald eagle
]
animal_df = pd.DataFrame({"file_path": animal_urls})

animal_df["sentiment"] = animal_df["file_path"].ai.analyze_sentiment(column_type="path")
display(animal_df)

ai.classify: Classify files

For full parameters, see pandas or PySpark.

# This code uses AI. Always review output for mistakes.

custom_df["highest_degree"] = custom_df["file_path"].ai.classify(
    "Master", "PhD", "Bachelor", "Other",
    column_type="path",
)
display(custom_df)

ai.extract: Extract entities from files

For full parameters, see pandas or PySpark.

# This code uses AI. Always review output for mistakes.

extracted = custom_df["file_path"].ai.extract(
    aifunc.ExtractLabel(
        "name",
        description="The full name of the candidate, first letter capitalized.",
        max_items=1,
    ),
    "companies_worked_for",
    aifunc.ExtractLabel(
        "year_of_experience",
        description="The total years of professional work experience the candidate has, excluding internships.",
        type="integer",
        max_items=1,
    ),
    column_type="path",
)
display(extracted)

ai.fix_grammar: Fix grammar in files

For full parameters, see pandas or PySpark.

# This code uses AI. Always review output for mistakes.

custom_df["corrections"] = custom_df["file_path"].ai.fix_grammar(column_type="path")
display(custom_df)

ai.generate_response: Apply custom prompts to files

For full parameters, see pandas or PySpark.

# This code uses AI. Always review output for mistakes.

# Series-level: generate a response from each file
animal_df["animal_name"] = animal_df["file_path"].ai.generate_response(
    prompt="What type of animal is in this image? Give me only the animal's common name.",
    column_type="path",
)
display(animal_df)
# This code uses AI. Always review output for mistakes.

# DataFrame-level: use all columns as context
animal_df["description"] = animal_df.ai.generate_response(
    prompt="Describe this animal's natural habitat and one interesting fact about it.",
    column_type_dict={"file_path": "path"},
)
display(animal_df)

ai.summarize: Summarize files

For full parameters, see pandas or PySpark.

# This code uses AI. Always review output for mistakes.

# Summarize file content from a single column
custom_df["summary"] = custom_df["file_path"].ai.summarize(
    instructions="Summarize this file in one sentence for a support analyst.",
    column_type="path",
)
display(custom_df)

You can summarize values across all columns in a DataFrame by omitting the input column and specifying file path columns with column_type_dict (pandas) or col_types (PySpark):

# This code uses AI. Always review output for mistakes.

custom_df["summary"] = custom_df.ai.summarize(
    column_type_dict={"file_path": "path"},
)
display(custom_df)

ai.translate: Translate files

For full parameters, see pandas or PySpark.

# This code uses AI. Always review output for mistakes.

custom_df["chinese_version"] = custom_df["file_path"].ai.translate(
    "Chinese",
    column_type="path",
)
display(custom_df)

Evaluate output quality

Use the AI Functions Eval Notebooks for structured workflows that use LLM-as-a-Judge to assess multimodal outputs and compute metrics such as accuracy, precision, recall, F1, coherence, consistency, and relevance. You can use these workflows to validate the quality of classification, extraction, summarization, and other AI function results before moving to production.

Monitor cost and capacity usage

Use Billing for AI Functions to understand costs, runtime usage, and capacity monitoring.