LlmInputHelper Class

Definition

Converts AnalysisResult objects into LLM-friendly text.

public static class LlmInputHelper
type LlmInputHelper = class
Public Module LlmInputHelper
Inheritance
LlmInputHelper

Methods

Name Description
ToLlmInput(AnalysisResult, IDictionary<String,Object>, LlmInputOptions)

Converts a Content Understanding analysis result into LLM-friendly text.

Produces a YAML front matter block followed by markdown body, suitable for injecting into an LLM prompt, storing in a vector database, or passing as tool output.

The YAML front matter (delimited by ---) may include: contentType (document, image, audio, video), pages (page range), timeRange (media time span), category (classification label), fields (extracted structured fields as YAML), rai_warnings (content safety flags), and any caller-supplied metadata entries.

The markdown body contains the extracted text with page-break markers (<!-- InputPageNumber: N -->) inserted at page boundaries so downstream consumers can locate content by page number. N is the original 1-based page number from the source document (i.e., the page index in the analyzed PDF), not a counter that restarts at 1 for each call. This matters when the analyze request specifies a ContentRange (e.g., "2-3,5"): the markers in the output will read InputPageNumber: 2, 3, 5 — not 1, 2, 3. Downstream consumers (RAG indexers, page-citation prompts) can rely on the marker value to cite the correct source page even when only a subset of pages was analyzed. If the service markdown already contains <!-- InputPageNumber: markers, the helper passes the markdown through unchanged to avoid duplicate markers.

Internal telemetry messages such as LLMStats: ... are filtered from the rendered rai_warnings front matter.

Applies to