ServerEventConversationItemInputAudioTranscriptionCompleted interface

This event is the output of audio transcription for user audio written to the user audio buffer. Transcription begins when the input audio buffer is committed by the client or server (in server_vad mode). Transcription runs asynchronously with Response creation, so this event may come before or after the Response events. VoiceLive API models accept audio natively, and thus input transcription is a separate process run on a separate ASR (Automatic Speech Recognition) model. The transcript may diverge somewhat from the model's interpretation, and should be treated as a rough guide.

Extends

Properties

contentIndex

The index of the content part containing the audio.

itemId

The ID of the user message item containing the audio.

logprobs

The log probabilities of the transcription tokens.

phrases

The transcription phrases with timing information.

transcript

The transcribed text.

type

The event type, must be conversation.item.input_audio_transcription.completed.

Inherited Properties

eventId

Property Details

contentIndex

The index of the content part containing the audio.

contentIndex: number

Property Value

number

itemId

The ID of the user message item containing the audio.

itemId: string

Property Value

string

logprobs

The log probabilities of the transcription tokens.

logprobs?: LogProbProperties[]

Property Value

phrases

The transcription phrases with timing information.

phrases?: TranscriptionPhrase[]

Property Value

transcript

The transcribed text.

transcript: string

Property Value

string

type

The event type, must be conversation.item.input_audio_transcription.completed.

type: "conversation.item.input_audio_transcription.completed"

Property Value

"conversation.item.input_audio_transcription.completed"

Inherited Property Details

eventId

eventId?: string

Property Value

string

Inherited From ServerEvent.eventId