ServerEventInputAudioBufferSpeechStarted interface

Sent by the server when in server_vad mode to indicate that speech has been detected in the audio buffer. This can happen any time audio is added to the buffer (unless speech is already detected). The client may want to use this event to interrupt audio playback or provide visual feedback to the user. The client should expect to receive a input_audio_buffer.speech_stopped event when speech stops. The item_id property is the ID of the user message item that will be created when speech stops and will also be included in the input_audio_buffer.speech_stopped event (unless the client manually commits the audio buffer during VAD activation).

Extends

Properties

audioStartInMs

Milliseconds from the start of all audio written to the buffer during the session when speech was first detected. This will correspond to the beginning of audio sent to the model, and thus includes the prefix_padding_ms configured in the Session.

itemId

The ID of the user message item that will be created when speech stops.

type

The event type, must be input_audio_buffer.speech_started.

Inherited Properties

eventId

Property Details

audioStartInMs

Milliseconds from the start of all audio written to the buffer during the session when speech was first detected. This will correspond to the beginning of audio sent to the model, and thus includes the prefix_padding_ms configured in the Session.

audioStartInMs: number

Property Value

number

itemId

The ID of the user message item that will be created when speech stops.

itemId: string

Property Value

string

type

The event type, must be input_audio_buffer.speech_started.

type: "input_audio_buffer.speech_started"

Property Value

"input_audio_buffer.speech_started"

Inherited Property Details

eventId

eventId?: string

Property Value

string

Inherited From ServerEvent.eventId