AzureSemanticVad interface

Server Speech Detection (Azure semantic VAD, default variant).

Extends

Properties

autoTruncate

Whether to automatically truncate the audio buffer when speech stops.

createResponse

Whether to automatically create a response when speech stops.

endOfUtteranceDetection

Configuration for end-of-utterance detection.

interruptResponse

Whether to allow the user's speech to interrupt the assistant's response.

languages

List of BCP-47 language codes for speech detection.

prefixPaddingInMs

Amount of audio to include before speech is detected, in milliseconds.

removeFillerWords

Whether to remove filler words (e.g., 'um', 'uh') from transcription.

silenceDurationInMs

Duration of silence required to end speech detection, in milliseconds.

speechDurationInMs

Minimum speech duration in milliseconds to trigger detection.

threshold

Activation threshold for VAD detection. Range: 0.0 to 1.0.

type

Property Details

autoTruncate

Whether to automatically truncate the audio buffer when speech stops.

autoTruncate?: boolean

Property Value

boolean

createResponse

Whether to automatically create a response when speech stops.

createResponse?: boolean

Property Value

boolean

endOfUtteranceDetection

Configuration for end-of-utterance detection.

endOfUtteranceDetection?: EouDetectionUnion

Property Value

interruptResponse

Whether to allow the user's speech to interrupt the assistant's response.

interruptResponse?: boolean

Property Value

boolean

languages

List of BCP-47 language codes for speech detection.

languages?: string[]

Property Value

string[]

prefixPaddingInMs

Amount of audio to include before speech is detected, in milliseconds.

prefixPaddingInMs?: number

Property Value

number

removeFillerWords

Whether to remove filler words (e.g., 'um', 'uh') from transcription.

removeFillerWords?: boolean

Property Value

boolean

silenceDurationInMs

Duration of silence required to end speech detection, in milliseconds.

silenceDurationInMs?: number

Property Value

number

speechDurationInMs

Minimum speech duration in milliseconds to trigger detection.

speechDurationInMs?: number

Property Value

number

threshold

Activation threshold for VAD detection. Range: 0.0 to 1.0.

threshold?: number

Property Value

number

type

type: "azure_semantic_vad"

Property Value

"azure_semantic_vad"