microsoft-cognitiveservices-speech-sdk package

Classes

ActivityReceivedEventArgs

Defines contents of received message/events.

AudioConfig

Represents audio input configuration used for specifying what type of input to use (microphone, file, stream).

AudioInputStream

Represents audio input stream used for custom audio input configurations.

AudioOutputStream

Represents audio output stream used for custom audio output configurations.

AudioStreamFormat

Represents audio stream format used for custom audio input configurations.

AutoDetectSourceLanguageConfig

Language auto detect configuration.

AutoDetectSourceLanguageResult

Output format

AvatarConfig

Defines the talking avatar configuration.

AvatarEventArgs

Defines content for talking avatar events.

AvatarSynthesizer

Defines the avatar synthesizer.

AvatarVideoFormat

Defines the avatar output video format.

AvatarWebRTCConnectionResult

Defines the avatar WebRTC connection result.

BaseAudioPlayer

Base audio player class TODO: Plays only PCM for now.

BotFrameworkConfig

Class that defines configurations for the dialog service connector object for using a Bot Framework backend.

CancellationDetails

Contains detailed information about why a result was canceled.

CancellationDetailsBase

Contains detailed information about why a result was canceled.

Connection

Connection is a proxy class for managing connection to the speech service of the specified Recognizer. By default, a Recognizer autonomously manages connection to service when needed. The Connection class provides additional methods for users to explicitly open or close a connection and to subscribe to connection status changes. The use of Connection is optional, and mainly for scenarios where fine tuning of application behavior based on connection status is needed. Users can optionally call Open() to manually set up a connection in advance before starting recognition on the Recognizer associated with this Connection. If the Recognizer needs to connect or disconnect to service, it will setup or shutdown the connection independently. In this case the Connection will be notified by change of connection status via Connected/Disconnected events. Added in version 1.2.1.

ConnectionEventArgs

Defines payload for connection events like Connected/Disconnected. Added in version 1.2.0

ConnectionMessage

ConnectionMessage represents implementation specific messages sent to and received from the speech service. These messages are provided for debugging purposes and should not be used for production use cases with the Azure Cognitive Services Speech Service. Messages sent to and received from the Speech Service are subject to change without notice. This includes message contents, headers, payloads, ordering, etc. Added in version 1.11.0.

ConnectionMessageEventArgs
Conversation
ConversationExpirationEventArgs

Defines content for session events like SessionStarted/Stopped, SoundStarted/Stopped.

ConversationParticipantsChangedEventArgs

Defines content for session events like SessionStarted/Stopped, SoundStarted/Stopped.

ConversationTranscriber

Performs speech recognition with speaker separation from microphone, file, or other audio input streams, and gets transcribed text as result.

ConversationTranscriptionCanceledEventArgs

Defines content of a RecognitionErrorEvent.

ConversationTranscriptionEventArgs

Defines contents of conversation transcribed/transcribing event.

ConversationTranscriptionResult

Defines result of conversation transcription.

ConversationTranslationCanceledEventArgs
ConversationTranslationEventArgs

Defines payload for session events like Speech Start/End Detected

ConversationTranslationResult

Translation text result.

ConversationTranslator

Join, leave or connect to a conversation.

Coordinate

Defines a coordinate in 2D space.

CustomCommandsConfig

Class that defines configurations for the dialog service connector object for using a CustomCommands backend.

Diagnostics

Defines diagnostics API for managing console output Added in version 1.21.0

DialogServiceConfig

Class that defines base configurations for dialog service connector

DialogServiceConnector

Dialog Service Connector

IntentRecognitionCanceledEventArgs

Define payload of intent recognition canceled result events.

IntentRecognitionEventArgs

Intent recognition result event arguments.

IntentRecognitionResult

Intent recognition result.

IntentRecognizer

Intent recognizer.

KeywordRecognitionModel

Represents a keyword recognition model for recognizing when the user says a keyword to initiate further speech recognition.

LanguageUnderstandingModel

Language understanding model

Meeting
MeetingTranscriber
MeetingTranscriptionCanceledEventArgs

Defines content of a MeetingTranscriptionCanceledEvent.

MeetingTranscriptionEventArgs

Defines contents of meeting transcribed/transcribing event.

NoMatchDetails

Contains detailed information for NoMatch recognition results.

Participant

Represents a participant in a conversation. Added in version 1.4.0

PhraseListGrammar

Allows additions of new phrases to improve speech recognition.

Phrases added to the recognizer are effective at the start of the next recognition, or the next time the SpeechSDK must reconnect to the speech service.

PronunciationAssessmentConfig

Pronunciation assessment configuration.

PronunciationAssessmentResult

Pronunciation assessment results.

PropertyCollection

Represents collection of properties and their values.

PullAudioInputStream

Represents audio input stream used for custom audio input configurations.

PullAudioInputStreamCallback

An abstract base class that defines callback methods (read() and close()) for custom audio input streams).

PullAudioOutputStream

Represents memory backed push audio output stream used for custom audio output configurations.

PushAudioInputStream

Represents memory backed push audio input stream used for custom audio input configurations.

PushAudioOutputStream

Represents audio output stream used for custom audio output configurations.

PushAudioOutputStreamCallback

An abstract base class that defines callback methods (write() and close()) for custom audio output streams).

RecognitionEventArgs

Defines payload for session events like Speech Start/End Detected

RecognitionResult

Defines result of speech recognition.

Recognizer

Defines the base class Recognizer which mainly contains common event handlers.

ServiceEventArgs

Defines payload for any Service message event Added in version 1.9.0

SessionEventArgs

Defines content for session events like SessionStarted/Stopped, SoundStarted/Stopped.

SourceLanguageConfig

Source Language configuration.

SpeakerAudioDestination

Represents the speaker playback audio destination, which only works in browser. Note: the SDK will try to use Media Source Extensions to play audio. Mp3 format has better supports on Microsoft Edge, Chrome and Safari (desktop), so, it's better to specify mp3 format for playback.

SpeakerIdentificationModel

Defines SpeakerIdentificationModel class for Speaker Recognition Model contains a set of profiles against which to identify speaker(s)

SpeakerRecognitionCancellationDetails
SpeakerRecognitionResult

Output format

SpeakerRecognizer

Defines SpeakerRecognizer class for Speaker Recognition Handles operations from user for Voice Profile operations (e.g. createProfile, deleteProfile)

SpeakerVerificationModel

Defines SpeakerVerificationModel class for Speaker Recognition Model contains a profile against which to verify a speaker

SpeechConfig

Speech configuration.

SpeechConfigImpl
SpeechRecognitionCanceledEventArgs
SpeechRecognitionEventArgs

Defines contents of speech recognizing/recognized event.

SpeechRecognitionResult

Defines result of speech recognition.

SpeechRecognizer

Performs speech recognition from microphone, file, or other audio input streams, and gets transcribed text as result.

SpeechSynthesisBookmarkEventArgs

Defines contents of speech synthesis bookmark event.

SpeechSynthesisEventArgs

Defines contents of speech synthesis events.

SpeechSynthesisResult

Defines result of speech synthesis.

SpeechSynthesisVisemeEventArgs

Defines contents of speech synthesis viseme event.

SpeechSynthesisWordBoundaryEventArgs

Defines contents of speech synthesis word boundary event.

SpeechSynthesizer

Defines the class SpeechSynthesizer for text to speech. Updated in version 1.16.0

SpeechTranslationConfig

Speech translation configuration.

SynthesisResult

Base class for synthesis results

SynthesisVoicesResult

Defines result of speech synthesis.

Synthesizer
TranslationRecognitionCanceledEventArgs

Define payload of speech recognition canceled result events.

TranslationRecognitionEventArgs

Translation text result event arguments.

TranslationRecognitionResult

Translation text result.

TranslationRecognizer

Translation recognizer

TranslationSynthesisEventArgs

Translation Synthesis event arguments

TranslationSynthesisResult

Defines translation synthesis result, i.e. the voice output of the translated text in the target language.

Translations

Represents collection of parameters and their values.

TurnStatusReceivedEventArgs

Defines contents of received message/events.

User
VoiceInfo

Information about Speech Synthesis voice Added in version 1.20.0.

VoiceProfile

Defines Voice Profile class for Speaker Recognition

VoiceProfileCancellationDetails
VoiceProfileClient

Defines VoiceProfileClient class for Speaker Recognition Handles operations from user for Voice Profile operations (e.g. createProfile, deleteProfile)

VoiceProfileEnrollmentCancellationDetails
VoiceProfileEnrollmentResult

Output format

VoiceProfilePhraseResult

Output format

VoiceProfileResult

Output format

Interfaces

CancellationEventArgs
ConversationInfo
IParticipant

Represents a participant in a conversation. Added in version 1.4.0

IPlayer

Represents audio player interface to control the audio playback, such as pause, resume, etc.

IVoiceJson
MeetingInfo
VoiceSignature

Enums

AudioFormatTag
CancellationErrorCode

Defines error code in case that CancellationReason is Error. Added in version 1.1.0.

CancellationReason

Defines the possible reasons a recognition result might be canceled.

LanguageIdMode

Language Identification mode

LogLevel
NoMatchReason

Defines the possible reasons a recognition result might not be recognized.

OutputFormat

Define Speech Recognizer output formats.

ParticipantChangedReason
ProfanityOption

Profanity option. Added in version 1.7.0.

PronunciationAssessmentGradingSystem

Defines the point system for pronunciation score calibration; default value is FivePoint. Added in version 1.15.0

PronunciationAssessmentGranularity

Defines the pronunciation evaluation granularity; default value is Phoneme. Added in version 1.15.0

PropertyId

Defines speech property ids.

ResultReason

Defines the possible reasons a recognition result might be generated.

ServicePropertyChannel

Defines channels used to pass property settings to service. Added in version 1.7.0.

SpeakerRecognitionResultType
SpeechSynthesisBoundaryType

Defines the boundary type of speech synthesis boundary event.

SpeechSynthesisOutputFormat

Define speech synthesis audio output formats. SpeechSynthesisOutputFormat Updated in version 1.17.0

VoiceProfileType

Output format