Enum SPXPropertyId

Defines property ids.

Changed in version 1.4.0

Name	Description
SPXSpeechServiceConnectionKey	The Cognitive Services Speech Service subscription key. Under normal circumstances, you shouldn't have to use this property directly. Instead, use SPXSpeechConfiguration.initWithSubscription.
SPXSpeechServiceConnectionEndpoint	The Cognitive Services Speech Service endpoint (url). Under normal circumstances, you shouldn't have to use this property directly. Instead, use SPXSpeechConfiguration.initWithEndpoint.
SPXSpeechServiceConnectionRegion	The Cognitive Services Speech Service region. Under normal circumstances, you shouldn't have to use this property directly. Instead, use SPXSpeechConfiguration.initWithEndpoint, SPXSpeechConfiguration.initWithHost, or SPXSpeechConfiguration.initWithAuthorizationToken.
SPXSpeechServiceAuthorizationToken	The Cognitive Services Speech Service authorization token (aka access token). Under normal circumstances, you shouldn't have to use this property directly. Instead, use SPXSpeechConfiguration.initWithAuthorizationToken, SPXSpeechRecognizer.authorizationToken, or SPXTranslationRecognizer.authorizationToken.
SPXSpeechServiceAuthorizationType	The Cognitive Services Speech Service authorization type. Currently unused.
SPXSpeechServiceConnectionEndpointId	The Cognitive Services Custom Speech or Custom Voice Service endpoint id.
SPXSpeechServiceConnectionHost	The Cognitive Services Speech Service host (url). Under normal circumstances, you shouldn't have to use this property directly. Instead, use SPXSpeechConfiguration.initWithHost.
SPXSpeechServiceConnectionProxyHostName	The host name of the proxy server. Not implemented yet.
SPXSpeechServiceConnectionProxyPort	The port of the proxy server. Not implemented yet.
SPXSpeechServiceConnectionProxyUserName	The user name of the proxy server. Not implemented yet.
SPXSpeechServiceConnectionProxyPassword	The password of the proxy server. Not implemented yet.
SPXSpeechServiceConnectionUrl	The URL string built from speech configuration. This property is intended to be read-only. The SDK is using it internally.
SPXSpeechServiceConnectionProxyHostBypass	Specifies the list of hosts for which proxies should not be used. This setting overrides all other configurations. Hostnames are separated by commas and are matched in a case-insensitive manner. Wildcards are not supported.
SPXSpeechServiceConnectionTranslationToLanguages	The list of comma separated languages (BCP-47 format) used as target translation languages. Under normal circumstances, you shouldn't have to use this property directly. Instead use SPXSpeechTranslationConfiguration.addTargetLanguage and the read-only SPXSpeechTranslationConfiguration.targetLanguages and SPXTranslationRecognizer.targetLanguages collections.
SPXSpeechServiceConnectionTranslationVoice	The name of the Cognitive Service Text to Speech Service voice. Under normal circumstances, you shouldn't have to use this property directly. Instead use SPXSpeechTranslationConfiguration.voiceName.
SPXSpeechServiceConnectionTranslationFeatures	Translation features. For internal use.
SPXSpeechServiceConnectionRecognitionMode	The Cognitive Services Speech Service recognition mode. Can be "INTERACTIVE", "CONVERSATION", "DICTATION". This property is intended to be read-only. The SDK is using it internally.
SPXSpeechServiceConnectionRecognitionLanguage	The spoken language to be recognized (in BCP-47 format).
SPXSpeechSessionId	The session id. This id is a universally unique identifier (aka UUID) representing a specific binding of an audio input stream and the underlying speech recognition instance to which it is bound. Under normal circumstances, you shouldn't have to use this property directly. Instead, use SPXSessionEventArgs.sessionId.
SPXSpeechServiceConnectionSynthesisLanguage	The spoken language to be synthesized (e.g. en-US)
SPXSpeechServiceConnectionSynthesisVocie	The name of the voice to be used for speech synthesis
SPXSpeechServiceConnectionSynthesisOutputFormat	The string to specify speech synthesis output audio format (e.g. riff-16khz-16bit-mono-pcm)
SPXSpeechServiceConnectionSynthesisEnableCompressedAudioTransmission	Indicates if use compressed audio format for speech synthesis audio transmission. This property only affects when SpeechServiceConnection_SynthOutputFormat is set to a pcm format. If this property is not set and GStreamer is available, SDK will use compressed format for synthesized audio transmission, and decode it. You can set this property to "false" to use raw pcm format for transmission on wire. Added in version 1.16.0
SPXSpeechServiceConnectionSynthBackend	The string to specify TTS backend; valid options are online and offline. Under normal circumstances, you shouldn't have to use this property directly. Instead, use SPXEmbeddedSpeechConfig.initWithPath or SPXEmbeddedSpeechConfig.initWithPaths to set the synthesis backend to offline. Added in version 1.19.0
SPXSpeechServiceConnectionSynthOfflineDataPath	The data file path(s) for offline synthesis engine; only valid when synthesis backend is offline. Under normal circumstances, you shouldn't have to use this property directly. Instead, use SPXEmbeddedSpeechConfig.initWithPath or SPXEmbeddedSpeechConfig.initWithPaths Added in version 1.19.0
SPXSpeechServiceConnectionSynthOfflineVoice	The name of the offline TTS voice to be used for speech synthesis. Added in version 1.19.0
SPXSpeechServiceConnectionVoicesListEndpoint	The Cognitive Services Speech Service voices list api endpoint (url). Under normal circumstances, you don't need to specify this property, SDK will construct it based on the region/host/endpoint of SPXSpeechConfig. Added in version 1.16.0
SPXSpeechServiceConnectionInitialSilenceTimeoutMs	The initial silence timeout value (in milliseconds) used by the service. Added in version 1.5.0
SPXSpeechServiceConnectionEndSilenceTimeoutMs	This property is deprecated. For current information about silence timeouts, please visit https://aka.ms/csspeech/timeouts.
SPXSpeechServiceConnectionEnableAudioLogging	A boolean value specifying whether audio logging is enabled in the service or not. Audio and content logs are stored either in Microsoft-owned storage, or in your own storage account linked to your Cognitive Services subscription (Bring Your Own Storage (BYOS) enabled Speech resource). Added in version 1.5.0
SPXSpeechServiceConnectionLanguageIdMode	The speech service connection language identifier mode. Can be "AtStart" (the default), or "Continuous". See Language Identification document https://aka.ms/speech/lid?pivots=programming-language-objectivec for more details. Added in 1.25.0
SPXSpeechServiceConnectionTranslationCategoryId	The speech service connection translation categoryId.
SPXSpeechServiceConnectionAutoDetectSourceLanguages	The source language candidates used for auto language detection Added in version 1.12.0
SPXSpeechServiceConnectionAutoDetectSourceLanguageResult	The auto language detection result Added in version 1.12.0
SPXSpeechServiceResponseRequestDetailedResultTrueFalse	The requested Cognitive Services Speech Service response output format (simple or detailed). Not implemented yet.
SPXSpeechServiceResponseRequestProfanityFilterTrueFalse	The requested Cognitive Services Speech Service response output profanity level. Currently unused.
SPXSpeechServiceResponseProfanityOption	The requested Cognitive Services Speech Service response output profanity setting. Allowed values are "masked", "removed", and "raw". Added in version 1.5.0.
SPXSpeechServiceResponsePostProcessingOption	A string value specifying which post processing option should be used by service. Allowed values are "TrueText". Added in version 1.5.0
SPXSpeechServiceResponseRequestWordLevelTimestamps	A boolean value specifying whether to include word-level timestamps in the response result. Added in version 1.5.0
SPXSpeechServiceResponseStablePartialResultThreshold	The number of times a word has to be in partial results to be returned. Added in version 1.5.0
SPXSpeechServiceResponseOutputFormatOption	A string value specifying the output format option in the response result. Internal use only. Added in version 1.5.0.
SPXSpeechServiceResponseRequestSnr	A boolean value specifying whether to include SNR (signal to noise ratio) in the response result. Added in version 1.18.0.
SPXSpeechServiceResponseTranslationRequestStablePartialResult	A boolean value to request for stabilizing translation partial results by omitting words in the end. Added in version 1.5.0.
SPXSpeechServiceResponseRequestWordBoundary	A boolean value specifying whether to request WordBoundary events. Added in version 1.21.0.
SPXSpeechServiceResponseRequestPunctuationBoundary	A boolean value specifying whether to request punctuation boundary in WordBoundary Events. Default is true. Added in version 1.21.0.
SPXSpeechServiceResponseRequestSentenceBoundary	A boolean value specifying whether to request sentence boundary in WordBoundary Events. Default is false. Added in version 1.21.0.
SPXSpeechServiceResponseSynthesisEventsSyncToAudio	A boolean value specifying whether the SDK should synchronize synthesis metadata events, (e.g. word boundary, viseme, etc.) to the audio playback. This only takes effect when the audio is played through the SDK. Default is true. If set to false, the SDK will fire the events as they come from the service, which may be out of sync with the audio playback. Added in version 1.31.0.
SPXSpeechServiceResponseJsonResult	The Cognitive Services Speech Service response output (in JSON format). This property is available on recognition result objects only.
SPXSpeechServiceResponseJsonErrorDetails	The Cognitive Services Speech Service error details (in JSON format). Under normal circumstances, you shouldn't have to use this property directly. Instead, use SPXCancellationDetails.errorDetails.
SPXSpeechServiceResponseRecognitionLatencyMs	The recognition latency in milliseconds. Read-only, available on final speech/translation results. This measures the latency between when an audio input is received by the SDK, and the moment the final result is received from the service. The SDK computes the time difference between the last audio fragment from the audio input that is contributing to the final result, and the time the final result is received from the speech service. Added in version 1.3.0.
SPXSpeechServiceResponseSynthesisFirstByteLatencyMs	The speech synthesis first byte latency in milliseconds. Read-only, available on final speech synthesis results. This measures the latency between when the synthesis is started to be processed, and the moment the first byte audio is available. Added in version 1.17.0.
SPXSpeechServiceResponseSynthesisFinishLatencyMs	The speech synthesis all bytes latency in milliseconds. Read-only, available on final speech synthesis results. This measures the latency between when the synthesis is started to be processed, and the moment the whole audio is synthesized. Added in version 1.17.0.
SPXSpeechServiceResponseSynthesisUnderrunTimeMs	The underrun time for speech synthesis in milliseconds. Read-only, available on results in SynthesisCompleted events. This measures the total underrun time from SPXAudioConfigPlaybackBufferLengthInMs is filled to synthesis completed. Added in version 1.17.0.
SPXSpeechServiceResponseSynthesisConnectionLatencyMs	The speech synthesis connection latency in milliseconds. Read-only, available on final speech synthesis results. This measures the latency between when the synthesis is started to be processed, and the moment the HTTP/WebSocket connection is established. Added in version 1.26.0
SPXSpeechServiceResponseSynthesisNetworkLatencyMs	The speech synthesis network latency in milliseconds. Read-only, available on final speech synthesis results. This measures the network round trip time. Added in version 1.26.0
SPXSpeechServiceResponseSynthesisServiceLatencyMs	The speech synthesis service latency in milliseconds. Read-only, available on final speech synthesis results. This measures the service processing time to synthesize the first byte of audio. Added in version 1.26.0
SPXSpeechServiceResponseSynthesisBackend	Indicates which backend the synthesis is finished by. Read-only, available on speech synthesis results, except for the result in SynthesisStarted event. Added in version 1.17.0.
SPXSpeechServiceResponseDiarizeIntermediateResults	Determines if intermediate results contain speaker identification. Allowed values are "true" or "false". If set to "true", the intermediate results will contain speaker identification. The default value if unset or set to an invalid value is "false". This is currently only supported for scenarios using the ConversationTranscriber". Adding in version 1.40.
SPXCancellationDetailsReason	The cancellation reason. Currently unused.
SPXCancellationDetailsReasonText	The cancellation text. Currently unused.
SPXCancellationDetailsReasonDetailedText	The cancellation detailed text. Currently unused.
SPXAudioConfigDeviceNameForRender	The device name for audio render. Under normal circumstances, you shouldn't have to use this property directly. Instead, use SPXAudioConfiguration.initWithSpeakerOutput. Added in version 1.17.0
SPXAudioConfigPlaybackBufferLengthInMs	Playback buffer length in milliseconds, default is 50 milliseconds. Added in version 1.17.0.
SPXSpeechLogFilename	The file name to write logs.
SPXSpeechSegmentationSilenceTimeoutMs	A duration of detected silence, measured in milliseconds, after which speech-to-text will determine a spoken phrase has ended and generate a final Recognized result. Configuring this timeout may be helpful in situations where spoken input is significantly faster or slower than usual and default segmentation behavior consistently yields results that are too long or too short. Segmentation timeout values that are inappropriately high or low can negatively affect speech-to-text accuracy; this property should be carefully configured and the resulting behavior should be thoroughly validated as intended.
SPXSpeechSegmentationMaximumTimeMs	The maximum length of a spoken phrase when using the Time segmentation strategy. The value of <see also cref="Speech_SegmentationSilenceTimeoutMs"/> must be set in order to use this setting. As the length of a spoken phrase approaches this value, the <see also cref="Speech_SegmentationSilenceTimeoutMs"/> will begin being reduced until either the phrase silence timeout is hit or the phrase reaches the maximum length. The value must be in the range [20000, 70000] milliseconds.
SPXSpeechSegmentationStrategy	The strategy used to determine when a spoken phrase has ended and a final recognized result should be generated. Allowed values are "Default", "Time", and "Semantic".
SPXSpeechStartEventSensitivity	Controls how quickly the system signals a potential speech start after detecting voice activity. This setting does not alter the underlying voice activity detection algorithm. It only adjusts the timing criteria for raising a SpeechStartDetected event.
SPXDataBuffer_TimeStamp	The timestamp associated to data buffer written by client when using Pull/Push audio mode streams. The timestamp is a 64-bit value with resolution of 90kHz. The same as the presentation timestamp in MPEG transfrom stream. See https://en.wikipedia.org/wiki/Presentation_timestamp. NOTE: Added in version 1.13.0.
SPXDataBuffer_UserId	The user id associated to data buffer written by client when using Pull/Push audio mode streams.
SPXPronunciationAssessment_ReferenceText	The reference text of the audio for pronunciation evaluation. For this and the following pronunciation assessment parameters, see the table Pronunciation assessment parameters. Under normal circumstances, you shouldn't have to use this property directly.
SPXPronunciationAssessment_GradingSystem	The point system for pronunciation score calibration (FivePoint or HundredMark). Under normal circumstances, you shouldn't have to use this property directly.
SPXPronunciationAssessment_Granularity	The pronunciation evaluation granularity (Phoneme, Word, or FullText). Under normal circumstances, you shouldn't have to use this property directly.
SPXPronunciationAssessment_EnableMiscue	Defines if enable miscue calculation. With this enabled, the pronounced words will be compared to the reference text, and will be marked with omission/insertion based on the comparison. The default setting is False. Under normal circumstances, you shouldn't have to use this property directly.
SPXPronunciationAssessment_PhonemeAlphabet	The pronunciation evaluation phoneme alphabet. The valid values are "SAPI" (default) and "IPA" Under normal circumstances, you shouldn't have to use this property directly. Instead, use SPXPronunciationAssessmentConfiguration.phonemeAlphabet.
SPXPronunciationAssessment_NBestPhonemeCount	The pronunciation evaluation nbest phoneme count. Under normal circumstances, you shouldn't have to use this property directly. Instead, use SPXPronunciationAssessmentConfiguration.nbestPhonemeCount.
SPXPronunciationAssessment_EnableProsodyAssessment	Whether to enable prosoody assessment. Under normal circumstances, you shouldn't have to use this property directly. Instead, use SPXPronunciationAssessmentConfiguration.enableProsodyAssessment.
SPXPronunciationAssessment_Json	The json string of pronunciation assessment parameters Under normal circumstances, you shouldn't have to use this property directly.
SPXPronunciationAssessment_Params	Pronunciation assessment parameters. This property is intended to be read-only. The SDK is using it internally.
SPXSpeechSynthesis_FrameTimeoutInterval	The timeout interval in milliseconds between synthesized speech audio frames. The greater of this and 10 seconds is used as a hard frame timeout. A speech synthesis timeout occurs if a) the time passed since the latest frame exceeds this timeout interval and the Real-Time Factor (RTF) exceeds its maximum value, or b) the time passed since the latest frame exceeds the hard frame timeout.
SPXSpeechSynthesis_RtfTimeoutThreshold	The maximum Real-Time Factor (RTF) for speech synthesis. The RTF is calculated as RTF = f(d)/d where f(d) is the time taken to synthesize speech audio of duration d.

Last updated on 2026-02-09

Enum SPXPropertyId

Additional resources