你当前正在访问 Microsoft Azure Global Edition 技术文档网站。如果需要访问由世纪互联运营的 Microsoft Azure 中国技术文档网站，请访问 https://docs.azure.cn。

Call Media - Recognize

服务:: Communication

API 版本:: 2025-05-15

从呼叫中识别媒体。

POST {endpoint}/calling/callConnections/{callConnectionId}:recognize?api-version=2025-05-15

URI 参数

名称	在	必需	类型	说明
callConnectionId	path	True	string	呼叫连接 ID
endpoint	path	True	string (url)	Azure 通信资源的终结点。
api-version	query	True	string	要调用的 API 版本。

请求头

名称	必需	类型	说明
Authorization	True	string	Azure 通信服务用户访问令牌。

请求正文

名称	必需	类型	说明
recognizeInputType	True	RecognizeInputType	确定识别的类型。
recognizeOptions	True	RecognizeOptions	定义用于识别的选项。
interruptCallMediaOperation		boolean	如果设置识别可以进入其他现有排队/当前处理请求。
operationCallbackUri		string	设置一个回调 URI，用于替代 CreateCall/AnswerCall 为此操作设置的默认回调 URI。此设置按操作。如果未设置，将使用 CreateCall/AnswerCall 设置的默认回调 URI。
operationContext		string	用于标识操作上下文的值。
playPrompt		PlaySource	要播放用于识别的音频的来源。
playPrompts		PlaySource[]	要播放用于识别的音频的来源。

响应

名称	类型	说明
202 Accepted		该服务已接受识别请求，并开始处理它。将在指定的回调 URI 处收到 RecognizeCompleted 或 RecognizeFailed 事件，以更新请求的状态。
Other Status Codes	CommunicationErrorResponse	错误

安全性

Authorization

Azure 通信服务用户访问令牌。

类型: apiKey
在: header

示例

CallMedia_Recognize

示例请求

HTTP

POST https://contoso.communications.azure.com/calling/callConnections/18dea47f-b081-4107-9a5c-4300819d2c6c:recognize?api-version=2025-05-15

{
  "recognizeInputType": "dtmf",
  "playPrompt": {
    "kind": "file",
    "file": {
      "uri": "https://some.file.azure.com/sample.wav"
    }
  },
  "recognizeOptions": {
    "interruptPrompt": true,
    "initialSilenceTimeoutInSeconds": 5,
    "targetParticipant": {
      "kind": "communicationUser",
      "communicationUser": {
        "id": "8:acs:b9614373-fd0b-480c-8fd2-cb58b70eab9f_da7be3a9-8788-42a6-85c6-56b2cf784fce"
      }
    },
    "dtmfOptions": {
      "interToneTimeoutInSeconds": 3,
      "maxTonesToCollect": 5,
      "stopTones": [
        "pound"
      ]
    }
  },
  "operationCallbackUri": "https://app.contoso.com/callback"
}

示例响应

状态代码:: 202

定义

名称	说明
Choice
DtmfOptions	DTMF 识别选项
FileSource
PlaySource
PlaySourceType	定义播放源的类型
RecognizeInputType	确定识别的类型。
RecognizeOptions
RecognizeRequest
SpeechOptions	连续语音识别的选项
SsmlSource
TextSource
Tone
VoiceKind	语音类型

Choice

Object

名称	类型	说明
label	string	给定选择的标识符
phrases	string[]	要识别的短语列表
tone	Tone

DtmfOptions

Object

DTMF 识别选项

名称	类型	说明
interToneTimeoutInSeconds	integer (int32) minimum: 1 maximum: 60	在 DTMF 输入之间等待以停止识别的时间。
maxTonesToCollect	integer (int32)	要收集的最大 DTMF 音调数。
stopTones	Tone[]	将停止识别的音调列表。

FileSource

Object

名称	类型	说明
uri	string	要播放的音频文件的 URI

PlaySource

Object

名称	类型	说明
file	FileSource	定义要用于播放的文件源信息
kind	PlaySourceType	定义播放源的类型
playSourceCacheId	string	定义用于缓存相关媒体的标识符
ssml	SsmlSource	定义要用于播放的 ssml（语音合成标记语言）源信息
text	TextSource	定义要用于播放的文本源信息

PlaySourceType

枚举

定义播放源的类型

值	说明
file
text
ssml

RecognizeInputType

枚举

确定识别的类型。

值	说明
dtmf
speech
speechOrDtmf
choices

RecognizeOptions

Object

名称	类型	说明
choices	Choice[]	定义用于识别的 Ivr 选项。
dtmfOptions	DtmfOptions	定义 DTMF 的配置。
initialSilenceTimeoutInSeconds	integer (int32) minimum: 0 maximum: 300	提示后等待第一个输入的时间（如果有）。
interruptPrompt	boolean	确定我们是否中断提示并开始识别。
speechLanguage	string	要识别的语音语言，如果未设置默认值，则 en-US
speechOptions	SpeechOptions	定义连续语音识别选项。
speechRecognitionModelEndpointId	string	部署自定义模型的终结点。
targetParticipant	CommunicationIdentifierModel

RecognizeRequest

Object

名称	类型	说明
interruptCallMediaOperation	boolean	如果设置识别可以进入其他现有排队/当前处理请求。
operationCallbackUri	string	设置一个回调 URI，用于替代 CreateCall/AnswerCall 为此操作设置的默认回调 URI。此设置按操作。如果未设置，将使用 CreateCall/AnswerCall 设置的默认回调 URI。
operationContext	string	用于标识操作上下文的值。
playPrompt	PlaySource	要播放用于识别的音频的来源。
playPrompts	PlaySource[]	要播放用于识别的音频的来源。
recognizeInputType	RecognizeInputType	确定识别的类型。
recognizeOptions	RecognizeOptions	定义用于识别的选项。

SpeechOptions

Object

连续语音识别的选项

名称	类型	说明
endSilenceTimeoutInMs	integer (int64)	当用户停止说话和 cogservice 发送响应时结束静音的长度。

SsmlSource

Object

名称	类型	说明
customVoiceEndpointId	string	部署自定义语音的终结点。
ssmlText	string	要播放的认知服务的 Ssml 字符串

TextSource

Object

名称	类型	说明
customVoiceEndpointId	string	部署自定义语音的终结点。
sourceLocale	string	要播放的源语言区域设置请参阅此处的可用区域设置：
text	string	要播放的认知服务的文本
voiceKind	VoiceKind	语音类型
voiceName	string	要播放的语音名称请参阅此处的可用文本转语音：

Tone

枚举

值	说明
zero
one
two
three
four
five
six
seven
eight
nine
a
b
c
d
pound
asterisk

VoiceKind

枚举

语音类型

值	说明
male
female

通过

Call Media - Recognize

URI 参数

请求头

请求正文

响应

安全性

Authorization

示例

CallMedia_Recognize

示例请求

示例响应

定义

Choice

DtmfOptions

FileSource

PlaySource

PlaySourceType

RecognizeInputType

RecognizeOptions

RecognizeRequest

SpeechOptions

SsmlSource

TextSource

Tone

VoiceKind