你当前正在访问 Microsoft Azure Global Edition 技术文档网站。如果需要访问由世纪互联运营的 Microsoft Azure 中国技术文档网站，请访问 https://docs.azure.cn。

将代码从 REST API v3.1 迁移到 v3.2

项目
04/15/2024

语音转文本 REST API 用于批量听录和自定义语音识别。本文介绍版本 3.1 到 3.2 的更改内容。

重要

语音转文本 REST API v3.2 以预览版提供。语音转文本 REST API v3.1 已正式发布。语音转文本 REST API v3.0 将于 2026 年 4 月 1 日停用。有关详细信息，请参阅语音转文本 REST API v3.0 到 v3.1 和 v3.1 到 v3.2 迁移指南。

基础路径

必须在代码中将基础路径从 /speechtotext/v3.1 更新为 /speechtotext/v3.2-preview.2。例如，若要获取 eastus 区域中的基础模型，请使用 https://eastus.api.cognitive.microsoft.com/speechtotext/v3.2-preview.2/models/base 而非 https://eastus.api.cognitive.microsoft.com/speechtotext/v3.1/models/base。

有关详细信息，请参阅本指南后面的操作 ID。

批量听录

重要

新定价对通过语音转文本 REST API v3.2 进行的批量听录有效。有关详细信息，请参阅定价指南。

后向兼容性限制

请勿使用语音转文本 REST API v3.0 或 v3.1 检索通过语音转文本 REST API v3.2 创建的听录。你可能会看到如下错误消息：“此 API 版本不能用于访问此听录。请使用 API v3.2 或更高版本。”

语言识别模型

LanguageIdentificationMode 作为 candidateLocales 和 speechModelMapping 的同级添加到 LanguageIdentificationProperties。可用于语言识别的模型为 Continuous 或 Single。连续语言识别是默认值。有关详细信息，请参阅语言识别。

Whisper 模型

Azure AI 语音现在通过语音转文本 REST API v3.2 支持 OpenAI 的 Whisper 模型。要了解详细信息，请查看创建批量听录指南。

注意

Azure OpenAI 服务还支持通过同步 REST API 使用 OpenAI 的 Whisper 模型的语音转文本功能。若要了解详细信息，请查看快速入门。请查看什么是耳语模型？，详细了解何时使用 Azure AI 语音与 Azure OpenAI 服务。

自定义语音

重要

如果基础模型是在 2023 年 10 月 1 日及以后创建的，则你需要支付自定义语音模型训练的费用。如果基础模型是在 2023 年 10 月之前创建的，则无需支付训练费用。有关详细信息，请参阅 Azure AI 语音定价。

若要以编程方式确定是在 2023 年 10 月 1 日之前还是 2023 年 10 月 1 日之后创建模型，请使用版本 3.2 中新增的 chargedForAdaptation 属性。

自定义显示文本格式设置

为支持使用自定义显示文本格式数据进行模型适应，Datasets_Create 操作支持 OutputFormatting 数据类型。有关详细信息，请参阅上传数据集。

为具有 Lexical 和 Display 枚举值的 OutputFormatType 添加了定义。

"OutputFormatType": {
    "title": "OutputFormatType",
    "enum": [
        "Lexical",
        "Display"
    ],
    "type": "string",
    "x-ms-enum": {
        "name": "OutputFormatType",
        "modelAsString": true,
        "values": [
            {
                "value": "Lexical",
                "description": "Model provides the transcription output without formatting."
            },
            {
                "value": "Display",
                "description": "Model supports display formatting transcriptions output or endpoints."
            }
        ]
    }
},

OutputFormattingData 枚举值将添加到 FileKind（输入数据类型）。

supportedOutputFormat 属性已添加到 BaseModelFeatures。此属性在 BaseModel 定义范围内。

"BaseModelFeatures": {
    "title": "BaseModelFeatures",
    "description": "Features supported by the model.",
    "type": "object",
    "allOf": [
        {
            "$ref": "#/definitions/SharedModelFeatures"
        }
    ],
    "properties": {
        "supportsAdaptationsWith": {
            "description": "Supported dataset kinds to adapt the model.",
            "type": "array",
            "items": {
                "$ref": "#/definitions/DatasetKind"
            },
            "readOnly": true
        },
        "supportedOutputFormat": {
            "description": "Supported output formats.",
            "type": "array",
            "items": {
                "$ref": "#/definitions/OutputFormatType"
            },
            "readOnly": true
        }
    }
},

调整费用

chargeForAdaptation 属性已添加到 BaseModelProperties。此属性在 BaseModel 定义范围内。

重要

如果 chargeForAdaptation 的值为 true，则需为训练模型付费。如果值为 false，则需为训练模型付费。使用 chargeForAdaptation 属性而不是创建日期以编程方式确定是否需要为模型训练付费。

"BaseModelProperties": {
    "title": "BaseModelProperties",
    "type": "object",
    "properties": {
        "deprecationDates": {
            "$ref": "#/definitions/BaseModelDeprecationDates"
        },
        "features": {
            "$ref": "#/definitions/BaseModelFeatures"
        },
        "chargeForAdaptation": {
            "description": "A value indicating whether model adaptation is charged.",
            "type": "boolean",
            "readOnly": true
        }
    }
},

文本规范化

textNormalizationKind 属性已添加到 DatasetProperties。

TextNormalizationKind 的实体定义：文本规范化的类型。

默认值：默认文本规范化（例如，在 en-US 中，将“2 to 3”替换为“two to three”）。
无：未对输入文本应用文本规范化。此值是一个替代选项，仅在上传前对文本进行规范化时才使用。

评估属性

已向 EvaluationProperties 属性添加令牌计数和标记错误属性：

correctTokenCount1：model1 正确识别的标记数。
tokenCount1：model1 处理的标记数。
tokenDeletionCount1：model1 识别出的删除标记数。
tokenErrorRate1：使用 model1 进行识别的标记错误率。
tokenInsertionCount1：model1 识别出的插入标记数。
tokenSubstitutionCount1：model1 识别出的替代标记数。
correctTokenCount2：model2 正确识别的标记数。
tokenCount2：model2 处理的标记数。
tokenDeletionCount2：model2 识别出的删除标记数。
tokenErrorRate2：使用 model2 进行识别的标记错误率。
tokenInsertionCount2：model2 识别出的插入标记数。
tokenSubstitutionCount2：model2 识别出的替代标记数。

模型复制

以下更改适用于要复制模型的方案。

添加了新的 Models_Copy 操作。下面是新复制操作中的架构："$ref": "#/definitions/ModelCopyAuthorization"
已弃用 Models_CopyTo 操作。以下是弃用的复制操作中的架构："$ref": "#/definitions/ModelCopy"
添加了返回 "$ref": "#/definitions/ModelCopyAuthorization" 的新 Models_AuthorizeCopy 操作。该返回的实体可在新的 Models_Copy 操作中使用。

为 ModelCopyAuthorization 添加了新的实体定义：

"ModelCopyAuthorization": {
    "title": "ModelCopyAuthorization",
    "required": [
        "expirationDateTime",
        "id",
        "sourceResourceId",
        "targetResourceEndpoint",
        "targetResourceId",
        "targetResourceRegion"
    ],
    "type": "object",
    "properties": {
        "targetResourceRegion": {
            "description": "The region (aka location) of the target speech resource (e.g., westus2).",
            "minLength": 1,
            "type": "string"
        },
        "targetResourceId": {
            "description": "The Azure Resource ID of the target speech resource.",
            "minLength": 1,
            "type": "string"
        },
        "targetResourceEndpoint": {
            "description": "The endpoint (base url) of the target resource (with custom domain name when it is used).",
            "minLength": 1,
            "type": "string"
        },
        "sourceResourceId": {
            "description": "The Azure Resource ID of the source speech resource.",
            "minLength": 1,
            "type": "string"
        },
        "expirationDateTime": {
            "format": "date-time",
            "description": "The expiration date of this copy authorization.",
            "type": "string"
        },
        "id": {
            "description": "The ID of this copy authorization.",
            "minLength": 1,
            "type": "string"
        }
    }
},

为 ModelCopyAuthorizationDefinition 添加了新的实体定义：

"ModelCopyAuthorizationDefinition": {
    "title": "ModelCopyAuthorizationDefinition",
    "required": [
        "sourceResourceId"
    ],
    "type": "object",
    "properties": {
        "sourceResourceId": {
            "description": "The Azure Resource ID of the source speech resource.",
            "minLength": 1,
            "type": "string"
        }
    }
},

CustomModelLinks 复制属性

添加了新的 copy 属性。

copyTo URI：过时模型复制操作的位置。有关更多详细信息，请参阅 Models_CopyTo 操作。
copy URI：模型复制操作的位置。有关更多详细信息，请参阅 Models_Copy 操作。

"CustomModelLinks": {
    "title": "CustomModelLinks",
    "type": "object",
    "properties": {
      "copyTo": {
        "format": "uri",
        "description": "The location to the obsolete model copy action. See operation \"Models_CopyTo\" for more details.",
        "type": "string",
        "readOnly": true
      },
      "copy": {
        "format": "uri",
        "description": "The location to the model copy action. See operation \"Models_Copy\" for more details.",
        "type": "string",
        "readOnly": true
      },
      "files": {
        "format": "uri",
        "description": "The location to get all files of this entity. See operation \"Models_ListFiles\" for more details.",
        "type": "string",
        "readOnly": true
      },
      "manifest": {
        "format": "uri",
        "description": "The location to get a manifest for this model to be used in the on-prem container. See operation \"Models_GetCustomModelManifest\" for more details.",
        "type": "string",
        "readOnly": true
      }
    },
    "readOnly": true
},

Share via