將程式碼從 REST API 的 v3.1 遷移至 v3.2

發行項
10/16/2024

語音轉換文字 REST API 可用來進行批次轉譯與自訂語音。本文說明從 3.1 版到 3.2 版的變更。

重要

語音轉換文字 REST API v3.2 是最新版本，已正式推出。 2024 年 9 月將會移除預覽版 3.2-preview.1 和 3.2-preview.2*。語音轉換文字 REST API v3.1 將在宣佈的日期淘汰。語音轉換文字 REST API v3.0 將於 2026 年 4 月 1 日淘汰。

基底路徑

您必須將程式碼中的基底路徑從 /speechtotext/v3.1 更新為 /speechtotext/v3.2。例如，若要取得 eastus 區域中的基底模型，請使用 https://eastus.api.cognitive.microsoft.com/speechtotext/v3.2/models/base 而不是 https://eastus.api.cognitive.microsoft.com/speechtotext/v3.1/models/base。

如需詳細資訊，請參閱本指南後面的作業識別碼。

批次轉譯

重要

新定價對於透過語音轉換文字 REST API v3.2 進行批次轉譯有效。如需詳細資訊，請參閱定價指南。

回溯相容性限制

請勿使用語音轉換文字 REST API v3.0 或 v3.1 來擷取透過語音轉換文字 REST API v3.2 建立的轉譯。您會看到如下的錯誤訊息：「API 版本無法用來存取此轉譯。請使用 API 版本 v3.2 或更高版本。」

語言識別模式

LanguageIdentificationMode 會作為 candidateLocales 和 speechModelMapping 的同層級新增至 LanguageIdentificationProperties。語言辨識可用的模式為 Continuous 或 Single。預設值是連續語言識別。如需詳細資訊，請參閱語言識別。

Whisper 模型

Azure AI 語音現在透過語音轉換文字 REST API v3.2 支援 OpenAI 的 Whisper 模型。若要深入了解，請參閱建立批次轉譯指南。

注意

Azure OpenAI 服務也支援 OpenAI 的 Whisper 模型，透過同步 REST API 進行語音轉換文字。如需詳細資訊，請參閱快速入門。請參閱什麼是 Whisper 模型？深入了解何時使用 Azure AI 語音與Azure OpenAI 服務。

自訂語音

重要

如果基底模型是在 2023 年 10 月 1 日及以後建立的，則會向您收取自訂語音模型訓練的費用。如果基底模型是在 2023 年 10 月之前建立的，則不會向您收取訓練費用。如需詳細資訊，請參閱 Azure AI 語音定價。

若要以程式設計方式判斷模型是在 2023 年 10 月 1 日之前或之後建立的，請使用版本 3.2 中新增的 chargedForAdaptation 屬性。

自訂顯示文字格式

為了支援使用自訂顯示文字格式數據的模型調整，Datasets_Create 作業支援 OutputFormatting 數據類型。如需詳細資訊，請參閱上傳資料集。

已為 OutputFormatType 新增了一個定義，此定義具有 Lexical 和 Display 列舉值。

"OutputFormatType": {
    "title": "OutputFormatType",
    "enum": [
        "Lexical",
        "Display"
    ],
    "type": "string",
    "x-ms-enum": {
        "name": "OutputFormatType",
        "modelAsString": true,
        "values": [
            {
                "value": "Lexical",
                "description": "Model provides the transcription output without formatting."
            },
            {
                "value": "Display",
                "description": "Model supports display formatting transcriptions output or endpoints."
            }
        ]
    }
},

OutputFormattingData 列舉值已新增至 FileKind (輸入數據類型)。

BaseModelFeatures 中新增了 supportedOutputFormat 屬性。此屬性位於 BaseModel 定義中。

"BaseModelFeatures": {
    "title": "BaseModelFeatures",
    "description": "Features supported by the model.",
    "type": "object",
    "allOf": [
        {
            "$ref": "#/definitions/SharedModelFeatures"
        }
    ],
    "properties": {
        "supportsAdaptationsWith": {
            "description": "Supported dataset kinds to adapt the model.",
            "type": "array",
            "items": {
                "$ref": "#/definitions/DatasetKind"
            },
            "readOnly": true
        },
        "supportedOutputFormat": {
            "description": "Supported output formats.",
            "type": "array",
            "items": {
                "$ref": "#/definitions/OutputFormatType"
            },
            "readOnly": true
        }
    }
},

調整費用

BaseModelProperties 中新增了 chargeForAdaptation 屬性。此屬性位於 BaseModel 定義中。

重要

如果 chargeForAdaptation 的值為 true，則會向您收取模型訓練的費用。如果值為 false，則會向您收取模型訓練的費用。使用 chargeForAdaptation 屬性 (而非建立日期)，以程序設計方式判斷您是否要為訓練模型付費。

"BaseModelProperties": {
    "title": "BaseModelProperties",
    "type": "object",
    "properties": {
        "deprecationDates": {
            "$ref": "#/definitions/BaseModelDeprecationDates"
        },
        "features": {
            "$ref": "#/definitions/BaseModelFeatures"
        },
        "chargeForAdaptation": {
            "description": "A value indicating whether model adaptation is charged.",
            "type": "boolean",
            "readOnly": true
        }
    }
},

文字正規化

DatasetProperties 中新增了 textNormalizationKind 屬性。

TextNormalizationKind 的實體定義：文字正規化的類型。

默認值：默認文字正規化 (例如，在美國英語中，「two to three」替換「2 to 3」)。
無：不會將文字正規化套用至輸入文字。這個值是覆寫選項，只有在上傳之前將文字正規化時，才應該使用這個值。

評估屬性

在 EvaluationProperties 屬性中新增了語彙基元計數與權杖錯誤屬性：

correctTokenCount1：model1 正確辨識的語彙基元數目。
tokenCount1：model1 所處理的語彙基元數目。
tokenDeletionCount1：model1 所辨識的語彙基元中被刪除的數目。
tokenErrorRate1：用 model1 進行辨識時的語彙基元錯誤率。
tokenInsertionCount1：model1 所辨識的語彙基元中插入的數目。
tokenSubstitutionCount1：model1 所辨識的單詞中被替換的數目。
correctTokenCount2：model2 正確辨識的語彙基元數目。
tokenCount2：model2 所處理的語彙基元數目。
tokenDeletionCount2：model2 所辨識的語彙基元中被刪除的數目。
tokenErrorRate2：用 model2 進行辨識時的語彙基元錯誤率。
tokenInsertionCount2：model2 所辨識的語彙基元中插入的數目。
tokenSubstitutionCount2：model2 所辨識的單詞中被替換的數目。

模型複製

下列變更適用於您複製模型的案例。

已新增 Models_Copy 作業。以下是新複製作業中的結構描述："$ref": "#/definitions/ModelCopyAuthorization"
已淘汰 Models_CopyTo 作業。以下是已淘汰複製作業中的結構描述："$ref": "#/definitions/ModelCopy"
已新增傳回 "$ref": "#/definitions/ModelCopyAuthorization" 的新 Models_AuthorizeCopy 作業。您可以在新的 Models_Copy 作業中使用這個傳回的實體。

已新增 ModelCopyAuthorization 的新實體定義：

"ModelCopyAuthorization": {
    "title": "ModelCopyAuthorization",
    "required": [
        "expirationDateTime",
        "id",
        "sourceResourceId",
        "targetResourceEndpoint",
        "targetResourceId",
        "targetResourceRegion"
    ],
    "type": "object",
    "properties": {
        "targetResourceRegion": {
            "description": "The region (aka location) of the target speech resource (e.g., westus2).",
            "minLength": 1,
            "type": "string"
        },
        "targetResourceId": {
            "description": "The Azure Resource ID of the target speech resource.",
            "minLength": 1,
            "type": "string"
        },
        "targetResourceEndpoint": {
            "description": "The endpoint (base url) of the target resource (with custom domain name when it is used).",
            "minLength": 1,
            "type": "string"
        },
        "sourceResourceId": {
            "description": "The Azure Resource ID of the source speech resource.",
            "minLength": 1,
            "type": "string"
        },
        "expirationDateTime": {
            "format": "date-time",
            "description": "The expiration date of this copy authorization.",
            "type": "string"
        },
        "id": {
            "description": "The ID of this copy authorization.",
            "minLength": 1,
            "type": "string"
        }
    }
},

已新增 ModelCopyAuthorizationDefinition 的新實體定義：

"ModelCopyAuthorizationDefinition": {
    "title": "ModelCopyAuthorizationDefinition",
    "required": [
        "sourceResourceId"
    ],
    "type": "object",
    "properties": {
        "sourceResourceId": {
            "description": "The Azure Resource ID of the source speech resource.",
            "minLength": 1,
            "type": "string"
        }
    }
},

CustomModelLinks 複製屬性

已新增 copy 屬性。

copyTo URI：過時模型複製動作的位置。如需更多詳細資料，請參閱 Models_CopyTo 作業。
copy URI：模型複製動作的位置。如需更多詳細資料，請參閱 Models_Copy 作業。

"CustomModelLinks": {
    "title": "CustomModelLinks",
    "type": "object",
    "properties": {
      "copyTo": {
        "format": "uri",
        "description": "The location to the obsolete model copy action. See operation \"Models_CopyTo\" for more details.",
        "type": "string",
        "readOnly": true
      },
      "copy": {
        "format": "uri",
        "description": "The location to the model copy action. See operation \"Models_Copy\" for more details.",
        "type": "string",
        "readOnly": true
      },
      "files": {
        "format": "uri",
        "description": "The location to get all files of this entity. See operation \"Models_ListFiles\" for more details.",
        "type": "string",
        "readOnly": true
      },
      "manifest": {
        "format": "uri",
        "description": "The location to get a manifest for this model to be used in the on-prem container. See operation \"Models_GetCustomModelManifest\" for more details.",
        "type": "string",
        "readOnly": true
      }
    },
    "readOnly": true
},

共用方式為