你当前正在访问 Microsoft Azure Global Edition 技术文档网站。 如果需要访问由世纪互联运营的 Microsoft Azure 中国技术文档网站,请访问 https://docs.azure.cn

实现语言识别

语言标识用于在与支持的语言列表比较时确定音频中所说的语言。

语言标识 (LID) 用例包括:

  • 语音转文本识别,用于需要识别音频源中的语言并将其转录为文本时。
  • 语音翻译,用于需要识别音频源中的语言并将其翻译为另一种语言时。

对于语音识别,语言标识的初始延迟较高。 你应只在需要时添加此可选功能。

设置配置选项

无论是将语言识别与语音转文本还是与语音翻译一起使用,都需要了解一些常见概念和配置选项。

然后,向语音服务提出识别一次或连续识别请求。

重要

语言标识 API 是使用语音 SDK 版本 1.25 及更高版本简化的。 已删除 SpeechServiceConnection_SingleLanguageIdPrioritySpeechServiceConnection_ContinuousLanguageIdPriority 属性。 单个属性 SpeechServiceConnection_LanguageIdMode 会替换它们。 不再需要在低延迟和高准确度之间进行优先级选择。 对于连续语音识别或翻译,你只需选择是在启动时运行还是连续运行语言识别即可。

本文提供代码片段来描述这些概念。 提供了指向每个用例的完整示例的链接。

候选语言

向候选语言提供 AutoDetectSourceLanguageConfig 对象。 你预计音频中至少出现了一个候选语言。 对于起始 LID,最多可添加 4 种语言;对于连续 LID,最多可添加 10 种语言。 语音服务会返回提供的其中一种候选语言,即使音频中不存在这些语言。 例如,如果提供 fr-FR(法语)和 en-US(英语)作为候选语言,但语音使用的是德语,则服务会返回 fr-FRen-US

必须提供带有破折号 (-) 分隔符的完整区域设置,但语言标识仅使用每个基本语言一个区域设置。 对于同一语言,请勿包含多个区域设置(例如“en-US”和“en-GB”)。

var autoDetectSourceLanguageConfig =
    AutoDetectSourceLanguageConfig.FromLanguages(new string[] { "en-US", "de-DE", "zh-CN" });
auto autoDetectSourceLanguageConfig = 
    AutoDetectSourceLanguageConfig::FromLanguages({ "en-US", "de-DE", "zh-CN" });
auto_detect_source_language_config = \
    speechsdk.languageconfig.AutoDetectSourceLanguageConfig(languages=["en-US", "de-DE", "zh-CN"])
AutoDetectSourceLanguageConfig autoDetectSourceLanguageConfig =
    AutoDetectSourceLanguageConfig.fromLanguages(Arrays.asList("en-US", "de-DE", "zh-CN"));
var autoDetectSourceLanguageConfig = SpeechSDK.AutoDetectSourceLanguageConfig.fromLanguages([("en-US", "de-DE", "zh-CN"]);
NSArray *languages = @[@"en-US", @"de-DE", @"zh-CN"];
SPXAutoDetectSourceLanguageConfiguration* autoDetectSourceLanguageConfig = \
    [[SPXAutoDetectSourceLanguageConfiguration alloc]init:languages];

有关详细信息,请参阅支持的语言

“起始”和“连续”语言标识

语音支持起始和连续语言标识 (LID)。

注意

只有 C#、C++、Java(仅限语音转文本)、JavaScript(仅限文本转语音)和 Python 中的语音 SDK 支持连续语言标识。

  • “起始 LID”在音频的前几秒钟内标识语言一次。 如果音频中的语言不会更改,请使用“起始 LID”。 使用起始 LID 时,可以在不到 5 秒的时间内检测并返回一种语言。
  • 连续 LID 可以在音频期间识别多种语言。 如果音频中的语言会更改,请使用“连续 LID”。 “连续 LID”不支持在同一句子中更改语言。 例如,如果你主要说西班牙语并插入一些英语单词,它不会检测每个单词的语言更改。

通过调用用于识别一次或连续识别的方法,实现“起始 LID”或“连续 LID”。 仅连续识别支持连续 LID。

识别一次或连续识别

语言标识通过识别对象和操作完成。 向语音服务请求识别音频。

注意

请勿将识别与标识混淆。 识别可以与语言标识一起使用,也可单独使用。

将调用“识别一次”方法,或启动和停止连续识别方法。 你可从以下选项中进行选择:

  • 使用“起始 LID”识别一次。 “识别一次”不支持连续 LID。
  • 使用启动时 LID 和连续识别。
  • 使用连续 LID 和连续识别。

仅“连续 LID”需要 SpeechServiceConnection_LanguageIdMode 属性。 如果不进行设置,语音服务默认为“启动时 LID”。 支持的值为启动时 LID AtStart 或连续 LID Continuous

// Recognize once with At-start LID. Continuous LID isn't supported for recognize once.
var result = await recognizer.RecognizeOnceAsync();

// Start and stop continuous recognition with At-start LID
await recognizer.StartContinuousRecognitionAsync();
await recognizer.StopContinuousRecognitionAsync();

// Start and stop continuous recognition with Continuous LID
speechConfig.SetProperty(PropertyId.SpeechServiceConnection_LanguageIdMode, "Continuous");
await recognizer.StartContinuousRecognitionAsync();
await recognizer.StopContinuousRecognitionAsync();
// Recognize once with At-start LID. Continuous LID isn't supported for recognize once.
auto result = recognizer->RecognizeOnceAsync().get();

// Start and stop continuous recognition with At-start LID
recognizer->StartContinuousRecognitionAsync().get();
recognizer->StopContinuousRecognitionAsync().get();

// Start and stop continuous recognition with Continuous LID
speechConfig->SetProperty(PropertyId::SpeechServiceConnection_LanguageIdMode, "Continuous");
recognizer->StartContinuousRecognitionAsync().get();
recognizer->StopContinuousRecognitionAsync().get();
// Recognize once with At-start LID. Continuous LID isn't supported for recognize once.
SpeechRecognitionResult  result = recognizer->RecognizeOnceAsync().get();

// Start and stop continuous recognition with At-start LID
recognizer.startContinuousRecognitionAsync().get();
recognizer.stopContinuousRecognitionAsync().get();

// Start and stop continuous recognition with Continuous LID
speechConfig.setProperty(PropertyId.SpeechServiceConnection_LanguageIdMode, "Continuous");
recognizer.startContinuousRecognitionAsync().get();
recognizer.stopContinuousRecognitionAsync().get();
# Recognize once with At-start LID. Continuous LID isn't supported for recognize once.
result = recognizer.recognize_once()

# Start and stop continuous recognition with At-start LID
recognizer.start_continuous_recognition()
recognizer.stop_continuous_recognition()

# Start and stop continuous recognition with Continuous LID
speech_config.set_property(property_id=speechsdk.PropertyId.SpeechServiceConnection_LanguageIdMode, value='Continuous')
recognizer.start_continuous_recognition()
recognizer.stop_continuous_recognition()

使用“语音转文本”功能

当需要识别音频源中的语言并将其转录为文本时,使用语音转文本识别。 有关详细信息,请参阅语音转文本概述

注意

C#、C++、Python、Java、JavaScript 和 Objective-C 中的语音 SDK 支持使用起始语言标识的语音转文本识别。 只有 C#、C++、Java、JavaScript 和 Python 中的语音 SDK 支持使用连续语言标识的语音转文本识别。

目前,对于具有连续语言识别功能的语音转文本识别,必须从 wss://{region}.stt.speech.microsoft.com/speech/universal/v2 终结点字符串创建 SpeechConfig,如代码示例中所示。 在将来的 SDK 版本中,无需设置它。

有关使用语言标识的语音转文本识别的更多示例,请参阅 GitHub

using Microsoft.CognitiveServices.Speech;
using Microsoft.CognitiveServices.Speech.Audio;

var speechConfig = SpeechConfig.FromSubscription("YourSubscriptionKey","YourServiceRegion");

var autoDetectSourceLanguageConfig =
    AutoDetectSourceLanguageConfig.FromLanguages(
        new string[] { "en-US", "de-DE", "zh-CN" });

using var audioConfig = AudioConfig.FromDefaultMicrophoneInput();
using (var recognizer = new SpeechRecognizer(
    speechConfig,
    autoDetectSourceLanguageConfig,
    audioConfig))
{
    var speechRecognitionResult = await recognizer.RecognizeOnceAsync();
    var autoDetectSourceLanguageResult =
        AutoDetectSourceLanguageResult.FromResult(speechRecognitionResult);
    var detectedLanguage = autoDetectSourceLanguageResult.Language;
}

有关使用语言标识的语音转文本识别的更多示例,请参阅 GitHub

using namespace std;
using namespace Microsoft::CognitiveServices::Speech;
using namespace Microsoft::CognitiveServices::Speech::Audio;

auto speechConfig = SpeechConfig::FromSubscription("YourSubscriptionKey","YourServiceRegion");

auto autoDetectSourceLanguageConfig =
    AutoDetectSourceLanguageConfig::FromLanguages({ "en-US", "de-DE", "zh-CN" });

auto recognizer = SpeechRecognizer::FromConfig(
    speechConfig,
    autoDetectSourceLanguageConfig
    );

speechRecognitionResult = recognizer->RecognizeOnceAsync().get();
auto autoDetectSourceLanguageResult =
    AutoDetectSourceLanguageResult::FromResult(speechRecognitionResult);
auto detectedLanguage = autoDetectSourceLanguageResult->Language;

有关使用语言标识的语音转文本识别的更多示例,请参阅 GitHub

AutoDetectSourceLanguageConfig autoDetectSourceLanguageConfig =
    AutoDetectSourceLanguageConfig.fromLanguages(Arrays.asList("en-US", "de-DE"));

SpeechRecognizer recognizer = new SpeechRecognizer(
    speechConfig,
    autoDetectSourceLanguageConfig,
    audioConfig);

Future<SpeechRecognitionResult> future = recognizer.recognizeOnceAsync();
SpeechRecognitionResult result = future.get(30, TimeUnit.SECONDS);
AutoDetectSourceLanguageResult autoDetectSourceLanguageResult =
    AutoDetectSourceLanguageResult.fromResult(result);
String detectedLanguage = autoDetectSourceLanguageResult.getLanguage();

recognizer.close();
speechConfig.close();
autoDetectSourceLanguageConfig.close();
audioConfig.close();
result.close();

有关使用语言标识的语音转文本识别的更多示例,请参阅 GitHub

auto_detect_source_language_config = \
        speechsdk.languageconfig.AutoDetectSourceLanguageConfig(languages=["en-US", "de-DE"])
speech_recognizer = speechsdk.SpeechRecognizer(
        speech_config=speech_config, 
        auto_detect_source_language_config=auto_detect_source_language_config, 
        audio_config=audio_config)
result = speech_recognizer.recognize_once()
auto_detect_source_language_result = speechsdk.AutoDetectSourceLanguageResult(result)
detected_language = auto_detect_source_language_result.language
NSArray *languages = @[@"en-US", @"de-DE", @"zh-CN"];
SPXAutoDetectSourceLanguageConfiguration* autoDetectSourceLanguageConfig = \
        [[SPXAutoDetectSourceLanguageConfiguration alloc]init:languages];
SPXSpeechRecognizer* speechRecognizer = \
        [[SPXSpeechRecognizer alloc] initWithSpeechConfiguration:speechConfig
                           autoDetectSourceLanguageConfiguration:autoDetectSourceLanguageConfig
                                              audioConfiguration:audioConfig];
SPXSpeechRecognitionResult *result = [speechRecognizer recognizeOnce];
SPXAutoDetectSourceLanguageResult *languageDetectionResult = [[SPXAutoDetectSourceLanguageResult alloc] init:result];
NSString *detectedLanguage = [languageDetectionResult language];
var autoDetectSourceLanguageConfig = SpeechSDK.AutoDetectSourceLanguageConfig.fromLanguages(["en-US", "de-DE"]);
var speechRecognizer = SpeechSDK.SpeechRecognizer.FromConfig(speechConfig, autoDetectSourceLanguageConfig, audioConfig);
speechRecognizer.recognizeOnceAsync((result: SpeechSDK.SpeechRecognitionResult) => {
        var languageDetectionResult = SpeechSDK.AutoDetectSourceLanguageResult.fromResult(result);
        var detectedLanguage = languageDetectionResult.language;
},
{});

语音转文本自定义模型

注意

自定义模型的语言检测只可用于实时语音转文本和语音翻译。 批量听录仅支持默认基础模型的语言检测。

此示例演示如何将语言检测与自定义终结点一起使用。 如果检测到的语言为 en-US,则示例使用默认模型。 如果检测到的语言为 fr-FR,则示例使用自定义模型终结点。 有关详细信息,请参阅部署自定义语音识别模型

var sourceLanguageConfigs = new SourceLanguageConfig[]
{
    SourceLanguageConfig.FromLanguage("en-US"),
    SourceLanguageConfig.FromLanguage("fr-FR", "The Endpoint Id for custom model of fr-FR")
};
var autoDetectSourceLanguageConfig =
    AutoDetectSourceLanguageConfig.FromSourceLanguageConfigs(
        sourceLanguageConfigs);

此示例演示如何将语言检测与自定义终结点一起使用。 如果检测到的语言为 en-US,则示例使用默认模型。 如果检测到的语言为 fr-FR,则示例使用自定义模型终结点。 有关详细信息,请参阅部署自定义语音识别模型

std::vector<std::shared_ptr<SourceLanguageConfig>> sourceLanguageConfigs;
sourceLanguageConfigs.push_back(
    SourceLanguageConfig::FromLanguage("en-US"));
sourceLanguageConfigs.push_back(
    SourceLanguageConfig::FromLanguage("fr-FR", "The Endpoint Id for custom model of fr-FR"));

auto autoDetectSourceLanguageConfig =
    AutoDetectSourceLanguageConfig::FromSourceLanguageConfigs(
        sourceLanguageConfigs);

此示例演示如何将语言检测与自定义终结点一起使用。 如果检测到的语言为 en-US,则示例使用默认模型。 如果检测到的语言为 fr-FR,则示例使用自定义模型终结点。 有关详细信息,请参阅部署自定义语音识别模型

List sourceLanguageConfigs = new ArrayList<SourceLanguageConfig>();
sourceLanguageConfigs.add(
    SourceLanguageConfig.fromLanguage("en-US"));
sourceLanguageConfigs.add(
    SourceLanguageConfig.fromLanguage("fr-FR", "The Endpoint Id for custom model of fr-FR"));

AutoDetectSourceLanguageConfig autoDetectSourceLanguageConfig =
    AutoDetectSourceLanguageConfig.fromSourceLanguageConfigs(
        sourceLanguageConfigs);

此示例演示如何将语言检测与自定义终结点一起使用。 如果检测到的语言为 en-US,则示例使用默认模型。 如果检测到的语言为 fr-FR,则示例使用自定义模型终结点。 有关详细信息,请参阅部署自定义语音识别模型

 en_language_config = speechsdk.languageconfig.SourceLanguageConfig("en-US")
 fr_language_config = speechsdk.languageconfig.SourceLanguageConfig("fr-FR", "The Endpoint Id for custom model of fr-FR")
 auto_detect_source_language_config = speechsdk.languageconfig.AutoDetectSourceLanguageConfig(
        sourceLanguageConfigs=[en_language_config, fr_language_config])

此示例演示如何将语言检测与自定义终结点一起使用。 如果检测到的语言为 en-US,则示例使用默认模型。 如果检测到的语言为 fr-FR,则示例使用自定义模型终结点。 有关详细信息,请参阅部署自定义语音识别模型

SPXSourceLanguageConfiguration* enLanguageConfig = [[SPXSourceLanguageConfiguration alloc]init:@"en-US"];
SPXSourceLanguageConfiguration* frLanguageConfig = \
        [[SPXSourceLanguageConfiguration alloc]initWithLanguage:@"fr-FR"
                                                     endpointId:@"The Endpoint Id for custom model of fr-FR"];
NSArray *languageConfigs = @[enLanguageConfig, frLanguageConfig];
SPXAutoDetectSourceLanguageConfiguration* autoDetectSourceLanguageConfig = \
        [[SPXAutoDetectSourceLanguageConfiguration alloc]initWithSourceLanguageConfigurations:languageConfigs];
var enLanguageConfig = SpeechSDK.SourceLanguageConfig.fromLanguage("en-US");
var frLanguageConfig = SpeechSDK.SourceLanguageConfig.fromLanguage("fr-FR", "The Endpoint Id for custom model of fr-FR");
var autoDetectSourceLanguageConfig = SpeechSDK.AutoDetectSourceLanguageConfig.fromSourceLanguageConfigs([enLanguageConfig, frLanguageConfig]);

运行语音翻译

当需要识别音频源中的语言并将其翻译为另一种语言时,请使用语音翻译。 有关详细信息,请参阅语音翻译概述

注意

C#、C++、JavaScript 和 Python 中的语音 SDK 仅支持使用语言标识的语音翻译。 目前,对于具有语言识别功能的语音翻译,必须从 wss://{region}.stt.speech.microsoft.com/speech/universal/v2 终结点字符串创建 SpeechConfig,如代码示例中所示。 在将来的 SDK 版本中,无需设置它。

有关使用语言标识的语音翻译的更多示例,请参阅 GitHub

using Microsoft.CognitiveServices.Speech;
using Microsoft.CognitiveServices.Speech.Audio;
using Microsoft.CognitiveServices.Speech.Translation;

public static async Task RecognizeOnceSpeechTranslationAsync()
{
    var region = "YourServiceRegion";
    // Currently the v2 endpoint is required. In a future SDK release you won't need to set it.
    var endpointString = $"wss://{region}.stt.speech.microsoft.com/speech/universal/v2";
    var endpointUrl = new Uri(endpointString);

    var config = SpeechTranslationConfig.FromEndpoint(endpointUrl, "YourSubscriptionKey");

    // Source language is required, but currently ignored. 
    string fromLanguage = "en-US";
    speechTranslationConfig.SpeechRecognitionLanguage = fromLanguage;

    speechTranslationConfig.AddTargetLanguage("de");
    speechTranslationConfig.AddTargetLanguage("fr");

    var autoDetectSourceLanguageConfig = AutoDetectSourceLanguageConfig.FromLanguages(new string[] { "en-US", "de-DE", "zh-CN" });

    using var audioConfig = AudioConfig.FromDefaultMicrophoneInput();

    using (var recognizer = new TranslationRecognizer(
        speechTranslationConfig, 
        autoDetectSourceLanguageConfig,
        audioConfig))
    {

        Console.WriteLine("Say something or read from file...");
        var result = await recognizer.RecognizeOnceAsync().ConfigureAwait(false);

        if (result.Reason == ResultReason.TranslatedSpeech)
        {
            var lidResult = result.Properties.GetProperty(PropertyId.SpeechServiceConnection_AutoDetectSourceLanguageResult);

            Console.WriteLine($"RECOGNIZED in '{lidResult}': Text={result.Text}");
            foreach (var element in result.Translations)
            {
                Console.WriteLine($"    TRANSLATED into '{element.Key}': {element.Value}");
            }
        }
    }
}

有关使用语言标识的语音翻译的更多示例,请参阅 GitHub

auto region = "YourServiceRegion";
// Currently the v2 endpoint is required. In a future SDK release you won't need to set it.
auto endpointString = std::format("wss://{}.stt.speech.microsoft.com/speech/universal/v2", region);
auto config = SpeechTranslationConfig::FromEndpoint(endpointString, "YourSubscriptionKey");

auto autoDetectSourceLanguageConfig = AutoDetectSourceLanguageConfig::FromLanguages({ "en-US", "de-DE" });

// Sets source and target languages
// The source language will be detected by the language detection feature. 
// However, the SpeechRecognitionLanguage still need to set with a locale string, but it will not be used as the source language.
// This will be fixed in a future version of Speech SDK.
auto fromLanguage = "en-US";
config->SetSpeechRecognitionLanguage(fromLanguage);
config->AddTargetLanguage("de");
config->AddTargetLanguage("fr");

// Creates a translation recognizer using microphone as audio input.
auto recognizer = TranslationRecognizer::FromConfig(config, autoDetectSourceLanguageConfig);
cout << "Say something...\n";

// Starts translation, and returns after a single utterance is recognized. The end of a
// single utterance is determined by listening for silence at the end or until a maximum of 15
// seconds of audio is processed. The task returns the recognized text as well as the translation.
// Note: Since RecognizeOnceAsync() returns only a single utterance, it is suitable only for single
// shot recognition like command or query.
// For long-running multi-utterance recognition, use StartContinuousRecognitionAsync() instead.
auto result = recognizer->RecognizeOnceAsync().get();

// Checks result.
if (result->Reason == ResultReason::TranslatedSpeech)
{
    cout << "RECOGNIZED: Text=" << result->Text << std::endl;

    for (const auto& it : result->Translations)
    {
        cout << "TRANSLATED into '" << it.first.c_str() << "': " << it.second.c_str() << std::endl;
    }
}
else if (result->Reason == ResultReason::RecognizedSpeech)
{
    cout << "RECOGNIZED: Text=" << result->Text << " (text could not be translated)" << std::endl;
}
else if (result->Reason == ResultReason::NoMatch)
{
    cout << "NOMATCH: Speech could not be recognized." << std::endl;
}
else if (result->Reason == ResultReason::Canceled)
{
    auto cancellation = CancellationDetails::FromResult(result);
    cout << "CANCELED: Reason=" << (int)cancellation->Reason << std::endl;

    if (cancellation->Reason == CancellationReason::Error)
    {
        cout << "CANCELED: ErrorCode=" << (int)cancellation->ErrorCode << std::endl;
        cout << "CANCELED: ErrorDetails=" << cancellation->ErrorDetails << std::endl;
        cout << "CANCELED: Did you set the speech resource key and region values?" << std::endl;
    }
}

有关使用语言标识的语音翻译的更多示例,请参阅 GitHub

import azure.cognitiveservices.speech as speechsdk
import time
import json

speech_key, service_region = "YourSubscriptionKey","YourServiceRegion"
weatherfilename="en-us_zh-cn.wav"

# set up translation parameters: source language and target languages
# Currently the v2 endpoint is required. In a future SDK release you won't need to set it. 
endpoint_string = "wss://{}.stt.speech.microsoft.com/speech/universal/v2".format(service_region)
translation_config = speechsdk.translation.SpeechTranslationConfig(
    subscription=speech_key,
    endpoint=endpoint_string,
    speech_recognition_language='en-US',
    target_languages=('de', 'fr'))
audio_config = speechsdk.audio.AudioConfig(filename=weatherfilename)

# Specify the AutoDetectSourceLanguageConfig, which defines the number of possible languages
auto_detect_source_language_config = speechsdk.languageconfig.AutoDetectSourceLanguageConfig(languages=["en-US", "de-DE", "zh-CN"])

# Creates a translation recognizer using and audio file as input.
recognizer = speechsdk.translation.TranslationRecognizer(
    translation_config=translation_config, 
    audio_config=audio_config,
    auto_detect_source_language_config=auto_detect_source_language_config)

# Starts translation, and returns after a single utterance is recognized. The end of a
# single utterance is determined by listening for silence at the end or until a maximum of 15
# seconds of audio is processed. The task returns the recognition text as result.
# Note: Since recognize_once() returns only a single utterance, it is suitable only for single
# shot recognition like command or query.
# For long-running multi-utterance recognition, use start_continuous_recognition() instead.
result = recognizer.recognize_once()

# Check the result
if result.reason == speechsdk.ResultReason.TranslatedSpeech:
    print("""Recognized: {}
    German translation: {}
    French translation: {}""".format(
        result.text, result.translations['de'], result.translations['fr']))
elif result.reason == speechsdk.ResultReason.RecognizedSpeech:
    print("Recognized: {}".format(result.text))
    detectedSrcLang = result.properties[speechsdk.PropertyId.SpeechServiceConnection_AutoDetectSourceLanguageResult]
    print("Detected Language: {}".format(detectedSrcLang))
elif result.reason == speechsdk.ResultReason.NoMatch:
    print("No speech could be recognized: {}".format(result.no_match_details))
elif result.reason == speechsdk.ResultReason.Canceled:
    print("Translation canceled: {}".format(result.cancellation_details.reason))
    if result.cancellation_details.reason == speechsdk.CancellationReason.Error:
        print("Error details: {}".format(result.cancellation_details.error_details))

运行和使用容器

语音容器提供通过语音 SDK 和语音 CLI 访问的基于 Websocket 的查询终结点 API。 默认情况下,语音 SDK 和语音 CLI 使用公共语音服务。 若要使用该容器,需要更改初始化方法。 使用容器主机 URL 而不是密钥和区域。

在容器中运行语言 ID 时,请使用 SourceLanguageRecognizer 对象而不是 SpeechRecognizerTranslationRecognizer

有关容器的详细信息,请参阅语言识别语音容器操作指南。

实现语音转文本批量听录

如果要使用批量听录 REST API 识别语言,请在 Transcriptions_Create 请求的正文中使用 languageIdentification 属性。

警告

批量听录仅支持默认基础模型的语言标识。 如果在听录请求中同时指定了语言识别和自定义模型,则服务会回退到使用指定候选语言的基本模型。 这可能会导致意外的识别结果。

如果语音转文本方案既需要语言识别,又需要自定义模型,请使用实时语音转文本,而不是批量听录。

以下示例显示了 languageIdentification 属性在四种候选语言中的用法。 有关请求属性的详细信息,请参阅创建批量听录

{
    <...>
    
    "properties": {
    <...>
    
        "languageIdentification": {
            "candidateLocales": [
            "en-US",
            "ja-JP",
            "zh-CN",
            "hi-IN"
            ]
        },	
        <...>
    }
}