언어 식별은 지원되는 언어 목록과 비교할 때 오디오에서 말하는 언어를 식별하는 데 사용됩니다.
언어 식별(LID) 사용 사례는 다음과 같습니다.
- 음성 텍스트 변환 인식 - 오디오 원본의 언어를 식별한 후 텍스트로 전사해야 하는 경우.
- 음성 번역 오디오 원본의 언어를 식별한 후 다른 언어로 번역해야 하는 경우
음성 인식의 경우 언어 식별에서 초기 대기 시간이 더 길어집니다. 이 선택적 기능은 필요한 경우에만 포함해야 합니다.
구성 옵션 설정
음성 텍스트 변환 또는 음성 번역에서 언어 식별을 사용하는지 여부에 관계없이 몇 가지 일반적인 개념과 구성 옵션이 있습니다.
그런 다음, 한 번 인식 또는 연속 인식을 Speech Service에 요청합니다.
이 문서에서는 개념을 설명하는 코드 조각을 제공합니다. 각 사용 사례에 대한 전체 샘플에 대한 링크가 제공됩니다.
후보 언어
AutoDetectSourceLanguageConfig
개체를 사용하여 후보 언어를 제공합니다. 하나 이상의 후보가 오디오에 있는 것으로 예상합니다.
시작 시 LID에 대해 최대 4개 언어를 포함하거나 연속 LID에 대해 최대 10개 언어를 포함할 수 있습니다. Speech Service는 후보 언어가 오디오에 없는 경우에도 제공된 해당 언어 중 하나를 반환합니다. 예를 들어 fr-FR
(프랑스어) 및 en-US
(영어)가 후보로 제공되었지만 독일어를 말하는 경우 서비스에서 fr-FR
또는 en-US
가 반환됩니다.
대시(-
) 구분 기호가 있는 전체 로캘을 제공해야 하지만 언어 식별은 기본 언어당 하나의 로캘만 사용합니다. 동일한 언어에 대해 여러 로캘(예: en-US
및 en-GB
)을 포함하지 마세요.
var autoDetectSourceLanguageConfig =
AutoDetectSourceLanguageConfig.FromLanguages(new string[] { "en-US", "de-DE", "zh-CN" });
auto autoDetectSourceLanguageConfig =
AutoDetectSourceLanguageConfig::FromLanguages({ "en-US", "de-DE", "zh-CN" });
auto_detect_source_language_config = \
speechsdk.languageconfig.AutoDetectSourceLanguageConfig(languages=["en-US", "de-DE", "zh-CN"])
AutoDetectSourceLanguageConfig autoDetectSourceLanguageConfig =
AutoDetectSourceLanguageConfig.fromLanguages(Arrays.asList("en-US", "de-DE", "zh-CN"));
var autoDetectSourceLanguageConfig = SpeechSDK.AutoDetectSourceLanguageConfig.fromLanguages([("en-US", "de-DE", "zh-CN"]);
NSArray *languages = @[@"en-US", @"de-DE", @"zh-CN"];
SPXAutoDetectSourceLanguageConfiguration* autoDetectSourceLanguageConfig = \
[[SPXAutoDetectSourceLanguageConfiguration alloc]init:languages];
자세한 내용은 지원되는 언어를 참조하세요.
시작 시 및 연속 언어 식별
음성은 시작 시 및 연속 LID(언어 식별)를 모두 지원합니다.
참고 항목
연속 언어 식별은 C#, C++, Java(음성 텍스트 변환만 해당), JavaScript(음성 텍스트 변환만 해당) 및 Python의 Speech SDK에서만 지원됩니다.
- 시작 시 LID는 언어를 오디오의 처음 몇 초 내에 한 번 식별합니다. 오디오의 언어가 변경되지 않는 경우 시작 시 LID를 사용합니다. 시작 시 LID를 사용하면 단일 언어가 검색되고 5초 이내에 반환됩니다.
- 연속 LID는 오디오 중에 여러 언어를 식별할 수 있습니다. 오디오의 언어가 변경될 수 있는 경우 연속 LID를 사용합니다. 연속 LID는 동일한 문장 내에서 언어 변경을 지원하지 않습니다. 예를 들어 스페인어를 주로 말하고 몇몇 영어 단어를 삽입하는 경우 단어당 언어 변경이 검색되지 않습니다.
한 번 인식 또는 연속 인식 메서드를 호출하여 시작 시 LID 또는 연속 LID를 구현합니다. 연속 LID는 연속 인식에서만 지원됩니다.
한 번 또는 연속 인식
언어 식별은 인식 개체 및 작업을 사용하여 완료됩니다. 오디오 인식을 위해 Speech Service에 요청합니다.
참고 항목
인식과 식별을 혼동하지 마세요. 인식은 언어 식별의 사용 여부에 관계없이 사용할 수 있습니다.
“한 번 인식” 메서드를 호출하거나, 연속 인식 시작 및 중지 메서드를 호출합니다. 다음 중에서 선택합니다.
- 시작 시 LID를 사용하여 한 번 인식 연속 LID는 한 번 인식을 지원하지 않습니다.
- 시작 시 LID에서 연속 인식을 사용합니다.
- 연속 LID와 함께 연속 인식을 사용합니다.
SpeechServiceConnection_LanguageIdMode
속성은 연속 LID에만 필요합니다. 이 속성이 없으면 Speech Service가 기본적으로 시작 시 LID로 설정됩니다. 지원되는 값은 시작 시 LID에 대해 AtStart
, 연속 LID의 경우 Continuous
입니다.
// Recognize once with At-start LID. Continuous LID isn't supported for recognize once.
var result = await recognizer.RecognizeOnceAsync();
// Start and stop continuous recognition with At-start LID
await recognizer.StartContinuousRecognitionAsync();
await recognizer.StopContinuousRecognitionAsync();
// Start and stop continuous recognition with Continuous LID
speechConfig.SetProperty(PropertyId.SpeechServiceConnection_LanguageIdMode, "Continuous");
await recognizer.StartContinuousRecognitionAsync();
await recognizer.StopContinuousRecognitionAsync();
// Recognize once with At-start LID. Continuous LID isn't supported for recognize once.
auto result = recognizer->RecognizeOnceAsync().get();
// Start and stop continuous recognition with At-start LID
recognizer->StartContinuousRecognitionAsync().get();
recognizer->StopContinuousRecognitionAsync().get();
// Start and stop continuous recognition with Continuous LID
speechConfig->SetProperty(PropertyId::SpeechServiceConnection_LanguageIdMode, "Continuous");
recognizer->StartContinuousRecognitionAsync().get();
recognizer->StopContinuousRecognitionAsync().get();
// Recognize once with At-start LID. Continuous LID isn't supported for recognize once.
SpeechRecognitionResult result = recognizer->RecognizeOnceAsync().get();
// Start and stop continuous recognition with At-start LID
recognizer.startContinuousRecognitionAsync().get();
recognizer.stopContinuousRecognitionAsync().get();
// Start and stop continuous recognition with Continuous LID
speechConfig.setProperty(PropertyId.SpeechServiceConnection_LanguageIdMode, "Continuous");
recognizer.startContinuousRecognitionAsync().get();
recognizer.stopContinuousRecognitionAsync().get();
# Recognize once with At-start LID. Continuous LID isn't supported for recognize once.
result = recognizer.recognize_once()
# Start and stop continuous recognition with At-start LID
recognizer.start_continuous_recognition()
recognizer.stop_continuous_recognition()
# Start and stop continuous recognition with Continuous LID
speech_config.set_property(property_id=speechsdk.PropertyId.SpeechServiceConnection_LanguageIdMode, value='Continuous')
recognizer.start_continuous_recognition()
recognizer.stop_continuous_recognition()
음성 텍스트 변환 사용
오디오 원본의 언어를 식별한 후 텍스트로 전사해야 하는 경우 음성 텍스트 변환 인식을 사용합니다. 자세한 내용은 음성 텍스트 변환 개요를 참조하세요.
참고 항목
시작 시 언어 식별을 사용하는 음성 텍스트 변환 인식은 C#, C++, Python, Java, JavaScript 및 Objective-C의 Speech SDK에서 지원됩니다. 연속 언어 식별을 사용하는 음성 텍스트 변환 인식은 C#, C++, Java, JavaScript 및 Python의 Speech SDK에서만 지원됩니다.
현재 지속적인 언어 식별을 사용하여 음성 텍스트로 인식하려면 코드 예제와 같이 엔드포인트에서 SpeechConfig를 만들어야 합니다.
GitHub에서 언어 식별을 사용하는 음성 텍스트 변환 인식의 더 많은 예제를 참조하세요.
using Microsoft.CognitiveServices.Speech;
using Microsoft.CognitiveServices.Speech.Audio;
var speechConfig = SpeechConfig.FromEndpoint(new Uri("YourSpeechEndpoint"), "YourSpeechKey");
var autoDetectSourceLanguageConfig =
AutoDetectSourceLanguageConfig.FromLanguages(
new string[] { "en-US", "de-DE", "zh-CN" });
using var audioConfig = AudioConfig.FromDefaultMicrophoneInput();
using (var recognizer = new SpeechRecognizer(
speechConfig,
autoDetectSourceLanguageConfig,
audioConfig))
{
var speechRecognitionResult = await recognizer.RecognizeOnceAsync();
var autoDetectSourceLanguageResult =
AutoDetectSourceLanguageResult.FromResult(speechRecognitionResult);
var detectedLanguage = autoDetectSourceLanguageResult.Language;
}
using Microsoft.CognitiveServices.Speech;
using Microsoft.CognitiveServices.Speech.Audio;
var config = SpeechConfig.FromEndpoint(new Uri("YourSpeechEndpoint"), "YourSpeechKey");
// Set the LanguageIdMode (Optional; Either Continuous or AtStart are accepted; Default AtStart)
config.SetProperty(PropertyId.SpeechServiceConnection_LanguageIdMode, "Continuous");
var autoDetectSourceLanguageConfig = AutoDetectSourceLanguageConfig.FromLanguages(new string[] { "en-US", "de-DE", "zh-CN" });
var stopRecognition = new TaskCompletionSource<int>();
using (var audioInput = AudioConfig.FromWavFileInput(@"en-us_zh-cn.wav"))
{
using (var recognizer = new SpeechRecognizer(config, autoDetectSourceLanguageConfig, audioInput))
{
// Subscribes to events.
recognizer.Recognizing += (s, e) =>
{
if (e.Result.Reason == ResultReason.RecognizingSpeech)
{
Console.WriteLine($"RECOGNIZING: Text={e.Result.Text}");
var autoDetectSourceLanguageResult = AutoDetectSourceLanguageResult.FromResult(e.Result);
Console.WriteLine($"DETECTED: Language={autoDetectSourceLanguageResult.Language}");
}
};
recognizer.Recognized += (s, e) =>
{
if (e.Result.Reason == ResultReason.RecognizedSpeech)
{
Console.WriteLine($"RECOGNIZED: Text={e.Result.Text}");
var autoDetectSourceLanguageResult = AutoDetectSourceLanguageResult.FromResult(e.Result);
Console.WriteLine($"DETECTED: Language={autoDetectSourceLanguageResult.Language}");
}
else if (e.Result.Reason == ResultReason.NoMatch)
{
Console.WriteLine($"NOMATCH: Speech could not be recognized.");
}
};
recognizer.Canceled += (s, e) =>
{
Console.WriteLine($"CANCELED: Reason={e.Reason}");
if (e.Reason == CancellationReason.Error)
{
Console.WriteLine($"CANCELED: ErrorCode={e.ErrorCode}");
Console.WriteLine($"CANCELED: ErrorDetails={e.ErrorDetails}");
Console.WriteLine($"CANCELED: Did you set the speech resource key and endpoint values?");
}
stopRecognition.TrySetResult(0);
};
recognizer.SessionStarted += (s, e) =>
{
Console.WriteLine("\n Session started event.");
};
recognizer.SessionStopped += (s, e) =>
{
Console.WriteLine("\n Session stopped event.");
Console.WriteLine("\nStop recognition.");
stopRecognition.TrySetResult(0);
};
// Starts continuous recognition. Uses StopContinuousRecognitionAsync() to stop recognition.
await recognizer.StartContinuousRecognitionAsync().ConfigureAwait(false);
// Waits for completion.
// Use Task.WaitAny to keep the task rooted.
Task.WaitAny(new[] { stopRecognition.Task });
// Stops recognition.
await recognizer.StopContinuousRecognitionAsync().ConfigureAwait(false);
}
}
GitHub에서 언어 식별을 사용하는 음성 텍스트 변환 인식의 더 많은 예제를 참조하세요.
using namespace std;
using namespace Microsoft::CognitiveServices::Speech;
using namespace Microsoft::CognitiveServices::Speech::Audio;
auto speechConfig = SpeechConfig::FromEndpoint("YourServiceEndpoint", "YourSpeechResoureKey");
auto autoDetectSourceLanguageConfig =
AutoDetectSourceLanguageConfig::FromLanguages({ "en-US", "de-DE", "zh-CN" });
auto recognizer = SpeechRecognizer::FromConfig(
speechConfig,
autoDetectSourceLanguageConfig
);
speechRecognitionResult = recognizer->RecognizeOnceAsync().get();
auto autoDetectSourceLanguageResult =
AutoDetectSourceLanguageResult::FromResult(speechRecognitionResult);
auto detectedLanguage = autoDetectSourceLanguageResult->Language;
// Creates an instance of a speech config with specified subscription key and service region.
// Note: For multi-lingual speech recognition with language id, it only works with speech v2 endpoint,
// you must use FromEndpoint api in order to use the speech v2 endpoint.
// Replace YourServiceRegion with your region, for example "westus", and
// replace YourSubscriptionKey with your own speech key.
string speechv2Endpoint = "wss://YourServiceRegion.stt.speech.microsoft.com/speech/universal/v2";
auto speechConfig = SpeechConfig::FromEndpoint(speechv2Endpoint, "YourSubscriptionKey");
// Set the mode of input language detection to either "AtStart" (the default) or "Continuous".
// Please refer to the documentation of Language ID for more information.
// https://aka.ms/speech/lid?pivots=programming-language-cpp
speechConfig->SetProperty(PropertyId::SpeechServiceConnection_LanguageIdMode, "Continuous");
// Define the set of languages to detect
auto autoDetectSourceLanguageConfig = AutoDetectSourceLanguageConfig::FromLanguages({ "en-US", "zh-CN" });
// Creates a speech recognizer using file as audio input.
// Replace with your own audio file name.
auto audioInput = AudioConfig::FromWavFileInput("en-us_zh-cn.wav");
auto recognizer = SpeechRecognizer::FromConfig(speechConfig, autoDetectSourceLanguageConfig, audioInput);
// promise for synchronization of recognition end.
promise<void> recognitionEnd;
// Subscribes to events.
recognizer->Recognizing.Connect([](const SpeechRecognitionEventArgs& e)
{
auto lidResult = AutoDetectSourceLanguageResult::FromResult(e.Result);
cout << "Recognizing in " << lidResult->Language << ": Text =" << e.Result->Text << std::endl;
});
recognizer->Recognized.Connect([](const SpeechRecognitionEventArgs& e)
{
if (e.Result->Reason == ResultReason::RecognizedSpeech)
{
auto lidResult = AutoDetectSourceLanguageResult::FromResult(e.Result);
cout << "RECOGNIZED in " << lidResult->Language << ": Text=" << e.Result->Text << "\n"
<< " Offset=" << e.Result->Offset() << "\n"
<< " Duration=" << e.Result->Duration() << std::endl;
}
else if (e.Result->Reason == ResultReason::NoMatch)
{
cout << "NOMATCH: Speech could not be recognized." << std::endl;
}
});
recognizer->Canceled.Connect([&recognitionEnd](const SpeechRecognitionCanceledEventArgs& e)
{
cout << "CANCELED: Reason=" << (int)e.Reason << std::endl;
if (e.Reason == CancellationReason::Error)
{
cout << "CANCELED: ErrorCode=" << (int)e.ErrorCode << "\n"
<< "CANCELED: ErrorDetails=" << e.ErrorDetails << "\n"
<< "CANCELED: Did you update the subscription info?" << std::endl;
recognitionEnd.set_value(); // Notify to stop recognition.
}
});
recognizer->SessionStopped.Connect([&recognitionEnd](const SessionEventArgs& e)
{
cout << "Session stopped.";
recognitionEnd.set_value(); // Notify to stop recognition.
});
// Starts continuous recognition. Uses StopContinuousRecognitionAsync() to stop recognition.
recognizer->StartContinuousRecognitionAsync().get();
// Waits for recognition end.
recognitionEnd.get_future().get();
// Stops recognition.
recognizer->StopContinuousRecognitionAsync().get();
GitHub에서 언어 식별을 사용하는 음성 텍스트 변환 인식의 더 많은 예제를 참조하세요.
AutoDetectSourceLanguageConfig autoDetectSourceLanguageConfig =
AutoDetectSourceLanguageConfig.fromLanguages(Arrays.asList("en-US", "de-DE"));
SpeechRecognizer recognizer = new SpeechRecognizer(
speechConfig,
autoDetectSourceLanguageConfig,
audioConfig);
Future<SpeechRecognitionResult> future = recognizer.recognizeOnceAsync();
SpeechRecognitionResult result = future.get(30, TimeUnit.SECONDS);
AutoDetectSourceLanguageResult autoDetectSourceLanguageResult =
AutoDetectSourceLanguageResult.fromResult(result);
String detectedLanguage = autoDetectSourceLanguageResult.getLanguage();
recognizer.close();
speechConfig.close();
autoDetectSourceLanguageConfig.close();
audioConfig.close();
result.close();
// Shows how to do continuous speech recognition on a multilingual audio file with continuous language detection. Here, we assume the
// spoken language in the file can alternate between English (US), Spanish (Mexico) and German.
// If specified, speech recognition will use the custom model associated with the detected language.
public static void continuousRecognitionFromFileWithContinuousLanguageDetectionWithCustomModels() throws InterruptedException, ExecutionException, IOException, URISyntaxException
{
// Creates an instance of a speech config with specified
// subscription key and endpoint URL. Replace with your own subscription key
// and endpoint URL.
SpeechConfig speechConfig = SpeechConfig.fromEndpoint(new URI("YourEndpointUrl"), "YourSubscriptionKey");
// Change the default from at-start language detection to continuous language detection, since the spoken language in the audio
// may change.
speechConfig.setProperty(PropertyId.SpeechServiceConnection_LanguageIdMode, "Continuous");
// Define a set of expected spoken languages in the audio, with an optional custom model endpoint ID associated with each.
// Update the below with your own languages. Please see https://docs.microsoft.com/azure/cognitive-services/speech-service/language-support
// for all supported languages.
// Update the below with your own custom model endpoint IDs, or omit it if you want to use the standard model.
List<SourceLanguageConfig> sourceLanguageConfigs = new ArrayList<SourceLanguageConfig>();
sourceLanguageConfigs.add(SourceLanguageConfig.fromLanguage("en-US", "YourEnUsCustomModelID"));
sourceLanguageConfigs.add(SourceLanguageConfig.fromLanguage("es-MX", "YourEsMxCustomModelID"));
sourceLanguageConfigs.add(SourceLanguageConfig.fromLanguage("de-DE"));
// Creates an instance of AutoDetectSourceLanguageConfig with the above 3 source language configurations.
AutoDetectSourceLanguageConfig autoDetectSourceLanguageConfig = AutoDetectSourceLanguageConfig.fromSourceLanguageConfigs(sourceLanguageConfigs);
// We provide a WAV file with English and Spanish utterances as an example. Replace with your own multilingual audio file name.
AudioConfig audioConfig = AudioConfig.fromWavFileInput( "es-mx_en-us.wav");
// Creates a speech recognizer using file as audio input and the AutoDetectSourceLanguageConfig
SpeechRecognizer speechRecognizer = new SpeechRecognizer(speechConfig, autoDetectSourceLanguageConfig, audioConfig);
// Semaphore used to signal the call to stop continuous recognition (following either a session ended or a cancelled event)
final Semaphore doneSemaphone = new Semaphore(0);
// Subscribes to events.
/* Uncomment this to see intermediate recognition results. Since this is verbose and the WAV file is long, it is commented out by default in this sample.
speechRecognizer.recognizing.addEventListener((s, e) -> {
AutoDetectSourceLanguageResult autoDetectSourceLanguageResult = AutoDetectSourceLanguageResult.fromResult(e.getResult());
String language = autoDetectSourceLanguageResult.getLanguage();
System.out.println(" RECOGNIZING: Text = " + e.getResult().getText());
System.out.println(" RECOGNIZING: Language = " + language);
});
*/
speechRecognizer.recognized.addEventListener((s, e) -> {
AutoDetectSourceLanguageResult autoDetectSourceLanguageResult = AutoDetectSourceLanguageResult.fromResult(e.getResult());
String language = autoDetectSourceLanguageResult.getLanguage();
if (e.getResult().getReason() == ResultReason.RecognizedSpeech) {
System.out.println(" RECOGNIZED: Text = " + e.getResult().getText());
System.out.println(" RECOGNIZED: Language = " + language);
}
else if (e.getResult().getReason() == ResultReason.NoMatch) {
if (language == null || language.isEmpty() || language.toLowerCase().equals("unknown")) {
System.out.println(" NOMATCH: Speech Language could not be detected.");
}
else {
System.out.println(" NOMATCH: Speech could not be recognized.");
}
}
});
speechRecognizer.canceled.addEventListener((s, e) -> {
System.out.println(" CANCELED: Reason = " + e.getReason());
if (e.getReason() == CancellationReason.Error) {
System.out.println(" CANCELED: ErrorCode = " + e.getErrorCode());
System.out.println(" CANCELED: ErrorDetails = " + e.getErrorDetails());
System.out.println(" CANCELED: Did you update the subscription info?");
}
doneSemaphone.release();
});
speechRecognizer.sessionStarted.addEventListener((s, e) -> {
System.out.println("\n Session started event.");
});
speechRecognizer.sessionStopped.addEventListener((s, e) -> {
System.out.println("\n Session stopped event.");
doneSemaphone.release();
});
// Starts continuous recognition and wait for processing to end
System.out.println(" Recognizing from WAV file... please wait");
speechRecognizer.startContinuousRecognitionAsync().get();
doneSemaphone.tryAcquire(30, TimeUnit.SECONDS);
// Stop continuous recognition
speechRecognizer.stopContinuousRecognitionAsync().get();
// These objects must be closed in order to dispose underlying native resources
speechRecognizer.close();
speechConfig.close();
audioConfig.close();
for (SourceLanguageConfig sourceLanguageConfig : sourceLanguageConfigs)
{
sourceLanguageConfig.close();
}
autoDetectSourceLanguageConfig.close();
}
GitHub에서 언어 식별을 사용하는 음성 텍스트 변환 인식의 더 많은 예제를 참조하세요.
auto_detect_source_language_config = \
speechsdk.languageconfig.AutoDetectSourceLanguageConfig(languages=["en-US", "de-DE"])
speech_recognizer = speechsdk.SpeechRecognizer(
speech_config=speech_config,
auto_detect_source_language_config=auto_detect_source_language_config,
audio_config=audio_config)
result = speech_recognizer.recognize_once()
auto_detect_source_language_result = speechsdk.AutoDetectSourceLanguageResult(result)
detected_language = auto_detect_source_language_result.language
import azure.cognitiveservices.speech as speechsdk
import time
import json
speech_key, endpoint_string = "YourSpeechResoureKey","YourServiceEndpoint"
weatherfilename="en-us_zh-cn.wav"
speech_config = speechsdk.SpeechConfig(subscription=speech_key, endpoint=endpoint_string)
audio_config = speechsdk.audio.AudioConfig(filename=weatherfilename)
# Set the LanguageIdMode (Optional; Either Continuous or AtStart are accepted; Default AtStart)
speech_config.set_property(property_id=speechsdk.PropertyId.SpeechServiceConnection_LanguageIdMode, value='Continuous')
auto_detect_source_language_config = speechsdk.languageconfig.AutoDetectSourceLanguageConfig(
languages=["en-US", "de-DE", "zh-CN"])
speech_recognizer = speechsdk.SpeechRecognizer(
speech_config=speech_config,
auto_detect_source_language_config=auto_detect_source_language_config,
audio_config=audio_config)
done = False
def stop_cb(evt):
"""callback that signals to stop continuous recognition upon receiving an event `evt`"""
print('CLOSING on {}'.format(evt))
nonlocal done
done = True
# Connect callbacks to the events fired by the speech recognizer
speech_recognizer.recognizing.connect(lambda evt: print('RECOGNIZING: {}'.format(evt)))
speech_recognizer.recognized.connect(lambda evt: print('RECOGNIZED: {}'.format(evt)))
speech_recognizer.session_started.connect(lambda evt: print('SESSION STARTED: {}'.format(evt)))
speech_recognizer.session_stopped.connect(lambda evt: print('SESSION STOPPED {}'.format(evt)))
speech_recognizer.canceled.connect(lambda evt: print('CANCELED {}'.format(evt)))
# stop continuous recognition on either session stopped or canceled events
speech_recognizer.session_stopped.connect(stop_cb)
speech_recognizer.canceled.connect(stop_cb)
# Start continuous speech recognition
speech_recognizer.start_continuous_recognition()
while not done:
time.sleep(.5)
speech_recognizer.stop_continuous_recognition()
NSArray *languages = @[@"en-US", @"de-DE", @"zh-CN"];
SPXAutoDetectSourceLanguageConfiguration* autoDetectSourceLanguageConfig = \
[[SPXAutoDetectSourceLanguageConfiguration alloc]init:languages];
SPXSpeechRecognizer* speechRecognizer = \
[[SPXSpeechRecognizer alloc] initWithSpeechConfiguration:speechConfig
autoDetectSourceLanguageConfiguration:autoDetectSourceLanguageConfig
audioConfiguration:audioConfig];
SPXSpeechRecognitionResult *result = [speechRecognizer recognizeOnce];
SPXAutoDetectSourceLanguageResult *languageDetectionResult = [[SPXAutoDetectSourceLanguageResult alloc] init:result];
NSString *detectedLanguage = [languageDetectionResult language];
var autoDetectSourceLanguageConfig = SpeechSDK.AutoDetectSourceLanguageConfig.fromLanguages(["en-US", "de-DE"]);
var speechRecognizer = SpeechSDK.SpeechRecognizer.FromConfig(speechConfig, autoDetectSourceLanguageConfig, audioConfig);
speechRecognizer.recognizeOnceAsync((result: SpeechSDK.SpeechRecognitionResult) => {
var languageDetectionResult = SpeechSDK.AutoDetectSourceLanguageResult.fromResult(result);
var detectedLanguage = languageDetectionResult.language;
},
{});
음성 텍스트 변환 사용자 지정 모델
참고 항목
사용자 지정 모델을 사용한 언어 감지는 실시간 음성 텍스트 변환 및 음성 번역에서만 사용할 수 있습니다. 일괄 처리 대화 내용 기록은 기본 기준 모델에 대한 언어 감지만 지원합니다.
이 샘플에서는 사용자 지정 엔드포인트에서 언어 감지를 사용하는 방법을 보여 줍니다. 검색된 언어가 en-US
인 경우 이 예제에서는 기본 모델을 사용합니다. 검색된 언어가 fr-FR
인 경우 이 예제에서는 사용자 지정 모델 엔드포인트를 사용합니다. 자세한 내용은 Custom 음성 모델 배포를 참조하세요.
var sourceLanguageConfigs = new SourceLanguageConfig[]
{
SourceLanguageConfig.FromLanguage("en-US"),
SourceLanguageConfig.FromLanguage("fr-FR", "The Endpoint Id for custom model of fr-FR")
};
var autoDetectSourceLanguageConfig =
AutoDetectSourceLanguageConfig.FromSourceLanguageConfigs(
sourceLanguageConfigs);
이 샘플에서는 사용자 지정 엔드포인트에서 언어 감지를 사용하는 방법을 보여 줍니다. 검색된 언어가 en-US
인 경우 이 예제에서는 기본 모델을 사용합니다. 검색된 언어가 fr-FR
인 경우 이 예제에서는 사용자 지정 모델 엔드포인트를 사용합니다. 자세한 내용은 Custom 음성 모델 배포를 참조하세요.
std::vector<std::shared_ptr<SourceLanguageConfig>> sourceLanguageConfigs;
sourceLanguageConfigs.push_back(
SourceLanguageConfig::FromLanguage("en-US"));
sourceLanguageConfigs.push_back(
SourceLanguageConfig::FromLanguage("fr-FR", "The Endpoint Id for custom model of fr-FR"));
auto autoDetectSourceLanguageConfig =
AutoDetectSourceLanguageConfig::FromSourceLanguageConfigs(
sourceLanguageConfigs);
이 샘플에서는 사용자 지정 엔드포인트에서 언어 감지를 사용하는 방법을 보여 줍니다. 검색된 언어가 en-US
인 경우 이 예제에서는 기본 모델을 사용합니다. 검색된 언어가 fr-FR
인 경우 이 예제에서는 사용자 지정 모델 엔드포인트를 사용합니다. 자세한 내용은 Custom 음성 모델 배포를 참조하세요.
List sourceLanguageConfigs = new ArrayList<SourceLanguageConfig>();
sourceLanguageConfigs.add(
SourceLanguageConfig.fromLanguage("en-US"));
sourceLanguageConfigs.add(
SourceLanguageConfig.fromLanguage("fr-FR", "The Endpoint Id for custom model of fr-FR"));
AutoDetectSourceLanguageConfig autoDetectSourceLanguageConfig =
AutoDetectSourceLanguageConfig.fromSourceLanguageConfigs(
sourceLanguageConfigs);
이 샘플에서는 사용자 지정 엔드포인트에서 언어 감지를 사용하는 방법을 보여 줍니다. 검색된 언어가 en-US
인 경우 이 예제에서는 기본 모델을 사용합니다. 검색된 언어가 fr-FR
인 경우 이 예제에서는 사용자 지정 모델 엔드포인트를 사용합니다. 자세한 내용은 Custom 음성 모델 배포를 참조하세요.
en_language_config = speechsdk.languageconfig.SourceLanguageConfig("en-US")
fr_language_config = speechsdk.languageconfig.SourceLanguageConfig("fr-FR", "The Endpoint Id for custom model of fr-FR")
auto_detect_source_language_config = speechsdk.languageconfig.AutoDetectSourceLanguageConfig(
sourceLanguageConfigs=[en_language_config, fr_language_config])
이 샘플에서는 사용자 지정 엔드포인트에서 언어 감지를 사용하는 방법을 보여 줍니다. 검색된 언어가 en-US
인 경우 이 예제에서는 기본 모델을 사용합니다. 검색된 언어가 fr-FR
인 경우 이 예제에서는 사용자 지정 모델 엔드포인트를 사용합니다. 자세한 내용은 Custom 음성 모델 배포를 참조하세요.
SPXSourceLanguageConfiguration* enLanguageConfig = [[SPXSourceLanguageConfiguration alloc]init:@"en-US"];
SPXSourceLanguageConfiguration* frLanguageConfig = \
[[SPXSourceLanguageConfiguration alloc]initWithLanguage:@"fr-FR"
endpointId:@"The Endpoint Id for custom model of fr-FR"];
NSArray *languageConfigs = @[enLanguageConfig, frLanguageConfig];
SPXAutoDetectSourceLanguageConfiguration* autoDetectSourceLanguageConfig = \
[[SPXAutoDetectSourceLanguageConfiguration alloc]initWithSourceLanguageConfigurations:languageConfigs];
var enLanguageConfig = SpeechSDK.SourceLanguageConfig.fromLanguage("en-US");
var frLanguageConfig = SpeechSDK.SourceLanguageConfig.fromLanguage("fr-FR", "The Endpoint Id for custom model of fr-FR");
var autoDetectSourceLanguageConfig = SpeechSDK.AutoDetectSourceLanguageConfig.fromSourceLanguageConfigs([enLanguageConfig, frLanguageConfig]);
음성 번역 실행
오디오 원본의 언어를 식별한 후 다른 언어로 번역해야 하는 경우 음성 번역을 사용합니다. 자세한 내용은 음성 번역 개요를 참조하세요.
참고 항목
언어 식별을 사용하는 음성 번역은 C#, C++, JavaScript, Python의 Speech SDK에서만 지원됩니다.
GitHub에서 언어 식별을 사용하는 음성 번역의 더 많은 예제를 참조하세요.
using Microsoft.CognitiveServices.Speech;
using Microsoft.CognitiveServices.Speech.Audio;
using Microsoft.CognitiveServices.Speech.Translation;
public static async Task RecognizeOnceSpeechTranslationAsync()
{
var endpointUrl = new Uri("YourSpeechResoureEndpoint");
var config = SpeechTranslationConfig.FromEndpoint(endpointUrl, "YourSpeechResoureKey");
// Source language is required, but currently ignored.
string fromLanguage = "en-US";
speechTranslationConfig.SpeechRecognitionLanguage = fromLanguage;
speechTranslationConfig.AddTargetLanguage("de");
speechTranslationConfig.AddTargetLanguage("fr");
var autoDetectSourceLanguageConfig = AutoDetectSourceLanguageConfig.FromLanguages(new string[] { "en-US", "de-DE", "zh-CN" });
using var audioConfig = AudioConfig.FromDefaultMicrophoneInput();
using (var recognizer = new TranslationRecognizer(
speechTranslationConfig,
autoDetectSourceLanguageConfig,
audioConfig))
{
Console.WriteLine("Say something or read from file...");
var result = await recognizer.RecognizeOnceAsync().ConfigureAwait(false);
if (result.Reason == ResultReason.TranslatedSpeech)
{
var lidResult = result.Properties.GetProperty(PropertyId.SpeechServiceConnection_AutoDetectSourceLanguageResult);
Console.WriteLine($"RECOGNIZED in '{lidResult}': Text={result.Text}");
foreach (var element in result.Translations)
{
Console.WriteLine($" TRANSLATED into '{element.Key}': {element.Value}");
}
}
}
}
using Microsoft.CognitiveServices.Speech;
using Microsoft.CognitiveServices.Speech.Audio;
using Microsoft.CognitiveServices.Speech.Translation;
public static async Task MultiLingualTranslation()
{
var endpointUrl = new Uri("YourSpeechResoureEndpoint");
var config = SpeechTranslationConfig.FromEndpoint(endpointUrl, "YourSpeechResoureKey");
// Source language is required, but currently ignored.
string fromLanguage = "en-US";
config.SpeechRecognitionLanguage = fromLanguage;
config.AddTargetLanguage("de");
config.AddTargetLanguage("fr");
// Set the LanguageIdMode (Optional; Either Continuous or AtStart are accepted; Default AtStart)
config.SetProperty(PropertyId.SpeechServiceConnection_LanguageIdMode, "Continuous");
var autoDetectSourceLanguageConfig = AutoDetectSourceLanguageConfig.FromLanguages(new string[] { "en-US", "de-DE", "zh-CN" });
var stopTranslation = new TaskCompletionSource<int>();
using (var audioInput = AudioConfig.FromWavFileInput(@"en-us_zh-cn.wav"))
{
using (var recognizer = new TranslationRecognizer(config, autoDetectSourceLanguageConfig, audioInput))
{
recognizer.Recognizing += (s, e) =>
{
var lidResult = e.Result.Properties.GetProperty(PropertyId.SpeechServiceConnection_AutoDetectSourceLanguageResult);
Console.WriteLine($"RECOGNIZING in '{lidResult}': Text={e.Result.Text}");
foreach (var element in e.Result.Translations)
{
Console.WriteLine($" TRANSLATING into '{element.Key}': {element.Value}");
}
};
recognizer.Recognized += (s, e) => {
if (e.Result.Reason == ResultReason.TranslatedSpeech)
{
var lidResult = e.Result.Properties.GetProperty(PropertyId.SpeechServiceConnection_AutoDetectSourceLanguageResult);
Console.WriteLine($"RECOGNIZED in '{lidResult}': Text={e.Result.Text}");
foreach (var element in e.Result.Translations)
{
Console.WriteLine($" TRANSLATED into '{element.Key}': {element.Value}");
}
}
else if (e.Result.Reason == ResultReason.RecognizedSpeech)
{
Console.WriteLine($"RECOGNIZED: Text={e.Result.Text}");
Console.WriteLine($" Speech not translated.");
}
else if (e.Result.Reason == ResultReason.NoMatch)
{
Console.WriteLine($"NOMATCH: Speech could not be recognized.");
}
};
recognizer.Canceled += (s, e) =>
{
Console.WriteLine($"CANCELED: Reason={e.Reason}");
if (e.Reason == CancellationReason.Error)
{
Console.WriteLine($"CANCELED: ErrorCode={e.ErrorCode}");
Console.WriteLine($"CANCELED: ErrorDetails={e.ErrorDetails}");
Console.WriteLine($"CANCELED: Did you set the speech resource key and endpoint values?");
}
stopTranslation.TrySetResult(0);
};
recognizer.SpeechStartDetected += (s, e) => {
Console.WriteLine("\nSpeech start detected event.");
};
recognizer.SpeechEndDetected += (s, e) => {
Console.WriteLine("\nSpeech end detected event.");
};
recognizer.SessionStarted += (s, e) => {
Console.WriteLine("\nSession started event.");
};
recognizer.SessionStopped += (s, e) => {
Console.WriteLine("\nSession stopped event.");
Console.WriteLine($"\nStop translation.");
stopTranslation.TrySetResult(0);
};
// Starts continuous recognition. Uses StopContinuousRecognitionAsync() to stop recognition.
Console.WriteLine("Start translation...");
await recognizer.StartContinuousRecognitionAsync().ConfigureAwait(false);
Task.WaitAny(new[] { stopTranslation.Task });
await recognizer.StopContinuousRecognitionAsync().ConfigureAwait(false);
}
}
}
GitHub에서 언어 식별을 사용하는 음성 번역의 더 많은 예제를 참조하세요.
auto endpointString = "YourSpeechResoureEndpoint";
auto config = SpeechTranslationConfig::FromEndpoint(endpointString, "YourSpeechResoureKey");
auto autoDetectSourceLanguageConfig = AutoDetectSourceLanguageConfig::FromLanguages({ "en-US", "de-DE" });
// Sets source and target languages
// The source language will be detected by the language detection feature.
// However, the SpeechRecognitionLanguage still need to set with a locale string, but it will not be used as the source language.
// This will be fixed in a future version of Speech SDK.
auto fromLanguage = "en-US";
config->SetSpeechRecognitionLanguage(fromLanguage);
config->AddTargetLanguage("de");
config->AddTargetLanguage("fr");
// Creates a translation recognizer using microphone as audio input.
auto recognizer = TranslationRecognizer::FromConfig(config, autoDetectSourceLanguageConfig);
cout << "Say something...\n";
// Starts translation, and returns after a single utterance is recognized. The end of a
// single utterance is determined by listening for silence at the end or until a maximum of 15
// seconds of audio is processed. The task returns the recognized text as well as the translation.
// Note: Since RecognizeOnceAsync() returns only a single utterance, it is suitable only for single
// shot recognition like command or query.
// For long-running multi-utterance recognition, use StartContinuousRecognitionAsync() instead.
auto result = recognizer->RecognizeOnceAsync().get();
// Checks result.
if (result->Reason == ResultReason::TranslatedSpeech)
{
cout << "RECOGNIZED: Text=" << result->Text << std::endl;
for (const auto& it : result->Translations)
{
cout << "TRANSLATED into '" << it.first.c_str() << "': " << it.second.c_str() << std::endl;
}
}
else if (result->Reason == ResultReason::RecognizedSpeech)
{
cout << "RECOGNIZED: Text=" << result->Text << " (text could not be translated)" << std::endl;
}
else if (result->Reason == ResultReason::NoMatch)
{
cout << "NOMATCH: Speech could not be recognized." << std::endl;
}
else if (result->Reason == ResultReason::Canceled)
{
auto cancellation = CancellationDetails::FromResult(result);
cout << "CANCELED: Reason=" << (int)cancellation->Reason << std::endl;
if (cancellation->Reason == CancellationReason::Error)
{
cout << "CANCELED: ErrorCode=" << (int)cancellation->ErrorCode << std::endl;
cout << "CANCELED: ErrorDetails=" << cancellation->ErrorDetails << std::endl;
cout << "CANCELED: Did you set the speech resource key and endpoint values?" << std::endl;
}
}
using namespace std;
using namespace Microsoft::CognitiveServices::Speech;
using namespace Microsoft::CognitiveServices::Speech::Audio;
using namespace Microsoft::CognitiveServices::Speech::Translation;
void MultiLingualTranslation()
{
auto config = SpeechTranslationConfig::FromEndpoint("YourSpeechResoureEndpoint", "YourSpeechResoureKey");
// Set the LanguageIdMode (Optional; Either Continuous or AtStart are accepted; Default AtStart)
speechConfig->SetProperty(PropertyId::SpeechServiceConnection_LanguageIdMode, "Continuous");
auto autoDetectSourceLanguageConfig = AutoDetectSourceLanguageConfig::FromLanguages({ "en-US", "de-DE", "zh-CN" });
promise<void> recognitionEnd;
// Source language is required, but currently ignored.
auto fromLanguage = "en-US";
config->SetSpeechRecognitionLanguage(fromLanguage);
config->AddTargetLanguage("de");
config->AddTargetLanguage("fr");
auto audioInput = AudioConfig::FromWavFileInput("whatstheweatherlike.wav");
auto recognizer = TranslationRecognizer::FromConfig(config, autoDetectSourceLanguageConfig, audioInput);
recognizer->Recognizing.Connect([](const TranslationRecognitionEventArgs& e)
{
std::string lidResult = e.Result->Properties.GetProperty(PropertyId::SpeechServiceConnection_AutoDetectSourceLanguageResult);
cout << "Recognizing in Language = "<< lidResult << ":" << e.Result->Text << std::endl;
for (const auto& it : e.Result->Translations)
{
cout << " Translated into '" << it.first.c_str() << "': " << it.second.c_str() << std::endl;
}
});
recognizer->Recognized.Connect([](const TranslationRecognitionEventArgs& e)
{
if (e.Result->Reason == ResultReason::TranslatedSpeech)
{
std::string lidResult = e.Result->Properties.GetProperty(PropertyId::SpeechServiceConnection_AutoDetectSourceLanguageResult);
cout << "RECOGNIZED in Language = " << lidResult << ": Text=" << e.Result->Text << std::endl;
}
else if (e.Result->Reason == ResultReason::RecognizedSpeech)
{
cout << "RECOGNIZED: Text=" << e.Result->Text << " (text could not be translated)" << std::endl;
}
else if (e.Result->Reason == ResultReason::NoMatch)
{
cout << "NOMATCH: Speech could not be recognized." << std::endl;
}
for (const auto& it : e.Result->Translations)
{
cout << " Translated into '" << it.first.c_str() << "': " << it.second.c_str() << std::endl;
}
});
recognizer->Canceled.Connect([&recognitionEnd](const TranslationRecognitionCanceledEventArgs& e)
{
cout << "CANCELED: Reason=" << (int)e.Reason << std::endl;
if (e.Reason == CancellationReason::Error)
{
cout << "CANCELED: ErrorCode=" << (int)e.ErrorCode << std::endl;
cout << "CANCELED: ErrorDetails=" << e.ErrorDetails << std::endl;
cout << "CANCELED: Did you set the speech resource key and endpoint values?" << std::endl;
recognitionEnd.set_value();
}
});
recognizer->Synthesizing.Connect([](const TranslationSynthesisEventArgs& e)
{
auto size = e.Result->Audio.size();
cout << "Translation synthesis result: size of audio data: " << size
<< (size == 0 ? "(END)" : "");
});
recognizer->SessionStopped.Connect([&recognitionEnd](const SessionEventArgs& e)
{
cout << "Session stopped.";
recognitionEnd.set_value();
});
// Starts continuos recognition. Use StopContinuousRecognitionAsync() to stop recognition.
recognizer->StartContinuousRecognitionAsync().get();
recognitionEnd.get_future().get();
recognizer->StopContinuousRecognitionAsync().get();
}
GitHub에서 언어 식별을 사용하는 음성 번역의 더 많은 예제를 참조하세요.
import azure.cognitiveservices.speech as speechsdk
import time
import json
speech_key, service_endpoint = "YourSpeechResoureKey","YourServiceEndpoint"
weatherfilename="en-us_zh-cn.wav"
# set up translation parameters: source language and target languages
translation_config = speechsdk.translation.SpeechTranslationConfig(
subscription=speech_key,
endpoint=service_endpoint,
speech_recognition_language='en-US',
target_languages=('de', 'fr'))
audio_config = speechsdk.audio.AudioConfig(filename=weatherfilename)
# Specify the AutoDetectSourceLanguageConfig, which defines the number of possible languages
auto_detect_source_language_config = speechsdk.languageconfig.AutoDetectSourceLanguageConfig(languages=["en-US", "de-DE", "zh-CN"])
# Creates a translation recognizer using and audio file as input.
recognizer = speechsdk.translation.TranslationRecognizer(
translation_config=translation_config,
audio_config=audio_config,
auto_detect_source_language_config=auto_detect_source_language_config)
# Starts translation, and returns after a single utterance is recognized. The end of a
# single utterance is determined by listening for silence at the end or until a maximum of 15
# seconds of audio is processed. The task returns the recognition text as result.
# Note: Since recognize_once() returns only a single utterance, it is suitable only for single
# shot recognition like command or query.
# For long-running multi-utterance recognition, use start_continuous_recognition() instead.
result = recognizer.recognize_once()
# Check the result
if result.reason == speechsdk.ResultReason.TranslatedSpeech:
print("""Recognized: {}
German translation: {}
French translation: {}""".format(
result.text, result.translations['de'], result.translations['fr']))
elif result.reason == speechsdk.ResultReason.RecognizedSpeech:
print("Recognized: {}".format(result.text))
detectedSrcLang = result.properties[speechsdk.PropertyId.SpeechServiceConnection_AutoDetectSourceLanguageResult]
print("Detected Language: {}".format(detectedSrcLang))
elif result.reason == speechsdk.ResultReason.NoMatch:
print("No speech could be recognized: {}".format(result.no_match_details))
elif result.reason == speechsdk.ResultReason.Canceled:
print("Translation canceled: {}".format(result.cancellation_details.reason))
if result.cancellation_details.reason == speechsdk.CancellationReason.Error:
print("Error details: {}".format(result.cancellation_details.error_details))
import azure.cognitiveservices.speech as speechsdk
import time
import json
speech_key, service_endpoint = "YourSpeechResoureKey","YourServiceEndpoint"
weatherfilename="en-us_zh-cn.wav"
# Currently the v2 endpoint is required. In a future SDK release you won't need to set it.
translation_config = speechsdk.translation.SpeechTranslationConfig(
subscription=speech_key,
endpoint=service_endpoint,
speech_recognition_language='en-US',
target_languages=('de', 'fr'))
audio_config = speechsdk.audio.AudioConfig(filename=weatherfilename)
# Set the LanguageIdMode (Optional; Either Continuous or AtStart are accepted; Default AtStart)
translation_config.set_property(property_id=speechsdk.PropertyId.SpeechServiceConnection_LanguageIdMode, value='Continuous')
# Specify the AutoDetectSourceLanguageConfig, which defines the number of possible languages
auto_detect_source_language_config = speechsdk.languageconfig.AutoDetectSourceLanguageConfig(languages=["en-US", "de-DE", "zh-CN"])
# Creates a translation recognizer using and audio file as input.
recognizer = speechsdk.translation.TranslationRecognizer(
translation_config=translation_config,
audio_config=audio_config,
auto_detect_source_language_config=auto_detect_source_language_config)
def result_callback(event_type, evt):
"""callback to display a translation result"""
print("{}: {}\n\tTranslations: {}\n\tResult Json: {}".format(
event_type, evt, evt.result.translations.items(), evt.result.json))
done = False
def stop_cb(evt):
"""callback that signals to stop continuous recognition upon receiving an event `evt`"""
print('CLOSING on {}'.format(evt))
nonlocal done
done = True
# connect callback functions to the events fired by the recognizer
recognizer.session_started.connect(lambda evt: print('SESSION STARTED: {}'.format(evt)))
recognizer.session_stopped.connect(lambda evt: print('SESSION STOPPED {}'.format(evt)))
# event for intermediate results
recognizer.recognizing.connect(lambda evt: result_callback('RECOGNIZING', evt))
# event for final result
recognizer.recognized.connect(lambda evt: result_callback('RECOGNIZED', evt))
# cancellation event
recognizer.canceled.connect(lambda evt: print('CANCELED: {} ({})'.format(evt, evt.reason)))
# stop continuous recognition on either session stopped or canceled events
recognizer.session_stopped.connect(stop_cb)
recognizer.canceled.connect(stop_cb)
def synthesis_callback(evt):
"""
callback for the synthesis event
"""
print('SYNTHESIZING {}\n\treceived {} bytes of audio. Reason: {}'.format(
evt, len(evt.result.audio), evt.result.reason))
if evt.result.reason == speechsdk.ResultReason.RecognizedSpeech:
print("RECOGNIZED: {}".format(evt.result.properties))
if evt.result.properties.get(speechsdk.PropertyId.SpeechServiceConnection_AutoDetectSourceLanguageResult) == None:
print("Unable to detect any language")
else:
detectedSrcLang = evt.result.properties[speechsdk.PropertyId.SpeechServiceConnection_AutoDetectSourceLanguageResult]
jsonResult = evt.result.properties[speechsdk.PropertyId.SpeechServiceResponse_JsonResult]
detailResult = json.loads(jsonResult)
startOffset = detailResult['Offset']
duration = detailResult['Duration']
if duration >= 0:
endOffset = duration + startOffset
else:
endOffset = 0
print("Detected language = " + detectedSrcLang + ", startOffset = " + str(startOffset) + " nanoseconds, endOffset = " + str(endOffset) + " nanoseconds, Duration = " + str(duration) + " nanoseconds.")
global language_detected
language_detected = True
# connect callback to the synthesis event
recognizer.synthesizing.connect(synthesis_callback)
# start translation
recognizer.start_continuous_recognition()
while not done:
time.sleep(.5)
recognizer.stop_continuous_recognition()
컨테이너 실행 및 사용
음성 컨테이너는 음성 SDK 및 음성 CLI를 통해 액세스되는 websocket 기반 쿼리 엔드포인트 API를 제공합니다. 기본적으로 음성 SDK 및 음성 CLI는 공개 음성 서비스를 사용합니다. 컨테이너를 사용하려면 초기화 메서드를 변경해야 합니다. 키 및 엔드포인트 대신 컨테이너 호스트 URL을 사용합니다.
컨테이너에서 언어 ID를 실행할 때 SourceLanguageRecognizer
또는 SpeechRecognizer
대신 TranslationRecognizer
개체를 사용하세요.
컨테이너에 대한 자세한 내용은 언어 식별 음성 컨테이너 방법 가이드를 참조하세요.
음성 텍스트 변환 일괄 처리 대화 내용 기록 구현
Batch 전사 REST API를 사용하여 언어를 식별하려면 Transcriptions - Submit 요청의 본문에서 languageIdentification
속성을 사용하세요.
경고
일괄 처리 대화 내용 기록은 기본 기준 모델에 대한 언어 식별만 지원합니다. 언어 식별 및 사용자 지정 모델 모두 대화 내용 기록 요청에 지정된 경우 이 서비스는 지정된 후보 언어의 기준 모델 사용에 기댑니다. 이로 인해 예기치 않은 인식 결과가 발생할 수 있습니다.
음성 텍스트 변환 시나리오에 언어 식별 및 사용자 지정 모델이 모두 필요한 경우 일괄 처리 대화 내용 기록 대신 실시간 음성 텍스트 변환을 사용합니다.
다음 예제에서는 4개의 후보 언어에서 languageIdentification
속성을 사용하는 방법을 보여 줍니다. 요청 속성에 대한 자세한 내용은 일괄 처리 대화 내용 기록 만들기를 참조하세요.
{
<...>
"properties": {
<...>
"languageIdentification": {
"candidateLocales": [
"en-US",
"ja-JP",
"zh-CN",
"hi-IN"
]
},
<...>
}
}
관련 콘텐츠