非同步對話轉譯多通道自動分段標記
注意
此功能目前處於公開預覽。 此預覽版是在沒有服務等級協定的情況下提供,不建議用於生產工作負載。 可能不支援特定功能,或可能已經限制功能。 如需詳細資訊,請參閱 Microsoft Azure 預覽版增補使用條款。
本文使用 RemoteMeetingTranscriptionClient API 來示範非同步對話轉譯多通道自動分段標記。 如果您已設定對話轉譯來進行非同步轉譯且具有 meetingId
,您可以使用 RemoteMeetingTranscriptionClient API 來取得與該 meetingId
相關聯的轉譯。
重要
對話轉譯多通道自動分段標記 (預覽) 將於 2025 年 3 月 28 日淘汰。 如需移轉至其他語音轉換文字功能的詳細資訊,請參閱移出對話轉譯多通道自動分段標記。
非同步與即時 + 非同步的比較
使用非同步謄寫時,您可以串流會議音訊,但不需要即時傳回謄寫。 而是在傳送音訊之後,使用 Meeting
的 meetingId
來查詢非同步轉譯的狀態。 當非同步轉譯就緒時,您就會取得 RemoteMeetingTranscriptionResult
。
使用即時加上非同步時,您可以即時取得謄寫,但也可以使用 meetingId
進行查詢來取得謄寫 (類似於非同步案例)。
完成非同步轉譯需要兩個步驟。 第一個步驟是上傳音訊,並選擇僅非同步或即時加上非同步。 第二個步驟是取得轉譯結果。
上傳音訊
非同步轉譯的第一個步驟是使用語音 SDK,將音訊傳送至對話轉譯服務。
此範例程式碼說明如何在僅限非同步模式中使用對話轉譯。 為了將音訊串流至轉譯器,您必須新增衍生自即時對話轉譯快速入門的音訊串流程式碼。
async Task CompleteContinuousRecognition(MeetingTranscriber recognizer, string meetingId)
{
var finishedTaskCompletionSource = new TaskCompletionSource<int>();
recognizer.SessionStopped += (s, e) =>
{
finishedTaskCompletionSource.TrySetResult(0);
};
recognizer.Canceled += (s, e) =>
{
Console.WriteLine($"CANCELED: Reason={e.Reason}");
if (e.Reason == CancellationReason.Error)
{
Console.WriteLine($"CANCELED: ErrorCode={e.ErrorCode}");
Console.WriteLine($"CANCELED: ErrorDetails={e.ErrorDetails}");
Console.WriteLine($"CANCELED: Did you update the subscription info?");
throw new System.ApplicationException("${e.ErrorDetails}");
}
finishedTaskCompletionSource.TrySetResult(0);
};
await recognizer.StartTranscribingAsync().ConfigureAwait(false);
// Waits for completion.
// Use Task.WaitAny to keep the task rooted.
Task.WaitAny(new[] { finishedTaskCompletionSource.Task });
await recognizer.StopTranscribingAsync().ConfigureAwait(false);
}
async Task<List<string>> GetRecognizerResult(MeetingTranscriber recognizer, string meetingId)
{
List<string> recognizedText = new List<string>();
recognizer.Transcribed += (s, e) =>
{
if (e.Result.Text.Length > 0)
{
recognizedText.Add(e.Result.Text);
}
};
await CompleteContinuousRecognition(recognizer, meetingId);
recognizer.Dispose();
return recognizedText;
}
async Task UploadAudio()
{
// Create the speech config object
// Substitute real information for "YourSubscriptionKey" and "Region"
SpeechConfig speechConfig = SpeechConfig.FromSubscription("YourSubscriptionKey", "Region");
speechConfig.SetProperty("ConversationTranscriptionInRoomAndOnline", "true");
// Set the property for asynchronous transcription
speechConfig.SetServiceProperty("transcriptionMode", "async", ServicePropertyChannel.UriQueryParameter);
// Alternatively: set the property for real-time plus asynchronous transcription
// speechConfig.setServiceProperty("transcriptionMode", "RealTimeAndAsync", ServicePropertyChannel.UriQueryParameter);
// Create an audio stream from a wav file or from the default microphone if you want to stream live audio from the supported devices
// Replace with your own audio file name and Helper class which implements AudioConfig using PullAudioInputStreamCallback
PullAudioInputStreamCallback wavfilePullStreamCallback = Helper.OpenWavFile("16kHz16Bits8channelsOfRecordedPCMAudio.wav");
// Create an audio stream format assuming the file used above is 16kHz, 16 bits and 8 channel pcm wav file
AudioStreamFormat audioStreamFormat = AudioStreamFormat.GetWaveFormatPCM(16000, 16, 8);
// Create an input stream
AudioInputStream audioStream = AudioInputStream.CreatePullStream(wavfilePullStreamCallback, audioStreamFormat);
// Ensure the meetingId for a new meeting is a truly unique GUID
String meetingId = Guid.NewGuid().ToString();
// Create a Meeting
using (var meeting = await Meeting.CreateMeetingAsync(speechConfig, meetingId))
{
using (var meetingTranscriber = new MeetingTranscriber(AudioConfig.FromStreamInput(audioStream)))
{
await meetingTranscriber.JoinMeetingAsync(meeting);
// Helper function to get the real-time transcription results
var result = await GetRecognizerResult(meetingTranscriber, meetingId);
}
}
}
如果您需要即時「加上」非同步,請將適當的程式碼加上註解和取消註解,如下所示:
// Set the property for asynchronous transcription
// speechConfig.SetServiceProperty("transcriptionMode", "async", ServicePropertyChannel.UriQueryParameter);
// Alternatively: set the property for real-time plus asynchronous transcription
speechConfig.SetServiceProperty("transcriptionMode", "RealTimeAndAsync", ServicePropertyChannel.UriQueryParameter);
取得轉譯結果
透過 NuGet 安裝 Microsoft.CognitiveServices.Speech.Remotemeeting 1.13.0 版或更新版本。
範例轉譯程式碼
在您具有 meetingId
之後,請在用戶端應用程式上建立遠端會議謄寫用戶端 RemoteMeetingTranscriptionClient,以查詢非同步謄寫的狀態。 建立 RemoteMeetingTranscriptionOperation 的物件以取得長時間執行的 Operation 物件。 您可以檢查作業的狀態,或等候作業完成。
// Create the speech config
SpeechConfig config = SpeechConfig.FromSubscription("YourSpeechKey", "YourSpeechRegion");
// Create the speech client
RemoteMeetingTranscriptionClient client = new RemoteMeetingTranscriptionClient(config);
// Create the remote operation
RemoteMeetingTranscriptionOperation operation =
new RemoteMeetingTranscriptionOperation(meetingId, client);
// Wait for operation to finish
await operation.WaitForCompletionAsync(TimeSpan.FromSeconds(10), CancellationToken.None);
// Get the result of the long running operation
var val = operation.Value.MeetingTranscriptionResults;
// Print the fields from the results
foreach (var item in val)
{
Console.WriteLine($"{item.Text}, {item.ResultId}, {item.Reason}, {item.UserId}, {item.OffsetInTicks}, {item.Duration}");
Console.WriteLine($"{item.Properties.GetProperty(PropertyId.SpeechServiceResponse_JsonResult)}");
}
Console.WriteLine("Operation completed");
上傳音訊
您必須先使用語音 SDK 將音訊傳送至對話轉譯服務,才能執行非同步對話轉譯。
此範例程式碼說明如何在僅限非同步模式中使用對話轉譯。 為了將音訊串流至轉譯器,您必須新增衍生自即時對話轉譯快速入門的音訊串流程式碼。 請參閱該主題的限制一節,以查看支援的平台和語言 API。
// Create the speech config object
// Substitute real information for "YourSubscriptionKey" and "Region"
SpeechConfig speechConfig = SpeechConfig.fromSubscription("YourSubscriptionKey", "Region");
speechConfig.setProperty("ConversationTranscriptionInRoomAndOnline", "true");
// Set the property for asynchronous transcription
speechConfig.setServiceProperty("transcriptionMode", "async", ServicePropertyChannel.UriQueryParameter);
// Set the property for real-time plus asynchronous transcription
//speechConfig.setServiceProperty("transcriptionMode", "RealTimeAndAsync", ServicePropertyChannel.UriQueryParameter);
// pick a meeting Id that is a GUID.
String meetingId = UUID.randomUUID().toString();
// Create a Meeting
Future<Meeting> meetingFuture = Meeting.createMeetingAsync(speechConfig, meetingId);
Meeting meeting = meetingFuture.get();
// Create an audio stream from a wav file or from the default microphone if you want to stream live audio from the supported devices
// Replace with your own audio file name and Helper class which implements AudioConfig using PullAudioInputStreamCallback
PullAudioInputStreamCallback wavfilePullStreamCallback = Helper.OpenWavFile("16kHz16Bits8channelsOfRecordedPCMAudio.wav");
// Create an audio stream format assuming the file used above is 16kHz, 16 bits and 8 channel pcm wav file
AudioStreamFormat audioStreamFormat = AudioStreamFormat.getWaveFormatPCM((long)16000, (short)16,(short)8);
// Create an input stream
AudioInputStream audioStream = AudioInputStream.createPullStream(wavfilePullStreamCallback, audioStreamFormat);
// Create a meeting transcriber
MeetingTranscriber transcriber = new MeetingTranscriber(AudioConfig.fromStreamInput(audioStream));
// join a meeting
transcriber.joinMeetingAsync(meeting);
// Add the event listener for the real-time events
transcriber.transcribed.addEventListener((o, e) -> {
System.out.println("Meeting transcriber Recognized:" + e.toString());
});
transcriber.canceled.addEventListener((o, e) -> {
System.out.println("Meeting transcriber canceled:" + e.toString());
try {
transcriber.stopTranscribingAsync().get();
} catch (InterruptedException ex) {
ex.printStackTrace();
} catch (ExecutionException ex) {
ex.printStackTrace();
}
});
transcriber.sessionStopped.addEventListener((o, e) -> {
System.out.println("Meeting transcriber stopped:" + e.toString());
try {
transcriber.stopTranscribingAsync().get();
} catch (InterruptedException ex) {
ex.printStackTrace();
} catch (ExecutionException ex) {
ex.printStackTrace();
}
});
// start the transcription.
Future<?> future = transcriber.startTranscribingAsync();
...
如果您需要即時「加上」非同步,請將適當的程式碼加上註解和取消註解,如下所示:
// Set the property for asynchronous transcription
//speechConfig.setServiceProperty("transcriptionMode", "async", ServicePropertyChannel.UriQueryParameter);
// Set the property for real-time plus asynchronous transcription
speechConfig.setServiceProperty("transcriptionMode", "RealTimeAndAsync", ServicePropertyChannel.UriQueryParameter);
取得轉譯結果
針對此處所示的程式碼,您需要 remote-meeting 1.8.0 版,僅 Windows 和 Linux 上的 JAVA (1.8.0 或更新版本) 才提供支援。
取得非同步會議用戶端 SDK
您可以藉由編輯 pom.xml 檔案來取得 remote-meeting,如下所示。
在檔案結尾的
</project>
結尾標籤之前,建立repositories
元素並讓其參考語音 SDK 的 Maven 存放庫:<repositories> <repository> <id>maven-cognitiveservices-speech</id> <name>Microsoft Cognitive Services Speech Maven Repository</name> <url>https://azureai.azureedge.net/maven/</url> </repository> </repositories>
另請新增以 remotemeeting-client-sdk 1.8.0 為相依性的
dependencies
元素:<dependencies> <dependency> <groupId>com.microsoft.cognitiveservices.speech.remotemeeting</groupId> <artifactId>remote-meeting</artifactId> <version>1.8.0</version> </dependency> </dependencies>
儲存變更
範例轉譯程式碼
在您具有 meetingId
之後,請在用戶端應用程式上建立遠端會議謄寫用戶端 RemoteMeetingTranscriptionClient,以查詢非同步謄寫的狀態。 使用 RemoteMeetingTranscriptionClient 中的 GetTranscriptionOperation 方法來取得 PollerFlux 物件。 PollerFlux 物件包含遠端作業狀態 RemoteMeetingTranscriptionOperation 和最終結果 RemoteMeetingTranscriptionResult 的相關資訊。 作業完成後,請在 SyncPoller 上呼叫 getFinalResult,以取得 RemoteMeetingTranscriptionResult。 在此程式碼中,我們會將結果內容列印至系統輸出。
// Create the speech config object
SpeechConfig speechConfig = SpeechConfig.fromSubscription("YourSubscriptionKey", "Region");
// Create a remote Meeting Transcription client
RemoteMeetingTranscriptionClient client = new RemoteMeetingTranscriptionClient(speechConfig);
// Get the PollerFlux for the remote operation
PollerFlux<RemoteMeetingTranscriptionOperation, RemoteMeetingTranscriptionResult> remoteTranscriptionOperation = client.GetTranscriptionOperation(meetingId);
// Subscribe to PollerFlux to get the remote operation status
remoteTranscriptionOperation.subscribe(
pollResponse -> {
System.out.println("Poll response status : " + pollResponse.getStatus());
System.out.println("Poll response status : " + pollResponse.getValue().getServiceStatus());
}
);
// Obtain the blocking operation using getSyncPoller
SyncPoller<RemoteMeetingTranscriptionOperation, RemoteMeetingTranscriptionResult> blockingOperation = remoteTranscriptionOperation.getSyncPoller();
// Wait for the operation to finish
blockingOperation.waitForCompletion();
// Get the final result response
RemoteMeetingTranscriptionResult resultResponse = blockingOperation.getFinalResult();
// Print the result
if(resultResponse != null) {
if(resultResponse.getMeetingTranscriptionResults() != null) {
for (int i = 0; i < resultResponse.getMeetingTranscriptionResults().size(); i++) {
MeetingTranscriptionResult result = resultResponse.getMeetingTranscriptionResults().get(i);
System.out.println(result.getProperties().getProperty(PropertyId.SpeechServiceResponse_JsonResult.name()));
System.out.println(result.getProperties().getProperty(PropertyId.SpeechServiceResponse_JsonResult));
System.out.println(result.getOffset());
System.out.println(result.getDuration());
System.out.println(result.getUserId());
System.out.println(result.getReason());
System.out.println(result.getResultId());
System.out.println(result.getText());
System.out.println(result.toString());
}
}
}
System.out.println("Operation finished");