How to reduce the delay in transcribed text using Azure Communication Services and Cognitive Services?
We are using Azure Communication Services and Cognitive Services AI in the same eastus region to make outbound calls. However, there is a delay of 3 to 4 seconds after the user talks on the phone end to get the transcribed text. The code used involves CallMediaRecognizeSpeechOptions
and handles the RecognizeCompleted
event. How can we reduce this delay?
Azure Communication Services
-
ajkuma 22,401 Reputation points • Microsoft Employee
2024-02-08T10:37:48.9266667+00:00 To better assist you on this, just to clarify, is this issue confined only to your resources in eastus region? How exactly are you measuring the latency? Just to clarify, does it work fine locally?
-
Prasanth Krishnan 20 Reputation points
2024-02-08T12:02:31.5733333+00:00 Locally as well we face the same delay. Code used : StartRecognizingAsync calling code
var playSource = new FileSource(new Uri(callbackUrl + "/audio/"+audioFile));
var recognizeOptions = new CallMediaRecognizeSpeechOptions(participants.Value[0].Identifier)
{
Prompt = playSource,
EndSilenceTimeout = TimeSpan.FromMilliseconds(1000),
InitialSilenceTimeout = TimeSpan.FromSeconds(10),
InterruptPrompt = false,
OperationContext = "OpenQuestionSpeechOrDtmf",
};
var recognizeResult = await callAutomationClient.GetCallConnection(parsedEvent.CallConnectionId)
.GetCallMedia()
.StartRecognizingAsync(recognizeOptions);
Recongize completed Event - fires after 3 secondsif (parsedEvent is RecognizeCompleted recognizeCompleted)
{
switch (recognizeCompleted.RecognizeResult)
{
case SpeechResult speechResult:
var text = speechResult.Speech;
break;
}
}
-
ajkuma 22,401 Reputation points • Microsoft Employee
2024-02-08T18:57:57.6633333+00:00 Thanks for additional details. I'm checking on this internally and will get back soon.
-
Chintan Kalkura 5 Reputation points
2024-02-13T14:10:12.8233333+00:00 We are having the same issue, where there is a delay or latency in the speech/audio being played on the call getting delayed by 4 seconds atleast. Once the user receives the call there is silence for 4 to 6 seconds sometimes before the speech or audio starts.
We are using callAutomationApi's JAVA SDK CallAutomationClient. I am playing the audio after the CallConnected event has been received on the callback. I tried disabling the callIntelligenceOptions for the audio file as PlaySource, still there was a delay. Can you please help as it is critical for us to not have any delay.
PlaySource playAudio = new FileSource().setUrl("uri for audio file"); PlayOptions playOptions = new PlayOptions(playAudio, new ArrayList<>(List.of(target))) .setOperationContext(context); client.getCallConnection(patientcallConnectionId).getCallMedia() .playWithResponse(playOptions, Context.NONE);
-
ajkuma 22,401 Reputation points • Microsoft Employee
2024-02-14T08:34:46.6166667+00:00 Chintan Kalkura, Thanks for sharing info about the issue you experienced. I'm discussing on this with our product team and will post back as soon as I hear from them.
-
Prasanth Krishnan 20 Reputation points
2024-02-14T08:53:12.1766667+00:00 https://github.com/DSRC-ACS/PSTN-Calls/tree/main we have create a sample to reproduce these issues. can someone help us to sort this?
-
Prasanth Krishnan 20 Reputation points
2024-02-14T09:05:20.5366667+00:00 https://github.com/DSRC-ACS/PSTN-Calls/tree/main I have uploaded the sample code to reproduce the issues we are facing .
-
ajkuma 22,401 Reputation points • Microsoft Employee
2024-02-16T04:11:46.1066667+00:00 Prasanth, thanks for sharing the additional info. I have reached out privately for additional resource details.
-
Ghaith Rawi 0 Reputation points
2024-04-06T01:03:43.8533333+00:00 We are facing the same problem with play media action. This happens with or without caching. We get around 5 seconds delay for each media file we play to callers (IVR call flow logic). This is affecting our solution design (contact centre) and if we cannot get a fix for it, we are definitely walking away from Azure communication services.
play_source = FileSource(url=audioUri, play_source_cache_id="<playSourceId>")
play_to = [target_participant]
call_automation_client.get_call_connection(call_connection_id).play_media(
play_source=play_source, play_to=play_to
)
-
ajkuma 22,401 Reputation points • Microsoft Employee
2024-04-12T19:31:23.46+00:00 Ghaith, could you please share your ACS resource ID and ACS name if it's fine (here)?
See -Troubleshooting in Azure Communication Service
Note: Kindly do not share any PII data on the public forum. -
Ghaith Rawi 0 Reputation points
2024-04-18T08:02:19.59+00:00 Thanks @ajkuma
As requested.
name :communicationservicesMain
ID: /subscriptions/xxxxxxxxxxxxx/resourceGroups/contact_centre/providers/Microsoft.Communication/CommunicationServices/communicationservicesMain
I have tried everything and MS support keeps saying it is normal but it is not. I hosted the media files locally and tried again as advised by support but no changes. I checked the storage blob metrics and the delay from storage is less than 500ms. So, it must be something wrong with Azure communication service. Sadly I started giving up on ACS :(
-
ajkuma 22,401 Reputation points • Microsoft Employee
2024-04-18T12:28:20.2266667+00:00 Thanks for sharing the requested details.
Apologies for any inconvenience with this issue. Could you please share the support request #SR number, I'll also track this internally.
-
ajkuma 22,401 Reputation points • Microsoft Employee
2024-04-18T18:21:59.62+00:00 Ghaith Rawi , Thanks for sharing the resource details, could you also please share the latest call Id?-Troubleshooting in Azure Communication Service
-
Ghaith Rawi 0 Reputation points
2024-04-18T23:58:52.81+00:00 Hi @ajkuma,
Thanks for taking the time to look at this. Really appreciate that. We would love to build our solution on top of ACS but this issue is a red line for our management/clients.
below is the ticket number
TrackingID#2404070030000289
Sample calls
'correlationId': '83809fe4-7d95-4cc9-bace-dc143f9d691b'
'correlationId': '8b14f79d-a007-46e9-a782-13b62d520b0b'
Thanks again
-
Ghaith Rawi 0 Reputation points
2024-04-19T06:22:56.6533333+00:00 Another test is as below ('correlationId': '23afef4f-0b55-4005-99d4-6cffc4023896')
I ran the code while hosting the media files on the storage (on the same teneant as ACS) and recorded the times (see below).
The code is initiating the play media immediately once the call is connected (fraction of a second = 33ms after call connected event is received) . I recorded the time we initiated playing a media file and the time we got PlayCompleted event. When subtracting the total media file length, it is clearly shown that there is a 2 seconds delay for each play media action.
[2024-04-19T04:09:54.408Z] Call connected. 'correlationId': '23afef4f-0b55-4005-99d4-6cffc4023896'
[2024-04-19T04:09:54.777Z] Start playing Media File (media file length = 2.6 seconds) (https://contactcentredata.blob.core.windows.net/prompts/welcome_msg.wav)
[2024-04-19T04:09:59.742Z] Call media played successfully. 'correlationId': '23afef4f-0b55-4005-99d4-6cffc4023896'
[2024-04-19T04:09:59.784Z] Start playing Media File (media file length = 2.9 seconds) (https://contactcentredata.blob.core.windows.net/prompts/welcome.wav)
[2024-04-19T04:10:04.361Z] Call media played successfully. 'correlationId': '23afef4f-0b55-4005-99d4-6cffc4023896'
Thanks
-
Deleted
This comment has been deleted due to a violation of our Code of Conduct. The comment was manually reported or identified through automated detection before action was taken. Please refer to our Code of Conduct for more information.
-
ajkuma 22,401 Reputation points • Microsoft Employee
2024-04-19T12:40:48.1666667+00:00 Ghaith, thanks for sharing additional info, I'm discussing on this internally and will get back soon.
-
ajkuma 22,401 Reputation points • Microsoft Employee
2024-04-24T05:25:59.26+00:00 Ghaith, Our product engineering team is working on this, I'll post back as soon as I have more updates. We appreciate your patience!
-
ajkuma 22,401 Reputation points • Microsoft Employee
2024-05-01T09:54:31.8933333+00:00 To provide an update, based on the call Id shared, we see that the download time is around ~700ms and there is no extra delay between when the request is received, file downloaded, played and completed. We notice that the http call triggered for PlayCompleted takes about 1.5 seconds on the Contoso side which could be also affecting the experience.
For additional investigation, I request you to share the actual affected calls ( where the latency is observed) Ids and recordings so that we can co-relate the experience and identify the delay that you are experiencing.
Note: Please do not share any PII data on the public forum.
Sign in to comment