question

NikitaKuzmin-8526 avatar image
0 Votes"
NikitaKuzmin-8526 asked jeyapandian-6329 answered

Azure Communication Service - Access to audio stream and pass it into Speech-To-Text service in real time

Hi!
I've started to investigate Azure Communication Service SDK for .Net. I'm trying to figure out if it is possible how to create a video call between 2 people and get remote participant's audio stream in order to pass it into azure speech service, generate some text in real time and then make some analysis.
So, the main question is - how to get remote participant's audio stream during the video call? It is possible? I don't see any entities, fields or properties connected with audio in Call, CallAgent, LocalVideoStream and other video call entities. If you have examples of something similiar, it will very helpful for me. Thank you!

azure-communication-services
· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

brtrach-MSFT avatar image
2 Votes"
brtrach-MSFT answered puluguv commented

@NikitaKuzmin-8526 Thank you for your question around being able to get the raw audio from a video call.

We verified with the product group that at this time, the necessary capability is not available. They did verify though that a feature to allow access to the raw audio and video streams is being looked into but no ETA is available at this time.

Another feature that is being worked on is closed captioning. This is also being worked on and no ETA is available.

ACS was just released less than a year ago and the team is hard at work at adding features and the roadmap is very bright. Features can be announced any time but a lot of products tie their announcements to //Build or Ignite conferences so keep an eye out for news especially around then. Keep an eye here for any updates.

P.S. Thank you for the verified answer and 5 star survey on the other thread. This feedback was recognized by my manager and helps us as engineers. We appreciate your feedback. The other thread you asked for assistance with is owned by my co-worker and they are in discussion with the product group right now so you should hopefully receive an update on that thread shortly.

Let us know if you have any further questions or concerns regarding this topic. Otherwise I hope you have a great weekend.

· 2
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Got it! Thank you for detailed response, it saves our time for investigation!

0 Votes 0 ·

Thanks for the details. How does Dynamics 365 Omni Channel for customer service is able to access the stream and convert speech to text at real time? The team is claiming that they are using ACS.

0 Votes 0 ·
jeyapandian-6329 avatar image
0 Votes"
jeyapandian-6329 answered

Hi,

i like add up another request, this is for translation too for real time, instead of text.

if you provide audio stream means need the ability to mute Particular person.

Say, we run a global meeting Key speaker deliver in English, the translators will do the real time translation in multiple language. so, if a person want to hear in his native language he will mute the key speaker audio and listen to real time translator audio.

Is it possible ?



5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.