Conversation Transcription support for mono audio streams

I know that the Conversation Transcription feature of the Speech Service is still in preview (https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/conversation-transcription), but I would like to know if there is any planned support (even if only partially so) for mono audio streams. Currently, it appears that this feature is only supported with 8 channel audio streams provided speech services SDK microphone arrays. I'd like to be able to get this feature working with voice streams provided by end user devices, such as phones, tablets, and laptops.
My end goal here is to provide both real-time speech transcription as well as speaker identification. I know that currently this can be achieved with a combination of Continuous Recognition during the conversation and Batch Transcription after the recording is completed, however the drawbacks here are that batch transcription isn't supported on free tier instances and doing a continuous recognition + batch transcription would end up resulting in a doubling of the costs of each transcription session.
The real time + async conversation transcription feature seems to have everything I'm looking for, but the lack of support for anything other than 8 channel mic array audio is really limiting.
So to summarize, I really have only 2 questions:
- Does Conversation Transcription have any planned support for mono audio streams in the future?
- Will that Batch Transcription API ever be available for free tier speech services?
Thanks!
Quick follow-up. Yes, we will formally support mono channel for real time diarization and speaker identification in the nearest future. We have some private workaround for customers. If interested, let me know and I'd be glad to connect you with the product team. Thanks.
Thanks for the quick response - yes I'd love to get in touch regarding the private workaround. Our app is still in alpha, so we don't mind doing some hacks to get it to work for now with the knowledge that it will eventually be fully supported.
Thanks!
-Brett
Hi @GiftA-MSFT - just checking in to see if you've had an opportunity to find someone on the product team that can help me with the private workaround on this one. Thanks!
-Brett
Hi @GiftA-MSFT I'm attempting to do this too. It seems that support for mono channel is still not available publicly. Please can you share details of the workaround with me?
Hi @GiftA-MSFT - I'm interested in this as well.
@GiftA-MSFT - i am still not able to find mono audio stream support for real time conversation. Can you please help me on this?
Sign in to comment
Please review my response in the private message via comments above. Once you provide contact details, I will connect you and the product team. Thanks.
Sign in to comment
0 additional answers
Sort by: Most helpful