This is a similar post to this one, but that was back in 2020 so I decided to make a new post. Here's my scenario:
I have a collection of mono wav files of research interviews, and I have been using Azure's speech-to-text to transcribe them. As such, I am using asynchronous file uploads. Azure's conversation transcription feature seems useful to me, mainly for its speaker diarization ability. The azure docs still indicate that 8-channel audio is needed for async conversation transcription, but is there a workaround for mono wav files?
The previous post listed that there was a private workaround, but unfortunately this will not work for me, as I will be working with my clients' individual Azure accounts so any functionality would need to be publicly accessible. I am working in JavaScript and if there any code demos then that would be greatly appreciated as well.
Also, the pricing website mentions that multichannel conversation transcription is a higher price than standard transcription ($2.10 vs $1.00 per hour), so if there is a single channel conversation transcription feature, which price categorization would this fall under?