Text to Speech failed to output some format with background audio

Leo Malfray 65 Reputation points
2024-01-10T04:51:58.45+00:00

I'm using a Speech Service to do Text to Speech (called from a javascript azure function).

I use SSML with the mstts:backgroundaudio tag where the source (src) is an mp3 audio file.

When I output as mp3 (Audio16Khz32KBitRateMonoMp3) or Pcm wav (Riff16Khz16BitMonoPcm) using SpeechSynthesisOutputFormat no problem, it works as expected.

However when I choose a MULAW or ALAW output format (Raw8Khz8BitMonoALaw, Raw8Khz8BitMonoMULaw, Riff8Khz8BitMonoALaw, Riff8Khz8BitMonoMULaw) I get an internal server error on your side.

Works fine without the background audio tag.

Was able to work around by generating in PCM and converting to MULAW with an external library but that add an extra step that should not be necessary if it was working as expected on your side.

Not sure if there is another place for issue reporting ?

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
1,909 questions
{count} votes

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.