How to save text to speech audio file with Nodejs to an GCS external bucket?

Rafa Torres 46 Reputation points
2023-09-07T16:04:40.19+00:00

We are trying to create a text-to-speech audio file and upload it to an external bucket storage (GCS in our case).

Our problem is to convert the binary data that Azure returns into a Buffer, it seems like the binary data is corrupted because the response gives us "RIFF�~\u0004\u0000WAVEfmt...(continues)" with "������������������\u0000\u0000\u0000\u0000" in the middle, is it a sign of corruption?

That causes the audio file we created to be empty (I guess).

We use NodeJS with JavaScript as a reference. The default API call is made with Axios:

const azureLongAudio = axios.create({
    baseURL: DEFAULTPATH + apiVersion,
    headers: {
        'Ocp-Apim-Subscription-Key': API_KEY,
        'content-type': 'application/json',
        'X-Microsoft-OutputFormat': 'riff-24khz-16bit-mono-pcm',
        'Content-Type': 'application/ssml+xml'
    }
});

Then we use the next function to call Azure, save a buffer from the returned binary data, create a WAV file, and upload it to an external bucket:

const createAudioFromText = (text) => {
    const version = '1.0';
    const language = supportedLanguages[0];
    const voiceGender = 'Male';
    const voiceName = 'es-ES-EliasNeural';
    const textContent = `
    <speak version='${version}' xml:lang='${language}'>
        <voice xml:lang='${language}' xml:gender='${voiceGender}' name='${voiceName}'>
            ${text}
        </voice>
    </speak>`;
    return new Promise(async (resolve, reject) => {
        const bucket = gcs.getBucket();
        const filename = 'test-01.wav';
        const file = bucket.file(`recordings/${filename}`);

        const { data: audioData } = await azureLongAudio.post('/', textContent).catch((err) => {
            reject(err.response);
        });

        const audioBuffer = Buffer.from(audioData);

        const writer = new wav.FileWriter(filename, {
            sampleRate: 24000, 
            channels: 1, 
            bitDepth: 16, 
            audioFormat: 1 
        });

        writer.write(audioBuffer);
        writer.end();

        writer.on('finish', async () => {
            const writeStream = file.createWriteStream({
                resumable: true,
                contentType: 'audio/wav'
            });

            writer.pipe(writeStream);

            writeStream.on('error', (err) => {
                console.error('Failed to save into GCS:', err);
                reject(err);
            });

            writeStream.on('finish', () => {
                console.log(`Saved in GCS: ${filename}`);
                resolve(audioData);
            });
        });
    });
};

I'd appreciate any response.

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
2,061 questions
{count} votes

1 answer

Sort by: Most helpful
  1. rdfdsd 0 Reputation points
    2023-10-08T08:48:38.0733333+00:00

    To save text-to-speech audio to a file and upload it to Google Cloud Storage (GCS) using Node.js, you can follow these general steps:

    1. Set Up Google Cloud SDK: Make sure you have the Google Cloud SDK installed and configured on your machine. Set up a GCS bucket for Scottsdale orthokeratology doctors where you want to store the audio file.

    Install Required Node.js Packages: Install the necessary Node.js packages for text-to-speech and GCS interaction. You can use the @google-cloud/text-to-speech library for text-to-speech and @google-cloud/storage for interacting with GCS.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.