Audio + transcript failing to export from Editor to Training and Testing Dataset in Azure Speech Studio

Amy Highton 0 Reputation points

I'm having a really hard time uploading audio and transcripts in Speech Studio. I tried many times to upload audio + human-labelled transcripts and kept getting these errors:

"Your Custom Speech data failed to upload. Check the supported data formats and try again later." AND "Zero transcriptions could be parsed from the given input. Error: invalid input line format..."

I then tried to use the Editor tab instead and upload only the audio, and have Azure automatically transcribe it. After editing the results I tried to export it back to the Training and Testing Dataset tab and got this message:

"Error: normalized text is empty."

I'm sure I have met all of the file requirements, but nothing seems to be working - any ideas?

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
1,151 questions
Azure Machine Learning
Azure Machine Learning
An Azure machine learning service for building and deploying models.
2,242 questions
{count} votes

1 answer

Sort by: Most helpful
  1. santoshkc 745 Reputation points Microsoft Vendor

    Hi @Amy Highton ,

    Thank you for your response.

    I tried to reproduce the error that you received. User's image

    To fix this kind of issues, see the below steps for best practice:

    • Use good quality of audio and follow the below given information of image: The uploaded file should be compressed zip file. In the zip file include audio(.wav) and transcript(.txt) file.User's image
    • In text transcript(.txt) file, it should be audio name + space(or tabspace) + text transcript. eg: audio1.wav is the audio name. see below image. User's image
    • Upload your data in Training and testing dataset tab. Editor tab is used only to import the data. User's image

    Hope this helps. Thank you!

    If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.