Custom Speech Audio + Transcript

dddd 61 Reputation points
2022-04-09T20:02:23.543+00:00

191517-image.png

I see that the audio snippet can not exceed 60 seconds.
I have uploaded data that is longer than 60 seconds and it seemed that the custom speech model was still trained.
What is happening with the rest of my data?
Is it being truncated?
Would it be better to stay under the 60 seconds?

thank you.

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
Azure AI services
Azure AI services
A group of Azure services, SDKs, and APIs designed to make apps more intelligent, engaging, and discoverable.
{count} votes

1 answer

Sort by: Most helpful
  1. romungi-MSFT 49,101 Reputation points Microsoft Employee Moderator
    2022-04-11T10:18:00.563+00:00

    @dddd I have answered a same question earlier this year on the behavior of the trained model if the audio is more than 60 seconds. Please refer the complete thread and conversation here.

    To summarize, using files with more than 60s would not compromise the training to the limits mentioned. The guidance is based on training the acoustic model which just needs upto 60s of data.
    The audio files help in training the model based on your audio quality or background that you would probably use with all your future files. So, using length>60s will not ignore rest of the data. The text file transcripts used should be accurate to ensure the model is trained accurately for any length of your audio.

    If an answer is helpful, please click on 130616-image.png or upvote 130671-image.png which might help other community members reading this thread.

    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.