recognizing multiple audio files and obtain each transcription result

Question

recognizing multiple audio files and obtain each transcription result

Wayne Lee 51

Hi,

I have several audio files and i want to get the transcription of each one. What is the best way to call Azure spx to implement this?
Is there any example for the Speech CLI for this?

Thanks!

Accepted answer

4 additional answers

Your answer

Answer 1

GiftA-MSFT 11,176

You can try the following:

Option 1: spx recognize --files C:\path\*.wav --output batch file C:\path\"batch-{id}.json"
Option 2 (using command prompt): for %i in (C:path\*.wav) do spx recognize --files %i --output batch file %i.results.json

--please don't forget to Accept Answer if the reply is helpful. Thanks.--

Wayne Lee 51 Reputation points

2022-04-20T23:29:34.87+00:00

Thank you very much for investigating this. I am looking forward to hearing from your confirmation.
GiftA-MSFT 11,176 Reputation points

2022-04-20T23:36:10.647+00:00

Please check updated response above.

Answer 2

GiftA-MSFT 11,176

Hi, thanks for reaching out. The QuickStart document provides steps for recognizing and converting speech to text. Here's the CLI Command: spx recognize --file whatstheweatherlike.wav. If you want to transcribe large amount of audio in storage, consider using batch transcription API.

--please don't forget to Accept Answer if the reply is helpful. Thanks.--

Answer 3

I need the final .tcv file and also the .json file saving the partial results.

I used the following command to get the file MyAudoFileName.wav transcribed and the results saved in MyAudioFileName.json.
spx recognize --file MyAudioFileName.wav --output batch file MyAudioFileName.json

I have a lot audio files, and I saw the batch command:
spx recognize --files C:\your_wav_file_dir*.wav --output file C:\output_dir\speech_output.tsv --threads 10

For each audio file, how can I get its corresponding .json file in this batch mode? I did not find it in the doc for batch process.
Speech CLI batch operations: https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/spx-batch-operations

Could you please tell me how to get the .json files for all the audio files in the batch mode?

Thanks.

Answer 4

Thanks for your confirmation. That's very helpful!
I have been using the 2nd option. I think this may be different from the batch processing mode, as for each file, we call the spx and after completed the connection is closed. In batch mode, if the connection is established and closed for each file, then the batch mode is exactly the same as the 2nd option. If it is not such case, then batch mode may have advantages of keeping in connection for multiple files. I am not sure, just guessing.
I will try the 1st option and see if it works as i expected.
Many thanks for your help!

Answer 5

Wayne Lee 51

I tested the 1st option. After the recognition of all the audio files in the directory, it did not create any .json file.
The output.xxxx.tsv contains the results of all the audio files in the directory, and each line in it is in the format: audioFileName transcription_results
Thanks.

Share via

recognizing multiple audio files and obtain each transcription result

4 additional answers

Your answer