recognizing multiple audio files and obtain each transcription result

Wayne Lee 51 Reputation points
2022-04-19T18:24:24.423+00:00

Hi,

I have several audio files and i want to get the transcription of each one. What is the best way to call Azure spx to implement this?
Is there any example for the Speech CLI for this?

Thanks!

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
2,069 questions
0 comments No comments
{count} votes

Accepted answer
  1. GiftA-MSFT 11,176 Reputation points
    2022-04-20T19:25:10.453+00:00

    You can try the following:

    Option 1: spx recognize --files C:\path\*.wav --output batch file C:\path\"batch-{id}.json"
    Option 2 (using command prompt): for %i in (C:path\*.wav) do spx recognize --files %i --output batch file %i.results.json

    --please don't forget to Accept Answer if the reply is helpful. Thanks.--

    1 person found this answer helpful.

4 additional answers

Sort by: Most helpful
  1. GiftA-MSFT 11,176 Reputation points
    2022-04-19T19:45:55.59+00:00

    Hi, thanks for reaching out. The QuickStart document provides steps for recognizing and converting speech to text. Here's the CLI Command: spx recognize --file whatstheweatherlike.wav. If you want to transcribe large amount of audio in storage, consider using batch transcription API.

    --please don't forget to Accept Answer if the reply is helpful. Thanks.--

    0 comments No comments

  2. Wayne Lee 51 Reputation points
    2022-04-19T22:32:42.883+00:00

    I need the final .tcv file and also the .json file saving the partial results.

    I used the following command to get the file MyAudoFileName.wav transcribed and the results saved in MyAudioFileName.json.
    spx recognize --file MyAudioFileName.wav --output batch file MyAudioFileName.json

    I have a lot audio files, and I saw the batch command:
    spx recognize --files C:\your_wav_file_dir*.wav --output file C:\output_dir\speech_output.tsv --threads 10

    For each audio file, how can I get its corresponding .json file in this batch mode? I did not find it in the doc for batch process.
    Speech CLI batch operations: https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/spx-batch-operations

    Could you please tell me how to get the .json files for all the audio files in the batch mode?

    Thanks.

    0 comments No comments

  3. Wayne Lee 51 Reputation points
    2022-04-21T01:00:11.71+00:00

    Thanks for your confirmation. That's very helpful!
    I have been using the 2nd option. I think this may be different from the batch processing mode, as for each file, we call the spx and after completed the connection is closed. In batch mode, if the connection is established and closed for each file, then the batch mode is exactly the same as the 2nd option. If it is not such case, then batch mode may have advantages of keeping in connection for multiple files. I am not sure, just guessing.
    I will try the 1st option and see if it works as i expected.
    Many thanks for your help!

    0 comments No comments

  4. Wayne Lee 51 Reputation points
    2022-04-21T04:36:06.267+00:00

    I tested the 1st option. After the recognition of all the audio files in the directory, it did not create any .json file.
    The output.xxxx.tsv contains the results of all the audio files in the directory, and each line in it is in the format: audioFileName transcription_results
    Thanks.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.