spx translate not storing results

User Sense 1 Reputation point
2022-05-20T14:15:56.687+00:00

Command

spx translate --once --file audio.mp3 --format mp3 --source de-DE --target nl-NL  --output batch json --output batch file test.json

Result on screen

Connection CONNECTED...
TRANSLATING into 'nl': OK (from 'Okay')
TRANSLATING into 'nl': Oké, ik (from 'Okay, ich')
TRANSLATING into 'nl': Oké, ik ben met (from 'Okay, ich bin bei')
TRANSLATING into 'nl': Oké, ik ben met lampen (from 'Okay, ich bin bei Lampen')
TRANSLATING into 'nl': Oké, ik ben bij lampen en gloeit (from 'Okay, ich bin bei Lampen und leucht')
TRANSLATING into 'nl': Oké, ik hou van lampen en leuchter.cn (from 'Okay, ich bin bei Lampen und leuchter.cn')
TRANSLATING into 'nl': Oké, ik ben met lampen en leuchter.cn me (from 'Okay, ich bin bei Lampen und leuchter.cn ich')
TRANSLATED into 'nl': Oké, ik ben met lampen en leuchter.cn ik zou doen. (from 'Okay, ich bin bei Lampen und leuchter.cn, ich würde.')

SESSION STOPPED: 2f088491dd914ec9bbf27a594c35a8f1

Result in json file

{
  "AudioFileResults": [
    {
      "AudioFileName": "audio.mp3",
      "AudioFileUrl": "audio.mp3",
      "AudioLengthInSeconds": 7.3,
      "CombinedResults": [
        {
          "ChannelNumber": "0",
          "Lexical": "okay ich bin bei lampen und leuchter cn ich würde",
          "ITN": "Okay, ich bin bei Lampen und leuchter.cn, ich würde.",
          "MaskedITN": "Okay, ich bin bei Lampen und leuchter.cn, ich würde.",
          "Display": "Okay, ich bin bei Lampen und leuchter.cn, ich würde."
        }
      ],
      "SegmentResults": [
        {
          "RecognitionStatus": "Success",
          "ChannelNumber": "0",
          "SpeakerId": null,
          "Offset": 23700000,
          "Duration": 53000000,
          "OffsetInSeconds": 2.37,
          "DurationInSeconds": 5.3,
          "NBest": []
        }
      ]
    }
  ]
}

The translated text is not present in the test.json file. What am I missing?

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
2,069 questions
{count} votes

3 answers

Sort by: Most helpful
  1. romungi-MSFT 48,911 Reputation points Microsoft Employee Moderator
    2022-05-24T05:46:00.18+00:00

    @User Sense I can confirm that the above behavior was expected because the options used with the command for --output flag "batch json FILENAME" does not support translation and outputs the exact BATCH REST API json schema. It however created a .tsv file in the same directory and inside that file the translation information is available. If you want to get the translation text in JSON file you can follow this command format:

    spx translate --once --microphone --source en-US --target de --output all file type json --output all file my.output3.json  
    

    Basically use all instead of batch.

    More details about the output flags to print specific information is available as help in the tool.

    spx help translate output  
    spx help translate output all  
    spx help translate output each  
    spx help translate output examples  
    

    If an answer is helpful, please click on 130616-image.png or upvote 130671-image.png which might help other community members reading this thread.


  2. User Sense 1 Reputation point
    2022-05-24T14:09:07.047+00:00

    @romungi-MSFT We are planning to provide our customers with a translated transcription based on the audio file. This in combination with the orginial transcription in the language of the audio file (spx synthesize)

    Just the translation without a time reference isn't very helpful.

    It is odd that it is not possible to store the translated text into a file (json). Both batch and file approach are not containing the translation. A bit odd if you ask me as this is the purpose of the translate command. Even the tsv file is missing the translation.

    /root/.dotnet/tools/spx translate --once --file audio.mp3 --format mp3 --source de-DE --target nl-NL  
    

    On screen

    SESSION STARTED: ffa365b781f549df829df745562518be  
      
    Connection CONNECTED...  
    TRANSLATING into 'nl': OK (from 'Okay')  
    ...  
    TRANSLATING into 'nl': Oké, ik ben met lampen en leuchter.cn me (from 'Okay, ich bin bei Lampen und leuchter.cn ich')  
    TRANSLATING into 'nl': Oké, ik ben met lampen en leuchter.cn ik zou doen (from 'Okay, ich bin bei Lampen und leuchter.cn, ich würde')  
    TRANSLATED into 'nl': Oké, ik ben met lampen en leuchter.cn ik zou doen. (from 'Okay, ich bin bei Lampen und leuchter.cn, ich würde.')  
      
    SESSION STOPPED: ffa365b781f549df829df745562518be  
    

    tsv file

    audio.input.id  recognizer.session.started.sessionid    recognizer.recognized.result.text  
    audio   ffa365b781f549df829df745562518be        Okay, ich bin bei Lampen und leuchter.cn, ich würde.  
    

    Only the original transcription, no translation. Even the commands that should work do not work as expected.

    0 comments No comments

  3. User Sense 1 Reputation point
    2022-05-24T14:24:29.96+00:00

    I guess I will stick to spx translate --once --source de-DE --target nl --file audio.mp3 --format any --output vtt file caption.vtt for now.

    This results in a valid caption.nl.vtt file


    Another issue with this command is that

    spx translate --once --source de-DE --target nl-NL --file audio.mp3 --format any --output vtt file caption.vtt
    

    Results in the file caption.nl-NL.vtt, but is empty. Translation is shown on the screen but not exported :-(

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.