Get Duration and/or Size of Neural-TTS File Sent By Azure

Anonymous
2022-08-19T06:31:27.777+00:00

Hello, I get the neural-TTS file Azure sent me but I do not store store the file in the system directly, rather I want to first know the duration and/or the size of the file to decide where I will store it after... Can I get such data from the response Azure sends? Or in any other way?

Azure AI services
Azure AI services
A group of Azure services, SDKs, and APIs designed to make apps more intelligent, engaging, and discoverable.
2,371 questions
{count} vote

1 answer

Sort by: Most helpful
  1. romungi-MSFT 41,961 Reputation points Microsoft Employee
    2022-08-19T11:50:43.897+00:00

    @Anonymous This is possible if you use the Azure TTS long audio API.

    https://<region>.customvoice.api.speech.microsoft.com/api/texttospeech/v3.0/longaudiosynthesis  
    

    This is an asynchronous operation, so when the request is submitted you will receive a 202 Ok status code along with a URL that you can use to lookup the result of the processing.

    When a GET request is run against this URL, the response contains details of the files that are processed if the processing is successful along with the duration of the audio file. The sample response as seen below:

    response.status_code: 200  
    {  
      "models": [  
        {  
          "voiceName": "en-US-AriaNeural"  
        }  
      ],  
      "properties": {  
        "outputFormat": "riff-16khz-16bit-mono-pcm",  
        "concatenateResult": false,  
        "totalDuration": "PT5M57.252S",  
        "billableCharacterCount": 3048  
      },  
      ...  
    }  
    

    The totalDuration field from above is the duration of the audio file. The value of this field uses ISO-8601 duration format PnDTnHnMn.nS. Please see this page for example to parse this format.

    For the files that are actually created you will have to use the URL from the initial response and append /files to get the files.

    https://<region>.customvoice.api.speech.microsoft.com/api/texttospeech/v3.0/longaudiosynthesis/<GUID>/files  
    

    The GET response of the above URL will contain the size property which should tell you the size of the file in KB.

    Ex:

     response.status_code: 200  
        {  
          "values": [  
            {  
              "name": "2779f2aa-4e21-4d13-8afb-6b3104d6661a.txt",  
              "kind": "LongAudioSynthesisScript",  
              "properties": {  
                "size": 4200  
              },  
              ...  
              }  
            }  
    

    These details will not be available if you use the short audio API though.

    If an answer is helpful, please click on 130616-image.png or upvote 130671-image.png which might help other community members reading this thread.

    0 comments No comments