@Anonymous This is possible if you use the Azure TTS long audio API.
https://<region>.customvoice.api.speech.microsoft.com/api/texttospeech/v3.0/longaudiosynthesis
This is an asynchronous operation, so when the request is submitted you will receive a 202 Ok
status code along with a URL that you can use to lookup the result of the processing.
When a GET request is run against this URL, the response contains details of the files that are processed if the processing is successful along with the duration of the audio file. The sample response as seen below:
response.status_code: 200
{
"models": [
{
"voiceName": "en-US-AriaNeural"
}
],
"properties": {
"outputFormat": "riff-16khz-16bit-mono-pcm",
"concatenateResult": false,
"totalDuration": "PT5M57.252S",
"billableCharacterCount": 3048
},
...
}
The totalDuration
field from above is the duration of the audio file. The value of this field uses ISO-8601 duration format PnDTnHnMn.nS
. Please see this page for example to parse this format.
For the files that are actually created you will have to use the URL from the initial response and append /files
to get the files.
https://<region>.customvoice.api.speech.microsoft.com/api/texttospeech/v3.0/longaudiosynthesis/<GUID>/files
The GET response of the above URL will contain the size
property which should tell you the size of the file in KB.
Ex:
response.status_code: 200
{
"values": [
{
"name": "2779f2aa-4e21-4d13-8afb-6b3104d6661a.txt",
"kind": "LongAudioSynthesisScript",
"properties": {
"size": 4200
},
...
}
}
These details will not be available if you use the short audio API though.
If an answer is helpful, please click on or upvote which might help other community members reading this thread.