Change Sampling Rate to 44,100 kHz in Text to Speech via CLI (default is 16,000 kHz)

Ed Czelada 21 Reputation points

Is there a way to change default audio format in Azure Cognitive Services in Text to Speech CLI from 16,000 kHz to something higher, IE 44,100 kHz?

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
1,156 questions
{count} votes

Accepted answer
  1. romungi-MSFT 39,951 Reputation points Microsoft Employee

    @Ed Czelada I believe you can set the format of the output file using the --format option with the spx synthesize command.

    spx synthesize [...] --format FORMAT    

    You can get the format by running help on the formats that are supported.

    spx help synthesize format  

    This should give you all the supported formats further by running help on all the formats.

    spx help synthesize wav  
    spx help synthesize mp3  
    spx help synthesize ogg  
    spx help synthesize opus  
    spx help synthesize webm  
    spx help synthesize siren  
    spx help synthesize raw  

    Set the format that is required in the command and check if this works. For example, the default is riff-16khz-16bit-mono-pcm for wav other formats supported for wav are

    spx help synthesize wav  
    USAGE: spx synthesize [...] --format FORMAT                                                                                      │  
    │                                                                                                                                 │  
    │  WHERE: FORMAT is wav                                                                                                           │  
    │     OR: FORMAT is riff-16khz-16bit-mono-pcm                                                                                     │  
    │     OR: FORMAT is riff-24khz-16bit-mono-pcm                                                                                     │  
    │     OR: FORMAT is riff-48khz-16bit-mono-pcm                                                                                     │  
    │     OR: FORMAT is riff-8khz-16bit-mono-pcm                                                                                      │  
    │     OR: FORMAT is riff-8khz-8bit-mono-alaw                                                                                      │  
    │     OR: FORMAT is riff-8khz-8bit-mono-mulaw                                                                                     │  
    │     OR: FORMAT is riff-16khz-16kbps-mono-siren                                                                                  │  
    │                                                                                                                                 │  
    │  NOTE: The default wav format is riff-16khz-16bit-mono-pcm  

    If an answer is helpful, please click on 130616-image.png or upvote 130671-image.png which might help other community members reading this thread.

    0 comments No comments

0 additional answers

Sort by: Most helpful