How can I insert text in the text-to-speech tool that will not be spoken?

Mike Niedert 0 Reputation points
2024-03-12T22:54:43.3466667+00:00

If I wanted to include written prompts or headings which would help me keep the script organized, but I don't want the prompts or headings to be read...what can I do?

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
1,384 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. navba-MSFT 16,935 Reputation points Microsoft Employee
    2024-03-13T03:18:26.44+00:00

    @Mike Niedert Welcome to Microsoft Q&A Forum, Thank you for posting your query here!

    Azure Speech Service doesn’t directly support the feature to insert text that will not be spoken in the text-to-speech tool. However, you can use a workaround by structuring your script in a way that certain parts are programmatically ignored during the text-to-speech process.

    Are you using Python script ? If Yes, I am sharing the sample Python script where we can perform the from text to speech operation and ignore the prompt headings:

    # Define your script with prompts or headings
    script = {
        "Section 1": {
            "prompt": "Introduction",
            "content": "Welcome to our presentation. Today we will be discussing..."
        },
        "Section 2": {
            "prompt": "Main Topic",
            "content": "The main topic of our discussion is..."
        }
    }
    
    # Function to convert text to speech
    def text_to_speech(text):
        # Your Azure Speech text-to-speech code here
        pass
    
    # Use only the 'content' for text-to-speech
    for section in script.values():
        text_to_speech(section["content"])
    
    

    More info about the text to speech REST API is here.

    Alternatively, you could try using Azure's Speech Synthesis Markup Language (SSML) which gives you more control over text to speech output. SSML allows for the inclusion of tags that can modify the pronunciation, volume, and speed of the speech output. Using this you could use the "speak" tag to include only the text they want to be read aloud and exclude the headings and prompts.

    More info here and here.

    Hope this helps. If you have any follow-up questions, please let me know. I would be happy to help.

    **

    Please do not forget to "Accept the answer” and “up-vote” wherever the information provided helps you, this can be beneficial to other community members.

    0 comments No comments