Hi @Nathalie Froissart , Thanks for using Microsoft Q&A Platform.
Yes, it is possible to use Azure Cognitive Services Speech to get text input from the command prompt and configure the voice to emphasize the first word in the sentence or to have a higher pitch after a "?" token. For information on how to input the text, see this speech synthesis sample code on github.
Scenario1: to emphasize word.
For example, you can use the following SSML code to emphasize the first word in a sentence:
<speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" xmlns:mstts="https://www.w3.org/2001/mstts" xml:lang="en-US">
<voice name="en-US-GuyNeural">
<emphasis level="strong">Hello</emphasis> Good Morning.
</voice>
</speak>
In this example, the words "Hello" will be emphasized. The emphasis tag with level='strong' will emphasize the first word in the sentence.
You can find more information about using the emphasis tag in SSML in the following documentation: https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/speech-synthesis-markup-voice#adjust-emphasis
Scenario 2: Higher pitch after "?"
You can use the Speech Synthesis Markup Language (SSML) to control the prosody of the speech output. SSML tags allow you to specify the pitch, rate, volume, and pronunciation of the speech.
And you can use the following SSML code to have a higher pitch after a "?" token:
<speak>
What is your name? <prosody pitch="high">Please tell me.</prosody>
</speak>
You can use an if-else statement to control the input text that is supposed to be read out loud. For example, if the input text contains a "?" token, you can use the SSML code with the higher pitch, otherwise, you can use the SSML code with the emphasized first word.
I hope this helps. Let me know if you are looking for any additional information.
Regards,
Vasavi
-Please kindly accept the answer and vote 'Yes' if you feel helpful to support the community, thanks a lot.