Thank you for reaching out to Microsoft Q&A forum!
Microsoft's Neural TTS service utilizes cutting-edge neural network architectures to deliver high-quality, natural-sounding speech synthesis. While specific technical details about the exact architecture used in the service may not be publicly disclosed, but Microsoft uses models like FastSpeech, etc.
For academic purposes, you can reference the general approach taken by Microsoft, which focuses on end-to-end speech synthesis using parallelized models for fast and accurate text-to-speech conversion.
For more specific or unpublished information about the exact architecture used in the Neural TTS service, consider contacting Microsoft support or exploring their official research publications.
I hope you understand. And, if you have any further query do let us know.
If this answers your query, do click Accept Answer
and Yes
for was this answer helpful.