Text to Speech numbers Normalization rules

Question

Hi all!

I have a sentence to be translated into speech:

Insgesamt wurde laut Landesamt im Nordosten bisher bei 45646 Menschen eine Corona-Infektion nachgewiesen, 43609 Menschen gelten als genesen.

When Azure TTS reads this text in German, the first number is read as a normal number(Fünfundvierzigtausendsechshundertsechsundvierzig), and the second one as a set of digits (vier-drei-sechs-null-neun).

What are the rules for numbers normalization in general, why is the first number read normally, and the second isn't?

EDIT: I could reproduce the same behavior in English:

According to the state office in the northeast, a total of 45646 people have so far been found to have a corona infection, 43609 are considered to have recovered.

Answer

Hi, I'm not able to reproduce this issue for TTS. When using the demo sample page, I'm getting 'dreiundvierzigtausendsechshundertneun' for '43609'. If you're still getting inconsistent results, please share a sample of your request so we can investigate further. Thanks!

--- *Kindly Accept Answer if the information helps. Thanks.*

Answer

Hi, following up. TTS digit reading is context related. The machine will make "smart" decision based on context. It will not be 100% correct. However, you can fix such issue by SSML say-as element.

--- *Kindly Accept Answer if the information helps. Thanks.*

Text to Speech numbers Normalization rules

2 answers