I've been experimenting with the pronunciation assessment
service. My use case involves scripted assessment
in Canadian French (i.e., fr-CA
). So far, I've had the most success with the configuration where enableMiscue
is set to false
, as I have found the results to be easier to interpret and more consistent than when the option is turned on. Ideally, I would like to get phoneme granularity, but from what I've read in the docs, that option is only available for en-US
.
I've also experimented with the en-US
dialect, and I must say that I'm impressed with the results. Unfortunately, I can't say the same for fr-CA
. It's not that the latter is "bad", but from what I've seen, it's not quite ready for "prime time." While it performs well overall, it fails in a lot of situations. At some point, I considered that there might be something wrong with my French, but I double-checked using AI-synthesized audio, and the service fails at the same places. There are numerous examples, and my goal is not to list them all here, but the service systematically fails to detect the word dehors
. In other cases, the service fails for plural words (e.g., parents
fails, while parent
works fine).
That being said, I would really like to use the pronunciation assessment
service for my use case, so I'm looking for workarounds. I read about the custom speech service, and I was wondering whether it would be possible to train custom speech models to be used in the context of pronunciation assessment
. At the very least, it would be nice to have a way to report inaccuracies in the returned assessments so that the fr-CA
model could be updated. I'm also open to any other recommendations that could make the fr-CA
model work better for me.
Thank you very much for your time and support.
Francis