[Pronunciation Assessment] Is there a way to improve the results using a custom model?

Question

[Pronunciation Assessment] Is there a way to improve the results using a custom model?

Francis W 0

I've been experimenting with the pronunciation assessment service. My use case involves scripted assessment in Canadian French (i.e., fr-CA). So far, I've had the most success with the configuration where enableMiscue is set to false, as I have found the results to be easier to interpret and more consistent than when the option is turned on. Ideally, I would like to get phoneme granularity, but from what I've read in the docs, that option is only available for en-US.

I've also experimented with the en-US dialect, and I must say that I'm impressed with the results. Unfortunately, I can't say the same for fr-CA. It's not that the latter is "bad", but from what I've seen, it's not quite ready for "prime time." While it performs well overall, it fails in a lot of situations. At some point, I considered that there might be something wrong with my French, but I double-checked using AI-synthesized audio, and the service fails at the same places. There are numerous examples, and my goal is not to list them all here, but the service systematically fails to detect the word dehors. In other cases, the service fails for plural words (e.g., parents fails, while parent works fine).

That being said, I would really like to use the pronunciation assessment service for my use case, so I'm looking for workarounds. I read about the custom speech service, and I was wondering whether it would be possible to train custom speech models to be used in the context of pronunciation assessment. At the very least, it would be nice to have a way to report inaccuracies in the returned assessments so that the fr-CA model could be updated. I'm also open to any other recommendations that could make the fr-CA model work better for me.

Thank you very much for your time and support.

Francis

SriLakshmi C 6,250 Reputation points Microsoft External Staff Moderator

2024-10-04T09:54:22.7666667+00:00

Hello Francis W,

We are reaching out to the internal team to get more information related to your query and will get back to you as soon as we have an update.

Thank you!
SriLakshmi C 6,250 Reputation points Microsoft External Staff Moderator

2024-10-05T05:56:41.4+00:00

Hello Francis W, Greetings!

To improve the results from Azure's pronunciation assessment service for Canadian French (fr-CA), consider creating custom speech models using recordings from native speakers. This allows you to address specific pronunciation inconsistencies, such as with words like "dehors" and plural forms.

Experimenting with the pronunciation assessment API's parameters can also help. Since disabling the enableMiscue option yielded clearer results for you, continue testing other configuration settings, such as sensitivity levels.

Stay updated on Azure's documentation and community forums for any improvements or new features that could enhance your experience. You can consider other Azure services, like Text-to-Speech or Language to provide a more comprehensive solution for your pronunciation assessment needs.

Please let us know if you have any further questions. Thank you!
SriLakshmi C 6,250 Reputation points Microsoft External Staff Moderator

2024-10-07T05:33:09.9666667+00:00

Hello Francis W,

Following up to see if the above response was helpful. Thank you!
Francis W 0 Reputation points

2024-10-07T11:34:44.12+00:00

Hello @SriLakshmi C ,

Thank you for your response.

You write:

To improve the results from Azure's pronunciation assessment service for Canadian French (fr-CA), consider creating custom speech models using recordings from native speakers.

By that, do you mean that there is a way to create a custom model and to later use that model in the context of the pronunciation assessment service?

Thank you for clarifying.

Francis
Francis W 0 Reputation points

2024-10-09T11:32:37.03+00:00

@SriLakshmi C ,

Thank you for clarifying.

Thank you very much for your time and support! :)

Francis
Francis W 0 Reputation points

2024-10-09T11:34:59.9833333+00:00

@SriLakshmi C (Quadrant Resource LLC) ,

Understood! Thanks again for your time and support! :)

Francis
SriLakshmi C 6,250 Reputation points Microsoft External Staff Moderator

2024-10-14T10:25:28.27+00:00

Hello Francis W,

I'm glad to hear that my response was helpful to you. And thanks for sharing the information, which might be beneficial to other community members reading this thread as solution. Since the Microsoft Q&A community has a policy that "The question author cannot accept their own answer. They can only accept answers by others ", so I'll convert the previous response to an answer in case you'd like to accept the answer. This will help other users who may have a similar query find the solution more easily. If you have any further questions or concerns, please don't hesitate to ask. We're always here to help.

Thank you!
SriLakshmi C 6,250 Reputation points Microsoft External Staff Moderator

2024-10-15T09:21:53.73+00:00

Hi Francis W,

Did you got any chance to check the below provided response?

Thank you!

1 answer

Your answer

SriLakshmi C 6,250 Reputation points Microsoft External Staff Moderator

2024-10-04T09:54:22.7666667+00:00

Hello Francis W,

We are reaching out to the internal team to get more information related to your query and will get back to you as soon as we have an update.

Thank you!
SriLakshmi C 6,250 Reputation points Microsoft External Staff Moderator

2024-10-05T05:56:41.4+00:00

Hello Francis W, Greetings!

To improve the results from Azure's pronunciation assessment service for Canadian French (fr-CA), consider creating custom speech models using recordings from native speakers. This allows you to address specific pronunciation inconsistencies, such as with words like "dehors" and plural forms.

Experimenting with the pronunciation assessment API's parameters can also help. Since disabling the enableMiscue option yielded clearer results for you, continue testing other configuration settings, such as sensitivity levels.

Stay updated on Azure's documentation and community forums for any improvements or new features that could enhance your experience. You can consider other Azure services, like Text-to-Speech or Language to provide a more comprehensive solution for your pronunciation assessment needs.

Please let us know if you have any further questions. Thank you!
SriLakshmi C 6,250 Reputation points Microsoft External Staff Moderator

2024-10-07T05:33:09.9666667+00:00

Hello Francis W,

Following up to see if the above response was helpful. Thank you!
Francis W 0 Reputation points

2024-10-07T11:34:44.12+00:00

Hello @SriLakshmi C ,

Thank you for your response.

You write:

To improve the results from Azure's pronunciation assessment service for Canadian French (fr-CA), consider creating custom speech models using recordings from native speakers.

By that, do you mean that there is a way to create a custom model and to later use that model in the context of the pronunciation assessment service?

Thank you for clarifying.

Francis
Francis W 0 Reputation points

2024-10-09T11:32:37.03+00:00

@SriLakshmi C ,

Thank you for clarifying.

Thank you very much for your time and support! :)

Francis
Francis W 0 Reputation points

2024-10-09T11:34:59.9833333+00:00

@SriLakshmi C (Quadrant Resource LLC) ,

Understood! Thanks again for your time and support! :)

Francis
SriLakshmi C 6,250 Reputation points Microsoft External Staff Moderator

2024-10-14T10:25:28.27+00:00

Hello Francis W,

I'm glad to hear that my response was helpful to you. And thanks for sharing the information, which might be beneficial to other community members reading this thread as solution. Since the Microsoft Q&A community has a policy that "The question author cannot accept their own answer. They can only accept answers by others ", so I'll convert the previous response to an answer in case you'd like to accept the answer. This will help other users who may have a similar query find the solution more easily. If you have any further questions or concerns, please don't hesitate to ask. We're always here to help.

Thank you!
SriLakshmi C 6,250 Reputation points Microsoft External Staff Moderator

2024-10-15T09:21:53.73+00:00

Hi Francis W,

Did you got any chance to check the below provided response?

Thank you!

Answer 1

Hello Francis W,Unfortunately, you cannot currently create a custom speech model to be directly integrated into the pronunciation assessment service for Canadian French (fr-CA). The pronunciation assessment service does not support the use of custom speech models for improving the accuracy of phoneme or word-level assessments. The custom speech service, while useful for improving speech-to-text transcriptions in various dialects and languages, does not extend to fine-tuning the pronunciation assessment feature itself.

However, you can leverage the custom speech service to build models that better recognize and transcribe Canadian French in your specific use case. While this won’t directly enhance the pronunciation assessment, it could provide better transcription and recognition accuracy as part of the overall language processing pipeline.

Thank you!

Share via

[Pronunciation Assessment] Is there a way to improve the results using a custom model?

1 answer

Your answer