Hello @Alex Del Giudice ,
In pronunciation assessment, we usually have a "reference/target text". After the recognition, an algorithm, edit distance, and other steps will be applied to compute the insertion, and deletion error.
Only the words which are tag by sequence matching algorithm as matched word will be assign "mispronunciation" or None" tag like that.
I hope this helps. We appreciate your time and patience throughout this issue. Regards,
Vasavi
-Please kindly accept the answer and vote 'yes' if you feel helpful to support the community, thanks.