Word COM-Addin document sentence detection is wrong.

Ajay Pandirkar 1 Reputation point

In word Add-In, we are using the Microsoft.Office.Interop.Word namespace => document.sentences to get sentences in the document. The sentence detection is inaccurate, which results in splitting a complete sentence into multiple sentences. Also, for some paragraphs, it skips the part of the sentence before delimiter (',' or ".''). which causes the wrong sentence detection.


        On the downside, a lack of trust between female employees and leaders as the outcome of intra-gender micro-violence can lead to increased stress and isolation of both female managers and employees (O’Neil et al., 2018, p. 337). Furthermore, according to Derks et al.    

Paragraph Split Results:         

"sentence 1": " ,"

"sentence 2": "2018, p."

"sentence 3": "337)."

"sentence 4": "Furthermore, according to Derks et al."


Is there is another way to get correct sentence detection?

Office Development
Office Development
Office: A suite of Microsoft productivity software that supports common business tasks, including word processing, email, presentations, and data management and analysis.Development: The process of researching, productizing, and refining new or existing technologies.
3,706 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Oskar Shon 866 Reputation points MVP

    First make loop for Paragraphs in your document.
    Then next loop or split to array looking ". " (dot and space), because single dot can't by sure the and of sentence.
    If you want to be 100% sure you can check if next sentence have big letter, any sign or number.


    0 comments No comments