Read the paragraphs of a Word file as quickly as possible

zequion 151 Reputation points
2021-10-31T04:16:00.333+00:00

I have a c# function that reads paragraphs from .doc/.docx files. I use the familiar Microsoft system. The problem is that to read a 20mb size file it takes 1 hour and to read a 100mb file it takes all day and I can't use the pc for anything else.

C#
C#
An object-oriented and type-safe programming language that has its roots in the C family of languages and includes support for component-oriented programming.
10,198 questions
Office Development
Office Development
Office: A suite of Microsoft productivity software that supports common business tasks, including word processing, email, presentations, and data management and analysis.Development: The process of researching, productizing, and refining new or existing technologies.
3,457 questions
Word Management
Word Management
Word: A family of Microsoft word processing software products for creating web, email, and print documents.Management: The act or process of organizing, handling, directing or controlling something.
891 questions
0 comments No comments
{count} votes

2 answers

Sort by: Most helpful
  1. Charles Kenyon 2,561 Reputation points
    2021-11-01T18:05:26.56+00:00

    Basic problem is that Word does not know what a page is.
    https://wordmvp.com/Mac/PagesInWord.html

    It does know what a paragraph is (anything followed by a paragraph mark).

    1 person found this answer helpful.
    0 comments No comments

  2. Stefan Blom 2,061 Reputation points MVP
    2021-11-05T17:56:57.667+00:00

    What does the code do with the content it retrieves from the Word document? Perhaps there is a simpler way than going through paragraph by paragraph.

    0 comments No comments