How to fix conversion error line breaks in Microsoft Word?

Question

How to fix conversion error line breaks in Microsoft Word?

TJS 40

I have a batch of Microsoft Word documents that were converted from other formats, including WordPerfect and PDF, several years ago. Some of these have many short lines in a row. For example, the text might look like

Bananas

Oranges

Apples

After each of these lines, I would expect to see a paragraph mark/hard return, which would make these parse as separate lines when converted to html. But instead, Word has converted them as ending in spaces with a sharp right indent, so when it is converted to html, it becomes:

Bananas Oranges Apples

which is not workable.

Is there a way that I can identify only spaces that are at the ends of these short lines, and replace them with paragraph marks? In the shorter files I can do it by hand, but some of these are hundreds of pages long. The only other thing I can think to do is to convert them to PDF and back to Word, which seems to fix the line breaks, but I'm not sure if that will cause other issues that are less visible.

Any thoughts appreciated!

John Korchok 231.4K Reputation points Volunteer Moderator

2026-01-08T22:46:08.4+00:00

If the bot's reply does not help you, please click on the backwards P icon on the Home tab to show the hidden formatting marks. Then post a screen shot to show us what you are seeing.

Answer accepted by question author

1 additional answer

Your answer

John Korchok 231.4K Reputation points Volunteer Moderator

2026-01-08T22:46:08.4+00:00

If the bot's reply does not help you, please click on the backwards P icon on the Home tab to show the hidden formatting marks. Then post a screen shot to show us what you are seeing.

Answer 1

Thanks for posting. See if the following works for you. I asked Copilot:

"Word" VBA regex find space followed by uppercase character and add return between space and character

and it gave my the following VBA:

Sub AddReturnBetweenSpaceAndUppercase()
    Dim regEx As Object
    Dim matches As Object
    Dim match As Object
    Dim docRange As Range
    
    ' Create RegExp object
    Set regEx = CreateObject("VBScript.RegExp")
    regEx.Pattern = " ([A-Z])"   ' Space followed by uppercase letter
    regEx.Global = True
    regEx.IgnoreCase = False     ' Case-sensitive (only uppercase)
    
    ' Work on the whole document
    Set docRange = ActiveDocument.Content
    
    ' Run regex replace: replace space+uppercase with paragraph mark + uppercase
    docRange.Text = regEx.Replace(docRange.Text, vbCr & "\1")
    
    MsgBox "Done! Paragraph breaks inserted.", vbInformation
End Sub

Here are the notes accompanying the code:

How It Works

Pattern:

" ([A-Z])" → matches a space followed by an uppercase letter (A–Z).

The uppercase letter is captured in group 1 ([A-Z]).

Replacement:

- `vbCr & "\1"` → inserts a paragraph break (`vbCr`) before the captured uppercase letter.

**Scope**:

   - `ActiveDocument.Content` → applies to the entire document.

Important Notes

This replaces all matches in the document at once.
If you want to preserve formatting, you should use Word’s Find with wildcards instead of replacing .Text directly (because .Text replacement removes formatting).

Formatting-Safe Alternative (Word Wildcards)

Vba

Sub AddReturnFormattingSafe()
    With Selection.Find
        .ClearFormatting
        .Replacement.ClearFormatting
        .Text = " ([A-Z])"
        .Replacement.Text = "^p\1"
        .Forward = True
        .Wrap = wdFindContinue
        .Format = False
        .MatchWildcards = True
        .Execute Replace:=wdReplaceAll
    End With
    MsgBox "Done! Paragraph breaks inserted.", vbInformation
End Sub

Explanation:

" ([A-Z])" with wildcards matches space + uppercase.
^p\1 inserts a paragraph break before the uppercase letter.
This method keeps formatting intact.

Answer 2

Conversion from other programs is almost always imperfect.

It may look OK but the underlying structure will be very different from what the document would be if created in Word. Documents converted from pdf (or really any other format) to Word can be tough to edit because the conversion process never has a one-to-one matching of how formatting is done under the hood. This means that a converted document will seldom be formatted in Word in a way that uses Word features well for that formatting. An example is multiple section breaks to change margins, where in Word you would simply change the paragraph indent. Margins and Indents in Word. Another example is that Word formatting of text is best done using Styles and those will not be used. It will all be direct formatting. That can make a huge difference in how easy it is to edit. The Importance of Styles in Microsoft Word.

With pdf files, if possible, find the file from which the pdf was created and edit that file, using the program that created it. Then if you need it in Word format and it is not, convert it directly to Word. This will cut out one conversion process and make for fewer editing problems.

When I really need the document in Word format and intend to do much editing, I create a new Word file and paste the content into it as plain text. Then I format it to match the original using Styles for the formatting as much as possible. This takes time; for me, it is worth it and saves a lot of frustration.

Answer 3

TJS 40

User's image

0 comments

Share via

How to fix conversion error line breaks in Microsoft Word?

How It Works

Important Notes

Formatting-Safe Alternative (Word Wildcards)

Conversion from other programs is almost always imperfect.

1 additional answer

Your answer