How to import an html file into MS word, and have Word recognize the H tags?

Anonymous
2019-02-02T01:31:48+00:00

When you bring a simple HTML file into MS Word, (e.g. with H1 and H2 tags) Word will import it.

But it doesn't seem to fully recognize the H tags as "MS Word "headings"

Anyone know how to fix this?

Microsoft 365 and Office | Word | For home | Windows

Locked Question. This question was migrated from the Microsoft Support Community. You can vote on whether it's helpful, but you can't add comments or replies or follow the question.

0 comments No comments
{count} votes
Answer accepted by question author
  1. Doug Robbins - MVP - Office Apps and Services 322.1K Reputation points MVP Volunteer Moderator
    2019-02-04T02:33:00+00:00

    If you open the HTML file in Google Docs and then, instead of copying and pasting into Word, you  go to File>Download as and select Microsoft Word (.docx) 

    and then open the downloaded document in Word, the appropriate Heading # styles will have been applied

    Less steps however if you simply paste the text with tags into Word and run the macro that I provided.

    1 person found this answer helpful.
    0 comments No comments

14 additional answers

Sort by: Most helpful
  1. Doug Robbins - MVP - Office Apps and Services 322.1K Reputation points MVP Volunteer Moderator
    2019-02-02T03:14:42+00:00

    What is the format of the tags?

    If that is known, it is possible to create a macro that will go through the document and delete the tags, applying the appropriate style to each paragraph to which the tags have been applied.

    1 person found this answer helpful.
    0 comments No comments
  2. Anonymous
    2019-02-02T03:39:51+00:00

    The Tags are just raw html text file.

    e.g. <H1>Chapter on</H2>

    What's weird is, MS Word sees the <H> tags, and it correctly arranges them in the left Navigation pane.

    But when you actually put your mouse on the text, the STYLE is always set to Normal.

    Major bummer.

    See image:

    0 comments No comments
  3. Doug Robbins - MVP - Office Apps and Services 322.1K Reputation points MVP Volunteer Moderator
    2019-02-02T06:51:26+00:00

    Running a macro containing the following code will remove the tags and apply the appropriate styles

    Dim rng As Range

    Dim lngStyle As Long

    Dim i As Long

    Selection.HomeKey wdStory

    With Selection.Find

        Do While .Execute(FindText:="H[0-9]{1,}", Forward:=True, _

        MatchWildcards:=True, Wrap:=wdFindStop, MatchCase:=True) = True

            Set rng = Selection.Range

            lngStyle = Val(Mid(rng, 2))

            Selection.MoveStart wdCharacter, -1

            Selection.Collapse wdCollapseStart

            Set rng = Selection.Paragraphs(1).Range

            For i = 4 To 1 Step -1

                rng.Characters(i).Delete

            Next i

            For i = rng.Characters.Count - 1 To rng.Characters.Count - 5 Step -1

                rng.Characters(i).Delete

            Next i

            rng.Style = "Heading " & lngStyle

        Loop

    End With

    0 comments No comments
  4. Anonymous
    2019-02-02T13:37:21+00:00

    Hi

    thanks.

    This script will search for H characters in the text.

    But, checkout my image above. MS Word already removes the headings from the HTML (after you properly import the html file into ms word.)

    I have placed my raw html online here http://snippi.com/s/3evb07n

    The goal is to turn this raw text, into a MS Word file (with Word's Style 1 and Style 2 appropriately assigned--to the H1 and H2 in the html)

    0 comments No comments