Share via

how do I view html source in word 2016

Anonymous
2017-09-17T02:57:34+00:00

I'm not a web designer or a computer programmer. I'm just a simple college educator that likes building all my docs in word. But in the last few years I've been adapting to using a very powerful LMS called Canvas. Canvas's text editor does not have all the cool font and features that word does so I got in the habit of just cutting and pasting the HTML from my old word docs into the Canvas page. but now I've up graded to 2016 and I've spent hours trying to find someone to help with this. You'd think it would be the most basic feature to have. Is there no way to view the HTML source of a word 2016 doc? I find it hard to believe. Just about anywhere I go on my computer I simply right click and select inspect and there it is. What possible reason could MS have for hiding the HTML code?

The world wonders!

Microsoft 365 and Office | Word | For home | Windows

Locked Question. This question was migrated from the Microsoft Support Community. You can vote on whether it's helpful, but you can't add comments or replies or follow the question.

0 comments No comments

3 answers

Sort by: Most helpful
  1. Anonymous
    2017-09-18T04:32:12+00:00

    As Jay said, there is no HTML in Word documents.

    You say you "cut and paste the HTML from old word docs".  Exactly how did you do that? What version of Word, and how did you do it.

    I just thought of something. Although DOCX files are structured by XML, you do have the option of using File > Save As to save them to HTML format. I prefer "Web Page, Filtered", it removes a lot of redundant carp that otherwise Word would put in the straight HTML file.

    After that you could copy HTML to paste elsewhere.

    Lets take a high level look at some Word history. 

    The old "DOC" format files were "binary". A loose equivalent to a compiled program. If you opened one in a text editor, it was unreadable.  Even if you opened it in a hex editor, the file was still essentially unreadable.

    In 2007 MS switched to the DOCX format. As Jay pointed out the DOCX is actually a renamed ZIP format file, that contains a bunch of text only "XML" code files that can be easily read by any text editor (ie Wordpad).  The easiest way I've found to rename it is to ADD the .ZIP file extension (rather than remove the DOCX and replace it with ZIP).  Inside the zip are a bunch of standard folders containing the various bits and bobs.  The most relevant folder is /Word.  Inside it, if you have any pictures is the /Media folder to hold the pictures.  The most relevant subfile is "document.xml" It contains the body text, but it is buried inside of XML that looks like this:

    <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <w:document xmlns:wpc="http://schemas.microsoft.com/office/word/2010/wordprocessingCanvas" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:m="http://schemas.openxmlformats.org/officeDocument/2006/math" xmlns:v="urn:schemas-microsoft-com:vml" xmlns:wp14="http://schemas.microsoft.com/office/word/2010/wordprocessingDrawing" xmlns:wp="http://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing" xmlns:w10="urn:schemas-microsoft-com:office:word" xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" xmlns:w14

    XML is a variation of HTML.

    Most (99,99999%) Office users don't care about the underlying XML/HTML. So MS has not bothered to provide direct access to it. MS feels if you need the XML, you can unzip the file and access it.  And even that was a "secret" that MS didn't really share publicly. A few geeks figured it out and spread the word.

    The point is, DOCX files are complex structures of files and subfolders. There is no simple way to directly access XML (or HTML) inside of Word.

    4 people found this answer helpful.
    0 comments No comments
  2. Jay Freedman 207.6K Reputation points Volunteer Moderator
    2017-09-18T03:27:53+00:00

    A document in the .docx format (native since Office 2007) doesn't have HTML source, it's a zipped archive of mostly XML. If you want HTML, open the Save As dialog and set the Save As Type to Web Page or Web Page, Filtered. Both are HTML; the Filtered version has less Word-specific coding.

    If you want to see the XML, take a copy of the document and change its extension to .zip. Then you can open the archive directly or extract some or all of its contents. You may want to use an XML editor such as XML Notepad.

    3 people found this answer helpful.
    0 comments No comments
  3. Anonymous
    2017-09-18T23:39:24+00:00

    I wondered what you were using before to display Word HTML. Perhaps the old Script Editor? I believe that was "unlinked" from the product after Word 2003 and I think ditched altogether when Word 2010 came out.

    It is possible to display *something* by copying the document or selection to the clipboard, then fetching the HTML encoding from the clipboard and displaying it in a dialog box.

    I have posted a template that shows how that might be done at https://goo.gl/RU58LT . However, that's all it is. It may not display the same HTML as Word saves when you Save As in HTML format. I do not have a lot of time right now either to do more on this or even to provide a thorough description of how to use it. I suspect there must be similar utilities out there that have been better designed, written, tested and packaged.

    The template uses the clipboard routines posted by Leigh Webber on the MSDN site a few years ago (the relevant links are within the VBA code).

    To use it, you would need to download the template (it is a .dotm), and either copy it to your Word startup folder or add it to your document templates via the Developer tab. (You can enable the Developer tab via Word->File->Options->Customize Ribbon.

    You should then be able to add buttons or assign keystrokes to the two main routines in the template, which are called displaySelectionHTMLWithHeader and DisplayHTMLWithHeader, also using the Word->File->Options->Customize Ribbon option. You should be able to list the macros by selecting "Macros" under the "Choose commands from" dropdown, and add them to a new group within a new tab in the ribbon.

    Then select some content in a document, and click the relevant button in the Ribbon.

    The dialog box should display HTML, except that there is a short non-HTML header that shows the start and end points of the HTML and the name of the file it came from.

    1 person found this answer helpful.
    0 comments No comments