Share via

Custom XML Data

Anonymous
2013-10-13T04:03:47+00:00

I recently upgraded to Word 2013. When I opened a few files that had been created in Word 2007, I noticed that under "Inspect Document" it said custom XML data was found. I saved the files under new names and let Word remove the custom XML, but I'm a little concerned about how it got there in the first place as I did not (intentionally, at least) add any XML data. 

Is the presence of XML data any sort of danger or security risk to anyone who might be sent or open the file? And how could it have gotten in there without me knowing?


(Sorry if this sounds paranoid. I had an issue last weekend where Word was throwing a false positive and triggering a security alert on one of my documents. Both a Microsoft support agent and a response from someone on the community assured me it was nothing to worry about, but it left me a bit jumpy so when I saw the XML warning on multiple documents,I was concerned)

Microsoft 365 and Office | Word | For home | Windows

Locked Question. This question was migrated from the Microsoft Support Community. You can vote on whether it's helpful, but you can't add comments or replies or follow the question.

0 comments No comments
Answer accepted by question author
  1. Jay Freedman 207.5K Reputation points Volunteer Moderator
    2013-10-13T15:16:21+00:00

    Like the false positive security alert, this is not something to be concerned about.

    A little background will help: Documents that are saved in the .docx format actually consist almost entirely of XML (which stands for eXtensible Markup Language). Most of that XML is standardized to mark things like headings and tables.

    In the initial release of Word 2007, there was also the ability to include special ("custom") pieces of XML for whatever purpose was desired. That capability wasn't used very much, but when it was used it was usually by add-ins or macros, not by end users.

    In 2009 a company named i4i won a patent lawsuit claiming that Microsoft infringed its patent, and the court required Microsoft to remove the custom XML feature from Word. Further, from that date onward, any version of Word that opened a document in which custom XML was already present would be required to remove the custom XML from the file.

    So the bottom line is that custom XML in a Word document is not a security riskto anyone. It's just a very expensive headache for Microsoft and for anyone who might write an add-in that would benefit from using the forbidden custom XML.

    100+ people found this answer helpful.
    0 comments No comments

8 additional answers

Sort by: Most helpful
  1. Jay Freedman 207.5K Reputation points Volunteer Moderator
    2013-10-20T19:40:33+00:00

    The problem here is that Microsoft has apparently used the term "custom XML" for two different things.

    If a document created by a very early copy of Word 2007 from before the lawsuit was decided is opened in any later copy of Word, and it contains the "bad" custom XML (that is, patent-infringing, not dangerous to you), that stuff will automatically be removed. You don't have any choice in the matter, but you also don't have to run the Document Inspector to find out about it.

    Later versions of Word may place certain information into "good" custom XML. The only thing I've ever seen there is information about bibliography references and styles (for example, whether the document uses APA 6th edition or Chicago style, etc.). Not only is that stuff harmless when it's there, it might be a problem if you remove it and lose the information it contains.

    If you want to take a look at it before (or instead of) deleting it, here's how:

    • In Windows Explorer, go to the folder containing the document file.
    • Make a copy of the file (the quick way is to hold down Ctrl while dragging the file's icon and dropping it on a blank space in the file list).
    • With the copy selected, press F2 or right-click and choose Rename. Change the file's extension to .zip and press Enter.
    • Double-click the zip file to open it like a folder.
    • Double-click the subfolder named customXml.
    • Double-click each of the files (with extension .xml) in the subfolder. They'll open in whatever program you have assigned to open text files, probably Notepad.

    A lot of what you see will be XML tags within brackets like < > and what look like Internet addresses (although they don't usually correspond to real web pages). Some of it will be recognizable. Here's a sample:

    <b:Sources SelectedStyle="\APASixthEditionOfficeOnline.xsl" StyleName="APA" Version="6" xmlns:b="http://schemas.openxmlformats.org/officeDocument/2006/bibliography" xmlns="http://schemas.openxmlformats.org/officeDocument/2006/bibliography"></b:Sources>

    When you're satisfied, close the XML files and delete the zip file.

    One other thing that may make you more at ease: I've never heard of malware of any kind that could be transported in a document's custom XML. The worst risk it poses is that it might contain some personal information that you don't want to share with anyone else.

    20+ people found this answer helpful.
    0 comments No comments
  2. Anonymous
    2014-12-29T09:55:16+00:00

    In addition to the things that Jay has mentioned, 3 other things will result in your document having Custom XML Data. Two of them may be becoming more widespread as more people use "the cloud" to store documents. However, I do not think you need to be too concerned about them, as explained below.

     a. if the document you are working on was created on, has been opened from, or ever passed through, a SharePoint site, it wil probably have custom XML containing "SharePoint properties"

     b. if the document you are working on was created on, has been opened from, or ever passed through, a Business OneDrive site, it wil probably have custom XML containing "properties"

     c. If anyone has ever added one or more of a particular set of document properties, Word will create a custom XML part to contain their values. These are the "Cover Page Properties", which in the English language version of Word are called "Abstract", "Company Address", "Company E-mail", "Company Fax" and "Company Phone". You would typically insert them from Insert->Quick Parts->Document Property...

    However,

    The "ordinary" personal version of OneDrive (formerly "SkyDrive") does not create Custom XML, unlike the OneDrive for Business version.

    Word's document inspector does not detect the "Cover Page Properties" - presumably because they are known to Word and regarded as being like any other property.

    Even the SharePoint and OneDrive custom XML parts are primarily there to allow users to insert additional "document property information", over and above the standard built-in properties such as Author, Title and so on. The mechanism Word provides to let users do that is "content controls". Even if you have inserted those property values using content controls, as far as I know if you remove the custom XML, all that will happen is that changes to one content control will not result in changes to any others: whatever property values last appeared in the document should remain unchanged.

    7 people found this answer helpful.
    0 comments No comments
  3. Jay Freedman 207.5K Reputation points Volunteer Moderator
    2015-06-23T20:21:33+00:00

    [Post from user TT.thomastan has been split off to this new thread.]

    5 people found this answer helpful.
    0 comments No comments
  4. Anonymous
    2013-10-14T02:01:08+00:00

    Thank you so much, Jay!

    1 person found this answer helpful.
    0 comments No comments