Create a rich Word document based on your own custom XML (without the need for XSLT)


I hope everyone had a great new year. Sorry I've taken so much time off from blogging. I was pretty busy last week just getting caught up on e-mail. For those of you who posted comments, or sent comments to me directly, I'll try to get to them all (sorry it's taking so long). Last month was such a busy month with all the traveling for our work in Ecma and family time for the holidays that I quickly fell behind. Beta 1 of Office has been out for a couple months now, and I haven't posted much content to help people use some of the new XML functionality in Office 12. Today, I want to post an example Word document that leverages the new storage we provide for custom XML and the integration of that XML store with a new feature called content controls. Anyone who has Beta 1 should be able to try this out.

There were are large number of scenarios we looked at when we first started our move towards strong custom XML support back towards the end of Office XP. Some of them were around making document generation much easier and more reliable. Other scenarios were around making the Office documents integrate in richer ways with business processes. There were a number of different exciting scenarios here, but this first example I'm going to show is really more around document generation. We often see people use the mail merge functionality in Word for more than just creating letters. It allows you to import data to create a document driven by that data. We've also seen people do this in Word 2003 using the XML file format in combination with XSLTs. We had been a bit naive in thinking that there would be a lot of folks out there building XSLTs for transforming their data into a rich Word document. There are plenty of people willing to do this, but it's a lot of work, and often too advanced for the majority of people trying to build a solution.

Generate a rich document based on Custom XML without an XSLT!

I have an example I've demo'd at a number of conferences that I wanted everyone to get a chance to play around with. If you grab this ZIP file: you'll see a Word document and an XML file called "item1.xml". Go ahead and open the Word document in Beta 1 and take a look. I have a couple things I'd like you to try:

  1. Close down the file in Word, and make a copy of the Word document. For the new copy, rename the extension of the file to ".zip"
  2. Crack open the file and navigate to the "customXML" folder. Notice the part called "item1.xml". If you open that you'll see an XML file with a number of custom XML tags that I created, but they are all empty.
  3. Open the "item1.xml" file that was in the original ZIP file you downloaded. Notice that it's in the same namespace as the xml file you looked at in step 2, but it has values for each XML node.
  4. Delete the item1.xml file from the Word document from step 2, and replace it with the extra one from step 3.
  5. Now, change the extension of the Word document back to .docx and open it in Word. Notice that the document now has all the values from that new item1.xml file displayed directly inline in the Word document (you can open the original Word document as well if you want to compare the differences).
  6. Make some changes to those values and save the file again. Change the extension back to ZIP and go to the "item1.xml" part again and you'll see that the XML file has the updated values based on the changes you made.

This is new functionality that leverages a couple new features. Content controls, the custom XML store, and the ability to map the content controls to nodes in the custom XML store all combined to give you this powerful data view separation.

Content Controls

Even without the XML mapping, the new set of features in Word called content controls make it much easier to structure a rich Word solution. Go ahead and open the original document you'd downloaded again. Notice that in the 2nd paragraph, you can only edit within specific regions. In that 2nd paragraph, there are a number of "content controls", and then the entire paragraph has been "grouped". By grouping the 2nd paragraph when I created the document, I made it so that the look and boilerplate text couldn't be changed, and instead only the content of the controls could be edited. Some of the controls are just plain text, but notice that there are other types of controls as well. The date for example, has a calendar control that will drop down:

Developer Tools

There are a number of available content controls:

  1. Plain Text - The name is somewhat misleading. This control will take on the formatting that is applied to it while in design mode, so the template author can set up the look, and the end user can only edit the contents.
  2. Picture - This control can only contain a picture. When the user clicks on it, the "insert picture" dialog appears.
  3. Drop Down List - This one behaves similarly to the plain text control, since you can first set up what formatting you want applied, but in addition, you can also specify a list of values that the user is allowed to choose from.
  4. Calendar - The user will be given a calendar control to pick the date. You have a number of options here for how the date is formatted (M/d/yyyy; dddd, MMMM dd, yyyy; etc. ).
  5. Combo Box - Just like a Drop Down List, except that the user can type in their own values as well as choose from a list you define.
  6. Rich Text - Behaves just like any other text in Word.
  7.  Building Blocks - This is another new feature that I'll talk about later since it really deserves it's own post(s).

These new controls, and the new "grouping" functionality make it really easy to design a template where you have some structured islands of information you want the user to fill out. Each control has it's own independent settings as to whether it's editable and whether or not it can be deleted. You can also specify placeholder text to be displayed when the contents of the control is empty.

If you are building a solution, the controls are also really helpful because they can be given unique names that you can use to easily address them in the Object Model. That also makes it really easy to get at them in the file format, since each control will be marked with XML structure. The part that I find most exciting about the controls though, is that you can map these controls to XML nodes in your own schema as we saw in this example.

Insert your own content controls

While I'll need to cover this in more detail later, I did want to quickly explain how you can insert your own content controls. The first thing you'll need to to is make sure that you have the "developer" tab showing in the ribbon. You can do this by going to File -> Word Options, and under the view settings choose "Developer Tools":

Developer Tools

Now, click on the "Developer" tab, and you'll see a chunk called "content controls"

Developer Tools

Developer Tools

With this, you can insert new content controls, as well as modify the properties of existing ones. Go ahead and play around with that a bit, and I'll post some more information later on ways to work with the controls. Some of the other topics I'll try to cover in the future in this area are:

  1. Using XML mapping and schema to drive the content for drop down controls. If you have a schema restriction, we can automatically use those retentions to populate the dropdown list.
  2. Using locking and groups to structure the document.
  3. Using building blocks to generate rich structures document fragments that can be easily inserted into a document and automatically bind to the custom XML already present.
  4. Bind content controls to document properties and SharePoint data. Have you ever had a document library in sharepoint and wanted the ability to map the column values directly into the content of the document? Well now you can set it up so that if the values are changed in SharePoint they will be reflected directly in the document, and if they are changed in the document, they will be reflected in the SharePoint library.
  5. Programmatic access to the custom XML store. You can set up all the mappings with the content controls, and then just program directly against the XML data. Anytime the user changes the values of one of the controls, it's automatically pushed back into the node it's mapped to, and an event is thrown. If you make a change to a node programmatically, then any content control mapped to that node will be automatically updated. This allows you to write your solution directly against the data, instead of against Word's objects.


(I almost forgot... go Seahawks!)