Walkthrough: Connect an Item in the Data Store to an Excel Workbook
Word Developer Reference |
Microsoft Office documents are often used as repositories of data. In the past, this meant that you had to start the Office application to get to the data. However, Microsoft Office Word 2007 documents and Microsoft Office Excel 2007 workbooks enable you to store data in custom XML document parts, and retrieve it using the Office Open XML file formats. You can use Excel workbooks as a data source and build Word 2007 document templates that pull data from Excel workbooks.
Word 2007 enables you to build data-driven, document-generation solutions. You can create a document template that includes a custom XML part and use content controls to bind to custom XML data using XML mapping. Then you can create a managed application to build a new document based on the template. The managed application opens the document template, retrieves data from a Microsoft Office Access database to build a new custom XML part, replaces the template's custom XML part , and saves as a new document.
This walkthrough explains how to build a template and how to create a server-side application that generates documents that display data stored in an Excel workbook. To build this application, you will do the following:
- Create the source Excel workbook.
- Create a Word 2007 document template.
- Create a server-side application that pulls data from an Access database and generates new documents based on the Word 2007 document template.
Objects used in this sample:
- ContentControl
- ContentControls
- CustomXMLPart (Microsoft Office system core object model)
- CustomXMLParts (Microsoft Office system core object model)
- XMLMapping
For more information about content controls, see Working with Content Controls.
Business Scenario: Project Cost Report
To demonstrate how to build a Word 2007 template and connect an item in the data store to an Excel workbook, you first build a new Excel workbook that contains project data, then you create a project cost report document template with content controls that map to an XML file. Next, you create a document-generation application that enables you to select an Excel file, sheet name, and cell numbers that store project data and generate a custom report. The application retrieves customer data from the Excel workbook and uses the project cost report document template to build a new document that displays project data based on a user selection. The document displays the following information:
- Company Name
- Project Name
- Budget
Create a Source Excel Workbook
Follow these steps to create an Excel workbook that contains project data:
- Start Excel.
- Create an Excel workbook.
- Type
Company Name
into cell A1. - Type
Contoso Traders
into cell B1. - Type
Project Name
into cell A2. - Type
Marketing Plan
into cell B2. - Type
Budget
into cell A3. - Type
1500
into cell B3. - On the File menu, click Save.
- Save the file as C:\Budget.xlsx.
- Close Excel.
Create a Word Template and Define the XML Mappings for Each Content Control
Follow these steps to create a Word 2007 template:
- Open Word 2007.
- Create a document.
- Create content controls to bind to a node in the data store.
Content controls are predefined pieces of content. There are several types of content controls, including text blocks, drop-down menus, combo boxes, calendar controls, and pictures. You can map these content controls to an element in an XML file. Using XPath expressions, you can programmatically map content in an XML file to a content control. This enables you to write a simple and short application to manipulate and modify data in a document. - Set the XML mapping on the content control.
XML mapping is a feature of Word 2007 that enables you to create a link between a document and an XML file. This creates true data/view separation between the document formatting and the custom XML data. To load a custom XML part, you must first add a new data store to a Document object by using the Add method of the CustomXMLParts collection. This appends a new, empty data store to the document. Because it is empty, you cannot use it yet. Next, you must load a custom XML part from an XML file into the data store, by calling the Load method of the CustomXMLPart object, using a valid path to an XML file as the parameter. - Add three plain-text content controls, in the following order:
- Content control for Company Name
- Content control for Project Name
- Content control for Budget
- Save the template document as C:\ProjectReportTemplate.docx.
To Set XML Mapping on a Content Control
Note |
---|
In this procedure, you map the content control to a sample custom XML file. You create a valid custom XML file, save it to your hard disk drive, and add a data store to the document that contains the information to which you want to map. |
- Create a text file and save it as C:\ProjectData.xml.
- Copy the following into the text file and save it:
<?xml version="1.0" standalone="yes"?> <Project> <CompanyName>Northwind Traders</CompanyName> <ProjectName>Finance Plan</ProjectName> <Budget>1000</Budget> </Project>
, \ , \ , and \ nodes of the previous custom XML part. 4. Open the Microsoft Visual Basic editor and run the following Visual Basic for Applications (VBA) code to add a data store to your template document. This sample code demonstrates how to attach an XML file to a document, so that it becomes an available data store item. ' Load CustomerData.xml file ActiveDocument.CustomXMLParts.Add ActiveDocument.CustomXMLParts(4).Load ("c:\ProjectData.xml")
Dim strXPath1 As String strXPath1 = "/Project/CompanyName" ActiveDocument.ContentControls(1).XMLMapping.SetMapping strXPath1
Dim strXPath2 As String strXPath2 = "/Project/ProjectName" ActiveDocument.ContentControls(2).XMLMapping.SetMapping strXPath2
Dim strXPath3 As String strXPath3 = "/Project/Budget" ActiveDocument.ContentControls(3).XMLMapping.SetMapping strXPath3
Create a Server-Side Application That Pulls Data from an Excel Workbook and Generates the New Document
In this section, you create a Web-based application that enables users to select an Excel workbook file, the sheet name, and the cell numbers that contain the values that store project data and generate a custom project report. The Web-based application retrieves customer data from an Excel workbook, opens the project report document template, and creates a new document that displays customer data based on a user selection. This application does not require the use of Word 2007, Excel 2007, or VBA. You can use your favorite managed code (Microsoft Visual Basic .NET or C#) language to build this application. To build this application, do the following:
- Open Microsoft Visual Studio 2005 or Microsoft Visual Web Developer 2005.
- Create an ASP.NET Web site and name it ExcelToWord.
- Add the ProjectReportTemplate.docx to the App_Data folder.
- Download and install the Microsoft .NET Framework 3.0 (formerly Microsoft WinFX).
- Configure the assembly in the Web.config file as follows:
<compilation debug="false"> <assemblies> <add assembly="WindowsBase, Version=3.0.0.0, Culture=neutral, PublicKeyToken=31BF3856AD364E35"/> </assemblies> </compilation>
Sample Code
Visual Basic .NET
The following sample demonstrates how to connect to an Excel workbook to retrieve data based on a customer selection and create a new document based on the ProjectReportTemplate.docx.
<%@ Page Language="VB" AutoEventWireup="true" Debug="true" %>
<%@ Import Namespace="System.IO" %> <%@ Import Namespace="System.IO.Packaging" %> <%@ Import Namespace="System.Xml" %> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head id="Head1" runat="server"> <title>Data-Driven Document Generation - Excel Sample</title> </head>
<script runat='server'>
Private Const documentRelationshipType As String = "http://schemas.openxmlformats.org/officeDocument/2006/relationships/officeDocument" Private Const worksheetSchema As String = "http://schemas.openxmlformats.org/spreadsheetml/2006/5/main" Private Const sharedStringsRelationshipType As String = "http://schemas.openxmlformats.org/officeDocument/2006/relationships/sharedStrings" Private Const sharedStringSchema As String = "http://schemas.openxmlformats.org/spreadsheetml/2006/5/main" Protected Sub btnCreateDocument_Click(ByVal sender As Object, ByVal e As EventArgs) CreateDocument End Sub Private Sub CreateDocument() ' Get the template file and create a stream from it Const TemplateFile As String = "~/App_Data/ProjectReportTemplate.docx" ' Read the file into memory Dim buffer() As Byte = File.ReadAllBytes(Server.MapPath(TemplateFile)) Dim memoryStream As MemoryStream = New MemoryStream(buffer, true) buffer = Nothing ' Open the document in the stream and replace the custom XML part Dim pkgFile As Package = Package.Open(memoryStream, FileMode.Open, FileAccess.ReadWrite) Dim pkgrcOfficeDocument As PackageRelationshipCollection = pkgFile.GetRelationshipsByType(documentRelationshipType) For Each pkgr As PackageRelationship In pkgrcOfficeDocument If (pkgr.SourceUri.OriginalString = "/") Then ' Get the root part Dim pkgpRoot As PackagePart = pkgFile.GetPart(New Uri(("/" + pkgr.TargetUri.ToString), UriKind.Relative)) ' Add a custom XML part to the package Dim uriData As Uri = New Uri("/customXML/item1.xml", UriKind.Relative) If pkgFile.PartExists(uriData) Then ' Delete template "/customXML/item1.xml" part pkgFile.DeletePart(uriData) End If ' Load the custom XML data Dim pkgprtData As PackagePart = pkgFile.CreatePart(uriData, "application/xml") GetDataFromExcel(pkgprtData.GetStream) End If Next ' Close the file pkgFile.Close ' Return the result Response.ClearContent Response.ClearHeaders Response.AddHeader("content-disposition", "attachment; filename=document.docx") Response.ContentEncoding = System.Text.Encoding.UTF8 memoryStream.WriteTo(Response.OutputStream) memoryStream.Close Response.End End Sub Private Sub GetDataFromExcel(ByVal stream As Stream) 'Connect to an Excel workbook and get data Dim writer As XmlWriter = XmlWriter.Create(stream) writer.WriteStartElement("Project") writer.WriteElementString("CompanyName", XLGetCellValue(tbFileName.Text, tbSheetName.Text, tbCustomerCell.Text)) writer.WriteElementString("ProjectName", XLGetCellValue(tbFileName.Text, tbSheetName.Text, tbProjectCell.Text)) writer.WriteElementString("Budget", XLGetCellValue(tbFileName.Text, tbSheetName.Text, tbBudgetCell.Text)) writer.WriteEndElement writer.Close End Sub Public Function XLGetCellValue(ByVal fileName As String, ByVal sheetName As String, ByVal addressName As String) As String Dim cellValue As String = Nothing ' Retrieve the stream containing info from the requested ' worksheet. Dim documentPart As PackagePart = Nothing Dim documentUri As Uri = Nothing Using xlPackage As Package = Package.Open(fileName, FileMode.Open, FileAccess.Read) ' Get the main document part (workbook.xml). For Each relationship As PackageRelationship In xlPackage.GetRelationshipsByType(documentRelationshipType) ' There should be only one document part in the package. documentUri = PackUriHelper.ResolvePartUri(New Uri("/", UriKind.Relative), relationship.TargetUri) documentPart = xlPackage.GetPart(documentUri) ' There should be only one instance, but get out no matter what. Exit For Next If documentPart IsNot Nothing Then ' Load the contents of the workbook. Dim doc As XmlDocument = New XmlDocument() doc.Load(documentPart.GetStream()) ' Create a NamespaceManager to handle the default namespace. ' Create a prefix for the default namespace. Dim nt As NameTable = New NameTable() Dim nsManager As New XmlNamespaceManager(nt) nsManager.AddNamespace("d", worksheetSchema) nsManager.AddNamespace("s", sharedStringSchema) Dim searchString As String = String.Format("//d:sheet[@name='{0}']", sheetName) Dim sheetNode As XmlNode = doc.SelectSingleNode(searchString, nsManager) If sheetNode IsNot Nothing Then ' Get the relId attribute. Dim relationAttribute As XmlAttribute = sheetNode.Attributes("r:id") If relationAttribute IsNot Nothing Then Dim relId As String = relationAttribute.Value Dim sheetRelation As PackageRelationship = documentPart.GetRelationship(relId) Dim sheetUri As Uri = PackUriHelper.ResolvePartUri(documentUri, sheetRelation.TargetUri) Dim sheetPart As PackagePart = xlPackage.GetPart(sheetUri) ' Load the contents of the sheet into an XML document. Dim sheetDoc As New XmlDocument() sheetDoc.Load(sheetPart.GetStream()) Dim cellNode As XmlNode = sheetDoc.SelectSingleNode(String.Format("//d:sheetData/d:row/d:c[@r='{0}']", addressName), nsManager) If cellNode IsNot Nothing Then ' Retrieve the value. The value may be stored within ' this element. If the "t" attribute contains "s", then ' the cell contains a shared string, and you must look ' up the value individually. Dim typeAttr As XmlAttribute = cellNode.Attributes("t") Dim cellType As String = String.Empty If typeAttr IsNot Nothing Then cellType = typeAttr.Value End If Dim valueNode As XmlNode = cellNode.SelectSingleNode("d:v", nsManager) If valueNode IsNot Nothing Then cellValue = valueNode.InnerText End If ' Check the cell type. At this point, this code checks only ' for Boolean and string values individually. If cellType = "b" Then If cellValue = "1" Then cellValue = "TRUE" Else cellValue = "FALSE" End If ElseIf cellType = "s" Then ' Go retrieve the actual string from the associated string file. For Each stringRelationship As PackageRelationship In documentPart.GetRelationshipsByType(sharedStringsRelationshipType) ' There should be only one shared string reference, so exit this loop immediately. Dim sharedStringsUri As Uri = PackUriHelper.ResolvePartUri(documentUri, stringRelationship.TargetUri) Dim stringPart As PackagePart = xlPackage.GetPart(sharedStringsUri) If stringPart IsNot Nothing Then ' Load the contents of the shared strings. Dim stringDoc As New XmlDocument(nt) stringDoc.Load(stringPart.GetStream()) ' Add the string schema to the namespace manager. nsManager.AddNamespace("s", sharedStringSchema) Dim requestedString As Integer = Convert.ToInt32(cellValue) Dim strSearch As String = String.Format("//s:sst/s:si[{0}]", requestedString + 1) Dim stringNode As XmlNode = stringDoc.SelectSingleNode(strSearch, nsManager) If stringNode IsNot Nothing Then cellValue = stringNode.InnerText End If End If Next End If End If End If End If End If End Using Return cellValue End Function
</script>
<body> <form id="form1" runat="server"> <div> <strong>Create Budget Report based on Excel Data<br /> </strong> <br /> <table> <tr> <td colspan="2"> <strong>Financial data (Excel source file)</strong></td> </tr> <tr> <td style="width: 100px"> File name:</td> <td style="width: 419px"> <asp:TextBox ID="tbFileName" runat="server" Width="400px">C:\budget.xlsx</asp:TextBox></td> </tr> <tr> <td style="width: 100px"> Sheet name:</td> <td style="width: 419px"> <asp:TextBox ID="tbSheetName" runat="server" Width="400px">Sheet1</asp:TextBox></td> </tr> <tr> <td style="width: 100px"> Customer name(cell number):</td> <td style="width: 419px"> <asp:TextBox ID="tbCustomerCell" runat="server" Width="400px">B1</asp:TextBox></td> </tr> <tr> <td style="width: 100px"> Project name(cell number):</td> <td style="width: 419px"> <asp:TextBox ID="tbProjectCell" runat="server" Width="400px">B2</asp:TextBox></td> </tr> <tr> <td style="width: 100px"> Budget (cell number):</td> <td style="width: 419px"> <asp:TextBox ID="tbBudgetCell" runat="server" Width="400px">B3</asp:TextBox></td> </tr> <tr> <td style="width: 100px; height: 26px"> </td> <td style="width: 419px; height: 26px"> <asp:Button ID="btnCreateDocument" runat="server" OnClick="btnCreateDocument_Click" Text="Create Document" /></td> </tr> </table> </div> </form> </body> </html>
C#
The following sample demonstrates how to connect to an Excel workbook to retrieve data based on a customer selection and create a new document based on the ProjectReportTemplate.docx.
<%@ Page Language="C#" AutoEventWireup="true" %>
<%@ Import Namespace="System.IO" %> <%@ Import Namespace="System.IO.Packaging" %> <%@ Import Namespace="System.Xml" %> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head id="Head1" runat="server"> <title>Data-Driven Document Generation - Excel Sample</title> </head>
<script language="C#" runat="server">
const string documentRelationshipType = "http://schemas.openxmlformats.org/officeDocument/2006/relationships/officeDocument"; const string worksheetSchema = "http://schemas.openxmlformats.org/spreadsheetml/2006/5/main"; const string sharedStringsRelationshipType = "http://schemas.openxmlformats.org/officeDocument/2006/relationships/sharedStrings"; const string sharedStringSchema = "http://schemas.openxmlformats.org/spreadsheetml/2006/5/main"; protected void btnCreateDocument_Click(object sender, EventArgs e) { CreateDocument(); } private void CreateDocument() { // Get the template file and create a stream from it const string TemplateFile = @"~/App_Data/ProjectReportTemplate.docx"; // Read the file into memory byte[] buffer = File.ReadAllBytes(Server.MapPath(TemplateFile)); MemoryStream memoryStream = new MemoryStream(buffer, true); buffer = null; // Open the document in the stream and replace the custom XML part Package pkgFile = Package.Open(memoryStream, FileMode.Open, FileAccess.ReadWrite); PackageRelationshipCollection pkgrcOfficeDocument = pkgFile.GetRelationshipsByType(documentRelationshipType); foreach (PackageRelationship pkgr in pkgrcOfficeDocument) { if (pkgr.SourceUri.OriginalString == "/") { // Get the root part PackagePart pkgpRoot = pkgFile.GetPart(new Uri("/" + pkgr.TargetUri.ToString(), UriKind.Relative)); // Add a custom XML part to the package Uri uriData = new Uri("/customXML/item1.xml", UriKind.Relative); if (pkgFile.PartExists(uriData)) { // Delete template "/customXML/item1.xml" part pkgFile.DeletePart(uriData); } // Load the custom XML data PackagePart pkgprtData = pkgFile.CreatePart(uriData, "application/xml"); GetDataFromExcel(pkgprtData.GetStream()); } } // Close the file pkgFile.Close(); // Return the result Response.ClearContent(); Response.ClearHeaders(); Response.AddHeader("content-disposition", "attachment; filename=document.docx"); Response.ContentEncoding = System.Text.Encoding.UTF8; memoryStream.WriteTo(Response.OutputStream); memoryStream.Close(); Response.End(); } private void GetDataFromExcel(Stream stream) { //Connect to an Excel workbook and get data XmlWriter writer = XmlWriter.Create(stream); writer.WriteStartElement("Project"); writer.WriteElementString("CompanyName", XLGetCellValue(tbFileName.Text, tbSheetName.Text, tbCustomerCell.Text)); writer.WriteElementString("ProjectName", XLGetCellValue(tbFileName.Text, tbSheetName.Text, tbProjectCell.Text)); writer.WriteElementString("Budget", XLGetCellValue(tbFileName.Text, tbSheetName.Text, tbBudgetCell.Text)); writer.WriteEndElement(); writer.Close(); } public string XLGetCellValue(string fileName, string sheetName, string addressName) { // Return the value of the specified cell. string cellValue = null; // Retrieve the stream containing the requested // worksheet's info: using (Package xlPackage = Package.Open(fileName, FileMode.Open, FileAccess.Read)) { PackagePart documentPart = null; Uri documentUri = null; // Get the main document part (workbook.xml). foreach (System.IO.Packaging.PackageRelationship relationship in xlPackage.GetRelationshipsByType(documentRelationshipType)) { // There should be only one document part in the package. documentUri = PackUriHelper.ResolvePartUri(new Uri("/", UriKind.Relative), relationship.TargetUri); documentPart = xlPackage.GetPart(documentUri); // There should be only one instance, but get out no matter what. break; } if (documentPart != null) { // Load the contents of the workbook. XmlDocument doc = new XmlDocument(); doc.Load(documentPart.GetStream()); // Create a namespace manager, so you can search. // Add a prefix (d) for the default namespace. NameTable nt = new NameTable(); XmlNamespaceManager nsManager = new XmlNamespaceManager(nt); nsManager.AddNamespace("d", worksheetSchema); nsManager.AddNamespace("s", sharedStringSchema); string searchString = string.Format("//d:sheet[@name='{0}']", sheetName); XmlNode sheetNode = doc.SelectSingleNode(searchString, nsManager); if (sheetNode != null) { // Get the relId attribute: XmlAttribute relationAttribute = sheetNode.Attributes["r:id"]; if (relationAttribute != null) { string relId = relationAttribute.Value; // First, get the relation between the document and the sheet. PackageRelationship sheetRelation = documentPart.GetRelationship(relId); Uri sheetUri = PackUriHelper.ResolvePartUri(documentUri, sheetRelation.TargetUri); PackagePart sheetPart = xlPackage.GetPart(sheetUri); // Load the contents of the workbook. XmlDocument sheetDoc = new XmlDocument(nt); sheetDoc.Load(sheetPart.GetStream()); XmlNode cellNode = sheetDoc.SelectSingleNode(string.Format("//d:sheetData/d:row/d:c[@r='{0}']", addressName), nsManager); if (cellNode != null) { // Retrieve the value. The value may be stored within // this element. If the "t" attribute contains "s", then // the cell contains a shared string, and you must look // up the value individually. XmlAttribute typeAttr = cellNode.Attributes["t"]; string cellType = string.Empty; if (typeAttr != null) { cellType = typeAttr.Value; } XmlNode valueNode = cellNode.SelectSingleNode("d:v", nsManager); if (valueNode != null) { cellValue = valueNode.InnerText; } // Check the cell type. At this point, this code checks only // for Booleans and strings individually. if (cellType == "b") { if (cellValue == "1") { cellValue = "TRUE"; } else { cellValue = "FALSE"; } } else if (cellType == "s") { // Go retrieve the actual string from the associated string file. foreach (System.IO.Packaging.PackageRelationship stringRelationship in documentPart.GetRelationshipsByType(sharedStringsRelationshipType)) { // There should be only one shared string reference, so you'll exit this loop immediately. Uri sharedStringsUri = PackUriHelper.ResolvePartUri(documentUri, stringRelationship.TargetUri); PackagePart stringPart = xlPackage.GetPart(sharedStringsUri); if (stringPart != null) { // Load the contents of the shared strings. XmlDocument stringDoc = new XmlDocument(nt); stringDoc.Load(stringPart.GetStream()); // Add the string schema to the namespace manager: nsManager.AddNamespace("s", sharedStringSchema); int requestedString = Convert.ToInt32(cellValue); string strSearch = string.Format("//s:sst/s:si[{0}]", requestedString + 1); XmlNode stringNode = stringDoc.SelectSingleNode(strSearch, nsManager); if (stringNode != null) { cellValue = stringNode.InnerText; } } } } } } } } } return cellValue; }
</script>
<body> <form id="form1" runat="server"> <div> <strong>Create Budget Report based on Excel Data<br /> </strong> <br /> <table> <tr> <td colspan="2"> <strong>Financial data (Excel source file)</strong></td> </tr> <tr> <td style="width: 100px"> File name:</td> <td style="width: 419px"> <asp:TextBox ID="tbFileName" runat="server" Width="400px">C:\budget.xlsx</asp:TextBox></td> </tr> <tr> <td style="width: 100px"> Sheet name:</td> <td style="width: 419px"> <asp:TextBox ID="tbSheetName" runat="server" Width="400px">Sheet1</asp:TextBox></td> </tr> <tr> <td style="width: 100px"> Customer name(cell number):</td> <td style="width: 419px"> <asp:TextBox ID="tbCustomerCell" runat="server" Width="400px">B1</asp:TextBox></td> </tr> <tr> <td style="width: 100px"> Project name(cell number):</td> <td style="width: 419px"> <asp:TextBox ID="tbProjectCell" runat="server" Width="400px">B2</asp:TextBox></td> </tr> <tr> <td style="width: 100px"> Budget (cell number):</td> <td style="width: 419px"> <asp:TextBox ID="tbBudgetCell" runat="server" Width="400px">B3</asp:TextBox></td> </tr> <tr> <td style="width: 100px; height: 26px"> </td> <td style="width: 419px; height: 26px"> <asp:Button ID="btnCreateDocument" runat="server" OnClick="btnCreateDocument_Click" Text="Create Document" /></td> </tr> </table> </div> </form> </body> </html>
For more information about working with ASP.NET 2.0, read the ASP.NET QuickStart Tutorials.