Using the Open XML SDK
[Blog Map] [Table of Contents] [Next Topic]
Open XML Packages
To follow this tutorial, you don't need to delve into all of the details of working with packages. This topic presents a small chunk of code that you can use as boilerplate code – it opens a word document and retrieves the main part, the style part, and the comment part. It uses LINQ to XML to count the XML nodes in the three parts, and prints the counts to the console.
This blog is inactive.
New blog: EricWhite.com/blog
Blog TOCThe boiler plate code uses the Open XML SDK, a set of managed classes for .NET that provides more convenient access to Open XML documents. Using the SDK, you can get the main part of the document, and navigate to related parts more easily. It cuts down your code by quite a bit. This blog post is a summary of the differences between the classes in System.IO.Packaging and the classes in the Open XML SDK. This example uses the the Open XML SDK v1.0. This blog post gives lots of information about the Open XML SDK, including where to download it.
Before attempting to compile, don't forget to:
· Add a reference to the WindowsBase assembly.
· Download and install the Open XML SDK.
· Add a reference to the DocumentFormat.OpenXml assembly.
For the interested:
Just a few points about packages. Various parts in the package are related. You never rely on absolute paths to retrieve a part, even if you know the path. Instead, you start from the main part, and use relationships to navigate to the other parts. As mentioned, many of these parts are XML documents, including files that specify the relationships between parts. You can access the parts and the relationship files using any conformant XML parser and a library that can open and read from ZIP files. However, the classes in the namespace System.IO.Packaging (in the WindowsBase assembly) allow you to work with packages in a more convenient way. You can see a quick summary of how to use relationships to navigate from part to part here.
The following code is attached to this page. Here is the boiler plate code:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
using System.Xml;
using System.Xml.Linq;
using DocumentFormat.OpenXml.Packaging;
class Program
{
public static XDocument LoadXDocument(OpenXmlPart part)
{
XDocument xdoc;
using (StreamReader streamReader = new StreamReader(part.GetStream()))
xdoc = XDocument.Load(XmlReader.Create(streamReader));
return xdoc;
}
static void Main(string[] args)
{
const string filename = "SampleDoc.docx";
using (WordprocessingDocument wordDoc = WordprocessingDocument.Open(filename, true))
{
MainDocumentPart mainPart = wordDoc.MainDocumentPart;
StyleDefinitionsPart styleDefinitionsPart = mainPart.StyleDefinitionsPart;
WordprocessingCommentsPart commentsPart = mainPart.CommentsPart;
XDocument xDoc = LoadXDocument(mainPart);
XDocument styleDoc = LoadXDocument(styleDefinitionsPart);
XDocument commentsDoc = LoadXDocument(commentsPart);
Console.WriteLine("The main document part has {0} nodes.", xDoc.DescendantNodes().Count());
Console.WriteLine("The style part has {0} nodes.", styleDoc.DescendantNodes().Count());
Console.WriteLine("The comments part has {0} nodes.", commentsDoc.DescendantNodes().Count());
}
}
}
[Blog Map] [Table of Contents] [Next Topic]
Comments
Anonymous
May 21, 2008
I had to change CommentsPart to WordprocessingCommentsPart, else it would not find CommentsPart in Apr08 version of the SDK.Anonymous
May 21, 2008
You're right. This type's name changed with the Apr08 CTP. I've updated the code in the post. Thanks.Anonymous
July 10, 2008
After adding a reference to the OpenXML SDK, I had to change the using statement to: using DocumentFormat.OpenXml.Packaging; Also, the code in this section has not yet been updated to WordProcessingCommentsPart. Am I studying a copy that has been superseded?Anonymous
August 26, 2008
So all that the SDK does is help with handling the file packaging. You can use it to do pretty much nothing in changing the contents of OpenXML files.Anonymous
August 27, 2008
Yes, that's all that this version of the SDK does. The next version will do more. See this post for an approach to using the current SDK: http://blogs.msdn.com/ericwhite/archive/2008/07/09/open-xml-sdk-and-linq-to-xml.aspx This post summarizes the differences: http://blogs.msdn.com/ericwhite/archive/2007/12/20/what-is-the-difference-between-the-system-io-packaging-and-microsoft-office-documentformat-openxml-packaging-namespaces.aspx -Eric