Working with In-Memory Open XML Documents

Sometimes you want to work with Open XML documents in memory.  There are two scenarios that I know of:

  • This blog is inactive.
    New blog: EricWhite.com/blog

    Blog TOCWhen working with document libraries in SharePoint, you retrieve a document from the document library as a byte array.  You can then modify it as necessary, and then put it back into the document library, either as a new document, or replacing the original.  This post shows how to do this.

  • In a web application, you may want to fabricate Open XML documents on the fly and serve them up to remote users.  You don’t want to serialize such temporary documents to the file system.  After creating them, you want to send them directly to the end user of the web application.

This blog post presents a bit of code that shows how to work with in-memory documents as a MemoryStream.  The code works with either Open XML SDK V1 or CTP1 of the Open XML SDK V2.

There is one important point to make about using the Open XML SDK with MemoryStream objects.  There is a MemoryStream constructor that takes a byte array as an argument.  However, we can’t use that constructor because it creates a non-resizable instance of the MemoryStream class, and the Open XML SDK needs a resizable memory stream, as parts may change in size when serialized back into the Open XML package.  Instead, we use the constructor that takes no parameters.  This creates a resizable MemoryStream.  We can then write the byte array to the MemoryStream, and then open the Open XML package from the MemoryStream (using the WordprocessingDocument class in this example).

After opening the WordprocessingDocument, we can work with the document as normal using the Open XML SDK.  After leaving the scope of the ‘using’ statement that opens the document, the memory stream will contain the new, modified document.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
using System.Xml;
using System.Xml.Linq;
using DocumentFormat.OpenXml.Packaging;

public static class LocalExtensions
{
public static XDocument GetXDocument(this OpenXmlPart part)
{
XDocument xdoc = part.Annotation<XDocument>();
if (xdoc != null)
return xdoc;
using (StreamReader sr = new StreamReader(part.GetStream()))
using (XmlReader xr = XmlReader.Create(sr))
xdoc = XDocument.Load(xr);
part.AddAnnotation(xdoc);
return xdoc;
}

public static void PutXDocument(this OpenXmlPart part) {
XDocument xdoc = part.GetXDocument();
if (xdoc != null) {
// Serialize the XDocument object back to the package.
using (XmlWriter xw =
XmlWriter.Create(part.GetStream
(FileMode.Create, FileAccess.Write))) {
xdoc.Save(xw);
}
}
}

public static string StringConcatenate(
this IEnumerable<string> source)
{
return source.Aggregate(
new StringBuilder(),
(s, i) => s.Append(i),
s => s.ToString());
}
}

class Program
{
static void Main(string[] args)
{
byte[] byteArray = File.ReadAllBytes("Test.docx");
using (MemoryStream mem = new MemoryStream())
{
mem.Write(byteArray, 0, (int)byteArray.Length);
using (WordprocessingDocument wordDoc =
WordprocessingDocument.Open(mem, true))
{
XNamespace w =
"https://schemas.openxmlformats.org/wordprocessingml/2006/main";

// modify the document as necessary
// for this example, we'll convert the first paragraph to upper case
XDocument doc = wordDoc.MainDocumentPart.GetXDocument();
XElement firstParagraph = doc
.Element(w + "document")
.Element(w + "body")
.Element(w + "p");
if (firstParagraph != null)
{
string text = firstParagraph
.Descendants()
.Where(n => n.Name == w + "t" || n.Name == w + "ins")
.Select(n => (string)n)
.StringConcatenate();
firstParagraph.ReplaceWith(
new XElement(w + "p",
new XElement(w + "r",
new XElement(w + "t", text.ToUpper()))));
// write the XDocument back into the Open XML document
wordDoc.MainDocumentPart.PutXDocument();
}
}
// at this point, the MemoryStream contains the modified document.
// We could write it back to a SharePoint document library or serve
// it from a web server.

// in this example, we'll serialize back to the file system to verify
// that the code worked properly.
using (FileStream fileStream = new FileStream("Test2.docx",
System.IO.FileMode.CreateNew))
{
mem.WriteTo(fileStream);
}
}
}
}

Code is attached.

Program.cs