Refactoring using a Pure Function
[Blog Map] [Table of Contents] [Next Topic]
It would be useful to refactor this example to clean up the code that determines the style of the paragraph. We can make a function that has no side effects that returns the style name:
This blog is inactive.
New blog: EricWhite.com/blog
Blog TOCpublic static string GetParagraphStyle(XElement para)
{
return (string)para.Elements(w + "pPr")
.Elements(w + "pStyle")
.Attributes(w + "val")
.FirstOrDefault();
}
Now, the query is as follows:
var paragraphs =
mainPartDoc.Root
.Element(w + "body")
.Descendants(w + "p")
.Select(p =>
new
{
ParagraphNode = p,
Style = GetParagraphStyle(p)
}
);
We can rewrite the version that uses a query expression:
var paragraphs =
from p in mainPartDoc.Root.Element(w + "body").Descendants(w + "p")
let style = GetParagraphStyle(p)
select new
{
ParagraphNode = p,
Style = style,
};
The entire example follows. The code is attached to this page.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
using System.Xml;
using System.Xml.Linq;
using DocumentFormat.OpenXml.Packaging;
public static class LocalExtensions
{
public static string GetPath(this XElement el)
{
return
el
.AncestorsAndSelf()
.Aggregate("", (seed, i) => i.Name.LocalName + "/" + seed);
}
}
class Program
{
readonly static XNamespace w =
"https://schemas.openxmlformats.org/wordprocessingml/2006/main";
public static XDocument LoadXDocument(OpenXmlPart part)
{
XDocument xdoc;
using (StreamReader streamReader = new StreamReader(part.GetStream()))
xdoc = XDocument.Load(XmlReader.Create(streamReader));
return xdoc;
}
public static string GetParagraphStyle(XElement para)
{
return (string)para.Elements(w + "pPr")
.Elements(w + "pStyle")
.Attributes(w + "val")
.FirstOrDefault();
}
static void Main(string[] args)
{
const string filename = "SampleDoc.docx";
using (WordprocessingDocument wordDoc =
WordprocessingDocument.Open(filename, true))
{
MainDocumentPart mainPart = wordDoc.MainDocumentPart;
XDocument mainPartDoc = LoadXDocument(mainPart);
var paragraphs =
mainPartDoc.Root
.Element(w + "body")
.Descendants(w + "p")
.Select(p =>
new
{
ParagraphNode = p,
Style = GetParagraphStyle(p)
}
);
foreach (var p in paragraphs)
Console.WriteLine("{0} {1}",
p.Style != null ?
p.Style.PadRight(12) :
"".PadRight(12),
p.ParagraphNode.GetPath());
}
}
}
This is easier to read.
Because we wrote the GetParagraphStyle function without side effects, we were free to use it without worrying about how it would impact the execution of our query.