The Final Results - VB
[Table of Contents] [Next Topic]
We need one more query to retrieve the comments:
This blog is inactive.
New blog: EricWhite.com/blog
Dim defaultStyle As String = _
CStr( _
( _
From style In styleDoc.Root _
.Elements(w + "style") _
Where ( _
CStr(style.Attribute(w + "type")) = "paragraph" And _
CStr(style.Attribute(w + "default")) = "1") _
) _
.First() _
.Attribute(w + "styleId") _
)
Dim paragraphs = _
mainPartDoc.Root _
.Element(w + "body") _
.Descendants(w + "p") _
.Select(Function(p) _
New With { _
.ParagraphNode = p, _
.Style = GetParagraphStyle(p, defaultStyle) _
} _
)
Dim r As XName = w + "r"
Dim ins As XName = w + "ins"
Dim paragraphsWithText = _
paragraphs.Select(Function(p) _
New With { _
.ParagraphNode = p.ParagraphNode, _
.Style = p.Style, _
.Text = p.ParagraphNode _
.Elements() _
.Where(Function(z) z.Name = r Or z.Name = ins) _
.Descendants(w + "t") _
.StringConcatenate(Function(s) CStr(s)) _
} _
)
Dim groupedCodeParagraphs = paragraphsWithText _
.GroupAdjacent(Function(p) p.Style) _
.Where(Function(g) g.Key = "Code")
Dim groupedCodeWithComments = _
groupedCodeParagraphs.Select(Function(g) New With { _
.ParagraphGroup = g, _
.Comment = GetComment(commentsDoc, g.First().ParagraphNode) _
} _
)
And here is the GetComment function that we call from this query:
Public Function GetComment(ByVal commentsDoc As XDocument, ByVal p As XElement) As String
Dim w As XNamespace = _
"https://schemas.openxmlformats.org/wordprocessingml/2006/main"
Dim id = _
CStr(p.Elements(w + "commentRangeStart") _
.First() _
.Attribute(w + "id"))
Dim commentNode = commentsDoc.Root() _
.Elements(w + "comment") _
.Where(Function(c) CStr(c.Attribute(w + "id")) = id) _
.First()
Dim comment = commentNode _
.Elements(w + "p") _
.StringConcatenate(Function(node) node _
.Descendants(w + "t") _
.Select(Function(t) CStr(t)) _
.StringConcatenate() & "n")
Return comment
End Function
In the C# version of this query, I used a statement lambda expression. Visual Basic 9.0 doesn’t have statement lambda expressions – no big deal – just call out to another function from the lambda in the Select call.
The first query in the GetComment method takes the first paragraph node and uses the Elements extension method to find all child elements named commentRangeStart:
Dim id = _
CStr(p.Elements(w + "commentRangeStart") _
.First() _
.Attribute(w + "id"))
This might seem like it is doing too much work, but because we used the First extension method, the query is short circuited just as soon as the first commentRangeStart element is found. For all practical purposes, this query just follows a few links in a linked list (although it does a fair amount of other work to set up the query, but that work is the same every time, and is not too bad).
Once we have the id of the comment, we can write the query to find the comment node in the comments part. This is a straight forward query; you have seen everything in this query before.
Dim commentNode = commentsDoc.Root() _
.Elements(w + "comment") _
.Where(Function(c) CStr(c.Attribute(w + "id")) = id) _
.First()
Finally, once we have the comment node, the following query assembles the text of the comment. Text in comments are represented in the XML in a similar way to representation of text in the main document part. Under the comment node are run elements (w:r), and under the run elements are text elements (w:t):
Dim comment = commentNode _
.Elements(w + "p") _
.StringConcatenate(Function(node) node _
.Descendants(w + "t") _
.Select(Function(t) CStr(t)) _
.StringConcatenate() & "n")
The outer StringConcatenate does the concatenation of each line followed by a new line. The inner string concatenation assembles each line of the comment. It's a little more complicated, but this is how I naturally wrote it.
When we run this program, we see:
Code Block
==========
using System;
class Program {
public static void Main(string[] args) {
Console.WriteLine("Hello World");
}
}
Meta Data
=========
<Test SnipId="0001">n <Output SnipId="0002"/>n</Test>n
Code Block
==========
Hello World
Meta Data
=========
<Test SnipId="0002"/>n
This accomplished the goal of our exercise.
The complete listing follows in the next topic. It is ~230 lines long, but if we consider the extension methods to be part of a reusable library, and just consider the queries, the query code is only ~100 lines long, and there is a fair amount of white space. It does quite a bit of work for a little example.