Delete comments by all or a specific author in a word processing document

This topic shows how to use the classes in the Open XML SDK for Office to programmatically delete comments by all or a specific author in a word processing document, without having to load the document into Microsoft Word. It contains an example DeleteComments method to illustrate this task.


DeleteComments Method

You can use the DeleteComments method to delete all of the comments from a word processing document, or only those written by a specific author. As shown in the following code, the method accepts two parameters that indicate the name of the document to modify (string) and, optionally, the name of the author whose comments you want to delete (string). If you supply an author name, the code deletes comments written by the specified author. If you do not supply an author name, the code deletes all comments.

    // Delete comments by a specific author. Pass an empty string for the 
    // author to delete all comments, by all authors.
    public static void DeleteComments(string fileName, 
        string author = "")

Calling the DeleteComments Method

To call the DeleteComments method, provide the required parameters as shown in the following code.

    DeleteComments(@"C:\Users\Public\Documents\DeleteComments.docx",
    "David Jones");

How the Code Works

The following code starts by opening the document, using the WordprocessingDocument.Open method and indicating that the document should be open for read/write access (the final true parameter value). Next, the code retrieves a reference to the comments part, using the WordprocessingCommentsPart property of the main document part, after having retrieved a reference to the main document part from the MainDocumentPart property of the word processing document. If the comments part is missing, there is no point in proceeding, as there cannot be any comments to delete.

    // Get an existing Wordprocessing document.
    using (WordprocessingDocument document =
      WordprocessingDocument.Open(fileName, true))
    {
        // Set commentPart to the document WordprocessingCommentsPart, 
        // if it exists.
        WordprocessingCommentsPart commentPart =
          document.MainDocumentPart.WordprocessingCommentsPart;

        // If no WordprocessingCommentsPart exists, there can be no 
        // comments. Stop execution and return from the method.
        if (commentPart == null)
        {
            return;
        }
        // Code removed here…
    }

Creating the List of Comments

The code next performs two tasks: creating a list of all the comments to delete, and creating a list of comment IDs that correspond to the comments to delete. Given these lists, the code can both delete the comments from the comments part that contains the comments, and delete the references to the comments from the document part.The following code starts by retrieving a list of Comment elements. To retrieve the list, it converts the Elements collection exposed by the commentPart variable into a list of Comment objects.

    List<Comment> commentsToDelete =
        commentPart.Comments.Elements<Comment>().ToList();

So far, the list of comments contains all of the comments. If the author parameter is not an empty string, the following code limits the list to only those comments where the Author property matches the parameter you supplied.

    if (!String.IsNullOrEmpty(author))
    {
        commentsToDelete = commentsToDelete.
        Where(c => c.Author == author).ToList();
    }

Before deleting any comments, the code retrieves a list of comments ID values, so that it can later delete matching elements from the document part. The call to the Select method effectively projects the list of comments, retrieving an IEnumerable<T> of strings that contain all the comment ID values.

    IEnumerable<string> commentIds = 
        commentsToDelete.Select(r => r.Id.Value);

Deleting Comments and Saving the Part

Given the commentsToDelete collection, to the following code loops through all the comments that require deleting and performs the deletion. The code then saves the comments part.

    // Delete each comment in commentToDelete from the 
    // Comments collection.
    foreach (Comment c in commentsToDelete)
    {
        c.Remove();
    }

    // Save the comment part changes.
    commentPart.Comments.Save();

Deleting Comment References in the Document

Although the code has successfully removed all the comments by this point, that is not enough. The code must also remove references to the comments from the document part. This action requires three steps because the comment reference includes the CommentRangeStart, CommentRangeEnd, and CommentReference elements, and the code must remove all three for each comment. Before performing any deletions, the code first retrieves a reference to the root element of the main document part, as shown in the following code.

    Document doc = document.MainDocumentPart.Document;

Given a reference to the document element, the following code performs its deletion loop three times, once for each of the different elements it must delete. In each case, the code looks for all descendants of the correct type (CommentRangeStart, CommentRangeEnd, or CommentReference) and limits the list to those whose Id property value is contained in the list of comment IDs to be deleted. Given the list of elements to be deleted, the code removes each element in turn. Finally, the code completes by saving the document.

    // Delete CommentRangeStart for each
    // deleted comment in the main document.
    List<CommentRangeStart> commentRangeStartToDelete =
        doc.Descendants<CommentRangeStart>().
        Where(c => commentIds.Contains(c.Id.Value)).ToList();
    foreach (CommentRangeStart c in commentRangeStartToDelete)
    {
        c.Remove();
    }

    // Delete CommentRangeEnd for each deleted comment in the main document.
    List<CommentRangeEnd> commentRangeEndToDelete =
        doc.Descendants<CommentRangeEnd>().
        Where(c => commentIds.Contains(c.Id.Value)).ToList();
    foreach (CommentRangeEnd c in commentRangeEndToDelete)
    {
        c.Remove();
    }

    // Delete CommentReference for each deleted comment in the main document.
    List<CommentReference> commentRangeReferenceToDelete =
        doc.Descendants<CommentReference>().
        Where(c => commentIds.Contains(c.Id.Value)).ToList();
    foreach (CommentReference c in commentRangeReferenceToDelete)
    {
        c.Remove();
    }

    // Save changes back to the MainDocumentPart part.
    doc.Save();

Sample Code

The following is the complete code sample in both C# and Visual Basic.

using DocumentFormat.OpenXml.Packaging;
using DocumentFormat.OpenXml.Wordprocessing;
using System;
using System.Collections.Generic;
using System.Linq;

DeleteComments(args[0], args[1]);

// Delete comments by a specific author. Pass an empty string for the 
// author to delete all comments, by all authors.
static void DeleteComments(string fileName, string author = "")
{
    // Get an existing Wordprocessing document.
    using (WordprocessingDocument document = WordprocessingDocument.Open(fileName, true))
    {

        if (document.MainDocumentPart is null || document.MainDocumentPart.WordprocessingCommentsPart is null)
        {
            throw new ArgumentNullException("MainDocumentPart and/or WordprocessingCommentsPart is null.");
        }

        // Set commentPart to the document WordprocessingCommentsPart, 
        // if it exists.
        WordprocessingCommentsPart commentPart =
            document.MainDocumentPart.WordprocessingCommentsPart;

        // If no WordprocessingCommentsPart exists, there can be no 
        // comments. Stop execution and return from the method.
        if (commentPart is null)
        {
            return;
        }

        // Create a list of comments by the specified author, or
        // if the author name is empty, all authors.
        List<Comment> commentsToDelete =
            commentPart.Comments.Elements<Comment>().ToList();
        if (!String.IsNullOrEmpty(author))
        {
            commentsToDelete = commentsToDelete.
            Where(c => c.Author == author).ToList();
        }
        IEnumerable<string?> commentIds =
            commentsToDelete.Where(r => r.Id is not null && r.Id.HasValue).Select(r => r.Id?.Value);

        // Delete each comment in commentToDelete from the 
        // Comments collection.
        foreach (Comment c in commentsToDelete)
        {
            c.Remove();
        }

        // Save the comment part change.
        commentPart.Comments.Save();

        Document doc = document.MainDocumentPart.Document;

        // Delete CommentRangeStart for each
        // deleted comment in the main document.
        List<CommentRangeStart> commentRangeStartToDelete =
            doc.Descendants<CommentRangeStart>().
            Where(c => c.Id is not null && c.Id.HasValue && commentIds.Contains(c.Id.Value)).ToList();
        foreach (CommentRangeStart c in commentRangeStartToDelete)
        {
            c.Remove();
        }

        // Delete CommentRangeEnd for each deleted comment in the main document.
        List<CommentRangeEnd> commentRangeEndToDelete =
            doc.Descendants<CommentRangeEnd>().
            Where(c => c.Id is not null && c.Id.HasValue && commentIds.Contains(c.Id.Value)).ToList();
        foreach (CommentRangeEnd c in commentRangeEndToDelete)
        {
            c.Remove();
        }

        // Delete CommentReference for each deleted comment in the main document.
        List<CommentReference> commentRangeReferenceToDelete =
            doc.Descendants<CommentReference>().
            Where(c => c.Id is not null && c.Id.HasValue && commentIds.Contains(c.Id.Value)).ToList();
        foreach (CommentReference c in commentRangeReferenceToDelete)
        {
            c.Remove();
        }

        // Save changes back to the MainDocumentPart part.
        doc.Save();
    }
}

See also