XML SelectNodes and RemoveChild

oserin 21 Reputation points
2021-09-21T15:04:05.64+00:00

Hello , I am kind of new to C# and .NET environment and try to understand (not a solution) the following behavior by using the SelectNodes and RemoveChild.

I have some xml file that is generated and contains duplicates and I d like to remove these duplicates.
The original xml is :

134318-example.xml

I would like to output a new file that is :

134367-example-out-correct.xml

For that purpose I created a windows application form that runs under .NET Core 3.1 and the following core programm :

XmlDocument doc = new XmlDocument();  
doc.Load(rmvDupe.filePath);  
  
XmlNode root = doc.FirstChild;  
XmlNodeList nodeList = root.SelectNodes("/descendant::*"); //I wish I could use root.SelectNodes("/descendant::UniqueID") but this does not work in C#  
List<string> UniqueIDs = new List<string>() { };  
  
foreach (XmlNode node in nodeList)  
{  
    if (node.Name == "UniqueID")  
    {  
        if (!UniqueIDs.Contains(node.LastChild.Value))  
        {  
            UniqueIDs.Add(node.LastChild.Value);  
        }  
        else  
        {     
            node.ParentNode.ParentNode.RemoveChild(node.ParentNode);  
              
        }  
  
    }  
  
}  
  
doc.Save(outfilepath);  

So this code will only remove the first duplicate (134324-example-out-incorrect.xml) and for some reason when I debug under visual studio pro I see by putting a breakpoint at line 26 that nodeList gets modified in the process. What is also awkward is that if I set a breakpoint by the creation of the nodeList (line 5) that I step over once and then continue, it will generate the correct result ?

Kind Regards

C#
C#
An object-oriented and type-safe programming language that has its roots in the C family of languages and includes support for component-oriented programming.
10,620 questions
{count} votes

Accepted answer
  1. Timon Yang-MSFT 9,581 Reputation points
    2021-09-23T01:43:33.737+00:00

    I tested it with the file you provided, and it did produce wrong results. The error appeared on line 1042:

       <UniqueID>CA_9445df0f3a544807af6c1527ee67f90b</UniqueID>  
    

    Using the Cast method is not enough, we also need to add the ToList method after it.

              foreach (XmlNode node in nodeList.Cast<XmlNode>().ToList())  
    

    After adding this, the generated result is the same as the correct sample file.

              string[] lines =  File.ReadAllLines(@"C:\...\example-out-correct.xml");  
              string[] lines1 = File.ReadAllLines(@"C:\...\Desktop\myResult.xml");  
                for (int i = 0; i < lines.Count(); i++)  
                {  
                    if (lines[i]!=lines1[i])  
                    {  
                        Console.WriteLine(i+1);  
                        break;  
                    }  
                }  
                Console.ReadKey();  
    

    If the response is helpful, please click "Accept Answer" and upvote it.
    Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.

    1 person found this answer helpful.

1 additional answer

Sort by: Most helpful
  1. Bruce (SqlWork.com) 61,181 Reputation points
    2021-09-21T15:36:55.633+00:00

    you should never add / delete objects in a collection you are iterating over. You should copy the collection first:

    foreach (XmlNode node in nodeList.ToList())