question

moondaddy-8531 avatar image
0 Votes"
moondaddy-8531 asked DanielZhang-MSFT commented

c#, HtmlAgilityPack: Can someone please explain why this xpath grabs top of page and not a child element?

Using the attached webpage I'm trying to grab the text highlighted i dark blue below.
However, because I will also be grabbing all the other data below it, I am first grabbing the element highlighted in yellow as the starting point which is:

69693-image.png

 //h3 [text()='Business Details']/../following-sibling::div


If I add this to the above xpath I can get the text I want:

 /div[1]/div/div [text()] 

69771-image.png



However, if I do this in 2 steps in c# the same path grabs something at the top of the page. Here's my code

     static void BusDetailsTest()
     {
         HtmlDocument doc = new HtmlDocument();
         doc.Load(@"D:\Apps\VSOCD\LeadaRator\Trunk\V01\WpfBrowserTest\SampleSourceFiles\5-5.2 Number19.txt");
         // Get the outer element that can be used as a starting point for all the other pieces of data to scape also.
         var h3 = doc.DocumentNode.SelectSingleNode("//h3 [text()='Business Details']/../following-sibling::div");
         // Why does this grab something at the top of the page rather than some nexted child divs?
         var divBusDesc = h3.SelectSingleNode("//div[1]/div/div");
         Console.WriteLine(divBusDesc.InnerText);
     }


Can someone please explain what I'm doing wrong and how to correct it?

Thank you

I tried to upload a text file with the page source but this site is real buggy today and wont allow attaching text files.

69777-image.png

69747-image.png


dotnet-csharp
image.png (36.0 KiB)
image.png (42.5 KiB)
image.png (62.4 KiB)
image.png (9.9 KiB)
· 3
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.


Maybe you should write h3.SelectSingleNode("/div[1]/div/div[text()]")?

0 Votes 0 ·

Thanks @Viorel-1 but nothing seems to be working here.
h3.SelectSingleNode("/div") is null, but h3 has plenty of html as shown below:

69923-image.png


Unfortunately for some reason I can't attach text files or post blocks of code like this.

0 Votes 0 ·
image.png (36.6 KiB)

Hi @moondaddy-8531,
For this question, there are more discussion in this thread.
And you can try the har07's answer.
Best Regards,
Daniel Zhang


0 Votes 0 ·

0 Answers