HtmlAgilityPack Data read from investing.com

Malik Asad Mahmood 126 Reputation points
2021-03-22T18:35:50.77+00:00

Hi ,
Please I need help to read html table (EPS earning table)80343-tabledata.jpg from following URL
https://www.investing.com/equities/wrldcal-teleco-earnings and insert into datatable my code as following its connecting and return data from investing as I have zero knowledge in htmlaglibitypack using.

my code as follows
Private Sub Button2_Click(sender As Object, e As EventArgs) Handles Button2.Click
Using client As New WebClient()
client.Headers.Add("user-agent", "karen payne")
ServicePointManager.SecurityProtocol = CType(3072, SecurityProtocolType)
ServicePointManager.DefaultConnectionLimit = 9999
Dim page As String = client.DownloadString(siteAddress)
' TextBox1.Text = htmlCode

        '  Dim page As String = WebClient.DownloadString("https://www.investing.com/equities/wrldcal-teleco")  
        Dim doc As HtmlAgilityPack.HtmlDocument = New HtmlAgilityPack.HtmlDocument()  
        doc.LoadHtml(page)  

        Dim lstNode As List(Of HtmlNode) = New List(Of HtmlNode)  
        Dim lstName As List(Of String) = New List(Of String)  
        Dim lstTable As List(Of DataTable) = New List(Of DataTable)  



        For Each thed As HtmlNode In doc.DocumentNode.SelectNodes("//thead")  
            lstNode.Add(thed.ParentNode)  
            Dim text = thed.SelectSingleNode("tr").SelectSingleNode("th").InnerText  
            '  Console.WriteLine(text)  
            MsgBox(text)  

            MsgBox(text)  
            lstName.Add(text)  
        Next  


        For i As Integer = 0 To lstNode.Count - 1  
            Dim dt As DataTable = New DataTable  
            dt.TableName = lstName(i)  
            For Each trNode As HtmlNode In lstNode(i).SelectNodes("tr")  
                If trNode.Attributes("class") Is Nothing Then  
                    For Each colNode As HtmlNode In trNode.SelectNodes("td")  
                        dt.Columns.Add(colNode.InnerText)  
                    Next  
                Else  
                    Dim j As Integer = 0  
                    Dim row As DataRow = dt.NewRow()  
                    For Each rowNode As HtmlNode In trNode.SelectNodes("td")  
                        row(j) = rowNode.InnerText  
                        j += 1  
                    Next  
                    dt.Rows.Add(row)  
                    '  MsgBox("test")  
                End If  
            Next  
            lstTable.Add(dt)  
            DataGridView1.DataSource = dt  

        Next  
    End Using  

    MsgBox("final finished...")  



End Sub  
 
VB
VB
An object-oriented programming language developed by Microsoft that is implemented on the .NET Framework. Previously known as Visual Basic .NET.
2,738 questions
{count} votes

Accepted answer
  1. Xingyu Zhao-MSFT 5,366 Reputation points
    2021-03-24T02:38:04.923+00:00

    Hi @Malik Asad Mahmood ,
    You need to know how to get specific table with Html Agility Pack.
    Take a look at the following code:

                ...  
                Dim lstName As List(Of String) = New List(Of String)  
                Dim dt As DataTable = New DataTable  
      
                For Each table As HtmlNode In doc.DocumentNode.SelectNodes("//table[contains(@id,'earningsHistory')]")  
                    For Each columnNameNode As HtmlNode In table.SelectNodes(".//tr/th")  
                        Dim columnName As String = columnNameNode.InnerText  
                        If columnName.Contains("&nbsp") Then  
                            Dim name = columnName.Replace(" ", " ")  
                            lstName(lstName.Count - 1) += name  
                        Else  
                            lstName.Add(columnName)  
                        End If  
                    Next  
                    For Each colName As String In lstName  
                        dt.Columns.Add(colName)  
                    Next  
                    Dim i As Integer  
                    Dim row As DataRow = dt.NewRow()  
                    For Each itemNode As HtmlNode In table.SelectNodes(".//tr/td")  
      
                        If i = dt.Columns.Count Then  
                            i = 0  
                            dt.Rows.Add(row)  
                            row = dt.NewRow()  
                        End If  
                        If itemNode.InnerText.Contains("&nbsp") Then  
                            row(i) += itemNode.InnerText  
                        Else  
                            row(i) = itemNode.InnerText  
                            i += 1  
                        End If  
                    Next  
                Next  
      
                DataGridView1.DataSource = dt  
    

    Result of my test:
    80888-1.png

    Best Regards,
    Xingyu Zhao
    *
    If the answer is helpful, please click "Accept Answer" and upvote it.
    Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.

    1 person found this answer helpful.
    0 comments No comments

2 additional answers

Sort by: Most helpful
  1. Malik Asad Mahmood 126 Reputation points
    2021-03-23T08:20:00.213+00:00

    thank you for reply, Please from following url https://www.investing.com/equities/wrldcal-teleco-earnings want to read EPS income table further htmltable snapshot attach80581-tabledata.jpged

    1 person found this answer helpful.
    0 comments No comments

  2. Malik Asad Mahmood 126 Reputation points
    2021-03-24T06:11:56.54+00:00

    thank you for continuous support so kind its working fine,but when I am change url in order to get other product data for same page the htmltable name chance and class remains same for example following url in order to get data of same htmltable for different product its not working

    https://www.investing.com/equities/taha-spinning-earnings

    https://www.investing.com/equities/tri-star-poly-earnings
    thank you

    1 person found this answer helpful.

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.