What is the Fastest Way to Update Large Amounts of data in SQL Using VB.Net

Alan Liddle 96 Reputation points
2021-10-25T03:55:48.61+00:00

Hi All,

I have to update 9.5 million records in a table in SQL server using a VB.net Windows Forms app. What is the fastest way to do it?

Regards

Alan Liddle

Windows Forms
Windows Forms
A set of .NET Framework managed libraries for developing graphical user interfaces.
1,820 questions
SQL Server
SQL Server
A family of Microsoft relational database management and analysis systems for e-commerce, line-of-business, and data warehousing solutions.
12,639 questions
VB
VB
An object-oriented programming language developed by Microsoft that is implemented on the .NET Framework. Previously known as Visual Basic .NET.
2,564 questions
Transact-SQL
Transact-SQL
A Microsoft extension to the ANSI SQL language that includes procedural programming, local variables, and various support functions.
4,547 questions
{count} votes

Accepted answer
  1. Alan Liddle 96 Reputation points
    2021-11-05T10:38:30.827+00:00

    Hi All,

    I have to admit to have been terrified by the numbers involved in this exercise. It was actually 10.6 million records to update rather than 9.5 million, but close enough. Nobody actually told me how to do it. I was pointed at stuff that really didn't provide answers, so I finally decided to do it the old fashioned way....... What the hell was I worried about? I cut some code in about 30 minutes and it's currently about one hour into execution and about a quarter of the way through. Given that I never have to do this again, I consider this exceptionally fast. Here is my VB code (simples!):

    Private Sub ReadShareDays()
    Dim Files As Long
    Dim theFile() As String
    Dim fb As Integer
    Dim s1 As String

    lblProgress.Visible = True
    ReDim theFile(0 To 10000)
    Files = 0
    s1 = Dir("E:\Alan & Jill\Shares\Share Market Analysis\Daily Prices\*.txt",)
    Do While s1 <> ""
      Files += 1
      theFile(Files - 1) = s1
      s1 = Dir()
    Loop
    ReDim Preserve theFile(0 To Files - 1)
    PB1.Maximum = Files
    PB1.Step = 10
    PB1.Value = 0
    For i1 = 1 To Files
      ShareDays = 0
      ReDim ShareDay(0 To 4000)
      s1 = theFile(i1 - 1)
      lblProgress.Text = "Reading Share Days - " & GetField(s1, ".")
      fb = FreeFile()
      FileOpen(fb, "E:\Alan & Jill\Shares\Share Market Analysis\Daily Prices\" & theFile(i1 - 1), OpenMode.Input)
      Do While EOF(fb) <> True
        s1 = LineInput(fb)
        If s1 <> "" Then
          ShareDays += 1
          With ShareDay(ShareDays - 1)
            .theShare = GetField(s1, ",")
            .theDate = MakeDate(GetField(s1, ","))
            .theOpen = Val(GetField(s1, ","))
            .theHigh = Val(GetField(s1, ","))
            .theLow = Val(GetField(s1, ","))
            .theClose = Val(GetField(s1, ","))
            .theVolume = Val(GetField(s1, ","))
          End With
        End If
      Loop
      FileClose(fb)
      SaveShareDays()
      PB1.Value += 1
      Application.DoEvents()
    Next
    ReDim Preserve theShare(0 To Shares - 1)
    lblProgress.Visible = False
    Application.DoEvents()
    

    End Sub

    Private Function MakeDate(theDate As String) As Date
    Dim theYear As String
    Dim theMonth As String
    Dim theDay As String

    theYear = Mid(theDate, 1, 4)
    theMonth = Mid(theDate, 5, 2)
    theDay = Mid(theDate, 7, 2)
    MakeDate = CDate(theDay & "/" & theMonth & "/" & theYear)
    

    End Function

    Private Sub SaveShareDays()
    Dim DBCon As New sqlData
    Dim DBCon2 As New sqlData
    Dim ShareRec As Long
    Dim fb As Integer

    fb = FreeFile()
    FileOpen(fb, "E:\Alan & Jill\Shares\Share Market Analysis\Daily Prices\ShareDayErrors.csv", OpenMode.Append)
    For Each Share In ShareDay
      ShareRec = 0
      With DBCon
        .Query("SELECT RecNum FROM Shares WHERE theCode = '" & Share.theShare & "'")
        If .RecordCount > 0 Then
          ShareRec = .DBDT.Rows(0)("RecNum")
        End If
      End With
      If ShareRec > 0 Then
        With DBCon
          .Query("SELECT * FROM ShareDays WHERE theShare = " & CStr(ShareRec) & " AND theDate = '" & Format(Share.theDate, "yyyy-MM-dd") & "'")
          If .RecordCount > 0 Then
            If .DBDT.Rows(0)("theClose") > 0 Then
              Share.theClose = .DBDT.Rows(0)("theClose")
              ShareRec = .DBDT.Rows(0)("theShare")
            End If
            With DBCon2
              .Query("UPDATE ShareDays SET theOpen = " & Share.theOpen.ToString & ", theHigh = " & Share.theHigh.ToString & ", theLow = " & Share.theLow & ", theClose = " & Share.theClose & ", Volume = " & Share.theVolume &
                     "WHERE theDate = '" & Format(Share.theDate, "yyyy-MM-dd") & "' AND theShare = " & CStr(ShareRec))
            End With
          Else
            PrintLine(fb, Format(Share.theDate, "dd/MM/yyyy") & "," & Share.theShare & "," & CStr(Share.theOpen) & "," & CStr(Share.theHigh) & "," & CStr(Share.theLow) & "," & CStr(Share.theClose) & "," & CStr(Share.theVolume))
          End If
        End With
      End If
    Next
    FileClose(fb)
    

    End Sub


4 additional answers

Sort by: Most helpful
  1. Karen Payne MVP 35,031 Reputation points
    2021-10-25T11:08:00.147+00:00

    Seems with that many rows this would be done best done via a scheduled job. Dependent on the update logic, consider the advice here or look at using merge.

    1 person found this answer helpful.
    0 comments No comments

  2. Erland Sommarskog 100.8K Reputation points MVP
    2021-10-26T07:12:17.96+00:00

    I want to be able to:

    >

    Read all records into memory

    >

    Read CSV data into memory

    >

    Update each SQL data record in memory with CSV data

    >

    Save all records at once

    I sense that there is one important thing you have not really grasped. Your VB .Net program and SQL Server are two separate processes. (And this is no different from VB6.) So you must read the data from the file and send it to SQL Server.

    And there is not really any reason to read all records into memory at once. Rather you can read the file and stream it to SQL Server through a table-valued parameter. The you would do the update as:

    UPDATE tbl
    SET      col1 = tvp.col1,
         col2 = tvp.col2,
         ...
    FROM  tbl
    JOIN  @tvp tvp ON ...
    

    As it happens, my article Using Table-Valued Parameters in .NET, includes an example with reading a CSV file and streaming it to SQL Server.

    1 person found this answer helpful.

  3. Olaf Helper 40,656 Reputation points
    2021-10-25T07:30:27.677+00:00

    What is the fastest way to do it?

    Hello Alan,
    There is just one way: Doing it.
    You can do it batch wise, e.g. updating 1 mio rows per batch. But at the end of the day it is: Doing it.

    For more you have to provide more details. E.g. disabling index may effected by update statement & enabling iut afterwards can speep it up.

    0 comments No comments

  4. Erland Sommarskog 100.8K Reputation points MVP
    2021-10-25T21:47:26.547+00:00

    The problem with this sort of question is that we don't know your current level. We may be give answers that go over your head. Or we may give you answers like "I already know that".

    Anyway, the first thing is that you need send all data at once, or possibly in batches of, say, one million rows. You cannot send one row at a time. There are a couple of devices to use:

    1. The SqlBulkCopy class.
    2. Use a table-valued parameter. I written about these here
    3. Pack all data in an XML or JSON document.

    Once you have the data in SQL Server, there are just too many "it depends" for it to be meaningful when I know so little.