Comparing data from two SQL tables

Lance James 371 Reputation points
2022-02-24T00:04:11.967+00:00

I have two SQL tables to compare.

My approach is to create a DataTable for each SQL table and then do the comparison. Find items in table 2 that don't exist in table 1 and add them to table 1.

Each table has over 10,000 entries so looking for efficiency as this is a daily process.

Did some reading on DataTable.Merge and DataTable.GetChanges. If this is a solution, I am not finding the proper use.

Regards,
Lance

C#
C#
An object-oriented and type-safe programming language that has its roots in the C family of languages and includes support for component-oriented programming.
10,839 questions
0 comments No comments
{count} votes

Accepted answer
  1. Jack J Jun 24,501 Reputation points Microsoft Vendor
    2022-02-24T06:51:20.747+00:00

    @Lance James , based on my test, I find a method to compare two datatbales and add items in table 2 that don't exist in table 1 to table1.

    First of all , based on my test, DataTable.Merge can not get your goal, because it may contains the duplicate item.

    For example:

            DataTable table1=new DataTable();  
            table1.Columns.Add("Name");  
            table1.Columns.Add("Age");  
            table1.Columns.Add("Id");  
             
            table1.Rows.Add("test1", 22, 1002);  
    
            DataTable table2 = new DataTable();  
            table2.Columns.Add("Name");  
            table2.Columns.Add("Age");  
            table2.Columns.Add("Id");  
            table1.Merge(table2);  
    

    I set the same datarow for the two datatables, but the result is the following:

    177425-image.png

    Obviously, it is not you wanted data. The result should be only one row instead of two duplicated rows.

    Therefore, I recommend that you could use the following method to compare two datatables.

    Usage:

    var comparer = new CustomComparer();  
    DataTable dtUnion = table1.AsEnumerable()  
                      .Union(table2.AsEnumerable(), comparer).CopyToDataTable<DataRow>();  
    

    Extension Class:

    class CustomComparer : IEqualityComparer<DataRow>  
        {  
            #region IEqualityComparer<DataRow> Members  
      
            public bool Equals(DataRow x, DataRow y)  
            {  
                return ((string)x["Name"]).Equals((string)y["Name"]);  
            }  
      
            public int GetHashCode(DataRow obj)  
            {  
                return ((string)obj["Name"]).GetHashCode();  
            }  
      
            #endregion  
        }  
    

    Hope this could help you.

    Best Regards,
    Jack


    If the answer is the right solution, please click "Accept Answer" and upvote it.If you have extra questions about this answer, please click "Comment".

    Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.


1 additional answer

Sort by: Most helpful
  1. Lance James 371 Reputation points
    2022-02-24T23:43:59.277+00:00

    Here is a case of unintended consequences.

    Table 1 starts with the 1st day of data load.

    Table 2 contains the next days data. It reloads daily. This is a rolling list of data. The oldest data date rolls off and the new data date enters. There is approximately 120 days of data in this table with 119 days of old data and a day of new data. However, the data added isn't necessary identified by the last current date of data. It can contain data that wasn't posted previously a week ago. Hence, I have to check all data in Table 2.

    The plan for this thread was to compare Table 2 with Table 1 and identify the data in Table 2 that is not in Table 1 and add it to Table 1.

    The SQL solution I thought would works does for day 1 only. As the oldest data date rolls out of table 2 the Union SQL statement detects that in Table 1 as a difference and includes the Table 1 difference data it in the Union. Therefore, the amount of data in the Union increases everyday since Table 1 has all the data (rolled off dates).

    Ouch. So much time wasted automating a process that is flawed. So I return to JackJJun-MSFT reply above using the extension class.

    Before I try to learn this code method and where to stuff it, will it only identify New items in Table 2 that don't exist in Table 1?

    Thanks everyone for my need to reopen my issue.

    Regards,
    Lance

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.