How to speed up datatable updation for high volume of data

T.Zacks 3,996 Reputation points
2021-05-06T09:57:47.173+00:00

Reading data from xml file and load into datatable. i can not load data into List<T> because datatable has some dynamic column whose name is not known at compile time.

I have one List<T> and one datatable. _AllCommentsData is List<T> and dtData has datatable.

More or less _AllCommentsData has 10,000 to 20,000 data and dtData datatable has high volume of data. approx some dtData datatable has 70,00,000 data means 7 million data. when traversing in high volume of data and update one column value is taking long time. so guide me how to refactor or restructure the code as a result update one column value with high volume of data would take very less time. any work around exist to deal this situation?

i could join my datatable and List together and update field value but i could not do it because few column name is dynamic in datatable. so in LINQ select i can not mention their name and that is the reason i query datatable in loop and update data.

can i use Parallel.For() or AsParallel ? please guide me with some sample code which i can use to speed the data updation when there is high volume of data in data table. Thanks

My Sample code

                    foreach (var item in _AllCommentsData.Where(a => a.CommentType == "Only LI"))
                    {
                        if(item.Comments!="")
                        {
                            tmpData = dtData.AsEnumerable().Where(a => a.Field<string>("Section") == item.section
                                && a.Field<string>("LineItem") == item.LineItem
                                && (a.Field<int?>("EarningID") == null ? 0 : a.Field<int>("EarningID")) == item.EarningID);

                            if (tmpData != null && tmpData.Count() > 0)
                            {
                                tmpData.ForEach(x =>
                                {
                                    x["LI_Comment"] = 1;
                                });
                            }
                        }
                    }
Developer technologies C#
{count} votes

Accepted answer
  1. Viorel 122.6K Reputation points
    2021-05-07T17:29:10.53+00:00

    Check some of ideas (but also fix all possible implementation issues):

    var section_col = dtData.Columns(“Section”);
    var lineitem_col = dtData.Columns(“LineItem”);
    var earningid_col = dtData.Columns(“EarningID”);
    var li_comment_col = dtData.Columns(“LI_Comment”);
    . . .
    foreach (var item in _AllCommentsData.Where(a => a.Comments.Length != 0 && a.CommentType == "Only LI"))
    {
       foreach( var x in dtData.AsEnumerable().Where( a => 
                  a.Field<string>(section_col) == item.section &&
                  a.Field<string>(lineitem_col) == item.LineItem &&
                  a.Field<int?>(earningid_col) == item.EarningID ) )
       {
          x[li_comment_col] = 1;
       }
    }
    

1 additional answer

Sort by: Most helpful
  1. Daniel Zhang-MSFT 9,651 Reputation points
    2021-05-07T06:18:41.763+00:00

    Hi TZacks-2728,
    As karenpayneoregon said, I also suggest that you put your datatable into database and you could use Datareader to validate each row.
    Because the DataReader is a good choice when you're retrieving large amounts of data because the data is not cached in memory.
    More details please refer to the following link.
    Retrieve data using a DataReader
    Best Regards,
    Daniel Zhang


    If the response is helpful, please click "Accept Answer" and upvote it.

    Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.