How to Optimize IEnumerable to DataTable?

Spideregg 140 Reputation points
2023-12-13T04:00:33.78+00:00

I'm trying to create a function to convert IEnumerable<T> to a DataTable.

However, when dealing with large datasets, the process becomes very slow.

How can I optimize it for better performance? Thank you.

public static DataTable ToDataTable<T>(this IEnumerable<T> enumerable, bool trim = false)
{
    DataTable dataTable = new DataTable(typeof(T).Name);

    using (var reader = ObjectReader.Create(enumerable))
        dataTable.Load(reader);

    foreach (DataColumn col in dataTable.Columns)
    {
        col.AllowDBNull = true;
    }

    if (trim)
        return dataTable.Trim();
    else
        return dataTable;
}
Developer technologies | C#
Developer technologies | C#
An object-oriented and type-safe programming language that has its roots in the C family of languages and includes support for component-oriented programming.
0 comments No comments
{count} votes

Answer accepted by question author
  1. Sabyasachi Samaddar 235 Reputation points Microsoft Employee
    2023-12-13T05:08:40.53+00:00

    Hi @Spideregg , Welcome to Microsoft Q&A,

    You can try to optimize the code by using reflections, but to be honest, if you are using a very large database it will be better if you can change your approach and use either a Database or try to use more efficient data structures. A bit of explanation regarding this :

    Using a Database: When dealing with large datasets, it can be more efficient to use a database to store and manipulate your data. Databases are designed to handle large amounts of data and can perform operations like sorting, filtering, and aggregating data much faster than you could by manually coding these operations in C#. There are many types of databases, such as SQL Server, MySQL, and MongoDB, and the best one to use depends on your specific needs.

    Using More Efficient Data Structures: Data structures are a way of organizing and storing data so that they can be accessed and worked with efficiently. They define the relationship between the data, and the operations that can be performed on the data. Some data structures are more efficient than others for certain types of operations. For example, if you frequently need to look up values based on a key, a Dictionary<TKey, TValue> can be more efficient than a List<T> because it can look up values in constant time.

    If the answer is the right solution, please click "Accept Answer" and kindly upvote it. If you have extra questions about this answer, please click "Comment". 

    Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.


4 additional answers

Sort by: Most helpful
  1. Anonymous
    2023-12-13T05:35:58.2966667+00:00

    Hi @Spideregg , Welcome to Microsoft Q&A,

    Your code uses ObjectReader.Create to create an IDataReader and loads it into the DataTable through the Load method. Because it requires reflection operations on each object, which can become expensive on large data sets.

    It is recommended that you create the DataTable manually and add data to it row by row instead of using ObjectReader.Create. This can avoid the reflection operation of ObjectReader.Create and improve performance.

    public static DataTable ToDataTable<T>(this IEnumerable<T> enumerable, bool trim = false)
    {
         DataTable dataTable = new DataTable(typeof(T).Name);
    
         // Get all public properties of the type
         PropertyInfo[] properties = typeof(T).GetProperties();
    
         //Create DataTable columns
         foreach (PropertyInfo property in properties)
         {
             dataTable.Columns.Add(property.Name, Nullable.GetUnderlyingType(property.PropertyType) ?? property.PropertyType);
         }
    
         //Add data to the DataTable row by row
         foreach (T item in enumerable)
         {
             DataRow row = dataTable.NewRow();
             foreach (PropertyInfo property in properties)
             {
                 row[property.Name] = property.GetValue(item, null) ?? DBNull.Value;
             }
             dataTable.Rows.Add(row);
         }
    
         if(trim)
             return dataTable.Trim();
         else
             return dataTable;
    }
    

    Best Regards,

    Jiale


    If the answer is the right solution, please click "Accept Answer" and kindly upvote it. If you have extra questions about this answer, please click "Comment". 

    Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.


  2. Sabyasachi Samaddar 235 Reputation points Microsoft Employee
    2023-12-13T06:17:56.3733333+00:00

    Can you let me know the load you are pushing and the data schema


  3. Deleted

    This answer has been deleted due to a violation of our Code of Conduct. The answer was manually reported or identified through automated detection before action was taken. Please refer to our Code of Conduct for more information.


    Comments have been turned off. Learn more

  4. Karen Payne MVP 35,596 Reputation points Volunteer Moderator
    2023-12-15T10:05:21.07+00:00

    Not knowing what you mean by dealing with large datasets. I have to assume over say 100,000 items and have no clue how many properties are in your class here is something to try.

    Use Chunk to break the original list to smaller parts, convert to DataTable, process the data and move to the next part. In short, perform the large operation in batches. If the intention is to push to a database, consider using a temp table when chunking until all data is processed then push rows to the target database table.

    Also, if possible do what many companies do, perform the batch operations off hours using say a windows service which is triggered by time or when perhaps a file arrives.

    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.