Why massive inserts using SubmitChanges lack in performance.
Bulk inserting using LINQ to SQL, why is it so slow?
Well, the short answer is that it is not designed to do bulk inserts.
But out of curiosity, why is it slow?
Well, let us do it by example (my preferred way). First create a table in SQL Server like so (this will basically represent a person):
create table OurLinqPerson (cid int primary key, fname nvarchar(10), lname nvarchar(10), age int)
--drop table OurLinqPerson
Then create a new C# console project in Visual Studio.
Right click the project and add a new item, “LINQ to SQL Classes”, call it Person.dbml.
Then use the Server Explorer to find the table (OurLinqPerson) and drag it to the designer surface and “Save”.
Then add the following code:
static void Main(string[] args)
{
int noOfPersonsToInsert = 10000;
long calc = 0;
try
{
PersonDataContext pdc = new PersonDataContext();
Stopwatch sw = new Stopwatch();
sw.Start();
for (int i = 0; i < noOfPersonsToInsert; i++)
{
OurLinqPerson olp = new OurLinqPerson { cid = i, fname = "Peter", lname = "Peterson", age = 33 };
// Add the object to the datacontext.
pdc.OurLinqPersons.InsertOnSubmit(olp);
if (i % 500 == 0)
{
Console.WriteLine("Time (ms): Total / This batch {0, 3} / {1, 3}, Objects using InsertOnSubmit: {2}", sw.ElapsedMilliseconds.ToString(), (sw.ElapsedMilliseconds - calc).ToString(), i);
calc = sw.ElapsedMilliseconds;
}
}
Console.WriteLine("All InsertOnSubmits done. Total elapsed time (ms): {0}. \n\nNow calling SubmitChanges...", sw.ElapsedMilliseconds.ToString());
calc = sw.ElapsedMilliseconds;
// Now submit changes to the database.
pdc.SubmitChanges();
Console.WriteLine("SubmitChanges done, time for this (ms): {0}", (sw.ElapsedMilliseconds - calc).ToString());
sw.Stop();
Console.WriteLine("\n\nTotal elapsed time (ms): {0}", sw.ElapsedMilliseconds.ToString());
}
catch (Exception ex)
{
Console.WriteLine(ex);
}
}
The code hopefully is self-explanatory. We want to insert 10000 rows into the OurLinqPerson table, we do this by creating an instance of the OurLinqPerson object.
(Basically, a row in the table in the database is an instance of a class representing a row, the columns are properties on that class).
We then add that instance to the table in the DataContext by using it as an argument to the InsertOnSubmit method.
We write out the time elapsed and the time it took for doing this for every 500 rows.
Finally we submit the changes to the database by calling SubmitChanges.
So, run the application and the output should be something like this:
Time (ms): Total / This batch 23 / 23, Objects using InsertOnSubmit: 0
Time (ms): Total / This batch 27 / 4, Objects using InsertOnSubmit: 500
Time (ms): Total / This batch 30 / 3, Objects using InsertOnSubmit: 1000
Time (ms): Total / This batch 33 / 3, Objects using InsertOnSubmit: 1500
Time (ms): Total / This batch 44 / 11, Objects using InsertOnSubmit: 2000
Time (ms): Total / This batch 47 / 3, Objects using InsertOnSubmit: 2500
Time (ms): Total / This batch 50 / 3, Objects using InsertOnSubmit: 3000
Time (ms): Total / This batch 53 / 3, Objects using InsertOnSubmit: 3500
Time (ms): Total / This batch 58 / 5, Objects using InsertOnSubmit: 4000
Time (ms): Total / This batch 61 / 3, Objects using InsertOnSubmit: 4500
Time (ms): Total / This batch 65 / 3, Objects using InsertOnSubmit: 5000
Time (ms): Total / This batch 73 / 8, Objects using InsertOnSubmit: 5500
Time (ms): Total / This batch 76 / 3, Objects using InsertOnSubmit: 6000
Time (ms): Total / This batch 80 / 4, Objects using InsertOnSubmit: 6500
Time (ms): Total / This batch 83 / 3, Objects using InsertOnSubmit: 7000
Time (ms): Total / This batch 87 / 4, Objects using InsertOnSubmit: 7500
Time (ms): Total / This batch 90 / 3, Objects using InsertOnSubmit: 8000
Time (ms): Total / This batch 94 / 4, Objects using InsertOnSubmit: 8500
Time (ms): Total / This batch 97 / 3, Objects using InsertOnSubmit: 9000
Time (ms): Total / This batch 100 / 3, Objects using InsertOnSubmit: 9500
All InsertOnSubmits done. Total elapsed time (ms): 104.
Now calling SubmitChanges...
SubmitChanges done, time for this (ms): 3396
Total elapsed time (ms): 3500
So it is pretty obvious that adding the rows/objects performs quickly, around 5 ms to insert 500 of them, however, the call to SubmitChanges takes 3396 ms out of a 3500 total.
So, back to the original questions, why is it slow?
Well, when calling SubmitChanges every object/row in the table (OurLinqPersons) has to be inspected in order to decide what action to take on that particular row.
Once the Object State has been determined for the instance it will create the appropriate SQL for that state and then call the database with this SQL.
The rows can be in several states, for example, ToBeInserted, Unchanged, Deleted etc.
In this case the row is in the ToBeInserted state and so the following SQL is created and ran for each row:
INSERT INTO [dbo].[OurLinqPerson]([cid], [fname], [lname], [age]) VALUES (@p0, @p1, @p2, @p3)
So in short, the reason it takes time is because every row/object has to be inspected in order to decide what action to take on it and then that action has to be executed.
It doesn’t matter if the row/object is unchanged; it still has to be inspected.
If you want to insert a large number of row, pure ADO.Net (for example SqlBulkCopy is a better option).
"Table<TEntity>.InsertOnSubmit Method"
https://msdn.microsoft.com/en-us/library/bb763516.aspx
"DataContext.SubmitChanges Method "
https://msdn.microsoft.com/en-us/library/system.data.linq.datacontext.submitchanges.aspx
"Object States and Change-Tracking (LINQ to SQL)"
https://msdn.microsoft.com/en-us/library/bb386982.aspx
"Insert, Update, and Delete Operations (LINQ to SQL)"
https://msdn.microsoft.com/en-us/library/bb386931.aspx
Comments
Anonymous
November 14, 2010
Why is the inspection for every row/object so expensive? Is a call to the SQL server made? It would be nice to be able to turn this inspection off when necessary.Anonymous
April 22, 2011
I short - just use transaction and performance will skyrocket.