C# How SQL bulk copy work for large data insert

Sudip Bhatt 2,281 Reputation points
2021-01-10T12:51:07.557+00:00

1) what is default BatchSize of sql bulkcopy operation if i do not mention any value ?
2) i read some post and person mention 5000 BatchSize give him very good result. should i use 5000 batch size ?
How do i know what batch Size would be good for my scenario?

3) When SQL bulkcopy insert data in multiple batch then does it maintain any transaction? suppose 2 batch successfully inserted data but get any error for last batch then whole data will be deleted from table or not ?

4) SQL bulkcopy lock full table when inserting data? basically from multiple thread i am inserting data by SQL bulkcopy to same table. if table will be locked then parallel insert will not be possible. so suggest me best way to insert large data to same table from multiple thread?

Please guide me. Thanks

Developer technologies C#
0 comments No comments
{count} votes

Accepted answer
  1. Alberto Poblacion 1,571 Reputation points
    2021-01-10T21:13:50.573+00:00

    1) The default is to process the whole operation in a single batch. So if, for example, you give a DataTable with 10000 rows to SqlBulkcopy for inserting it into the DB, then BulkInsert will insert the whole 10000 rows in a single batch. By adjusting the batch size property you can tell it to send several smaller batches. You will need to do a bit of benchmarking to find out the optimal value because it depends on your data size and on the characteristics of the server and network. A batch size of 4000 is typical, but your results may vary.

    2) Yes, the transactions work with the bulk inserts, so if you roll back the transaction, any inserts as well as anything else that you did within the transaction will be rolled back. EDIT: But you need to use an explicit transaction for this to work. There is a property called UseInternalTransaction, but this only works for each individual batch. Other batches will not be rolled back if an insert fails, unless you started an explicit transaction before starting the bulk copy.

    Be aware, however, that bulk inserts have some differences in behavior when compared to standard insert statements. For instance, if the database recovery model is set to bulk logged, then then the bulk inserts are not logged individually; triggers are not fired; block allocation in tables uses larger blocks (which could affect database growth if you are using bulk insert for smallish amounts of data), and so on. On the other hand, they are significantly faster than individual inserts, if you really need the speed.

    3) Table locking is affected by the number of rows inserted in the table. Initially, row locking is used, but then the server escalates it to page locks and then table locks as needed for efficiency. This happens regardless of whether you are using bulk insert or individual insert statements.

    Note that, depending on the version of SQL Server, if the table has indexes then inserting data from multiple threads will have horrible performance due to the locking on the indexes. This has been greatly alleviated in SQL Server 2019, but it can hurt performance if you try to insert lots of rows from multiple threads in an earlier version.

    1 person found this answer helpful.

1 additional answer

Sort by: Most helpful
  1. Sushant Bagul 1 Reputation point
    2021-12-02T11:15:01.807+00:00

    Hi @Alberto Poblacion ,
    On same line we have developed a .Net application on 4.5.2 framework along with Oracle19c as database where we have used Oracle.DataAccess(4.122.19.1) as where we have file with 26 + lakh records which we are loading into database, the same code was working in DDTek.Oracle ODP.Net with Oracle12c but we have migrated database to 19c so its only supporting Oracle.DataAccess and here we are getting "Attempted to read or write protected memory. this is often an indication that memory is corrupt" for records more than 6.5 lakh.
    So we can say that for OracleBulkCopy its allowed 6.5 Lakh records to loaded at a same time but not more than that.

    Could you please help us with best approach so that we can load all records in one go? using C# with Oracle19c


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.