Microsoft SQL Server Bulk Insert NOT failing when inserting bigint values into int column

Sees, William T 25 Reputation points
2023-01-11T19:14:45.4966667+00:00

We recently had an identifier column move from int values to bigint values. An ETL process which loads these values was not updated. That process is using SQL bulk insert, and we are seeing incorrect values in the destination table. I would have expected a hard failure.

Please note this process has been running successfully for a long time before this issue.

Can anyone tell me what the heck SQL Server is doing here!?

I've been able to re-create this issue in SQL Server 2017, and 2019 and 2022 both with the most recent CUs.

Simplified Example CSV file contents:


310067463717

310067463718

310067463719

Example SQL:


create table #t (t int)

bulk insert #t from 'c:\temp\test.csv'

with (datafiletype = 'char',

fieldterminator = '|' 

)

select * from #t

Resulting data:


829818405

829818406

829818407

Interestingly, I tried with smaller values, and I do see an error:

Example CSV file contents (2147483647 is the largest int value for SQL Server):


310067463717

310067463718

310067463719

2147483647

2147483648

Running the same SQL code as above, I get an error for one row:

Msg 4867, Level 16, State 1, Line 4

Bulk load data conversion error (overflow) for row 5, column 1 (t).

The resulting data looks like this:


829818405

829818406

829818407

2147483647

I also tried just now with a much higher value, 31006746371945654, and that threw the same overflow error as 2147483648.

And last, I did confirm that if I create the table with the column defined as bigint, the data inserted is correct.


create table #t (t bigint)

bulk insert #t from 'c:\temp\test.csv'

with (datafiletype = 'char',

fieldterminator = '|' 

)

select * from #t

Resulting data:


2147483647

2147483648

310067463717

310067463718

310067463719

SQL Server | Other
0 comments No comments
{count} vote

Answer accepted by question author
  1. Erland Sommarskog 128.9K Reputation points MVP Volunteer Moderator
    2023-01-12T20:00:00.4733333+00:00

    That's absolutely a bug - and very easy to reproduce. You should report this on https://feedback.azure.com/d365community/forum/04fe6ee0-3b25-ec11-b6e6-000d3a4f0da0. Although, if this would be a blocking issue for you, and you need a fix, you would need to open a support case. The feedback is site is more to let Microsoft know, and they may see it and fix it at some time.

    Some more musings about this error. I found that if I take out one zero, so that I get 31067463717 , BULK INSERT still swallows the bait. But if I strip out both, i.e. 3167463717, BULK INSERT reports an error.

    I also tested this with the command-line tool BCP, and BCP does the right thing, at least with the numbers in the repro.

    1 person found this answer helpful.

2 additional answers

Sort by: Most helpful
  1. Jingyang Li 5,901 Reputation points Volunteer Moderator
    2023-01-11T22:34:26.26+00:00

    Can reproduce your sample.

    Don't know how that wrong values were inserted into the overflow values.

    Wrap an explicit Transaction block for Bulk Insert should catch this overflow error and rollback with no wrong data inserted.

    0 comments No comments

  2. Seeya Xi-MSFT 16,676 Reputation points
    2023-01-12T07:45:22.83+00:00

    Hi Sees, William T,

    Thanks for your sharing. The int data type is the primary integer data type in SQL Server. The bigint data type is intended for use when integer values might exceed the range that is supported by the int data type. Their ranges are different.

    When you bulk insert, it puts the data in the table you created. If the types don't match or overflow, it will report an error.

    Please read this document to check the range: int, bigint, smallint, and tinyint (Transact-SQL)

    User's image

    Best regards,

    Seeya


    If the answer is the right solution, please click "Accept Answer" and kindly upvote it. If you have extra questions about this answer, please click "Comment".


Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.