We recently had an identifier column move from int values to bigint values. An ETL process which loads these values was not updated. That process is using SQL bulk insert, and we are seeing incorrect values in the destination table. I would have expected a hard failure.
Please note this process has been running successfully for a long time before this issue.
Can anyone tell me what the heck SQL Server is doing here!?
I've been able to re-create this issue in SQL Server 2017, and 2019 and 2022 both with the most recent CUs.
Simplified Example CSV file contents:
310067463717
310067463718
310067463719
Example SQL:
create table #t (t int)
bulk insert #t from 'c:\temp\test.csv'
with (datafiletype = 'char',
fieldterminator = '|'
)
select * from #t
Resulting data:
829818405
829818406
829818407
Interestingly, I tried with smaller values, and I do see an error:
Example CSV file contents (2147483647 is the largest int value for SQL Server):
310067463717
310067463718
310067463719
2147483647
2147483648
Running the same SQL code as above, I get an error for one row:
Msg 4867, Level 16, State 1, Line 4
Bulk load data conversion error (overflow) for row 5, column 1 (t).
The resulting data looks like this:
829818405
829818406
829818407
2147483647
I also tried just now with a much higher value, 31006746371945654, and that threw the same overflow error as 2147483648.
And last, I did confirm that if I create the table with the column defined as bigint, the data inserted is correct.
create table #t (t bigint)
bulk insert #t from 'c:\temp\test.csv'
with (datafiletype = 'char',
fieldterminator = '|'
)
select * from #t
Resulting data:
2147483647
2147483648
310067463717
310067463718
310067463719