SQL Server Bulk Insert with Primary Key Id

Fabio Coatis 21 Reputation points
2022-01-31T17:16:41.577+00:00

I have a table with 20 million rows:

CREATE TABLE [dbo].[Hist]  
(  
    [CNPJ_ID] [INT] IDENTITY(1,1) PRIMARY KEY,  
    [TP_FUNDO] [text] NULL,  
    [CNPJ_FUNDO] [nvarchar](max) NULL,  
    [DT_COMPTC] [varchar](50) NOT NULL,  
    [VL_TOTAL] [varchar](50) NOT NULL,  
    [VL_QUOTA] [varchar](50) NOT NULL,  
    [VL_PATRIM_LIQ] [varchar](50) NOT NULL,  
    [CAPTC_DIA] [varchar](50) NOT NULL,  
    [RESG_DIA] [varchar](50) NOT NULL,  
    [NR_COTST] [int] NULL     
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]  

I created a format file with:

 bcp CVM.dbo.Hist format nul -c -x -f  C:\Hist.xml -T  

and deleted the pk record and edited to look like:

<BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">  
<RECORD>  
<FIELD ID="1" xsi:type="CharTerm" TERMINATOR=";" MAX_LENGTH="50" COLLATION="Latin1_General_CI_AS"/>  
<FIELD ID="2" xsi:type="CharTerm" TERMINATOR=";" MAX_LENGTH="50" COLLATION="Latin1_General_CI_AS"/>  
<FIELD ID="3" xsi:type="CharTerm" TERMINATOR=";" MAX_LENGTH="50" COLLATION="Latin1_General_CI_AS"/>  
<FIELD ID="4" xsi:type="CharTerm" TERMINATOR=";" MAX_LENGTH="50" COLLATION="Latin1_General_CI_AS"/>  
<FIELD ID="5" xsi:type="CharTerm" TERMINATOR=";" MAX_LENGTH="50" COLLATION="Latin1_General_CI_AS"/>  
<FIELD ID="6" xsi:type="CharTerm" TERMINATOR=";" MAX_LENGTH="50" COLLATION="Latin1_General_CI_AS"/>  
<FIELD ID="7" xsi:type="CharTerm" TERMINATOR=";" MAX_LENGTH="50" COLLATION="Latin1_General_CI_AS"/>  
<FIELD ID="8" xsi:type="CharTerm" TERMINATOR=";" MAX_LENGTH="50" COLLATION="Latin1_General_CI_AS"/>  
<FIELD ID="9" xsi:type="CharTerm" TERMINATOR="\r\n" MAX_LENGTH="12"/>  
</RECORD>  
<ROW>  
<COLUMN SOURCE="1" NAME="TP_FUNDO" xsi:type="SQLCHAR"/>  
<COLUMN SOURCE="2" NAME="CNPJ_FUNDO" xsi:type="SQLNVARCHAR"/>  
<COLUMN SOURCE="3" NAME="DT_COMPTC" xsi:type="SQLVARYCHAR"/>  
<COLUMN SOURCE="4" NAME="VL_TOTAL" xsi:type="SQLVARYCHAR"/>  
<COLUMN SOURCE="5" NAME="VL_QUOTA" xsi:type="SQLVARYCHAR"/>  
<COLUMN SOURCE="6" NAME="VL_PATRIM_LIQ" xsi:type="SQLVARYCHAR"/>  
<COLUMN SOURCE="7" NAME="CAPTC_DIA" xsi:type="SQLVARYCHAR"/>  
<COLUMN SOURCE="8" NAME="RESG_DIA" xsi:type="SQLVARYCHAR"/>  
<COLUMN SOURCE="9" NAME="NR_COTST" xsi:type="SQLINT"/>  
</ROW>  
</BCPFORMAT>  

When I tried:

INSERT INTO dbo.Hist ([TP_FUNDO], [CNPJ_FUNDO], [DT_COMPTC], [VL_TOTAL], [VL_QUOTA], [VL_PATRIM_LIQ], [CAPTC_DIA], [RESG_DIA], [NR_COTST])  
SELECT * FROM  OPENROWSET(  
BULK 'C:\Docs\2022.csv',  
FIRSTROW =2,  
FORMATFILE='C:\Docs\format_file.xml') as t;  

I got this error:

Msg 515, Level 16, State 2, Line 69
Cannot insert the value NULL into column 'CNPJ_ID', table 'CVM.dbo.Hist'; column does not allow nulls. INSERT fails.
The statement has been terminated.

my csv file is like:
169838-2022.txt

Any comments and suggestions will be appreciated.

Developer technologies | Transact-SQL
Developer technologies | Transact-SQL
A Microsoft extension to the ANSI SQL language that includes procedural programming, local variables, and various support functions.
SQL Server | Other
0 comments No comments
{count} votes

Answer accepted by question author
  1. Yitzhak Khabinsky 27,091 Reputation points
    2022-01-31T18:45:22.93+00:00

    Hi @Fabio Coatis ,

    Thanks for providing a minimal reproducible example.

    I tried to use it as-is, and got the following error:

    Msg 4864, Level 16, State 1, Line 21
    Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 1, column 9 (NR_COTST).

    I checked your source file, and noticed that its first row contains column headers.

    So, I just slightly modified the T-SQL by adding FIRSTROW = 2 in comparison with the stackoverflow solution.
    And it worked without any problem.

    SQL

    USE tempdb;  
    GO  
      
    DROP TABLE IF EXISTS dbo.Hist;  
      
    CREATE TABLE dbo.Hist  
     (  
         CNPJ_ID INT IDENTITY(1,1) PRIMARY KEY,  
         TP_FUNDO text NULL,  
         CNPJ_FUNDO nvarchar(max) NULL,  
         DT_COMPTC varchar(50) NOT NULL,  
         VL_TOTAL varchar(50) NOT NULL,  
         VL_QUOTA varchar(50) NOT NULL,  
         VL_PATRIM_LIQ varchar(50) NOT NULL,  
         CAPTC_DIA varchar(50) NOT NULL,  
         RESG_DIA varchar(50) NOT NULL,  
         NR_COTST int NULL     
     ) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY];  
      
    INSERT INTO dbo.Hist (TP_FUNDO, CNPJ_FUNDO, DT_COMPTC, VL_TOTAL, VL_QUOTA, VL_PATRIM_LIQ, CAPTC_DIA, RESG_DIA, NR_COTST)  
    SELECT *   
    FROM OPENROWSET(  
       BULK 'e:\Temp\FabioCoatis-0345\2022.csv',  
       FORMATFILE = 'e:\Temp\FabioCoatis-0345\hist_format_file.xml',  
       FIRSTROW = 2  
    ) AS t;  
      
    -- review results  
    SELECT * FROM dbo.Hist;  
    

    Please see below.

    169945-fabio.jpg

    P.S. Please connect with me on LinkedIn.


1 additional answer

Sort by: Most helpful
  1. CathyJi-MSFT 22,401 Reputation points Microsoft External Staff
    2022-02-01T07:34:28.98+00:00

    Hi @Fabio Coatis ,

    Quote from MS document Keep identity values when bulk importing data (SQL Server);

    >If the data file does not contain values for the identifier column in the table, use a format file to specify that the identifier column in the table should be skipped when importing data.

    Please reading above MS document to get more information.


    If the answer is helpful, please click "Accept Answer" and kindly upvote it. If you have extra questions about this answer, please click "Comment".

    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.