What are my options for storing data that contains html tags?

CharlieLor 551 Reputation points
2023-02-02T18:14:40.24+00:00

I'm having difficulty inserting string texts that contains HTML tags. I'm using VACHAR as data type. I'm doing bulk insert from a CSV file into a database table. I keep getting these errors.

Msg 4864, Level 16, State 1, Line 294

Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 2, column 1 (id).

Msg 4864, Level 16, State 1, Line 294

Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 3, column 1 (id).

Msg 4864, Level 16, State 1, Line 294

Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 4, column 1 (id).

Msg 4832, Level 16, State 1, Line 294

Bulk load: An unexpected end of file was encountered in the data file.

Msg 7399, Level 16, State 1, Line 294

The OLE DB provider "BULK" for linked server "(null)" reported an error. The provider did not give any information about the error.

Msg 7330, Level 16, State 2, Line 294

Cannot fetch a row from OLE DB provider "BULK" for linked server "(null)".

This is what I have.

Bulk Insert AllAbstract
From 'C:\Temp\MyAbstracts.csv'
With (
	Firstrow = 2,
	Datafiletype = 'char',
	Fieldterminator = '","',
	Rowterminator = '0x0a'
	)
Go
SQL Server
SQL Server
A family of Microsoft relational database management and analysis systems for e-commerce, line-of-business, and data warehousing solutions.
12,692 questions
{count} votes

7 answers

Sort by: Most helpful
  1. Erland Sommarskog 100.9K Reputation points MVP
    2023-02-02T22:15:59.9966667+00:00

    I would store HTML data in nvarchar(MAX). But if you think that there will only be characters in your code page, or you have a UTF-8 collation varchar(MAX) should also work.

    In any case, the errors you are getting are not due to the data type as such, but because the file does not comply to the options you have set. You have set that each field is terminated by the sequence ",", which seems unlikely to me. Furthermore you have specified that the sole row terminator is linefeed, which is what we can expect if it is a Unix file, but not if it is a Windows file.

    Anyway, HTML or not, this is a sample of a line complying with the format you have specified:

    ABC","DEF","88989","XYZ
    

    If you think there are quotes missing, you are right. But this is what you have specified.

    As for what you should specify - as other have said, we need to see a sample file as well as the CREATE TABLE statement for the table. We also need to know the output of SELECT @@version.


  2. PercyTang-MSFT 12,426 Reputation points Microsoft Vendor
    2023-02-03T06:33:51.7533333+00:00

    Hi @CharlieLor

    Maybe you can refer to these posts.

    https://stackoverflow.com/questions/33259735/bulk-load-data-conversion-error-type-mismatch-or-invalid-character-for-the-spec

    https://stackoverflow.com/questions/12902110/bulk-insert-partially-quoted-csv-file-in-sql-server

    https://stackoverflow.com/questions/96448/sql-bulk-import-from-csv

    Best regards,

    Percy Tang


    If the answer is the right solution, please click "Accept Answer". If you have extra questions about this answer, please click "Comment".

    Note: Please follow the steps in our Documentation to enable e-mail notifications if you want to receive the related email notification for this thread.

    0 comments No comments

  3. CharlieLor 551 Reputation points
    2023-02-03T16:33:43.8133333+00:00

    Seems like I can't upload .csv file in here so I have rename it to test3.txt. This is my test data.

    Here's the error

    Msg 4832, Level 16, State 1, Line 294

    Bulk load: An unexpected end of file was encountered in the data file.

    Msg 7301, Level 16, State 2, Line 294

    Cannot obtain the required interface ("IID_IColumnsInfo") from OLE DB provider "BULK" for linked server "(null)".

    Completion time: 2023-02-03T10:26:54.1549588-06:00

    Bulk Insert AllAbstract
    From 'E:\Temp\test3.csv'
    With (
    	Firstrow = 2,
    	Datafiletype = 'char',
    	Fieldterminator = '","',
    	Rowterminator = '0x0a'
    	)
    Go
    
    0 comments No comments

  4. Bruce (SqlWork.com) 55,601 Reputation points
    2023-02-03T16:41:06.3733333+00:00

    csv files are a poor format to import html. html commonly contains quotes, commas, tabs, line feed etc.

    this makes it very hard to define the column and line separators for csv. you can preprocess the html, but then you have to post process after import. writing your own import program makes more sense.

    0 comments No comments

  5. Deleted

    This answer has been deleted due to a violation of our Code of Conduct. The answer was manually reported or identified through automated detection before action was taken. Please refer to our Code of Conduct for more information.


    Comments have been turned off. Learn more