How to ignore the illegal characters from XML file in sql using OPENROWSET?

Mummana Venkatesh Babu 0 Reputation points
2023-04-05T13:21:44.7633333+00:00

When we are using the below code to read the XML file, we are getting the error as illegal characters. Here is the illegal character in the XML User's image

Below is the error message. (0 rows affected) Msg 9420, Level 16, State 1, Line 3 XML parsing: line 6612766, character 142, illegal xml character Below is the code.


INSERT INTO [KM].[XML_XML]([FILE_NAME],[BATCH_CODE],[XML])
	(SELECT   'Billrun_KPNRetail_20230302_20230262_PRODUCTION_1002_KPN.xml' AS [FILE_NAME], 'KM_230302' AS [BATCH_CODE]
	, CAST(BulkColumn AS XML) AS [XML] FROM OPENROWSET 
	( BULK 'importfiles/KM/XML/Billrun_KPNRetail_20230302_20230262_PRODUCTION_1002_KPN.xml', DATA_SOURCE = 'BlobStorage'
	, CODEPAGE = '65001', SINGLE_BLOB) AS ImportFile)
SQL Server | Other
{count} votes

1 answer

Sort by: Most helpful
  1. LiHongMSFT-4306 31,566 Reputation points
    2023-04-06T07:02:53.3866667+00:00

    Hi @Mummana Venkatesh Babu You can use an SSIS package to pull in the data and cleanse it. Either throw out the entire record into a trash table or filter off the character and insure it's encoding to UTF-8 standards.

    Also, you could pull in the record as a string, strip the character with a REPLACE and then convert it into XML.

    Best regards,

    Cosmog Hong


    If the answer is the right solution, please click "Accept Answer" and kindly upvote it. If you have extra questions about this answer, please click "Comment".

    Note: Please follow the steps in our Documentation to enable e-mail notifications if you want to receive the related email notification for this thread.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.