Additional columns created in CSV files while loading to synapse

Question

HdfsBridge::recordReaderFillBuffer - Unexpected error encountered filling record reader buffer: HadoopExecutionException: Not enough columns in this line.

My CSV file contain columns in which there are multiple values in a cell separated by comma, when I am trying to load this in synapse, it is throwing this error. I am sharing the external file format settings below:

CREATE EXTERNAL FILE FORMAT ArcdummyFileFormat
WITH
(
FORMAT_TYPE = DELIMITEDTEXT,
FORMAT_OPTIONS
(
FIELD_TERMINATOR = ',',
STRING_DELIMITER = '"',
FIRST_ROW = 2,
USE_TYPE_DEFAULT = True,
Encoding = 'UTF8'
)
)

What changes should i make in the file format or is there any other way to resolve this issue ?

Answer

Hello @kshitijjain-0490 and welcome to Microsoft Q&A.

If I understand you correctly, your data looks like:

ColumnX, ColumnY, ColumnZ
DataX, DataYpart1,DataYpart2,DataYpart3, DataZ

If this is the case, it is not a valid CSV. You should try with a different delimiter such as | or ; or tab. Then it would look like:

ColumnX | ColumnY | ColumnZ
DataX | DataYpart1,DataYpart2,DataYpart3 | DataZ

Does this make sense?

Share via

Additional columns created in CSV files while loading to synapse

1 answer