How can I ingest CDM folder - Synapse Link export data as an External Table in Synapse Analytics

Mehmet BAŞERDEM 41 Reputation points
2022-07-17T10:19:24.247+00:00

Hello all,

We are using Synapse Link to export Dynamics CRM data to ADLS Gen2. The CRM entities are exported in CSV format in CDM folders. They don't have headers. The records are splitted into multiple CSV files. I am not sure if all of the CSVs for the same entity follow same table structure & they have all the columns given in the model.json file.

There is also a subfolder called "Snapshot" . I am not sure whether we should include it into our external table location definition. If we shouldn't, I am not sure how to exclude it.

When I try to query , I receive below error. I need to confirm whether my external fileformat is appropriate for reading data from CDM folders

Msg 107090, Level 16, State 1, Line 1
HdfsBridge::recordReaderFillBuffer - Unexpected error encountered filling record reader buffer: HadoopExecutionException: No closing string delimiter.

Completion time: 2022-07-17T13:33:46.6246351+03:00

CREATE EXTERNAL FILE FORMAT [CDMCSVFileFormat] WITH

(FORMAT_TYPE = DELIMITEDTEXT, FORMAT_OPTIONS (
FIELD_TERMINATOR = N',',
STRING_DELIMITER = N'"',
FIRST_ROW = 1,
USE_TYPE_DEFAULT = False))

Thanks Mehmet Başerdem

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
5,378 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Mehmet BAŞERDEM 41 Reputation points
    2022-07-18T13:59:18.307+00:00

    We are checking source data. We found double quote characters inside text fields. It looks like they are failing the query.

    I think SynapseLink is not following the CSV format 100%. It is breaking the rule number #7 under
    https://datatracker.ietf.org/doc/html/rfc4180#section-2


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.