Share via

Error while trying to read data from ADLS to SQL Server using Polybase External table

Palash Aich 21 Reputation points
2021-04-08T13:49:51.387+00:00

Hello there,

I have a CSV file available in ADLS Gen 2 location. While trying to query the data from SQL server On-Prem, I am getting below error. I have another file, and it is working fine. What should I modify in the file format to get rid of this error.

Error:
HdfsBridge::recordReaderFillBuffer - Unexpected error encountered filling record reader buffer: HadoopExecutionException: No closing string delimiter.

File Format:
FORMAT_TYPE = DELIMITEDTEXT
, FORMAT_OPTIONS (FIELD_TERMINATOR = N',' , STRING_DELIMITER = N'"', DATE_FORMAT = N'yyyy-MM-dd HH:mm'
, USE_TYPE_DEFAULT = False)

CSV File Data:
b7b99200-b790-eb11-b1ac-000d3a9e5194,"4/7/2021 9:02:58 AM","4/7/2021 9:02:58 AM",0,1,,,,,,1,1,,1,1,,1,1,,1,1,,1,,1,False,False,False,,False,False,False,False,False,True,False,False,False,False,,,,,,,,,,,,,dba4ce79-af8c-eb11-b1ac-000d3a8efdd4,"systemuser",,,dba4ce79-af8c-eb11-b1ac-000d3a8efdd4,"systemuser",,,,,a7fd3dd7-fb36-eb11-a813-000d3a8b6b66,"businessunit",,,,,c788a9b1-2137-eb11-a813-000d3a8b6b66,"transactioncurrency",,,,,,,,,dba4ce79-af8c-eb11-b1ac-000d3a8efdd4,"systemuser",,,,,dba4ce79-af8c-eb11-b1ac-000d3a8efdd4,"systemuser",,2782500.0,,,,2782500.0,,,,,,,,,,,,,,,"2021-03-29T17:54:48Z",,,,,,,,,,,"2021-04-07T02:35:30Z",,,,,,,,"Corporation",,,,,,,1,,,,,,"2021-03-29T17:48:44.0000000+00:00",,1,,,,"S Raikar",,,,,,,,,,,,,63d61000-742e-46a1-a30f-e6ed720b1ff5,,,,,,1,,,,,,,,,,,,,,,,,,,,,,,,,,,,,"Raikar",,,,"2021-04-07T04:35:33Z",fe62b29c-e53b-4aac-b240-deba1f0f9350,,,,,,,,"001a000001GDSFjAAP",,,,7605457,b7b99200-b790-eb11-b1ac-000d3a9e5194,,,,,,"S Raikar",,,,,,1

Azure Data Lake Storage
Azure Data Lake Storage

An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.

Developer technologies | Transact-SQL
Developer technologies | Transact-SQL

A Microsoft extension to the ANSI SQL language that includes procedural programming, local variables, and various support functions.

SQL Server | Other
SQL Server | Other

Additional SQL Server features and topics not covered by specific categories


5 answers

Sort by: Most helpful
  1. Mehmet BAŞERDEM 41 Reputation points
    2022-07-18T14:19:19.167+00:00

    I have a similar issue in my case. In one of the rows of the source data, textfield has doublequote character inside. It looks like Polybase external table can't handle these type of escape-character scenarios. We are going to prune the quote characters before we use it with polybase

    Was this answer helpful?

    0 comments No comments

  2. Erland Sommarskog 134.7K Reputation points MVP Volunteer Moderator
    2021-04-14T21:18:55.583+00:00

    Is there any other way I can provide you file please?

    1. Upload it elsewhere (OneDrive, Dropbox etc) and post the link.
    2. Change the extension of the file to a permitted type - and inform us that we need to change it back.

    But my guess from the error message is simply that the file is malformed.

    Rather than uploading the file to have other people to look at it, you could do some troubleshooting on your own. Divide the file into two files. If one of these loads, then slash the the bad half in two etc.

    Or load the file to a one-column table with one row for each file. Then do:

    SELECT col FROM file WHERE len(replace(file, '"', '')) - len(file) % 2 = 1
    

    Troubleshooting things like this is part of the programmer's trade.

    Was this answer helpful?

    0 comments No comments

  3. Palash Aich 21 Reputation points
    2021-04-14T09:29:16.937+00:00

    Hi @AnuragSharma-MSFT

    I am unable to attach file as it is more than 7 MB. Also, I compressed the file which is of less size, however, zip file is not supporting to attach here.
    Is there any other way I can provide you file please?

    Thanks,
    Palash

    Was this answer helpful?

    0 comments No comments

  4. Anurag Sharma 17,636 Reputation points
    2021-04-12T15:07:01.65+00:00

    Hi @Palash Aich , thanks for replying back. I tried using the same scripts and data you provided and was able to query the csv data.

    Firstly I uploaded the CSV file in Blob storage configured as Data Lake Storage Gen2. Then I created the scoped credentials and used the same for querying the data as below:

    CREATE MASTER KEY ENCRYPTION BY PASSWORD='password@123';  
      
    CREATE DATABASE SCOPED CREDENTIAL AzureStorageCredential  
    WITH  
      IDENTITY = 'XYZ',  
      SECRET = '<your storage key>' ;  
      
    CREATE EXTERNAL DATA SOURCE AzureStorage  
    WITH   
    (  
        TYPE = HADOOP,  
            LOCATION = 'wasbs://******@youstorageacc.blob.core.windows.net',  
        CREDENTIAL = AzureStorageCredential  
    );  
      
    CREATE EXTERNAL FILE FORMAT TextFileFormat   
    WITH (   
        FORMAT_TYPE = DELIMITEDTEXT,   
        FORMAT_OPTIONS (   
            FIELD_TERMINATOR = N',' , STRING_DELIMITER = N'"', DATE_FORMAT = N'yyyy-MM-dd HH:mm',USE_TYPE_DEFAULT = False  
        )   
    );  
    

    Also attaching the csv file and create external table script provided by you:
    86916-data.txt
    87000-externaltablescript.txt

    Could you please try it once and let me know if this works for you?

    Was this answer helpful?


  5. Palash Aich 21 Reputation points
    2021-04-12T06:44:53.913+00:00

    Hi @AnuragSharma-MSFT ,

    Below is the table create script.

    CREATE EXTERNAL TABLE [dbo].[Account_CDM_Formatted]
    (
    [C1] varchar NULL,
    [C2] varchar NULL,
    [C3] varchar NULL,
    [C4] varchar NULL,
    [C5] varchar NULL,
    [C6] varchar NULL,
    [C7] varchar NULL,
    [C8] varchar NULL,
    [C9] varchar NULL,
    [C10] varchar NULL,
    [C11] varchar NULL,
    [C12] varchar NULL,
    [C13] varchar NULL,
    [C14] varchar NULL,
    [C15] varchar NULL,
    [C16] varchar NULL,
    [C17] varchar NULL,
    [C18] varchar NULL,
    [C19] varchar NULL,
    [C20] varchar NULL,
    [C21] varchar NULL,
    [C22] varchar NULL,
    [C23] varchar NULL,
    [C24] varchar NULL,
    [C25] varchar NULL,
    [C26] varchar NULL,
    [C27] varchar NULL,
    [C28] varchar NULL,
    [C29] varchar NULL,
    [C30] varchar NULL,
    [C31] varchar NULL,
    [C32] varchar NULL,
    [C33] varchar NULL,
    [C34] varchar NULL,
    [C35] varchar NULL,
    [C36] varchar NULL,
    [C37] varchar NULL,
    [C38] varchar NULL,
    [C39] varchar NULL,
    [C40] varchar NULL,
    [C41] varchar NULL,
    [C42] varchar NULL,
    [C43] varchar NULL,
    [C44] varchar NULL,
    [C45] varchar NULL,
    [C46] varchar NULL,
    [C47] varchar NULL,
    [C48] varchar NULL,
    [C49] varchar NULL,
    [C50] varchar NULL,
    [C51] varchar NULL,
    [C52] varchar NULL,
    [C53] varchar NULL,
    [C54] varchar NULL,
    [C55] varchar NULL,
    [C56] varchar NULL,
    [C57] varchar NULL,
    [C58] varchar NULL,
    [C59] varchar NULL,
    [C60] varchar NULL,
    [C61] varchar NULL,
    [C62] varchar NULL,
    [C63] varchar NULL,
    [C64] varchar NULL,
    [C65] varchar NULL,
    [C66] varchar NULL,
    [C67] varchar NULL,
    [C68] varchar NULL,
    [C69] varchar NULL,
    [C70] varchar NULL,
    [C71] varchar NULL,
    [C72] varchar NULL,
    [C73] varchar NULL,
    [C74] varchar NULL,
    [C75] varchar NULL,
    [C76] varchar NULL,
    [C77] varchar NULL,
    [C78] varchar NULL,
    [C79] varchar NULL,
    [C80] varchar NULL,
    [C81] varchar NULL,
    [C82] varchar NULL,
    [C83] varchar NULL,
    [C84] varchar NULL,
    [C85] varchar NULL,
    [C86] varchar NULL,
    [C87] varchar NULL,
    [C88] varchar NULL,
    [C89] varchar NULL,
    [C90] varchar NULL,
    [C91] varchar NULL,
    [C92] varchar NULL,
    [C93] varchar NULL,
    [C94] varchar NULL,
    [C95] varchar NULL,
    [C96] varchar NULL,
    [C97] varchar NULL,
    [C98] varchar NULL,
    [C99] varchar NULL,
    [C100] varchar NULL,
    [C101] varchar NULL,
    [C102] varchar NULL,
    [C103] varchar NULL,
    [C104] varchar NULL,
    [C105] varchar NULL,
    [C106] varchar NULL,
    [C107] varchar NULL,
    [C108] varchar NULL,
    [C109] varchar NULL,
    [C110] varchar NULL,
    [C111] varchar NULL,
    [C112] varchar NULL,
    [C113] varchar NULL,
    [C114] varchar NULL,
    [C115] varchar NULL,
    [C116] varchar NULL,
    [C117] varchar NULL,
    [C118] varchar NULL,
    [C119] varchar NULL,
    [C120] varchar NULL,
    [C121] varchar NULL,
    [C122] varchar NULL,
    [C123] varchar NULL,
    [C124] varchar NULL,
    [C125] varchar NULL,
    [C126] varchar NULL,
    [C127] varchar NULL,
    [C128] varchar NULL,
    [C129] varchar NULL,
    [C130] varchar NULL,
    [C131] varchar NULL,
    [C132] varchar NULL,
    [C133] varchar NULL,
    [C134] varchar NULL,
    [C135] varchar NULL,
    [C136] varchar NULL,
    [C137] varchar NULL,
    [C138] varchar NULL,
    [C139] varchar NULL,
    [C140] varchar NULL,
    [C141] varchar NULL,
    [C142] varchar NULL,
    [C143] varchar NULL,
    [C144] varchar NULL,
    [C145] varchar NULL,
    [C146] varchar NULL,
    [C147] varchar NULL,
    [C148] varchar NULL,
    [C149] varchar NULL,
    [C150] varchar NULL,
    [C151] varchar NULL,
    [C152] varchar NULL,
    [C153] varchar NULL,
    [C154] varchar NULL,
    [C155] varchar NULL,
    [C156] varchar NULL,
    [C157] varchar NULL,
    [C158] varchar NULL,
    [C159] varchar NULL,
    [C160] varchar NULL,
    [C161] varchar NULL,
    [C162] varchar NULL,
    [C163] varchar NULL,
    [C164] varchar NULL,
    [C165] varchar NULL,
    [C166] varchar NULL,
    [C167] varchar NULL,
    [C168] varchar NULL,
    [C169] varchar NULL,
    [C170] varchar NULL,
    [C171] varchar NULL,
    [C172] varchar NULL,
    [C173] varchar NULL,
    [C174] varchar NULL,
    [C175] varchar NULL,
    [C176] varchar NULL,
    [C177] varchar NULL,
    [C178] varchar NULL,
    [C179] varchar NULL,
    [C180] varchar NULL,
    [C181] varchar NULL,
    [C182] varchar NULL,
    [C183] varchar NULL,
    [C184] varchar NULL,
    [C185] varchar NULL,
    [C186] varchar NULL,
    [C187] varchar NULL,
    [C188] varchar NULL,
    [C189] varchar NULL,
    [C190] varchar NULL,
    [C191] varchar NULL,
    [C192] varchar NULL,
    [C193] varchar NULL,
    [C194] varchar NULL,
    [C195] varchar NULL,
    [C196] varchar NULL,
    [C197] varchar NULL,
    [C198] varchar NULL,
    [C199] varchar NULL,
    [C200] varchar NULL,
    [C201] varchar NULL,
    [C202] varchar NULL,
    [C203] varchar NULL,
    [C204] varchar NULL,
    [C205] varchar NULL,
    [C206] varchar NULL,
    [C207] varchar NULL,
    [C208] varchar NULL,
    [C209] varchar NULL,
    [C210] varchar NULL,
    [C211] varchar NULL,
    [C212] varchar NULL,
    [C213] varchar NULL,
    [C214] varchar NULL,
    [C215] varchar NULL,
    [C216] varchar NULL,
    [C217] varchar NULL,
    [C218] varchar NULL,
    [C219] varchar NULL,
    [C220] varchar NULL,
    [C221] varchar NULL,
    [C222] varchar NULL
    )
    WITH (DATA_SOURCE = [AzureStorage],LOCATION = N'/account/2021-04.csv',FILE_FORMAT = [CustomFormat],REJECT_TYPE = VALUE,REJECT_VALUE = 0)
    GO

    Was this answer helpful?

    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.