Problem querying a CSV File with Synapse SQL Serverless Pool

Silvano Paxia 101 Reputation points
2020-12-07T17:24:07.48+00:00

Hello,

trying to query a very simple CSV file, Strings have Double quote (").

SELECT
    TOP 100 *
FROM
    OPENROWSET(
        BULK 'https://xxxx.dfs.core.windows.net/sample/csv/backlog-20201207-2.csv',
        FORMAT = 'CSV',
        PARSER_VERSION='2.0',
firstrow = 2,
FIELDQUOTE = '"'
) AS [result]

Keep getting this error:

Error handling external file: 'Quotes ' ' must be inside quoted fields at [byte: 206]. '. File: 'https://dfs.core.windows.net/sample/csv/backlog-20201207-2.csv'.

Any help?

thank you so much

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
5,373 questions
0 comments No comments
{count} votes

6 answers

Sort by: Most helpful
  1. Silvano Paxia 101 Reputation points
    2020-12-11T01:27:12.033+00:00

    Hello Himanshu,

    Thank you so much!

    It works also for me but there is an issue.

    My field quote for strings is the double quote “

    As you can see now the strings have double quotes.

    I thought I could have just specify as Field quote

    FIELDQUOTE = '”'

    But it is not working.

    Is there a way I can get the string without double quote.

    If I filter for example with a WHERE condition like

    WHERE C5 = '1'

    I get the following error in the messages

    Potential conversion error while reading VARCHAR column 'C5' from UTF8 encoded text. Change database collation to a UTF8 collation or specify explicit column schema in WITH clause and assign UTF8 collation to VARCHAR columns. )
    Error handling external file: 'Quotes ' ' must be inside quoted fields at [byte: 59782]. '. File: 'https://myblob.dfs.core.windows.net/file.csv'.

    Thank you so much

    --silvano

    1 person found this answer helpful.

  2. PRADEEPCHEEKATLA 90,641 Reputation points Moderator
    2020-12-08T10:22:20.43+00:00

    Hello @Silvano Paxia ,

    Welcome to the Microsoft Q&A platform.

    As per my repro, I'm able to successfully run the query using a serverless SQL pool in Azure Synapse Analytics.

    46233-image.png

    In order to investigate further, could you please share the sample data to test on your dataset?

    Meanwhile, you can go through How to use OPENROWSET using serverless SQL pool in Azure Synapse Analytics.

    Hope this helps. Do let us know if you any further queries.

    ------------

    • Please accept an answer if correct. Original posters help the community find answers faster by identifying the correct answer. Here is how.
    • Want a reminder to come back and check responses? Here is how to subscribe to a notification.

  3. HimanshuSinha-msft 19,486 Reputation points Microsoft Employee Moderator
    2020-12-10T23:44:37.997+00:00

    Hello @Silvano Paxia ,

    The below works for me .

    SELECT  
        TOP 100 *  
    FROM  
        OPENROWSET(  
            BULK 'https://myblob.blob.core.windows.net/himanshu/File2/backlog-20201207-2.csv',  
            FORMAT = 'CSV',  
            PARSER_VERSION='2.0',  
           FIELDTERMINATOR ='   ',    
            firstrow = 2,         
            FIELDQUOTE = ''''  
    ) AS [result]  
    

    Let me know how it goes .

    Thanks
    Himanshu

    0 comments No comments

  4. Silvano Paxia 101 Reputation points
    2020-12-21T21:40:10.88+00:00

    Hello @HimanshuSinha-msft

    can you please help me? I sent you an email last week.

    Thanks


  5. Frederico Camargo 1 Reputation point
    2021-03-23T13:42:45.137+00:00

    Have you checked if this header is OK?
    It seems it is trying to load the header anyway and it has problems.
    Set FIRSTROW=1 for trying to avoid reading the first row as a header.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.