Share via

SQL example for 'querying JSON' seems incorrect in the learn module

Karthik Narayanan 20 Reputation points
2024-01-02T01:36:31.1733333+00:00

The SQL example given under the heading "Querying JSON files" seems to be incorrect on the following module.

https://learn.microsoft.com/en-us/training/modules/query-data-lake-using-azure-synapse-serverless-sql-pools/3-query-files

SELECT doc
FROM
    OPENROWSET(
        BULK 'https://mydatalake.blob.core.windows.net/data/files/*.json',
        FORMAT = 'csv',
        FIELDTERMINATOR ='0x0b',
        FIELDQUOTE = '0x0b',
        ROWTERMINATOR = '0x0b'
    ) WITH (doc NVARCHAR(MAX)) as rows

It seems to be written for CSV files. Not JSON.

Azure Synapse Analytics
Azure Synapse Analytics

An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.


Answer accepted by question author

Bruno Gomes 111 Reputation points
2024-01-02T01:56:34.98+00:00

Hi Karthikaeya,

I agree with you that the SQL example given under the heading "Querying JSON files" in the learn module is incorrect. The example is using the OPENROWSET() function with the FORMAT = 'csv' option, which is only valid for CSV files.

To query JSON files, you should use the FORMAT = 'json' option. The following is a corrected version of the example:

SQL


SELECT doc

FROM

  OPENROWSET(

    BULK 'https://mydatalake.blob.core.windows.net/data/files/*.json',

    FORMAT = 'json'

  ) WITH (doc NVARCHAR(MAX)) as rows

This example will correctly load the JSON data from the specified Azure Blob Storage container and return the doc column, which contains the entire JSON document as a single string.

Here is a more detailed explanation of the changes I made:

  • I changed the FORMAT option from 'csv' to 'json'. This tells the OPENROWSET() function to expect JSON data.
  • I removed the FIELDTERMINATOR and FIELDQUOTE options. These options are only used for CSV data.
  • I added the ROWTERMINATOR option with the value '0x0b'. This tells the OPENROWSET() function to use the carriage return (\r) character as a row terminator.

I hope this helps! Let me know if you have any other questions.

Sincerely,

Bruno Gomes.

Was this answer helpful?

0 comments No comments

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.