Import large json file (7GB+) into sql server

moondaddy 916 Reputation points
2021-07-06T23:02:57.617+00:00

I have a large json file which is irregularly structured.

*All top level elements are unique names like this:

112230-image.png

I tried this:

SELECT *  
INTO Json1  
FROM OPENROWSET (BULK 'D:\SomeData.json', SINGLE_CLOB) as j  

but get the exception: "Attempting to grow LOB beyond maximum allowed size of 2147483647 bytes."

However, this would work if I could import each of the elements you see into it's own row. I think my code above it trying to insert all data into one row, one column.

Then once each element is in a separate row I think I can read the json from there to finish the processing. Below is a sample of what the data looks like:

112323-image.png

Can someone please tell me what the sql syntax is to import this file into a table where each top level element lands in a new row?

Thank you.

Developer technologies Transact-SQL
{count} votes

3 answers

Sort by: Most helpful
  1. MelissaMa-MSFT 24,221 Reputation points
    2021-07-07T02:07:43.083+00:00

    Hi @moondaddy ,

    Instead of reading an entire JSON file as a single value, you may want to parse it and return the books in the file and their properties in rows and columns.

     insert into Json1  
     SELECT value  
     FROM OPENROWSET (BULK 'D:\SomeData.json', SINGLE_CLOB) as j  
     CROSS APPLY OPENJSON(BulkColumn)  
    

    The preceding OPENROWSET reads a single text value from the file. OPENROWSET returns the value as a BulkColumn, and passes BulkColumn to the OPENJSON function. OPENJSON iterates through the array of JSON objects in the BulkColumn array, and returns one book in each row.

    The OPENJSON function can parse the JSON content and transform it into a table or a result set. The following example loads the content, parses the loaded JSON, and returns the five fields as columns:

    insert into Json1  
    SELECT book.*  
     FROM OPENROWSET (BULK 'C:\JSON\Books\books.json', SINGLE_CLOB) as j  
     CROSS APPLY OPENJSON(BulkColumn)  
     WITH( id nvarchar(100), name nvarchar(100), price float,  
     pages_i int, author nvarchar(100)) AS book  
    

    You could refer more details from this link.

    Best regards,
    Melissa


    If the answer is helpful, please click "Accept Answer" and upvote it.
    Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.


  2. MelissaMa-MSFT 24,221 Reputation points
    2021-07-07T02:35:13.81+00:00

    Hi @moondaddy ,

    Thanks for your update.

    I think filestream feature may help you if you are having SQL server 2008 and on.

    Or you could have a try to split the JSON file into small files, convert them to CSV files and imported these CSV files into SQL Server as mentioned in this article.

    Best regards,
    Melissa


    If the answer is helpful, please click "Accept Answer" and upvote it.
    Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.


  3. MelissaMa-MSFT 24,221 Reputation points
    2021-07-07T07:27:58.037+00:00

    Hi @moondaddy ,

    Thanks for your update.

    I did a lot of research but failed to find any effective solution using OPENROWSET ,OPENJSON or other methods in SQL Server side due to your schema of JSON was not fixed.

    You could also try to use Newtonsoft.Json to read each top-level element into a JToken, convert the JToken into a string and store it in a Datatable, and then use SqlBulkCopy Class which could be able to achieve your requirement.

    If you still encounter any difficulties or unintelligible errors while coding, welcome to post a new question and add C# related tag 'dotnet-csharp'.

    JsonSerializer serializer = new JsonSerializer();  
    using (FileStream s = File.Open(@"C:\...\test.json", FileMode.Open))  
    using (StreamReader sr = new StreamReader(s))  
    using (JsonReader reader = new JsonTextReader(sr))  
    {  
    while (reader.Read())  
    {  
    // deserialize only when there's "{" character in the stream  
    if (reader.TokenType == JsonToken.StartObject)  
    {  
    JToken jObject = JToken.Load(reader);  
    }  
    }  
    }  
    

    Best regards,
    Melissa


    If the answer is helpful, please click "Accept Answer" and upvote it.
    Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.