Read json file which is generated from ravendb export which has duplicate columns

Shailendra Kad 11 Reputation points
2021-11-03T13:08:00.36+00:00

Hi Team,

I want to load the json file generated from ravendb export.
This is rather complex file and has lot of arrays and strings in it.
Only issue is, it has 2 columns which are duplicate.
I mean ideally this json is not valid , as it has 2 columns which are present in the file multiple times.
Sample structure as below
Docs[]
Attachments
Docs[]
Attachments
Indexes[]
Transformers[]
Docs[]

You see the Docs column is repeated multiple times.
And Docs is the imp column , which is array of documents.

In the source of data flow, I am getting the error as duplicate column.
{"message":"Job failed due to reason: at Source 'Json': org.apache.spark.sql.AnalysisException: Found duplicate column(s) in the data schema: Attachments, Docs;.

I am also trying to read this file as a delimited file and then see whether I can remove it.
Do you have any solution regarding how can I process it?

Or any other way I can load it?

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,369 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,546 questions
{count} votes