DF-Executor-OutOfMemoryError" in azure synapse
I am having a json from ravenDB which is not valid json as it is having duplicate columns.
So my first step is to clean the json and if there are duplicates make separate json for each file.
I was able to do it for sample file and it ran successfully,
Then I tried for a 12 MB file and it also worked.
But when I tried for a full DB backup file which is 10GB in size , it is giving error.
This 10 GB file generates 3 separate json as it is having DOCS columns 3 times.
First file is 9.6GB and other 2 files are small like 120MB and 10KB.
For the first file when I am trying to load it in Synapse DWH I am getitng below error.
Job failed due to reason: Cluster ran into out of memory issue during execution. Also, Please note that the dataflow has one or more custom partitioning schemes. The transformation(s) using custom partition schemes: Json,Select1,FlattenDocsCS,Flatten2,Filter1,ChangeDataTypesDateColumns,CstomsShipment. 1. Please retry using an integration runtime with bigger core count and/or memory optimized compute type. 2. Please retry using different partitioning schemes and/or number of partitions.
I tried to publish the pipeline so that I am not running in debug mode and in a small cluster.
I changed cluster size to 32 cores and changed partition schemes in optimize tab to all possible things.
But still I a getting an error.
Kindly please help