Getting Out of memory exception in ADF when copying table from Azure SQL to parquet file.

Inma 66 Reputation points
2022-09-07T14:24:25.087+00:00

I am getting below error from ADF pipeline when I copy big size varbinary datatype column data from Azure SQL to parquet.
{
"errorCode": "2200",
"message": "ErrorCode=ParquetJavaInvocationException,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=An error occurred when invoking java, message: java.lang.OutOfMemoryError:Unable to retrieve Java exception.\ntotal entry:19\r\nsun.misc.Unsafe.allocateMemory(Native Method)\r\njava.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:127)\r\njava.nio.ByteBuffer.allocateDirect(ByteBuffer.java:311)\r\norg.apache.parquet.hadoop.codec.SnappyCompressor.setInput(SnappyCompressor.java:99)\r\norg.apache.parquet.hadoop.codec.NonBlockedCompressorStream.write(NonBlockedCompressorStream.java:48)\r\norg.apache.parquet.bytes.CapacityByteArrayOutputStream.writeToOutput(CapacityByteArrayOutputStream.java:219)\r\norg.apache.parquet.bytes.CapacityByteArrayOutputStream.writeTo(CapacityByteArrayOutputStream.java:239)\r\norg.apache.parquet.bytes.BytesInput$CapacityBAOSBytesInput.writeAllTo(BytesInput.java:392)\r\norg.apache.parquet.bytes.BytesInput$SequenceBytesIn.writeAllTo(BytesInput.java:283)\r\norg.apache.parquet.hadoop.CodecFactory$HeapBytesCompressor.compress(CodecFactory.java:165)\r\norg.apache.parquet.hadoop.ColumnChunkPageWriteStore$ColumnChunkPageWriter.writePage(ColumnChunkPageWriteStore.java:98)\r\norg.apache.parquet.column.impl.ColumnWriterV1.writePage(ColumnWriterV1.java:148)\r\norg.apache.parquet.column.impl.ColumnWriterV1.flush(ColumnWriterV1.java:236)\r\norg.apache.parquet.column.impl.ColumnWriteStoreV1.flush(ColumnWriteStoreV1.java:122)\r\norg.apache.parquet.hadoop.InternalParquetRecordWriter.flushRowGroupToStore(InternalParquetRecordWriter.java:169)\r\norg.apache.parquet.hadoop.InternalParquetRecordWriter.checkBlockSizeReached(InternalParquetRecordWriter.java:143)\r\norg.apache.parquet.hadoop.InternalParquetRecordWriter.write(InternalParquetRecordWriter.java:125)\r\norg.apache.parquet.hadoop.ParquetWriter.write(ParquetWriter.java:292)\r\ncom.microsoft.datatransfer.bridge.parquet.ParquetBatchWriter.addRows(ParquetBatchWriter.java:61)\r\n.,Source=Microsoft.DataTransfer.Richfile.ParquetTransferPlugin,''Type=Microsoft.DataTransfer.Richfile.JniExt.JavaBridgeException,Message=,Source=Microsoft.DataTransfer.Richfile.HiveOrcBridge,'",
"failureType": "UserError",
"target": "ExportToParquet",
"details": []
}

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,643 questions
{count} votes

Accepted answer
  1. AnnuKumari-MSFT 31,151 Reputation points Microsoft Employee
    2022-09-08T10:14:10.427+00:00

    Hi @Inma ,
    Thankyou for using Microsoft Q&A platform and thanks for posting your question here.
    As I understand your issue, you are trying to copy the data from Azure sql table to .parquet file in ADLS. Since the table content is huge, it is throwing the above stated error. Please let me know if my understanding is incorrect.

    You can consider loading the data into partitioned file instead of one single file.

    To do that, you need to take care of below points:

    1. Leave the file name section of sink dataset as blank instead of hardcoding it
    2. In the sink settings, provide an integer value (example: 10000) in the 'max row' option to load 10000 rows of data per partitioned file.

    239037-image.png

    239038-image.png

    Hope this will help. Please let us know if any further queries.

    ------------------------------

    • Please don't forget to click on 130616-image.png or upvote 130671-image.png button whenever the information provided helps you.
      Original posters help the community find answers faster by identifying the correct answer. Here is how
    • Want a reminder to come back and check responses? Here is how to subscribe to a notification
    • If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators
    1 person found this answer helpful.

0 additional answers

Sort by: Most helpful