Dataflow in ADF is not able to read parquet files exported from ADX

Nahuel Diaz Lederhos 0 Reputation points

I have defined a continuous export to parquet files in ADX and then use these files to perform some operations on a dataflow in ADF. The dataflow was running for more than six months without errors and since a week ago it has not been able to read some files giving the following error:

java.lang.UnsupportedOperationException: Unsupported encoding: DELTA_BYTE_ARRAY

The properties of the continuous export are as follows:

  "WriteNativeParquetV2": true,
  "ParquetDatetimePrecision": 1,
  "isDisabled": false,
  "ReportDeltaLogResults": false

How should I handle this situation?

Azure Data Explorer
Azure Data Explorer
An Azure data analytics service for real-time analysis on large volumes of data streaming from sources including applications, websites, and internet of things devices.
431 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
8,496 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Amira Bedhiafi 8,716 Reputation points

    Like the error message indicates, you are reading data in Parquet format and writing to a (Delta I guess) table when you get a Parquet column cannot be converted error message.

    The vectorized Parquet reader is decoding the type column to a binary format.

    So if you have decimal type columns in your source data, you should disable the vectorized Parquet reader.

    1 person found this answer helpful.
    0 comments No comments