Parquet to SQL External Table HdfsBridge::recordReaderFillBuffer - Unexpected error encountered filling record reader buffer: ClassCastException:

Swathi Garudasu 1 Reputation point
2022-04-05T05:58:32.52+00:00

Create Table Script and Parquet File Schema are listed below. I am unable to query this table after creating. It fails with Error - HdfsBridge::recordReaderFillBuffer - Unexpected error encountered filling record reader buffer: ClassCastException:

I have tried to implement the datatype instructions from here - https://learn.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/design-elt-data-loading#define-the-tables
Tried couple of things which did not work

  • Used Float Datatype for Qty Columns
  • Used DateTime DataType for Date Columns
  • Used VARCHAR Datatype for String Columns

CREATE EXTERNAL TABLE ext.FactClear_Current
( SnapshotDate DATE ,
DateID DATE,
ProductID NVARCHAR (100),
ProgramConfigurationID NVARCHAR(100),
ClearQty REAL,
CummulativeClearQty REAL,
ETLInsertDtTm NVARCHAR(100),
ETLUpDATEDtTm NVARCHAR(100),
ExtractDateTime NVARCHAR(100),
DeltaActionCode NVARCHAR(100)
)
WITH
(
LOCATION = '/source/IDataFact/FactClearToBuild',
DATA_SOURCE = [DataLakeStore],
FILE_FORMAT = [FileFormat_PARQUET]
);

Parquet File Schema
|-- Snapshotdate: date (nullable = true)
|-- DateID: date (nullable = true)
|-- MaterialID: string (nullable = true)
|-- ProgramConfigurationId: integer (nullable = true)
|-- ClearQty: float (nullable = true)
|-- CumulativeClearQty: float (nullable = true)
|-- etlinsertdttm: string (nullable = true)
|-- etlupdatedttm: string (nullable = true)
|-- extractdatetime: string (nullable = true)
|-- deltaactioncode: string (nullable = true)--

Please help

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,250 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. ShaikMaheer-MSFT 37,566 Reputation points Microsoft Employee
    2022-04-06T13:35:18.757+00:00

    Hi @Swathi Garudasu ,

    Thank you for posting query in Microsoft Q&A Platform.

    As per my understanding you created an external table on top of parquet file but while querying that table ending up with the error. Please correct me if I am wrong.

    Usually this kind of error will come when data type not mapped properly between parquet file data types and SQL data types. As you mentioned below link gives clear idea about these data type mappings.
    https://learn.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/design-elt-data-loading#define-the-tables

    Could you please try below steps to see for making sure data types properly matching and see if that helps?

    Create Integrated Dataset for your parquet file and navigate to schema tab and take data types of parquet file from there and then try to see corresponding SQL data types from above documentation and define them accordingly in your external table.
    190594-image.png

    Please check below Q&A link in which similar discussion there. In that post also we are trying to take data types of parquet from dataset schema tab and then checking documentation accordingly.
    https://learn.microsoft.com/en-us/answers/questions/240046/synapse-error-hdfsbridgerecordreaderfillbuffer-ext.html

    Please let us know how it goes.

    Between, you have not shared full error message. Requesting to share error message if possible. Thank you.

    1 person found this answer helpful.