Share via

Error while loading data in SQL Pool Table in Azure Synapse Analytics

sgarg 0 Reputation points
2024-10-02T16:02:27.1866667+00:00

While following the step "Load the NYC Taxi Data into SQLPOOL1" from tutorial at https://learn.microsoft.com/en-us/azure/synapse-analytics/get-started-analyze-sql-pool

Got the following error:

110802;An internal DMS error occurred that caused this operation to fail. Details: Exception: Microsoft.SqlServer.DataWarehouse.DataMovement.Common.ExternalAccess.HdfsAccessException, Message: Error occurred while accessing HDFS external file[/NYCTripSmall.parquet][0]: Java exception raised on call to HdfsBridge_CreateRecordReader_V2. Java exception message: HdfsBridge::CreateRecordReader - Unexpected error encountered creating the record reader: TProtocolException: Required field 'codec' was not present! Struct: ColumnMetaData(type:INT32, encodings:[PLAIN, RLE, RLE_DICTIONARY], path_in_schema:[VendorID], codec:null, num_values:51837, total_uncompressed_size:7488, total_compressed_size:5087, data_page_offset:35, dictionary_page_offset:4)

The table gets created but there is no data in it and I cannot proceed with the Synapse analytics tutorial until this error is resolved. Please revert back asap. Thanks

Azure Synapse Analytics
Azure Synapse Analytics

An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.

0 comments No comments

2 answers

Sort by: Most helpful
  1. AnnuKumari-MSFT 34,571 Reputation points Microsoft Employee Moderator
    2024-10-08T06:49:31.7366667+00:00

    Hi @sgarg ,

    Welcome to Microsoft Q&A platform and thanks for posting your query here.

    You might need to add the following command to the COPY sentence:

    AUTO_CREATE_TABLE='ON'

    COPY INTO dbo.NYCTaxiTripSmall (VendorID 1, store_and_fwd_flag 4, RatecodeID 5, PULocationID 6 , DOLocationID 7, passenger_count 8,trip_distance 9, fare_amount 10, extra 11, mta_tax 12, tip_amount 13, tolls_amount 14, ehail_fee 15, improvement_surcharge 16, total_amount 17, payment_type 18, trip_type 19, congestion_surcharge 20 ) 
    FROM '
    WITH ( FILE_TYPE = 'PARQUET' ,MAXERRORS = 0 ,IDENTITY_INSERT = 'OFF' ,AUTO_CREATE_TABLE ='ON' )
    

    Kindly try it and let us know if it worked. Hope it helps. Please accept the answer if it was helpful. Thankyou

    Was this answer helpful?

    0 comments No comments

  2. Amira Bedhiafi 42,941 Reputation points MVP Volunteer Moderator
    2024-10-02T19:56:37.9766667+00:00

    The specific message you’re seeing indicates that the codec field is missing or improperly defined in the Parquet file metadata, which is likely causing a failure during the file read operation.

    Possible Solutions:

    1. Check the Parquet File:
      • Ensure that the Parquet file (NYCTripSmall.parquet) is correctly formatted and not corrupted.
      • The error mentions the absence of the codec, which could mean that the Parquet file was either generated incorrectly or contains unsupported encoding types.
    2. Re-upload the File:
      • Re-upload the NYCTripSmall.parquet file into your Azure Data Lake or Blob Storage and retry the load operation. Sometimes, network errors during file upload can cause issues with file metadata.
    3. Test with Another Parquet File:
      • Try loading another Parquet file (if available) to ensure that the issue is not specific to this file.
    4. Verify Compatibility:
      • Ensure that the version of the tools you're using to generate or upload the Parquet file is compatible with Azure Synapse. Some older or non-standard Parquet file formats may not be fully supported.
    5. Check Data Movement Service (DMS) Configurations:
      • The error also references an internal DMS issue. If you are using Azure Data Lake Storage (ADLS), ensure that your storage account and permissions are set correctly for the SQL pool to access the file.
      • You may want to try accessing the file directly from ADLS or Blob Storage to verify connectivity.

    Let me know if you'd like assistance with any of these steps or further debugging!

    Was this answer helpful?

    0 comments No comments

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.