Query timestamp_micros value of parquet file using Azure Synapse

srigowri 26 Reputation points
2022-10-04T05:50:51.96+00:00
  • I have a parquet file with one of the timestamp fields with following parquet schema, uploaded to Azure Synapse workspace
    {
    file_name: 'changesschema.parquet',
    name: 'ctime',
    type: 'INT64',
    type_length: '0',
    repetition_type: 'OPTIONAL',
    num_children: 0,
    converted_type: 'TIMESTAMP_MICROS',
    scale: 0,
    precision: 0,
    field_id: 0,
    logical_type: 'TimestampType(isAdjustedToUTC=0, unit=TimeUnit(MILLIS=<null>, MICROS=MicroSeconds(), NANOS=<null>))'
    }
  • I have defined a view in Synapse to include, CREATE OR ALTER VIEW snapshots
    AS SELECT *
    FROM
    OPENROWSET(
    BULK '/changes/workingenvironmentid=/volumeid=/snapshotid=*/**',
    DATA_SOURCE = 'CATALOG_DATA_SOURCE',
    FORMAT='PARQUET'
    )
    WITH (
    ctime datetime2
    )
    AS results
  • However when I query the data
    Column 'ctime' of type 'DATETIME2' is not compatible with external data type 'Parquet physical type: INT64', please try with 'BIGINT'. File/External table name:
Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
1,338 questions
Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,363 questions
{count} votes

1 answer

Sort by: Most helpful
  1. KranthiPakala-MSFT 46,422 Reputation points Microsoft Employee
    2022-10-05T20:11:53.92+00:00

    Hello @srigowri ,

    Thanks for the question and using MS Q&A platform.

    Could you please confirm if you are using Dedicated SQL or Serverless?
    I tried to download the file you have provided and tried with Serverless, and it seems to work without any issue. I don't see any errors.

    247820-image.png

    Whereas in dedicated SQL pool, seems like the datatype is not supported. As a workaround would it be possible for you to use COPY INTO command in stored procedures. Seens like using Copy Into command overcomes the error message.

    Hope this info helps. Please let us know how it goes.

    Thank you