Share via

The Ticks value for the datetime column must be between valid datetime ticks range - 0000-12-30 00:00:00

Mazhar Iqbal 30 Reputation points
2024-09-09T16:59:28.1866667+00:00

Question asked elsewhere too.

We are using the OData Web Services to ingest data from Dynamics Business Central using Azure Data Factory.

Data is initially ingested into Parquet files in ADLS gen2. Then from there copied to Azure SQL db tables.

Since the weekend, this copy activity from ADLS gen2 to Azure sql db is failing with the following error. This isn't one Business Central table object but multiple.

"ErrorCode=ParquetDateTimeExceedLimit,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=The Ticks value '-621357696000000000' for the datetime column must be between valid datetime ticks range -621355968000000000 and 2534022144000000000.,Source=Microsoft.DataTransfer.Richfile.ParquetTransferPlugin,'"

-621357696000000000 translates to "0000-12-30 00:00:00"

When looking at the parquet files we can see that indeed we have some duff datetime data .

Using Preview from the Dataset, the duff data is shown as (0001-01-01T00:00:00).

User's image

However, when viewing the data from a dataflow, the duff data is 0000-12-30 00:00:00 - the tick value causing the error

User's image

The source is Business Central data that are either not visible nor editable by end users.

I've tried some dynamic TabularTranslator mapping and a computed column in the target table to try and mitigate this but I still get the error.

ADF TabularTranslator

    {
        "source": {
            "name": "LastModifiedDateTime"
        },
        "sink": {
            "name": "str_LastModifiedDateTime"
        }
    },

SQL table

[str_LastModifiedDateTime] [varchar](50) NULL,  
[LastModifiedDateTime]  AS 
(TRY_CAST([str_LastModifiedDateTime] AS [datetime2])),

I'm now at a loss.
How can I build a different way to either ingest the data or read the data from parquet without these java code gregorian/julius calendar shenanigans getting in the way?


                "fileSystem": {
                    "value": "@dataset().containername",
                    "type": "Expression"
                }
            },
            "compressionCodec": "snappy"
        },
        "schema": []
    },
    "type": "Microsoft.DataFactory/factories/datasets"
}
Azure Data Factory
Azure Data Factory

An Azure service for ingesting, preparing, and transforming data at scale.


Answer accepted by question author

  1. Chris Smith-Kirk 80 Reputation points
    2024-09-19T05:42:29.93+00:00

    We had a similar problem and it started last weekend.

    Microsoft advised:

    Add the following JSON to our parquet integration data set: "useParquetV2": true

    e.g.

                "compressionCodec": "snappy",
                "useParquetV2": true
            },
    
    

    Root cause:

    Our recent upgrade of underlying parquet library to address CVEs seems to be the culprit. So Parquet V1 and Spark use the same Parquet-MR Java library for reading/writing parquet files.  

    Spark 3.0 introduced a breaking change where the default system calendar was switched to Proleptic Gregorian calendar, but INT96 Timestamp requires Julian calendar, so for timestamp values prior to 1900-01-01T00:00:00Z, that are ambiguous between these two calendars, Parquet-MR automatically shifts the value in order to rebase from Proleptic Gregorian calendar to Julian calendar w.r.t. calendar differences. It impacts ADF's Parquet v1 as well.

    Was this answer helpful?

    1 person found this answer helpful.

2 additional answers

Sort by: Most helpful
  1. Smaran Thoomu 35,125 Reputation points Microsoft External Staff Moderator
    2024-09-11T15:44:17.1366667+00:00

    Hi @Mazhar Iqbal
    I'm glad that you were able to resolve your issue and thank you for posting your solution so that others experiencing the same thing can easily reference this! Since the Microsoft Q&A community has a policy that "The question author cannot accept their own answer. They can only accept answers by others "I'll repost your solution in case you'd like to accept the answer.

    Issue: We are using the OData Web Services to ingest data from Dynamics Business Central using Azure Data Factory.

    Data is initially ingested into Parquet files in ADLS gen2. Then from there copied to Azure SQL db tables.

    Since the weekend, this copy activity from ADLS gen2 to Azure sql db is failing with the following error. This isn't one Business Central table object but multiple.

    "ErrorCode=ParquetDateTimeExceedLimit,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=The Ticks value '-621357696000000000' for the datetime column must be between valid datetime ticks range -621355968000000000 and 2534022144000000000.,Source=Microsoft.DataTransfer.Richfile.ParquetTransferPlugin,'"

    -621357696000000000 translates to "0000-12-30 00:00:00"

    When looking at the parquet files we can see that indeed we have some duff datetime data .

    Using Preview from the Dataset, the duff data is shown as (0001-01-01T00:00:00).

    User's image

    However, when viewing the data from a dataflow, the duff data is 0000-12-30 00:00:00 - the tick value causing the error

    User's image

    The source is Business Central data that are either not visible nor editable by end users.

    I've tried some dynamic TabularTranslator mapping and a computed column in the target table to try and mitigate this but I still get the error.

    ADF TabularTranslator

    JSONAI ConvertCopy

        {
            "source": {
                "name": "LastModifiedDateTime"
            },
            "sink": {
                "name": "str_LastModifiedDateTime"
            }
        },
    
    [str_LastModifiedDateTime] [varchar](50) NULL,  
    [LastModifiedDateTime]  AS 
    (TRY_CAST([str_LastModifiedDateTime] AS [datetime2])),
    

    I'm now at a loss. How can I build a different way to either ingest the data or read the data from parquet without these java code gregorian/julius calendar shenanigans getting in the way?

    
                    "fileSystem": {
                        "value": "@dataset().containername",
                        "type": "Expression"
                    }
                },
                "compressionCodec": "snappy"
            },
            "schema": []
        },
        "type": "Microsoft.DataFactory/factories/datasets"
    
    
    

    Solution: The solution I ended up with was adding a another Copy Data activity, immediately before the one that reads data from the Parquet to the SQL table, this one is to copy data from the Parquet file and create a new CSV file. And then amended the existing Copy Data activity to read data from the CSV instead of the Parquet.

    Strangely enough, the duff 0000-12-30 00:00:00 transformed itself to the valid 0001-01-01 00:00:00 timestamp in the CSV without any intervention from me.

    If I missed anything please let me know and I'd be happy to add it to my answer, or feel free to comment below with any additional information.

    If you have any other questions, please let me know. Thank you again for your time and patience throughout this issue.

    Hope this helps. Do let us know if you any further queries.


    If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

    Was this answer helpful?


  2. Mazhar Iqbal 30 Reputation points
    2024-09-11T15:11:47.29+00:00

    The solution I ended up with was adding a another Copy Data activity, immediately before the one that reads data from the Parquet to the SQL table, this one is to copy data from the Parquet file and create a new CSV file. And then amended the existing Copy Data activity to read data from the CSV instead of the Parquet.

    Strangely enough, the duff 0000-12-30 00:00:00 transformed itself to the valid 0001-01-01 00:00:00 timestamp in the CSV without any intervention from me.

    Was this answer helpful?


Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.