Share via

Excel to Parquet - Dates are null for few columns

Santhi Dhanuskodi 66 Reputation points
2022-10-11T05:11:23.563+00:00

Hi,

I am getting null values for date columns when I am converting from excel to parquet.
My source excel contains proper dates. I dont see any empty or null values in date columns.

And surprisingly I see these null values only for fewer records.

For example, if i have 10 records with date values '2021-01-01'.... I get '2021-01-01' for 5 records.
And I get null values for another 5 records.

I am using toDate(Date, 'yyyy-mm-dd')... even then this problem occurs. Please help me on this

Azure Synapse Analytics
Azure Synapse Analytics

An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.

Azure Data Factory
Azure Data Factory

An Azure service for ingesting, preparing, and transforming data at scale.


2 answers

Sort by: Most helpful
  1. Santhi Dhanuskodi 66 Reputation points
    2022-10-18T04:55:11.363+00:00

    We found out that some other columns which contain numbers, decimals were not in correct format.
    We had to edit source excel files , those numeric columns formats. Then it worked out.

    But it is surprising how Azure behaves on these kind of issues.
    It is tiresome investigation to root cause such data format issues.


  2. KranthiPakala-MSFT 46,822 Reputation points Microsoft Employee Moderator
    2022-10-11T15:28:38.377+00:00

    The only situation which results in such scenario is when your source records had inconsistent date formats.

    1. For example: If your source has inconsistent date format column (few records of format dd-MM-yyyy, MM-dd-yyyy, yyyy-MM-dd ), and if you use the toDate function as toDate(Date, 'yyyy-MM-dd'), then the function expects your source to pass the value of that column to be of format yyyy-MM-dd and if any of your source records contain a different format like dd-MM-yyyy or MM-dd-yyyy then those records will be treated as null. 249300-image.png

    I would suggest looking at the sink records that have null for date column and verify those records with your source to make sure that the date format is consistent as per the format defined in the toDate() function

    Hope this info helps. Please do keep us posted on your findings and if this doesn't help then please share a sample file with some good and bad (null value) records.


Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.