Copy pipeline of parquet files from ADLSGen2 to Kusto failed

Subhiksha Ranganathan 1 Reputation point Microsoft Employee
2022-11-11T22:49:25.58+00:00

I'm trying to copy a list of files in a folder (all parquet files) using the wildcard path of <folder/subfolder/*.parquet> from ADLSGen2 to a Kusto table.
(Size of all parquet files together is ~ 6GB)

The copy job fails giving the below error, and seems to succeed randomly on certain days.

I have tried changing the time of copy job, to rule out resources being a problem, that works at times but it not consistent.

I can't understand the error message completely, as it seems to me it isn't straightfoward. Any ideas on why this is happening?

259692-image.png

259663-screenshot-20221111-023930.png

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,669 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Subhiksha Ranganathan 1 Reputation point Microsoft Employee
    2022-11-17T07:34:31.11+00:00

    Hello @HimanshuSinha-msft ,
    Thanks for getting back. I was looking at various direction on how this pipeline could fail - and found out that there was a data quality issue (a + sign in a date column in one record in total of millions of data)
    Fixing that, helped the copy pipeline to succeed.

    The trickiest part about this finding is the Synapse pipeline error was very misleading - Timespan overflowed because the duration is too long.
    And the error didn't show up while writing the dataset to ADLSGen2, but only when copying to Kusto it threw an error and failed.

    The copu pipeline is succeeding now - so I guess we're good. Thanks for the info, will refer to them if this issue is faced again.

    Thanks,
    Subhi