Share via

Datalake table update slower

Abhishek Gaikwad 196 Reputation points
2021-11-17T11:23:16.177+00:00

We have a datalake table which has huge billion of records in parquet format. When we run any SQL queries against this table the queries are slower.
We are facing an issue when update certain colums to null values for this table. However when we copy this table to a new parquet file and then run update against the new parquet file the updates are faster. Can you please confirm what could be the reason the updates are faster when we copy data to a new file and the old file upates takes long.

Azure Data Lake Storage
Azure Data Lake Storage

An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.


Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.