Share via

Optimize Delta Lake

Ryan Abbey 1,186 Reputation points
2021-08-02T08:41:35.883+00:00

Can a Synapse delta lake (Spark 2.4, delta 0.6) table be optimised in a similar vein to a Databricks delta table?

If not, what optimisation options are available as we've loaded a lot of small files and found the deterioration quite significant (as we're only talking 80 files, the deterioration is pretty appalling really!)

Azure Synapse Analytics
Azure Synapse Analytics

An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.

0 comments No comments

1 answer

Sort by: Most helpful
  1. MartinJaffer-MSFT 26,161 Reputation points
    2021-08-03T08:39:07.073+00:00

    Hello again @Ryan Abbey and welcome back.

    The Databricks optimization speaks of two main functions, Z-ordering and compacting.

    Compacting is part of the base "Delta" package and not unique to Databricks. (See here) Compacting must also be available in Synapse.

    I am less certain about the Z-ordering. It may be Databricks administering to Delta.

    Aside from that, partitioning and file-size is available on both Synapse and Databricks.

    Ahh, I just found another optimize, Auto-optimize during write. Can you confirm which optimize you are asking about?

    Was this answer helpful?


Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.