How many partitions is needed with 90 billions table row count to benefit the CCI?

Anthony Marasigan 21 Reputation points
2023-01-30T15:30:29.5666667+00:00

Hi,

Say for example I have a large fact hash-distributed (CCI) table partitioned monthly with 90 billions table row count (current):

  1. How many partition is needed to create to benefit from a clustered columnstore index?
  2. Based from question 1, is it ok/normal to have that partition count even if succeeding partitions doesn't have data yet?

I know that this is already on the MS site but advise/inputs from experts with real-world experience will make it more clear.

Thanks in advance!

Azure SQL Database
Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,362 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Oury Ba-MSFT 16,081 Reputation points Microsoft Employee
    2023-01-31T16:56:18.5666667+00:00

    Hi Anthony Marasigan Thank you for reaching out.

    If you are using dedicated SQL pool before partitioning, dedicated SQL pool by default already divides your data into 60 databases. Partitioning further divides your data. If you partition your data, then consider that each partition needs at least 1 million rows to benefit from a clustered columnstore index. If you partition your table into 100 partitions, then your table needs at least 6 billion rows to benefit from a clustered columnstore index (60 distributions 100 partitions 1 million rows). If your 100-partition table does not have 6 billion rows, either reduce the number of partitions or consider using a heap table instead.

    Please let me know if you need additional information.

    Indexes on dedicated SQL pool tables in Azure Synapse Analytics

    Regards,

    Oury

    1 person found this answer helpful.