Rebuilding Clustered ColumnStore Index created segments with rows less than 102_400

pantonis 30 Reputation points
2023-11-06T10:52:55.26+00:00

I imported data in an SQL Server 2022 database into an Ordered (Timestamp column) Clustered ColumnStoreIndex table in batches of 1,048,576 items. After the import was completed I ran the following query

SELECT
     tables.name AS table_name,
     indexes.name AS index_name,
     columns.name AS column_name,
     partitions.partition_number,
     column_store_segments.segment_id,
     column_store_segments.min_data_id,
     column_store_segments.max_data_id,
     column_store_segments.row_count
FROM sys.column_store_segments
INNER JOIN sys.partitions
ON column_store_segments.hobt_id = partitions.hobt_id
INNER JOIN sys.indexes
ON indexes.index_id = partitions.index_id
AND indexes.object_id = partitions.object_id
INNER JOIN sys.tables
ON tables.object_id = indexes.object_id
INNER JOIN sys.columns
ON tables.object_id = columns.object_id
AND column_store_segments.column_id = columns.column_id
WHERE tables.name = 'MyTable'
AND columns.name = 'Timestamp'
ORDER BY tables.name, columns.name, col
umn_store_segments.segment_id;

and I saw that there was not good segment elimination. So I ran

ALTER INDEX CCI_MyTable ON dbo.MyTable REBUILD;

after it was finished I ran the first query again and I noticed even worse results. Before index rebuild I had 19 segments each one having 1,048,576 now I have 30 segments with numbers ranging from 40,000 rows up to 1,048,576. Segment elimination also is much much worse now.

What am I doing wrong?

SQL Server
SQL Server
A family of Microsoft relational database management and analysis systems for e-commerce, line-of-business, and data warehousing solutions.
14,148 questions
0 comments No comments
{count} votes

2 answers

Sort by: Most helpful
  1. Bjoern Peters 8,886 Reputation points
    2023-11-06T18:13:18.86+00:00

    Hello Pantonis,

    Welcome to the Q&A Forum; this is a great place to get support, answers, and tips.

    Thank you for posting your question; I'll be more than glad to help you out.

    You didn't write anything on the details of your table structure, how you load the data and the process; my answer might be a guess...

    I find this article helpful in understanding the loading mechanism and how a column store index is filtered/ordered:

    https://www.red-gate.com/simple-talk/databases/sql-server/t-sql-programming-sql-server/hands-on-with-columnstore-indexes-part-2-best-practices-and-guidelines/

    Maybe it is helpful for you, too.

    Your

    Bjoern Peters

    If the reply was helpful, please upvote and/or accept it as an answer, as this helps others in the community with similar questions. Thanks!


  2. LiHongMSFT-4306 29,191 Reputation points
    2023-11-08T02:05:05.9566667+00:00

    Hi @pantonis

    Generally, based on the percent of fragmentation (avg_fragmentation_in_percent column from sys.dm_db_index_physical_stats)

    • Fragmentation is less than 10% – no de-fragmentation is required. It is generally accepted that in majority of environments index fragmentation less than 10% in negligible and its performance impact on the SQL Server is minimal.

    • Fragmentation is between 10-30% – it is suggested to perform index reorganization

    • Fragmentation is higher than 30% – it is suggested to perform index rebuild

    Best regards,

    Cosmog Hong


    If the answer is the right solution, please click "Accept Answer" and kindly upvote it. If you have extra questions about this answer, please click "Comment".

    Note: Please follow the steps in our Documentation to enable e-mail notifications if you want to receive the related email notification for this thread.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.