Column Store Index taking more space than row store

DhanashreePrasun 61 Reputation points
2022-03-23T08:20:10.387+00:00

Hi,

I have a table with rowstore index and about 428M records.
I had read that Column Store index has better performance. So I created a replica of the table and made it as columnstore index table with one column as primary key and started copying records from the original table in batches. The batches were in Millions so definitely more than the minimum required(100K+) for column store index to compress and perform. I just copied about 261M records.

But when I checked the space used for the original and the column store indexed table, the row store index table has index size 357312 KB, but the column store is 9379560 KB. Way too much in storage.

Any idea what could have been done wrong here. Is it worth moving ahead and changing the rowstore to columnstore? Appreciate any insights on this.

Thanks

Azure SQL Database
SQL Server
SQL Server
A family of Microsoft relational database management and analysis systems for e-commerce, line-of-business, and data warehousing solutions.
12,892 questions
{count} votes

Accepted answer
  1. CathyJi-MSFT 21,096 Reputation points Microsoft Vendor
    2022-03-24T07:50:39.857+00:00

    Hi @DDDPPPP-7114,

    >Is it worth moving ahead and changing the rowstore to columnstore?

    Quote from MS document;

    The nonclustered column index contains a copy of part or all of the rows and columns in the underlying table.

    There are some recommendations for best practice usage for each of the index types(row index or column index), hope this could help you.

    •Use columnstore indexes on large tables (with at least a few million records), that are not being updated/deleted frequently
    •Columnstore indexes perform best on static data, such as in OLAP workloads, with a lot of queries that simply reads the data from the tables, or bulk loading new data periodically
    •Columnstore indexes excel in scanning and performing aggregations on big data ranges (doing SUM, AVG, COUNT, etc), because they are able to process around 900 rows in one batch, while traditional B-tree index process one-by-one (up until SQL Server 2019, which added a batch mode for row-based workload)
    •Use B-tree indexes on highly transactional workloads, when your table is being frequently modified (updates, deletes, inserts)
    •B-tree indexes will usually perform better in queries with high selectivity, for example, when you are returning a single value or small number of values, or if you are querying a small range of values (SEEKing for a value)

    Refer to the blog Rows or Columns — where should I put my index on?


    If the answer is helpful, please click "Accept Answer" and kindly upvote it. If you have extra questions about this answer, please click "Comment".

    0 comments No comments

1 additional answer

Sort by: Most helpful
  1. Alberto Morillo 32,991 Reputation points MVP
    2022-03-23T10:51:56.8+00:00

    The table with columnstore consumes more space because you created a non-clustered columnstore index. This means you have a duplicated copy of the data of the table on the columnstore index. So even though you have all that great compression with columnstore, it did not save any space because is a second much smaller copy of your data.

    The Columnstore index includes most or all of the columns in your table. Remember when you created teh columnstore index you did not have the option to specify which columns will be part of the columnstore index.

    Starting SQL Server 2014, we have the ability to make a clustered Columnstore index which became our primary data store and we can reduce then the amount of disk space being taken up by those indexes