Partitioning vs index

Wild Bill 1 Reputation point
2021-07-02T15:01:07.363+00:00

I'm building a table to store files from a trading partner. This table will be used to stage the data where it is transformed and moved to a separate database. My thought is to use a bit column 'IsActive' to designate which rows a currently in the queue vs which have been previously processed. I expect this table to grow ~5000 rows per day. I'm not sure an index would be best because to my understanding, they are best when applied to columns with many distinct values, the 'IsActive' column will only have 2 values. My next thought was to use horizontal partitioning. I've never used it before, so I thought I would see if you all had any recommendations.

SQL Server
SQL Server
A family of Microsoft relational database management and analysis systems for e-commerce, line-of-business, and data warehousing solutions.
14,494 questions
0 comments No comments
{count} votes

2 answers

Sort by: Most helpful
  1. Erland Sommarskog 120.2K Reputation points MVP
    2021-07-02T21:33:44.373+00:00

    I would use a filtered index:

    CREATE INDEX ActiveIx ON tbl(IsActive) WHERE IsActive = 1
    

    to keep down the size of the index. Of course this index can be extended or modified to be on other column. One thing to keep in mind is that if you have other key columns, you still need to have IsActive as an included column:

    CREATE INDEX ActiveIx ON tbl(SomeOtherKey) INCLUDE (IsActive) WHERE IsActive = 1
    

    It may seem redudant, and indeed it is, but the optimizer is not able to figure it out.

    Partitioning can certainly be an option, but if you have queries which does not include IsActive, they will need to look in both partitions.

    0 comments No comments

  2. CathyJi-MSFT 22,376 Reputation points Microsoft External Staff
    2021-07-05T07:47:07.63+00:00

    Hi @Wild Bill ,

    Suggest you read the blog Filtered Indexes vs. Table Partitioning, it has examples to explain the difference between them. Making a choice depending on your requirement.

    When it comes to reading the data, if you have a filtered index for a specific value, and the query optimizer can use that index, you can often reduce I/O by orders of magnitude because you are storing less of the data in the index itself. With partitioning, SQL Server has to determine which partition the data is stored on, then access it. This can be helpful on very, very large tables – but the care and time taken to implement it and the upkeep required mean it must be very carefully considered and maintained.


    If the response is helpful, please click "Accept Answer" and upvote it, as this could help other community members looking for similar queries.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.