Why do index scans slow down as table size increases (constant result set)?

Abhishek Mali 0 Reputation points
2025-05-29T11:42:05.1766667+00:00

We tested SELECT queries on four Azure SQL tables with different row counts (100K to 4.2M), but each query always returned ~100K rows (±5K). We ran the queries under three configurations:

  • Enforce index usage. (Via adding WITH (INDEX(NAME_OF_INDEX)))
  • Enforce table scan. (Via adding WITH (INDEX(0)))
  • Let Azure SQL decide the plan (default behaviour)

Key observations:

Full table scans were consistently faster than index scans, especially on smaller tables.

Enforcing index usage degraded performance as table size increased — even though the result set size stayed constant.

  • In default mode, the query planner sometimes chose a slower plan, favouring index usage when a scan was clearly faster.

We also observed a "fast/slow/fast/slow" pattern in execution times across table sizes, which didn’t follow an intuitive trend.

We’re looking for clarification on:

  1. What causes this counterintuitive performance behaviour?

Are there known table sizes or conditions where index scans become less efficient?

  1. Could we be missing something in test setup — e.g., caching or a control we didn’t clear?

Happy to share test queries if needed. Any insights or recommended strategies for consistent performance would be greatly appreciated!

Azure SQL Database
{count} votes

1 answer

Sort by: Most helpful
  1. Erland Sommarskog 121.4K Reputation points MVP Volunteer Moderator
    2025-06-01T20:57:55.5366667+00:00

    Full table scans were consistently faster than index scans, especially on smaller tables. Enforcing index usage degraded performance as table size increased — even though the result set size stayed constant.

    This is perfectly to be expected with the queries that you have. You are using SELECT *, that is, you are returning all columns. Furthermore, for all queries but the first, the conditions in the WHERE clause do not seem to be very selective. My guess is that those NOT IN conditions qualify most rows, and the conditions on Scenario, Currency and Measure qualify a good deal of the rows, more than 20 %. (Obviously, I am only speculating from the column names.)

    When you are going to retrieve all columns, a scan of the clustered index will always be faster than a scan of a non-clustered index, since in the latter case there has to be key lookups to get the data. The case when scans of non-clustered shines is when they cover the query, that is, all columns involved in the query are index keys or included columns.

    Forcing an index can help if a seek on the index is possible and useful. This is the case if the WHERE clause qualifies a small number of rows. Because, again, there is a cost for the key lookup and there is a point where a table scan is faster. That point is typically below 10% of selectivity.

    I don't know how these indexes were defined, but you say that they are not composite. For the first query, an index on (Entity, Account) might be a good idea. For the remaining queries, I don't think any useful indexes can be defined. (Based on my speculation above.)

    In default mode, the query planner sometimes chose a slower plan, favouring index usage when a scan was clearly faster.

    The optimizer computes a plan based on statistics that typically has been sampled. From this information optimizer estimates which is the best plan. Sometimes it get its right. Sometimes not. Since I don't see the queries or the plans where this occurred, I can't tell. But if you look at the execution plans in SQL Server Management Studio (which is likely to be more apt for the task than Dbeaver), you can compare estimated and actual values.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.