Why do index scans slow down as table size increases (constant result set)?

Question

Why do index scans slow down as table size increases (constant result set)?

Abhishek Mali 0

We tested SELECT queries on four Azure SQL tables with different row counts (100K to 4.2M), but each query always returned ~100K rows (±5K). We ran the queries under three configurations:

Enforce index usage. (Via adding WITH (INDEX(NAME_OF_INDEX)))
Enforce table scan. (Via adding WITH (INDEX(0)))
Let Azure SQL decide the plan (default behaviour)

Key observations:

Full table scans were consistently faster than index scans, especially on smaller tables.

Enforcing index usage degraded performance as table size increased — even though the result set size stayed constant.

In default mode, the query planner sometimes chose a slower plan, favouring index usage when a scan was clearly faster.

We also observed a "fast/slow/fast/slow" pattern in execution times across table sizes, which didn’t follow an intuitive trend.

We’re looking for clarification on:

What causes this counterintuitive performance behaviour?

Are there known table sizes or conditions where index scans become less efficient?

Could we be missing something in test setup — e.g., caching or a control we didn’t clear?

Happy to share test queries if needed. Any insights or recommended strategies for consistent performance would be greatly appreciated!

Erland Sommarskog 121.4K Reputation points MVP Volunteer Moderator

2025-05-29T12:17:09.9466667+00:00

Nothing of what you say seems very puzzling to me, but before I start typing answer, can you share the queries, so I can make my answer less abstract?
Adithya Prasad K 1,375 Reputation points Microsoft External Staff Moderator

2025-05-29T12:56:43.1033333+00:00

Hi Abhishek Mali
It sounds like you're encountering some interesting performance issues with index scans on Azure SQL as your table sizes increase, even though your result set size remains constant. That's definitely a perplexing situation!
Understanding Index Behavior: You're observing that full table scans are sometimes faster than index scans. This can be explained by how SQL Server optimizes queries. With larger tables, the overhead of navigating indexes can sometimes outweigh the efficiency of a full scan, especially if the number of rows returned relative to the total number of rows in the table is significant. Index scans can become less efficient if the data distribution is not optimal or if the query requires accessing a large portion of the table.
Query Plan Analysis: It might be useful to review the execution plans for your queries. This can give you insights into whether the optimizer is making decisions that lead to slower performance, like opting for an index scan when a full scan would be more efficient. Look out for warning messages in the execution plans as they can provide hints about potential issues
Index Configuration: Make sure your indexes are well-optimized. Review the index columns and their order, as well as any included columns. This can affect how the optimizer makes its decisions.
Cascading Effects: As you increase table size, other factors such as caching, concurrency, and memory may also come into play. Ensure your testing environment is consistent. Clear the cache before tests if needed, as it might impact the performance measurements you are noting, especially with large data sets.

To delve deeper into your scenario, could you provide:
-The specific SQL queries you're using for the tests? It will help clarify if there's anything unusual in the structure that needs addressing.
-Information about the indexes on the tables you're querying—what types are you using and how are they structured?
-Details about your hardware and configuration settings. Are there any specific resource constraints (CPU, memory, I/O) during the tests?
-How are you measuring the execution time? Are you using the same setup for each test, ensuring that external factors are minimized?
-Any additional workload running on the database during your tests that might influence the performance results?

Hope this gives you a good direction and I’m here to help further!
Adithya Prasad K 1,375 Reputation points Microsoft External Staff Moderator

2025-05-30T09:06:47.3433333+00:00

Hi Abhishek Mali
We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet. In case if you have any resolution please do share that same with the community as it can be helpful to others. Otherwise, will respond with more details and we will try to help
Erland Sommarskog 121.4K Reputation points MVP Volunteer Moderator

2025-05-30T18:22:21.8666667+00:00

And just to enforce: don't be deterred by Adithya's AI-type-of-answer. Without seeing the queries, the field is so open to speculation that it is difficult to cover all possibilities. If we see the queries, we can answer from these queries, and the answer will be easier to digest. If you also can share the execution plans, it will be even easier.
Abhishek Mali 0 Reputation points

2025-06-01T20:15:21.59+00:00

@Erland Sommarskog @Adithya Prasad K Sorry for the delay in responding. I was looking into how I could get the "Execution plan" which I wasn't able to figure out till today.
Abhishek Mali 0 Reputation points

2025-06-01T20:27:07.7566667+00:00
For some reason I see this issue while submitting the additional info I was gonna give.

Here's what I've tried:

To submit entire response at once.

To submit my response in chunks as well.
Abhishek Mali 0 Reputation points

2025-06-01T20:33:40.82+00:00

Due to above issue, I am adding screenshots of my answer instead.
Abhishek Mali 0 Reputation points

2025-06-01T20:34:50.33+00:00
Abhishek Mali 0 Reputation points

2025-06-01T20:36:48.6933333+00:00
Abhishek Mali 0 Reputation points

2025-06-01T20:37:31.4566667+00:00
Abhishek Mali 0 Reputation points

2025-06-01T20:38:09+00:00
Abhishek Mali 0 Reputation points

2025-06-01T20:38:57.25+00:00
Abhishek Mali 0 Reputation points

2025-06-01T20:39:39.47+00:00

1 answer

Your answer

Erland Sommarskog 121.4K Reputation points MVP Volunteer Moderator

2025-05-29T12:17:09.9466667+00:00

Nothing of what you say seems very puzzling to me, but before I start typing answer, can you share the queries, so I can make my answer less abstract?
Adithya Prasad K 1,375 Reputation points Microsoft External Staff Moderator

2025-05-30T09:06:47.3433333+00:00

Hi Abhishek Mali
We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet. In case if you have any resolution please do share that same with the community as it can be helpful to others. Otherwise, will respond with more details and we will try to help
Erland Sommarskog 121.4K Reputation points MVP Volunteer Moderator

2025-05-30T18:22:21.8666667+00:00

And just to enforce: don't be deterred by Adithya's AI-type-of-answer. Without seeing the queries, the field is so open to speculation that it is difficult to cover all possibilities. If we see the queries, we can answer from these queries, and the answer will be easier to digest. If you also can share the execution plans, it will be even easier.
Abhishek Mali 0 Reputation points

2025-06-01T20:15:21.59+00:00

@Erland Sommarskog @Adithya Prasad K Sorry for the delay in responding. I was looking into how I could get the "Execution plan" which I wasn't able to figure out till today.
Abhishek Mali 0 Reputation points

2025-06-01T20:27:07.7566667+00:00

For some reason I see this issue while submitting the additional info I was gonna give.

Here's what I've tried:

To submit entire response at once.

To submit my response in chunks as well.
Abhishek Mali 0 Reputation points

2025-06-01T20:33:40.82+00:00

Due to above issue, I am adding screenshots of my answer instead.
Abhishek Mali 0 Reputation points

2025-06-01T20:34:50.33+00:00
Abhishek Mali 0 Reputation points

2025-06-01T20:36:48.6933333+00:00
Abhishek Mali 0 Reputation points

2025-06-01T20:37:31.4566667+00:00
Abhishek Mali 0 Reputation points

2025-06-01T20:38:09+00:00
Abhishek Mali 0 Reputation points

2025-06-01T20:38:57.25+00:00
Abhishek Mali 0 Reputation points

2025-06-01T20:39:39.47+00:00

Answer 1

Erland Sommarskog 121.4K MVP Volunteer Moderator

Full table scans were consistently faster than index scans, especially on smaller tables. Enforcing index usage degraded performance as table size increased — even though the result set size stayed constant.

This is perfectly to be expected with the queries that you have. You are using SELECT *, that is, you are returning all columns. Furthermore, for all queries but the first, the conditions in the WHERE clause do not seem to be very selective. My guess is that those NOT IN conditions qualify most rows, and the conditions on Scenario, Currency and Measure qualify a good deal of the rows, more than 20 %. (Obviously, I am only speculating from the column names.)

When you are going to retrieve all columns, a scan of the clustered index will always be faster than a scan of a non-clustered index, since in the latter case there has to be key lookups to get the data. The case when scans of non-clustered shines is when they cover the query, that is, all columns involved in the query are index keys or included columns.

Forcing an index can help if a seek on the index is possible and useful. This is the case if the WHERE clause qualifies a small number of rows. Because, again, there is a cost for the key lookup and there is a point where a table scan is faster. That point is typically below 10% of selectivity.

I don't know how these indexes were defined, but you say that they are not composite. For the first query, an index on (Entity, Account) might be a good idea. For the remaining queries, I don't think any useful indexes can be defined. (Based on my speculation above.)

In default mode, the query planner sometimes chose a slower plan, favouring index usage when a scan was clearly faster.

The optimizer computes a plan based on statistics that typically has been sampled. From this information optimizer estimates which is the best plan. Sometimes it get its right. Sometimes not. Since I don't see the queries or the plans where this occurred, I can't tell. But if you look at the execution plans in SQL Server Management Studio (which is likely to be more apt for the task than Dbeaver), you can compare estimated and actual values.

Abhishek Mali 0 Reputation points

2025-06-02T12:21:43.4933333+00:00
Thanks a lot for the detailed explanation — it really helped clear up several of the doubts I had.

I do have a few follow-up questions for clarification:

Criteria to pick between index scan and full table scan

The reply mentions that the optimizer uses statistics and cost estimation to choose a plan. But is it influenced by Elastic Pools, memory constraints, or plan caching?

Regarding "What are the guideline volumes in which we should expect to see efficiency crossovers?"

You've mentioned that table scans tend to outperform index usage when the selectivity crosses a certain point — roughly more than 10% of total rows. Is there any official documentation, benchmark, or internal guidance from Microsoft that defines such crossover thresholds more precisely? It would help us better plan indexing strategy for large tables where query result set size remains constant.

For workloads where we always fetch a large but fixed number of rows (e.g. ~100K), is there a recommended query or indexing pattern we should follow?
Erland Sommarskog 121.4K Reputation points MVP Volunteer Moderator

2025-06-02T21:21:34.0533333+00:00
Criteria to pick between index scan and full table scan

The reply mentions that the optimizer uses statistics and cost estimation to choose a plan. But is it influenced by Elastic Pools, memory constraints, or plan caching?

Memory can affect the plan choice, but it would not affect the choice between a CI scan and an Index Seek + Key Lookup, since such a plan does not require memory. To my knowing elastic pool has no effect on the query plan, but I need to confess that I have not worked with them.

You've mentioned that table scans tend to outperform index usage when the selectivity crosses a certain point — roughly more than 10% of total rows. Is there any official documentation, benchmark, or internal guidance from Microsoft that defines such crossover thresholds more precisely?

No, this is based on cost estimation, and there is no fixed number. For one thing, the percentage will be lower for a narrow table with only a single column not present in the non-clustered index, than for a wide table where each row has a page of its own. For the simple reason that the scan is much more expensive for the wide table.

For workloads where we always fetch a large but fixed number of rows (e.g. ~100K), is there a recommended query or indexing pattern we should follow?

The absolute number of rows does not really matter. If you want 100000 rows out of 125000 rows, a scan will always be the fastest way. If there are 125 million rows it is completely different thing.

Generally, indexes should match selective search criterias, and the most selective should be first. If there is a column for which the query is for a range, that column should be last. And, yes, composite indexes are often powerful.
Abhishek Mali 0 Reputation points

2025-06-03T06:36:03.2466667+00:00

~~Hi @Erland Sommarskog following up on answer to above questions?~~ Sorry about that, needed a page refresh

Share via

Why do index scans slow down as table size increases (constant result set)?

1 answer

Your answer