@S-A You're correct that using multiple indexers allows indexing the same data source in parallel, accelerating the process.
These Azure Cognitive Search docs provide some additional details on how this works:
- This doc provides strategies for indexing large data sets in Azure AI Search. It mentions that if two indexers retrieve the same item, the indexer that completes last will overwrite the existing indexed document
- Indexers track state about what they've already indexed using a high water mark. This prevents re-indexing the full source each time
- If one indexer crashes part way through a large item, the next indexer will start over on that item rather than continuing where it left off.
So in summary:
- Multiple indexers pull data in parallel to accelerate indexing
- They track state to avoid re-indexing existing docs
- Last write wins if indexing same doc
- Items are re-indexed fully if crashed midway
Using multiple indexers is a good approach to scale out indexing throughput. The tradeoff is index consistency if overlapping indexing occurs.
Hope that helps. Let us know if you have additional questions about using multiple Indexers.
-Grace