When the LLM itself handles search (Scenario 2), it must iteratively “talk to itself” to craft queries, read results, refine queries, and ultimately produce a final answer. Each of those interactions is part of a conversation with the LLM—every query and every chunk of retrieved text gets fed back into the model, consuming tokens in both directions (input + output) for each step.
Why llm token count increased if I directly attached the data source to the llm model
trinadh maddimsetti
0
Reputation points
The question presents two scenarios involving AI search and an LLM:
- Scenario 1:
AI search is done separately.
The search results are passed as context to the LLM.
The LLM processes this context, consuming X tokens in total.
- Scenario 2:
The LLM itself performs the AI search.
It generates search queries, processes results, and formulates a response.
This approach consumes 2X tokens in total.
Key Question:
Why does Scenario 2 consume twice as many tokens as Scenario 1?
Azure AI Search
Azure AI Search
An Azure search service with built-in artificial intelligence capabilities that enrich information to help identify and explore relevant content at scale.
1,340 questions
Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
4,082 questions
Azure AI services
Azure AI services
A group of Azure services, SDKs, and APIs designed to make apps more intelligent, engaging, and discoverable.
3,602 questions