How can i improve the efficiency of file search without affecting much performance

Rahul kumar 15 Reputation points
2025-01-16T12:04:57.3533333+00:00

User's image

I uploaded a file (.json ) and asked a simple query to get me any two carrier's details and my input token usage with this simple query, marked 17k+. Why is this happening and how can i improve the file search. I want to feed the assistant model our data which is either in csv, xlsx and .json without consuming that much token!

Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
3,642 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Saideep Anchuri 1,950 Reputation points Microsoft Vendor
    2025-01-17T04:59:30.3533333+00:00

    Hi Rahul kumar

    Welcome to Microsoft Q&A Forum, thank you for posting your query here!

    Input token usage is very high because of internal operation file search to get accurate answers. It includes splitting complex queries to simpler ones, running both keyword and semantics search across vector store. how-it-works

    You can try adjusting below params to reduce token usage.

    1 the chunk size and overlap settings,

    1. max number of chunks that need to be added,

    3, reducing the number of results returned, and

    1. optimizing your queries to be more concise by reducing the need to split the query again to simpler queries.

    Detailed guide on chunking and overlapping can be found here chunking-examples

    Sample code for adjusting max number of results -                                                                          

        assistant = client.beta.assistants.create(name="Financial Analyst Assistant", instructions="You are an expert financial analyst.",   
      model="gpt-4-turbo", tools= [{"type": "file_search", "file_search": {"max_num_results": 2}}])
    

    Thank You.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.