How to create a chatbot that dynamically selects between database and PDF search based on user input?
I have developed two separate chatbots: one that searches data from my database and another that searches from PDF documents. Now, I want to create a unified chatbot that intelligently decides where to search based on the user's input. Could someone please suggest approaches or algorithms for implementing this decision-making functionality in the chatbot? As per example:
User Question --> Do some analysis --> (Search from PDF OR Search from DB) --> Provide output.
Azure AI Search
-
Divakarkumar-3696 • 375 Reputation points
2024-02-19T19:35:41.75+00:00 Hi, For your case, you would require function calling to support database fetch and search from PDF documents.
Please refer here https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/function-calling
Sample example: https://gist.github.com/pamelafox/a3fdea186b687509c02cb186ca203328
Please 'Accept as answer' if it helped so that it can help others in the community looking for help on similar topics.
-
Paritosh Raval • 5 Reputation points
2024-02-20T04:48:13.4433333+00:00 Thanks, @Divakarkumar-3696. However, how will it determine which case to call for database data and which case to call for PDF documents?
-
Divakarkumar-3696 • 375 Reputation points
2024-02-20T05:20:16.53+00:00 Hi,
We should define the functions with a good description, so that model will be able to determine which function to call.
Note: Not all models are capable of function calling.. Please refer here to see list of models (latest versions of gpt-35-turbo and gpt-4) that supports function calling
- First you define list of functions, in your case you need one function for fetching records from datbase and another function to search from PDF. Provide the description of function about what it does along with parameters to be passed to it.
tools =[ { "name": "search_database", "description": "Retrieve data from database" "parameters":{}, "required":[] }, { "name": "search_pdf", "description": "Search from PDF files" "parameters":{}, "required": [] } ]
- Then In the completion call make sure to pass these functions defined in the tools parameter and choice to be auto, to let the model determine which function to call.
response = client.chat.completions.create( model="<REPLACE_WITH_YOUR_MODEL_DEPLOYMENT_NAME>", messages=messages, tools=tools, # YOUR LIST OF FUNCTIONS tool_choice="auto", # DEFAULT - YOU CAN ALSO BE EXPLICIT HERE )
Please 'Accept as answer' if it helped so that it can help others in the community looking for help on similar topics.
-
Paritosh Raval • 5 Reputation points
2024-02-20T10:31:05.3166667+00:00 Thanks again @Divakarkumar-3696 for detailed answer,
I get how the code works, but I'm wondering how it knows which function to use for the JSON response.
Here's an example:
- I keep info about products in a database.
- But rules and regulations, policies are in PDFs.
The database is in one place, like an SQL server, while the PDFs are somewhere else, like blob storage or on my computer. So, if I ask something like "How many days off can an employee take in a month?"—how does it know to use the "search_pdf" function? Do I have to set description or properties for it ? If I do, adding lots of details might get complicated.
-
Paritosh Raval • 5 Reputation points
2024-02-20T10:31:22.74+00:00 Thanks again @Divakarkumar-3696 for detailed answer,I get how the code works, but I'm wondering how it knows which function to use for the JSON response. Here's an example: I keep info about products in a database.
But rules and regulations, policies are in PDFs.
The database is in one place, like an SQL server, while the PDFs are somewhere else, like blob storage or on my computer. So, if I ask something like "How many days off can an employee take in a month?"—how does it know to use the "search_pdf" function? Do I have to set description or properties for it ? If I do, adding lots of details might get complicated.Thanks for your answer, I am little curious so asking it.
-
Divakarkumar-3696 • 375 Reputation points
2024-02-20T12:01:48.48+00:00 The latest models (gpt-3.5-turbo-0125 and gpt-4-turbo-preview) have been trained to both detect when a function should to be called (depending on the input) and to respond with JSON that adheres to the function signature more closely than previous models
Reference: https://platform.openai.com/docs/guides/function-calling
As you stated, it is important to provide meaningful description to the functions and it's properties for the model to better determine. Not sure if you got a chance to take a look at this sample example : https://gist.github.com/pamelafox/a3fdea186b687509c02cb186ca203328. Here in the example, they have 2 functions, one to retrieve sources from the Azure Cognitive Search index and other to retrieve azure sdk related issues from github.
PS: It is the not the model by itself executes the function,it just determines the function to be called based on the inputs. It should be our responsibility to make the function call
-
Paritosh Raval • 5 Reputation points
2024-02-21T12:17:45.14+00:00 Thanks @Divakarkumar-3696, problem is I have 20 + pdfs and a large db. I can not set everything in description or in properties.
As per example, I have created few questions and asked as a description.but if I add any new question it is predicting wrong.
tools = [ { "type": "function", "function": { "name": "search_database", "description": "Retrieve data from database", "parameters": {"type": "object", "properties": {}} , "required": [] } }, { "type": "function", "function": { "name": "search_pdf", "description": """Retrieve data from PDF files Query string to retrieve documents from pdf eg: 'question1', 'question2' """, "search_query": { "type": "string", "description": "Query string to retrieve documents from azure search eg: 'How to debug compute issues'", }, # "parameters": {"type": "object", "properties": {}} , "required": ["search_query"] } } ]
-
Paritosh Raval • 5 Reputation points
2024-02-21T12:21:50.6233333+00:00 test test
-
Divakarkumar-3696 • 375 Reputation points
2024-02-23T06:49:17.98+00:00 Hi, Sorry for delay in response. As you mentioned, You don't need to set everything in the description but it is the actual function that does the job for you. When you say "if I add any new question it is predicting wrong." , were you getting wrong response? Can you please help us with the code you have defined for this search_pdf, sample question and answer you got from the model, to help you better.?
-
Paritosh Raval • 5 Reputation points
2024-02-23T07:15:47.7166667+00:00 import json import os from openai import AzureOpenAI def searchFromDatabase(): print("Searching from database...") def searchFromDocuments(): print("Searching from documents...") client = AzureOpenAI( api_key="my_key", api_version="version", azure_endpoint="my_endpoint" ) messages= [ {"role": "user", "content": "How many leaves an employee can take in one month?"} ] tools = [ { "type": "function", "function": { "name": "search_database", "description": """Retrieve data from database eg. who requested order number xyz? Which products are shipped in Order Number xyz? give me the carrier & tracking number for order number related information """, "parameters": {"type": "object", "properties": {}} , "required": [] } }, { "type": "function", "function": { "name": "search_pdf", "description": """Retrive data from PDF files Query string to retrieve documents from pdf eg: 'What are the potential consequences of violating the xyz Policy?', 'What actions can the company take if an employee refuses to cooperate with xyz?' 'What is xyz Policy within the company?' """, "search_query": { "type": "string", "description": "Query string to retrieve documents ", } # "parameters": {"type": "object", "properties": {}} , "required": ["search_query"] } } ] response = client.chat.completions.create( model="my-model-name", messages= messages, tools= tools, tool_choice="auto", ) print(response.choices[0].message.model_dump_json(indent=2)) tool_call = response.choices[0].message.tool_calls[0] print(tool_call) if tool_call.type == 'function': # Check the name of the function if tool_call.function.name == 'search_database': searchFromDatabase() elif tool_call.function.name == 'search_pdf': searchFromDocuments() else: print("Unknown function") else: print("Unknown tool call type")
@Divakarkumar-3696, Added code above.
so now my logic for serach pdf from blob storage will go into the search_pdf function and to retrive information from db will go to the search_database function.
Sign in to comment