Using Vector Search With An Existing Index In Azure OpenAI In Python

Bao, Jeremy (Cognizant) 105 Reputation points
2024-02-24T00:30:43.9633333+00:00

I have created an index containing an embedding in my Azure AI Search resource, using the "Import and Vectorize Data" feature, as described here: https://learn.microsoft.com/en-us/azure/search/search-get-started-portal-import-vectors#prepare-sample-data I currently have this code in Python (more or less), which uses RAG to provide a chatbot with context in order to handle user inputs.

load_dotenv()
azure_oai_endpoint = os.getenv("AZURE_OAI_ENDPOINT")
azure_oai_key = os.getenv("AZURE_OAI_KEY")
azure_oai_model = os.getenv("AZURE_OAI_MODEL")
azure_search_endpoint = os.getenv("AZURE_SEARCH_ENDPOINT")
azure_search_key = os.getenv("AZURE_SEARCH_KEY")
azure_search_index = os.getenv("AZURE_SEARCH_INDEX")

client = AzureOpenAI(
	base_url=
	api_key=azure_oai_key,
    api_version=
)

prompt = {

ourMessages = []

# Currently stateless
message_history_length = 0

extension_config = dict(
    dataSources = [
        { 
            "type": "AzureCognitiveSearch", 
            "parameters": { 
                "endpoint":azure_search_endpoint, 
                "key": azure_search_key, 
                "indexName": azure_search_index
            }
        }
    ]
)

while True:
    text = input('\nEnter a question, or q to quit:\n')

    if (text.lower() == 'q'):
        break

    ourMessages.append({"role": "user", "content": text})

    conversation = [prompt] + ourMessages[-(message_history_length + 1): ]
	
    response = client.chat.completions.create(
        model = azure_oai_model,
        temperature = 0.5,
        max_tokens = 1600,
        messages = conversation,
        extra_body = extension_config
    )

The above code seems to work in cases where the index does not include any sort of embedding scheme. The relevant information is usually retrieved and used to construct the answer. How should I modify this code to use an index that has an embedding vector (called vector) as a field and a vector profile? On the Azure OpenAI studio chat playground, you can click a box labelled "Add vector search to this search resource" when adding the index as a data source. How can I do that in the Python SDK? Do I need to modify extension_config somehow? I am finding it difficult to locate the relevant documentation.

Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
4,092 questions
0 comments No comments
{count} votes

Accepted answer
  1. Debarchan Sarkar - MSFT 1,131 Reputation points Microsoft Employee
    2024-02-25T01:51:45.0233333+00:00

    While I don't have the exact details or a specific example on your scenario, you can add the information about vector search in the extension_config variable. In your existing configuration dictionary, there's a section for "parameters" where you specify Azure Cognitive Search endpoint and key details. You can extend this section to include vector search parameters. Here's an example of how you might need to modify the extension_config dictionary:

    
    extension_config = dict(
    
        dataSources = [
    
            { 
    
                "type": "AzureCognitiveSearch",
    
                "parameters": {
    
                    "endpoint": azure_search_endpoint,
    
                    "key": azure_search_key,
    
                    "indexName": azure_search_index,
    
                    "vectorProfile": "your_vector_profile_name",  # Specify your vector profile name here
    
                    "vectorField": "vector"  # Field in your index containing vectors
    
                }       
    
            }
    
        ])
    
    

    This code specifies both the vector field and the vector profile you are using in your index. Please replace "your_vector_profile_name" with your actual vector profile name. This would be the name of the profile you used when preparing your sample data for import and vectorization. Remember to keep the "vectorField" value as "vector" if your vector field in the index is named vector. As a next step, try applying these modifications to your Python code and test if it now correctly uses the embedding vector for search. Since this is a relatively new feature and the official documentation might not cover all aspects in depth, should you face any issues, consider reaching out to Azure support or user community forums for more assistance

    1 person found this answer helpful.

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.