Are Vector Databases needed for Azure OpenAi

WMorath.SA 20 Reputation points
2023-11-03T14:22:45.48+00:00

I'm currently leveraging Azure's OpenAI capabilities to have an inhouse chat bot such as many other are doing. One of the questions I was asked in terms of building the framework to support this is Vector Databases needed for this?

I'm primarily working with unstructured data such as PDF's. I guess my question really is, are they needed? I see there is an option to utilize Vector Search if you use the embedded Ada model.

My assumption is that the Vector Search option in OpenAi is a solution to not building up a Vector Database?
Also, if Keyword search is successfully working in terms of providing users with the right information, is Vector Search even needed?

Thanks,

Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
1,396 questions
0 comments No comments
{count} votes

Accepted answer
  1. Pramod Valavala 19,356 Reputation points Microsoft Employee
    2023-11-03T14:53:26.0666667+00:00

    @WMorath.SA Vector Search is what enables searching large troves of unstructured data by semantic understanding. While not strictly necessary, in many cases you may see better results, especially when the input prompt by your users does not have the exact keywords as in your data.

    Vector Databases still rely on using embeddings output from models like ada, but provide an index that speeds up searches compared to you manually computing distances for each chunk of your data set on each query.

    OpenAI on your Data uses Azure Cognitive Search as the Vector Database, but you are free to build your own solution on any that suites your requirements. Even Redis can be used here for example.

    And in your case, Azure Cognitive Search is a good option since it includes the necessary processing pipelines to automatically handle PDFs, saving time on implementing your own pipeline if you don't already have one.

    Overall, you don't really need to use Vector Databases for every scenario but usually for knowledgebase scenarios, it might be worth considering it.

    1 person found this answer helpful.

0 additional answers

Sort by: Most helpful