Vector Index Lookup

Important

Vector, Vector DB and Faiss Index Lookup tools are deprecated and will be retired soon. Migrated to the new Index Lookup tool (preview).

Vector Index Lookup is a tool tailored for querying within an Azure Machine Learning vector index. It empowers users to extract contextually relevant information from a domain knowledge base.

Prerequisites

  • Follow the instructions from sample flow Bring your own Data QnA to prepare a vector index as an input.

  • Based on where you put your vector index, the identity used by the prompt flow runtime should be granted with certain roles. See the steps to assign an Azure role.

    Location Role
    Workspace datastores or workspace default blob AzureML Data Scientist
    Other blobs Storage Blob Data Reader

Note

When legacy tools switch to code-first mode, if you encounter the error embeddingstore.tool.vector_index_lookup.search' is not found, see the troubleshooting guidance.

Inputs

The tool accepts the following inputs:

Name Type Description Required
path string blob/AML asset/datastore URL for the VectorIndex

blob URL format:
https://<account_name>.blob.core.windows.net/<container_name>/<path_and_folder_name>.

Azure Machine Learning asset URL format:
azureml://subscriptions/<your_subscription>/resourcegroups/<your_resource_group>>/workspaces/<your_workspace>/data/<asset_name and optional version/label>

Machine Learning datastore URL format:
azureml://subscriptions/<your_subscription>/resourcegroups/<your_resource_group>/workspaces/<your_workspace>/datastores/<your_datastore>/paths/<data_path>
Yes
query string, list[float] Text to be queried
or
Target vector to be queried, which the LLM tool can generate.
Yes
top_k integer Count of top-scored entities to return. Default value is 3. No

Outputs

The following example is for a JSON format response returned by the tool, which includes the top-k scored entities. The entity follows a generic schema of vector search result provided by promptflow-vectordb SDK. For the Vector Index Search, the following fields are populated:

Field Name Type Description
text string Text of the entity.
score float Depends on index type defined in the vector index. If the index type is Faiss, the score is L2 distance. If the index type is Azure AI Search, the score is cosine similarity.
metadata dict Customized key-value pairs provided by the user when creating the index.
original_entity dict Depends on index type defined in the vector index. The original response JSON from the search REST API.
[
  {
    "text": "sample text #1",
    "vector": null,
    "score": 0.0,
    "original_entity": null,
    "metadata": {
      "link": "http://sample_link_1",
      "title": "title1"
    }
  },
  {
    "text": "sample text #2",
    "vector": null,
    "score": 0.07032840698957443,
    "original_entity": null,
    "metadata": {
      "link": "http://sample_link_2",
      "title": "title2"
    }
  },
  {
    "text": "sample text #0",
    "vector": null,
    "score": 0.08912381529808044,
    "original_entity": null,
    "metadata": {
      "link": "http://sample_link_0",
      "title": "title0"
    }
  }
]

Deploying to an online endpoint

When you deploy a flow containing the vector index lookup tool to an online endpoint, there's an extra step to set up permissions. During deployment through the web pages, there's a choice between System-assigned and User-assigned Identity types. Either way, using the Azure portal (or a similar functionality), add the "AzureML Data Scientist" role of Azure Machine learning studio to the identity assign to the endpoint.