How to Configure Cognitive Search Index Vector Fields

Jacob Thomas 55 Reputation points
2023-12-06T19:56:43.6233333+00:00

Hi, I had configured a custom vector embedding pipeline to iterate through documents, extract text, and generate vectors based off of the extracted text. Up until recently everything was working absolutely fine, then Microsoft deprecated my indexes and I had to recreate some new ones using the new search service API in order to continue using vector search.

My main issue is that after following the guidelines for migration, and creating the new indexes, I am unable to make a request using these indexes with "queryType" set to vector. When doing so I get the below response:

An error occurred when calling Azure Cognitive Search: Azure Search: Please assign a proper column/field for vector search. It should be of type Collection(Edm.Single)

I double checked the index definition (included below) and it all looks right, so I'm thinking its possible that I'm making my request improperly, so I'm also including the request body data source properties. Any assistance would be greatly appreciated.

"dataSources": [
        {
          "type": "AzureCognitiveSearch",
          "parameters": {
              "endpoint": "https://my-search-service.search.windows.net",
              "key": "my-search-service-key",
              "embeddingEndpoint": "https://my-openai-service.openai.azure.com/openai/deployments/text-embedding-ada-deployment-name/embeddings?api-version=2023-06-01-preview",
              "embeddingKey": "my-embedding-key",
              "indexName": "vector-index-name",
              "queryType": "vector",
              "inScope": "false",
            }
        }
{
  "@odata.context": "https://my-search-service.search.windows.net/$metadata#indexes/$entity",
  "@odata.etag": "\"XXXXXXXXX\"",
  "name": "vector-index-name",
  "defaultScoringProfile": null,
  "fields": [
    {
      "name": "id",
      "type": "Edm.String",
      "searchable": false,
      "filterable": true,
      "retrievable": true,
      "sortable": true,
      "facetable": true,
      "key": true,
      "indexAnalyzer": null,
      "searchAnalyzer": null,
      "analyzer": null,
      "normalizer": null,
      "dimensions": null,
      "vectorSearchProfile": null,
      "synonymMaps": []
    },
    {
      "name": "title",
      "type": "Edm.String",
      "searchable": true,
      "filterable": true,
      "retrievable": true,
      "sortable": true,
      "facetable": false,
      "key": false,
      "indexAnalyzer": null,
      "searchAnalyzer": null,
      "analyzer": null,
      "normalizer": null,
      "dimensions": null,
      "vectorSearchProfile": null,
      "synonymMaps": []
    },
    {
      "name": "content",
      "type": "Edm.String",
      "searchable": true,
      "filterable": true,
      "retrievable": true,
      "sortable": false,
      "facetable": false,
      "key": false,
      "indexAnalyzer": null,
      "searchAnalyzer": null,
      "analyzer": null,
      "normalizer": null,
      "dimensions": null,
      "vectorSearchProfile": null,
      "synonymMaps": []
    },
    {
      "name": "titleVector",
      "type": "Collection(Edm.Single)",
      "searchable": true,
      "filterable": false,
      "retrievable": true,
      "sortable": false,
      "facetable": false,
      "key": false,
      "indexAnalyzer": null,
      "searchAnalyzer": null,
      "analyzer": null,
      "normalizer": null,
      "dimensions": 1536,
      "vectorSearchProfile": "vector-config",
      "synonymMaps": []
    },
    {
      "name": "contentVector",
      "type": "Collection(Edm.Single)",
      "searchable": true,
      "filterable": false,
      "retrievable": true,
      "sortable": false,
      "facetable": false,
      "key": false,
      "indexAnalyzer": null,
      "searchAnalyzer": null,
      "analyzer": null,
      "normalizer": null,
      "dimensions": 1536,
      "vectorSearchProfile": "vector-config",
      "synonymMaps": []
    }
  ],
  "scoringProfiles": [],
  "corsOptions": null,
  "suggesters": [],
  "analyzers": [],
  "normalizers": [],
  "tokenizers": [],
  "tokenFilters": [],
  "charFilters": [],
  "encryptionKey": null,
  "similarity": {
    "@odata.type": "#Microsoft.Azure.Search.BM25Similarity",
    "k1": null,
    "b": null
  },
  "semantic": null,
  "vectorSearch": {
    "algorithms": [
      {
        "name": "hnsw-vector-config",
        "kind": "hnsw",
        "hnswParameters": {
          "metric": "cosine",
          "m": 4,
          "efConstruction": 400,
          "efSearch": 500
        },
        "exhaustiveKnnParameters": null
      }
    ],
    "profiles": [
      {
        "name": "vector-config",
        "algorithm": "hnsw-vector-config",
        "vectorizer": null
      }
    ],
    "vectorizers": []
  }
}
Azure AI Search
Azure AI Search
An Azure search service with built-in artificial intelligence capabilities that enrich information to help identify and explore relevant content at scale.
799 questions
Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
2,441 questions
Azure AI services
Azure AI services
A group of Azure services, SDKs, and APIs designed to make apps more intelligent, engaging, and discoverable.
2,515 questions
0 comments No comments
{count} votes

Accepted answer
  1. brtrach-MSFT 15,531 Reputation points Microsoft Employee
    2023-12-12T04:47:06.51+00:00

    @Jacob Thomas I'm glad that you were able to resolve your issue and thank you for posting your solution so that others experiencing the same thing can easily reference this! Since the Microsoft Q&A community has a policy that "The question author cannot accept their own answer. They can only accept answers by others (Opens in new window or tab)", I'll repost your solution in case you'd like to "Accept (Opens in new window or tab)" the answer.

    Issue: Your environment was working fine but some of your indexes were deprecated by Microsoft. After this, you were unable to make a request using indexes with "queryType" set to vector. When doing so you get the below response:

    An error occurred when calling Azure Cognitive Search: Azure Search: Please assign a proper column/field for vector search. It should be of type Collection(Edm.Single)

    **Solution: "**I just figured it out. It seems that the "fieldMapping" property is required for vector embeddings, at least with how my vectors are configured. If I had to guess it might have to do with the fact that I'm using two vector fields, but the property is supposed to only be required for the Azure Cosmos DB/MongoDB... at least according to the documentation. Below is the property format I used to get the request working."

    "fieldsMapping": {
                      "vectorFields": [
                          "titleVector",
                          "contentVector"
                      ]
                  },
    
    
    

    We are always here to assist you. Please reach out if you require further assistance in the future. We would appreciate your consideration in accepting this post as the solution so this can be marked as resolved. Thank you for your understanding.


2 additional answers

Sort by: Most helpful
  1. brtrach-MSFT 15,531 Reputation points Microsoft Employee
    2023-12-08T05:46:28.5366667+00:00

    @Jacob Thomas Based on the index definition you provided, it looks like you have already defined the "titleVector" and "contentVector" fields as Collection(Edm.Single) type, which is correct. However, I noticed that you have set the "vectorSearchProfile" property of these fields to "vector-config".

    Have you defined the "vector-config" vector search profile in your index? If not, you need to define it first before you can use it as the value of the "vectorSearchProfile" property.

    Here is an example of how to define a vector search profile in an index:

    "vectorSearch": {
        "algorithms": [
            {
                "name": "my-algorithm",
                "kind": "my-kind",
                "myAlgorithmParameters": {
                    "myParameter1": "myValue1",
                    "myParameter2": "myValue2"
                },
                "exhaustiveKnnParameters": null
            }
        ],
        "profiles": [
            {
                "name": "my-profile",
                "algorithm": "my-algorithm",
                "vectorizer": null
            }
        ],
        "vectorizers": []
    }
    
    
    

    You can replace "my-algorithm", "my-kind", "myAlgorithmParameters", "my-profile", and "vectorizers" with your own values. Once you have defined the vector search profile, you can use its name as the value of the "vectorSearchProfile" property of your vector fields.


  2. Jacob Thomas 55 Reputation points
    2023-12-11T15:08:14.58+00:00

    I just figured it out. It seems that the "fieldMapping" property is required for vector embeddings, at least with how my vectors are configured. If I had to guess it might have to do with the fact that I'm using two vector fields, but the property is supposed to only be required for the Azure Cosmos DB/MongoDB... at least according to the documentation. Below is the property format I used to get the request working.

    "fieldsMapping": {
                      "vectorFields": [
                          "titleVector",
                          "contentVector"
                      ]
                  },