Add Vectors / Hybrid Search with Cognitive Search on (NOSQL) Cosmos DB Data

Adrien O'Hana 15 Reputation points
2023-09-21T15:00:10.0966667+00:00

Hello,

Trying to set up semantic search on a cosmos db (nosql).

A) How to add vectors to cosmos db content ? The simple way for me would be to compute them myself and add a "vector" key to each database item. Is there another way ? Maybe I can automate an Azure AI pipeline that constructs a Vector Store next to it ?

B) How should I tackle the fact that I want multiple vectors per item ? Ideally I would like a cognitive search indexer capable of updating an index based on sentences of each database element. Is it feasible ?

Thanks for any help.

Best,

Adrien

Azure AI Search
Azure AI Search
An Azure search service with built-in artificial intelligence capabilities that enrich information to help identify and explore relevant content at scale.
1,357 questions
Azure Cosmos DB
Azure Cosmos DB
An Azure NoSQL database service for app development.
1,915 questions
{count} votes

3 answers

Sort by: Most helpful
  1. Azar 29,520 Reputation points MVP Volunteer Moderator
    2023-09-25T19:54:42.5266667+00:00

    Hi @Adrien O'Hana

    • Compute vectors for your text data and add a "vector" field to each Cosmos DB item. You can automate this with an Azure AI pipeline.
    • After computing the vectors for your text data, you can add a "vector" field to each item in your Cosmos DB documents. This field should store the vector representation of the text data
    • Regarding Cognitive Search Indexer, Azure Cognitive Search is primarily designed for full-text indexing and search, and it may not natively support multiple vectors per item or vector search. To achieve semantic search with multiple vectors, you might need to implement a custom solution that leverages the computed vectors and performs similarity search.

    Find below the official documentation for more info

    If you find this answer helpful kindly accept the answer thanks much.

    0 comments No comments

  2. Janarthanan S 700 Reputation points
    2023-10-08T06:12:14.3566667+00:00

    Hi @Adrien O'Hana

    To add vectors to your database's collection, you first need to create the embeddings by using your own model, Azure OpenAI Embeddings, or another API (such as Hugging Face on Azure).

    db.exampleCollection.insertMany([
      {name: "Eugenia Lopez", bio: "Eugenia is the CEO of AdvenureWorks.", vectorContent: [0.51, 0.12, 0.23]},
      {name: "Cameron Baker", bio: "Cameron Baker CFO of AdvenureWorks.", vectorContent: [0.55, 0.89, 0.44]},
      {name: "Jessie Irwin", bio: "Jessie Irwin is the former CEO of AdventureWorks and now the director of the Our Planet initiative.", vectorContent: [0.13, 0.92, 0.85]},
      {name: "Rory Nguyen", bio: "Rory Nguyen is the founder of AdventureWorks and the president of the Our Planet initiative.", vectorContent: [0.91, 0.76, 0.83]},
    ]);
    
    
    

    Updating an index :

    To perform a vector search, use the $search aggregation pipeline stage in a MongoDB query. To use the cosmosSearch index, use the new cosmosSearch operator.

    {   "$search": {     "cosmosSearch": {         "vector": <vector_to_search>,         "path": "<path_to_property>",         "k": <num_results_to_return>       },       "returnStoredSource": True }},   {     "$project": { "<custom_name_for_similarity_score>": {            "$meta": "searchScore" },             "document" : "$$ROOT"         }   } }
    

    You can find the detailed documentation:

    https://learn.microsoft.com/en-us/azure/cosmos-db/mongodb/vcore/vector-search

    Please comment below if you need any assistance on the same. Happy to help!

    Please kindly accept the answer and vote 'Yes' if you feel helpful to support the community, thank you for wonderful question and its interesting.

    Regards,

    Janarthanan S

    0 comments No comments

  3. Adrien O'Hana 15 Reputation points
    2023-11-16T11:59:29.0233333+00:00

    Still looking for a solution.

    Cognitive Search (now called AI search) seems to have a cognitive skills section where I could split by sentence or paragraph and embed with semantic vectors, but I see no example anywhere (the only example notebook was deleted)

    https://learn.microsoft.com/en-gb/samples/azure-samples/azure-search-power-skills/azure-openai-embeddings-generator/

    Anyone has an example of an automated text splitting + embedding pipeline from a cosmos db to a cognitive search index ?

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.