Unexpected internal error when creating vector index in CosmosDB for MongoDB V core.

Junheok Cheon 100 Reputation points
2024-12-20T10:06:34.9466667+00:00

Hello,
For the context, I will briefly explain what I have right now.

So, I have stored about 8000 documents in cosmos DB for mongo DB Vcore.
Currently working as free tier.

I have used Azure open ai service to create embedding for each documents using model called 'embedding-text-3-large'. So the embedding dimension is 3072 currently.

I am trying to create a vector index for this embedding field using mongo db CLI.

I ran a command provided here to create vector index.
However, I get the error: MongoServerError: [ActivityId=ef7ba639-09ba-4fdf-ac8e-df42246e1e88] An unexpected internal error has occurred.

When I ran a sample code where I just use a key that does not exist and set dimensions to 3, it would work just fine. Is this problem due to limitation of free tier? Or is there something that I am doing wrong?

Thanks.

db.runCommand({
    createIndexes: 'myContainer',
    indexes: [
      {
        name: 'vectorSearchIndexDev',
        key: {
          "embedding": "cosmosSearch"
        },
        cosmosSearchOptions: {
          kind: 'vector-ivf',
          numLists: 3,
          similarity: 'COS',
          dimensions: 3072
        }
      }
    ]
  });
Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
3,561 questions
{count} votes

Accepted answer
  1. romungi-MSFT 48,431 Reputation points Microsoft Employee
    2024-12-20T12:26:52.32+00:00

    @Junheok Cheon I am not familiar with MongoDB but looking at your approach, I believe you are trying something similar to what is mentioned here.

    I do see the recommendations of using up to 2,000 dimensions in size and numLists set to documentCount/1000 for up to 1 million documents.

    Features and limitations

    • Supported distance metrics: L2 (Euclidean), inner product, and cosine.
    • Supported indexing methods: IVFFLAT, HNSW, and DiskANN (Preview)
    • Indexing vectors up to 2,000 dimensions in size.
    • Indexing applies to only one vector per path.
    • Only one index can be created per vector path.

    Do you think if setting both the parameters as per recommendation will work? I did not see a limitation on free tier but I have seen free tier limitations with Azure AI search when integrated with Azure OpenAI BYOD. You can try the above recommendation and then check if it works. If it fails, you might want to upgrade to standard tier or pass the request id to support to understand what could cause an internal error. Thanks!!

    If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

    0 comments No comments

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.