Open AI Text Embedding Dimensions

Robert Bowden 0 Reputation points
2023-03-24T09:56:05.91+00:00

I am using text embeddings for vector search using ElasticSearch's hybrid search (BM25 + KNN). Not looking to use a separate vector database at this time as the hybrid has been working well.

The problem is that Elastic's max dimension size for vector fields is 1024. This works with ADA-001, but I would like to take advantage of the later version(s) - but they are 1536 dimensions.

Is there, or will there be, a way to limit the output dimensions for text embeddings?

In the meantime, will ADA-001 remain available, and accessible via the Azure Open AI API?

Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
2,162 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. YutongTie-MSFT 46,646 Reputation points
    2023-03-25T22:13:06.4633333+00:00

    Hello @Robert Bowden

    Thanks for reaching out to us for this question, I think you are asking the limitation for text embeddings model in Azure OpenAI and the availability of ada-001.

    For your question one, these APIs output fixed-length vectors, which are not adjustable in terms of their dimensionality. I am sorry for the inconveniences. I have forwarded this feedback to product team to see any potential solution.

    However, there are some potential workarounds you could consider but it's not easy to implement. One option would be to perform dimensionality reduction on the vectors after they are generated by the those APIs. Techniques like principal component analysis (PCA) or t-SNE can be used to reduce the dimensionality of the vectors while preserving their important features. This could allow you to reduce the dimensionality of the vectors from 1536 to 1024, which would be compatible with ElasticSearch's maximum dimension size for vector fields.

    For your question two, currently there is no deprecation plan for ada-001 in Azure OpenAI, it has good advantage which is faster and cheaper.

    I hope my answer is helpful, please let me know if you have more questions.

    Regards,

    Yutong

    -Please kindly accept the answer and vote 'Yes' if you feel helpful to support the community, thanks a lot.

    0 comments No comments