Why is the response unstable when I use text-embedding-ada-002?

Question

Why is the response unstable when I use text-embedding-ada-002?

zcc 20

I am using the text-embedding-ada-002 model provided by Azure OpenAI to generate embeddings, but its response time is unstable, with a 40% chance of being very slow. Usually, my response finishes within 2 seconds, but sometimes it can take up to 20 seconds, or even longer. What could be the reason for this? Here is my code:

response = client.embeddings.create(
                            input=input_text,
                            model=vector_engine,
                        )
```"

AshokPeddakotla-MSFT 35,971 Reputation points Moderator

2024-02-06T10:37:59.68+00:00

zcc Greetings!

Can you confirm if you using text-embedding-ada-002 (Version 2)?

There could be several reasons for the unstable response time of the text-embedding-ada-002 model. One possible reason could be the size of your input text.

The maximum length of input text for the embedding models is 2048 tokens. If your input text exceeds this limit, it may cause the model to take longer to generate embeddings.

You can try optimizing your input text to ensure it does not exceed the maximum length limit. Did you try using a different model or engine to see if it improves the response time?
zcc 20 Reputation points

2024-02-06T11:00:12.8566667+00:00

@AshokPeddakotla-MSFT I have confirmed that I used text-embedding-ada-002 (Version 2), and I conducted multiple tests on some test texts, resulting in unpredictable delays. My text tokens did not reach 2048. Another point is that I made continuous API calls within a minute, but when I checked the limitations of the model I deployed, I was far from reaching those limits.

zcc 20

This is the data from my three tests; they come from the same set of text sequences.

# first
result_1 = [0.7453436851501465, 0.5204393863677979, 0.16672658920288086, 0.17472267150878906, 0.17267441749572754, 0.18297243118286133, 0.1684587001800537, 15.79664945602417, 17.768128156661987, 0.17497968673706055, 0.17261981964111328, 0.17259693145751953]

# second
resul_2 = [0.7472083568572998, 0.17005181312561035, 20.200116395950317, 0.17545866966247559, 0.17123007774353027, 19.909905433654785, 0.1689450740814209, 0.16972017288208008, 20.07014489173889, 0.17378568649291992, 0.1733248233795166, 0.2687239646911621]

# third
result_3 = [0.7977395057678223, 0.1676774024963379, 0.16760969161987305, 18.680288791656494, 0.17368364334106445, 0.17961454391479492, 0.16588783264160156, 0.1701958179473877, 0.17095065116882324, 0.1715998649597168, 0.18040966987609863, 0.17617559432983398]

Please note, inside the list are the response times for each text call, in seconds.You can see, the response time for the same text can vary significantly.

zcc 20 Reputation points

2024-02-07T03:42:59.2266667+00:00

Can someone answer my question
AshokPeddakotla-MSFT 35,971 Reputation points Moderator

2024-02-08T05:08:46.5+00:00

zcc Sorry for the delayed response. This would require further investigation to find out the root cause. For a deeper investigation and immediate assistance on this issue, please file a support request @ https://aka.ms/azsupt?

1 answer

Your answer

AshokPeddakotla-MSFT 35,971 Reputation points Moderator

2024-02-06T10:37:59.68+00:00

zcc Greetings!

Can you confirm if you using text-embedding-ada-002 (Version 2)?

There could be several reasons for the unstable response time of the text-embedding-ada-002 model. One possible reason could be the size of your input text.

The maximum length of input text for the embedding models is 2048 tokens. If your input text exceeds this limit, it may cause the model to take longer to generate embeddings.

You can try optimizing your input text to ensure it does not exceed the maximum length limit. Did you try using a different model or engine to see if it improves the response time?
zcc 20 Reputation points

2024-02-06T11:00:12.8566667+00:00

@AshokPeddakotla-MSFT I have confirmed that I used text-embedding-ada-002 (Version 2), and I conducted multiple tests on some test texts, resulting in unpredictable delays. My text tokens did not reach 2048. Another point is that I made continuous API calls within a minute, but when I checked the limitations of the model I deployed, I was far from reaching those limits.
zcc 20 Reputation points

2024-02-06T11:14:09.3633333+00:00

This is the data from my three tests; they come from the same set of text sequences.

# first result_1 = [0.7453436851501465, 0.5204393863677979, 0.16672658920288086, 0.17472267150878906, 0.17267441749572754, 0.18297243118286133, 0.1684587001800537, 15.79664945602417, 17.768128156661987, 0.17497968673706055, 0.17261981964111328, 0.17259693145751953] # second resul_2 = [0.7472083568572998, 0.17005181312561035, 20.200116395950317, 0.17545866966247559, 0.17123007774353027, 19.909905433654785, 0.1689450740814209, 0.16972017288208008, 20.07014489173889, 0.17378568649291992, 0.1733248233795166, 0.2687239646911621] # third result_3 = [0.7977395057678223, 0.1676774024963379, 0.16760969161987305, 18.680288791656494, 0.17368364334106445, 0.17961454391479492, 0.16588783264160156, 0.1701958179473877, 0.17095065116882324, 0.1715998649597168, 0.18040966987609863, 0.17617559432983398]

Please note, inside the list are the response times for each text call, in seconds.You can see, the response time for the same text can vary significantly.
zcc 20 Reputation points

2024-02-07T03:42:59.2266667+00:00

Can someone answer my question
AshokPeddakotla-MSFT 35,971 Reputation points Moderator

2024-02-08T05:08:46.5+00:00

zcc Sorry for the delayed response. This would require further investigation to find out the root cause. For a deeper investigation and immediate assistance on this issue, please file a support request @ https://aka.ms/azsupt?

Answer 1

45711212 0

Is there a solution to this problem, I'm facing the same issue. The embedded text are only a few words. On average it takes 600ms but sometimes it can take up to 8s for no reason. We only do like 100 calls per day, so we are far away from the rate Limit.

Share via

Why is the response unstable when I use text-embedding-ada-002?

1 answer

Your answer