Azure OpenAI with Azure AI Search responds with only 5 rows even there are 80 rows in result

Manoj Prasad 0 Reputation points
2024-06-17T08:43:52.2866667+00:00

I am using Azure OpenAI that fetches data using Azure AI search. When spinning a web app through Azure portal the maximum response for a question is 20 rows.

And when accessing with WebAPI the maximum response is 5 rows.

I should be getting 80 rows for this question.

Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
2,611 questions
{count} votes

2 answers

Sort by: Most helpful
  1. Adharsh Santhanam 2,630 Reputation points
    2024-06-19T05:29:55.74+00:00

    Hello Manoj Prasad, firstly ensure that you execute the same operation in Azure AI search and you're getting the 80 rows for the query. If you don't get 80 rows directly, then, that clearly shows a mismatch in expectations as you're expecting 80 rows while the AI Search is giving a different number of rows. So, you need to iron out this difference. However, if the AI Search gives you 80 rows as expected and you're getting 20 rows via Azure portal and 5 rows via the API, then, this is likely due to one of the configurable parameters. Most likely, in the Chat Playground, increasing the "Maximum length" parameter (or in the case of API, increasing the "max_tokens") will lead to an increase in the number of rows returned by the service.

    Please do not forget to "Accept the answer” and “up-vote” wherever the information provided helps you, this can be beneficial to other community members.

    0 comments No comments

  2. navba-MSFT 20,530 Reputation points Microsoft Employee
    2024-06-19T05:31:16.98+00:00

    @Manoj Prasad Welcome to Microsoft Q&A Forum, Thank you for posting your query here!

    .

    Before deploying to the Web APP, could you please test from Azure Open Ai Studio chat playground and check if you are seeing the same issue? Do you get complete response there?

    .

    One way to control the length of a model’s response is with the max_tokens parameter. You can try to increase the max_tokens parameter value to avoid truncated responses. The max-tokens value varies by model. You can see its Max values here.

    See here:
    User's image

    .

    Hope this helps. If you have any follow-up questions, please let me know. I would be happy to help.

    0 comments No comments