GPT-4 consumes more tokens compared to GPT-3 and ignores the max token parameter in the API call.

lakshmi 746 Reputation points
2024-09-21T05:17:08.1733333+00:00

We have upgraded our Azure OpenAI deployment to GPT-4. Previously, we configured Azure Blob Storage to retrieve answers from private documents, which worked well with GPT-3 and continues to work with GPT-4. But in GPT-4, we only need short, concise answers rather than lengthy explanations.

Despite setting the max token count to 1000 and including instructions in both the system message and role information, the API is still returning long answers.

Are there any additional options to control the token usage and ensure more precise and concise responses?

Instruction we provided: Please answer using retrieved documents only. The answer should be in English. Answer in as few words as possible. Your answer should be concise and accurate. If the information you need is not present in the documents, then respond with, 'The requested information is not available in the retrieved documents'.

Screenshot of answer in GPT4:User's image

Screenshot of answer in GPT 3: User's image

Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
3,132 questions
{count} votes

2 answers

Sort by: Most helpful
  1. Luc MAURETTE 5 Reputation points
    2024-10-04T11:57:50.1666667+00:00

    Hi,
    maybe you can try to lower the temperature setting. A lower temperature value (e.g., 0.2 or 0.3) will produce more focused and deterministic responses, which can help prevent excessive elaboration.

    You can also experiment with top_p, which controls the diversity of the generated text. Lowering it (e.g., 0.5) might help reduce the length by making the model focus more on higher-probability responses.

    Regards,
    Luc.

    1 person found this answer helpful.
    0 comments No comments

  2. AshokPeddakotla-MSFT 34,111 Reputation points
    2024-09-23T02:35:08.9566667+00:00

    lakshmi Greetings!

    Instead of Words, can you try specifying Tokens?

    Please answer using retrieved documents only. The answer should be in English and use tokens up to 1000. Your answer should be concise and accurate. If the information you need is not present in the documents, then respond with, 'The requested information is not available in the retrieved documents'.

    Do let me know if that help or have any further queries.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.