GPT-4 consumes more tokens compared to GPT-3 and ignores the max token parameter in the API call.

Question

We have upgraded our Azure OpenAI deployment to GPT-4. Previously, we configured Azure Blob Storage to retrieve answers from private documents, which worked well with GPT-3 and continues to work with GPT-4. But in GPT-4, we only need short, concise answers rather than lengthy explanations.

Despite setting the max token count to 1000 and including instructions in both the system message and role information, the API is still returning long answers.

Are there any additional options to control the token usage and ensure more precise and concise responses?

Instruction we provided: Please answer using retrieved documents only. The answer should be in English. Answer in as few words as possible. Your answer should be concise and accurate. If the information you need is not present in the documents, then respond with, 'The requested information is not available in the retrieved documents'.

Screenshot of answer in GPT4: User's image

Screenshot of answer in GPT 3: User's image

Answer

Hi,
maybe you can try to lower the temperature setting. A lower temperature value (e.g., 0.2 or 0.3) will produce more focused and deterministic responses, which can help prevent excessive elaboration.

You can also experiment with top_p, which controls the diversity of the generated text. Lowering it (e.g., 0.5) might help reduce the length by making the model focus more on higher-probability responses.

Regards,
Luc.

Answer

lakshmi Greetings!

Instead of Words, can you try specifying Tokens?

Please answer using retrieved documents only. The answer should be in English and use tokens up to 1000. Your answer should be concise and accurate. If the information you need is not present in the documents, then respond with, 'The requested information is not available in the retrieved documents'.

Do let me know if that help or have any further queries.

Share via

GPT-4 consumes more tokens compared to GPT-3 and ignores the max token parameter in the API call.

2 answers

Your answer