We have upgraded our Azure OpenAI deployment to GPT-4. Previously, we configured Azure Blob Storage to retrieve answers from private documents, which worked well with GPT-3 and continues to work with GPT-4. But in GPT-4, we only need short, concise answers rather than lengthy explanations.
Despite setting the max token count to 1000 and including instructions in both the system message and role information, the API is still returning long answers.
Are there any additional options to control the token usage and ensure more precise and concise responses?
Instruction we provided: Please answer using retrieved documents only. The answer should be in English. Answer in as few words as possible. Your answer should be concise and accurate. If the information you need is not present in the documents, then respond with, 'The requested information is not available in the retrieved documents'.
Screenshot of answer in GPT4:
Screenshot of answer in GPT 3: