Inconsistent Response Quality Between Chat Interface and API Using the Same Instructions

Question

Inconsistent Response Quality Between Chat Interface and API Using the Same Instructions

Mohammad Norouzifard 20

Hi Support Team,

I hope you are doing well.

I am reaching out to ask why we are experiencing differences in response quality when using the same instructions in the chat interface versus the API (via assistant ID). Below is an example to illustrate the issue:

Example using chat:

Q: When did the incident happen? A: Two weeks ago.
Q: Can you please specify the exact time and date of the incident, including the time, day, month, and year? (The response is always consistent.) A: 9:00 AM on 20 Oct 2024.

Example using API:

Q: When did the incident happen? A: Two weeks ago.
Q: [No relevant follow-up question is generated; unrelated content is returned.]

Could you please help clarify why there’s a difference in behaviour between these two approaches?

Your assistance in resolving this would be greatly appreciated. Thanks

Accepted answer

1 additional answer

Your answer

Answer 1

Hi Mohammad Norouzifard,

Welcome to the Microsoft Q&A Platform. Thank you for reaching out & I hope you are doing well.

I’d like to clarify a few key points that might help explain the behavior you’re experiencing.

Response Consistency: In the chat interface, responses are often more consistent due to its design to handle conversational context effectively. The chat model is optimized for maintaining context over multiple turns of conversation, which allows it to generate relevant follow-up questions based on prior user inputs.
API Limitations: In contrast, when using the API, you may notice variability in responses. One notable limitation is that the Assistants API does not provide model controls for parameters such as top_p and temperature. These parameters significantly influence the creativity and variability of the responses. Without the ability to fine-tune these settings, the API may return responses that seem unrelated or less coherent compared to those generated in the chat interface.
Model Training and Optimization: The models used in the chat interface are specifically trained and optimized for interactive dialogue, while API responses may lack some of the conversational nuances that the chat interface can provide. This difference in optimization can lead to variations in response quality.

Hope this helps. Do let us know if you have any further queries.

If this answers your query, do click Accept Answer and Yes for was this answer helpful.

Answer 2

Mohammad Norouzifard 20

How can I fix this issue this differences between Chat Playground and API performance?

Is the API will consider the instructions that I have added into that or Not?

I am also happy to have a meeting to share the details for my issue.

Pavankumar Purilla 8,570 Reputation points Microsoft External Staff Moderator

2024-10-24T10:13:06.2133333+00:00
Hi Mohammad Norouzifard,
Hope you are doing well.
Here’s an explanation for the differences you’re noticing between the chat interface and the API, along with steps to improve API performance:

Chat Interface:

The chat interface automatically tracks the conversation context across multiple turns. This allows it to remember the dialogue history and nuances, resulting in consistent and relevant responses.

It handles session management automatically, ensuring the model understands the sequence and context of interactions.

API Usage:

When using the API, you need to manage the conversation context explicitly. If the API call doesn’t include the full conversation history or if the context is not properly passed, the assistant might not generate appropriate follow-up responses.

Unlike the chat interface, with the API you must handle session management manually. This means maintaining and passing the session ID or conversation history with each request to ensure the assistant retains context.

Improving API Response Quality:

Providing context and history within the messages array can greatly improve response quality. By maintaining the full conversation history, the model will have a clearer understanding of the context, leading to more accurate and relevant responses. Just make sure the structure is clear, like in the example below:

{ "model": "gpt-3.5-turbo", "messages": [ {"role": "system", "content": "You are an AI assistant."}, {"role": "user", "content": "When did the incident happen?"}, {"role": "assistant", "content": "Two weeks ago."}, {"role": "user", "content": "Can you please specify the exact time and date of the incident, including the time, day, month, and year?"} ] }

Key Tips:

Maintain Sequential Order: Ensure the messages are in the correct sequence. The model depends on this order to understand the conversation flow.

Provide Sufficient Context: Include all relevant history to give the model the information it needs to respond accurately.

I hope you understand.

Could you please take a moment to retake the survey on the above response? Your feedback is greatly appreciated.
Pavankumar Purilla 8,570 Reputation points Microsoft External Staff Moderator

2024-10-25T09:14:49.1066667+00:00

Hi Mohammad Norouzifard,
Greetings of the day!

I would like to follow up with the thread.

Just checking in to see if you had a chance to see my response to your question. Please tell us if it was helpful and feel free to reach out to us if you have any queries.

Could you please take a moment to retake the survey on the above response? Your feedback is greatly appreciated.
Pavankumar Purilla 8,570 Reputation points Microsoft External Staff Moderator

2024-10-28T06:32:13.3666667+00:00

Hi Mohammad Norouzifard,
Greetings of the day!
I would like to follow up with the thread.

Just checking in to see if you had a chance to see my response to your question. Please tell us if it was helpful and feel free to reach out to us if you have any queries.
Mohammad Norouzifard 20 Reputation points

2024-11-01T02:37:42.2166667+00:00

Thank you very much for your comment.

Share via

Inconsistent Response Quality Between Chat Interface and API Using the Same Instructions

1 additional answer

Your answer