Assistants and GPT Model change

Mohamed Hussein 380 Reputation points
2024-11-28T06:07:16.9466667+00:00

Good Day,

On Azure OpenAi assistants, i've exchanged around 200 messages,

response become quite slow, more than 1 minute to answer reply

Also, when i asked about current Model, the assistant respond GPT 3.5 turbo, while the assistant was configured to

"model": "gpt-4o-mini",

Having that said, new created threads on this Assistant get the right model

Finally, the aofrementioned request to the assistant consumed more than 100K token, is that normal?

The truncation strategy is Auto

Attached is assistant response and Run details


{
    "object": "list",
    "data": [
        {
            "id": "msg_deS7a1vmjSTE8RClsqClwTZA",
            "object": "thread.message",
            "created_at": 1732772590,
            "assistant_id": "asst_3rY50WWj2cvOXU5jjealB7qy",
            "thread_id": "thread_7j2hL1FVVVLQfi9plDS714to",
            "run_id": "run_O5hnIUgbdwm0B1k4wcETOybk",
            "role": "assistant",
            "content": [
                {
                    "type": "text",
                    "text": {
                        "value": "I am based on OpenAI's GPT-3.5-turbo model. If you have any questions or need assistance with anything specific, feel free to ask!",
                        "annotations": []
                    }
                }
            ],
            "attachments": [],
            "metadata": {}
        }
    ],
    "first_id": "msg_deS7a1vmjSTE8RClsqClwTZA",
    "last_id": "msg_deS7a1vmjSTE8RClsqClwTZA",
    "has_more": true
}


    "content": {
        "id": "run_O5hnIUgbdwm0B1k4wcETOybk",
        "object": "thread.run",
        "created_at": 1732772575,
        "assistant_id": "asst_3rY50WWj2cvOXU5jjealB7qy",
        "thread_id": "thread_7j2hL1FVVVLQfi9plDS714to",
        "status": "completed",
        "started_at": 1732772577,
        "expires_at": null,
        "cancelled_at": null,
        "failed_at": null,
        "completed_at": 1732772593,
        "required_action": null,
        "last_error": null,
        "model": "gpt-4o-mini",
        "instructions": "You are an AI assistant",
        "tools": [
            {
                "type": "code_interpreter"
            },
            {
                "type": "file_search",
                "file_search": {
                    "ranking_options": {
                        "ranker": "default_2024_08_21",
                        "score_threshold": 0
                    }
                }
            }
        ],
        "tool_resources": {},
        "metadata": {},
        "temperature": 1,
        "top_p": 0.94,
        "max_completion_tokens": null,
        "max_prompt_tokens": null,
        "truncation_strategy": {
            "type": "auto",
            "last_messages": null
        },
        "incomplete_details": null,
        "usage": {
            "prompt_tokens": 113403,
            "completion_tokens": 35,
            "total_tokens": 113438
        },
        "response_format": "auto",
        "tool_choice": "auto",
        "parallel_tool_calls": true
    },
Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
3,407 questions
{count} votes

1 answer

Sort by: Most helpful
  1. navba-MSFT 26,480 Reputation points Microsoft Employee
    2024-11-28T14:13:57.6033333+00:00

    @Mohamed Hussein Thanks for your reply.

    .
    I see that the issue you are encountering is already documented:

    https://learn.microsoft.com/en-us/azure/ai-services/openai/faq#when-i-ask-gpt-4-which-model-it-s-running--it-tells-me-it-s-running-gpt-3--why-does-this-happen-

    .
    More info below:

    When I ask GPT-4 which model it's running, it tells me it's running GPT-3. Why does this happen?

    Azure OpenAI models (including GPT-4) being unable to correctly identify what model is running is expected behavior.

    Why does this happen?

    Ultimately, the model is performing next token prediction in response to your question. The model doesn't have any native ability to query what model version is currently being run to answer your question. To answer this question, you can always go to Azure AI Foundry > Management > Deployments > and consult the model name column to confirm what model is currently associated with a given deployment name.

    .

    The questions, What model are you running? or What is the latest model from OpenAI?

    produce similar quality results to asking the model what the weather will be today. It might return the correct result, but purely by chance. On its own, the model has no real-world information other than what was part of its training/training data. In the case of GPT-4, as of August 2023 the underlying training data goes only up to September 2021. GPT-4 wasn't released until March 2023, so barring OpenAI releasing a new version with updated training data, or a new version that is fine-tuned to answer those specific questions, it's expected behavior for GPT-4 to respond that GPT-3 is the latest model release from OpenAI.

    .

    Workaround 1:

    If you wanted to help a GPT based model to accurately respond to the question what model are you running?, you would need to provide that information to the model through techniques like prompt engineering of the model's system message, Retrieval Augmented Generation (RAG) which is the technique used by Azure OpenAI on your data where up-to-date information is injected to the system message at query time, or via fine-tuning where you could fine-tune specific versions of the model to answer that question in a certain way based on model version.

    .

    Workaround 2:

    Since, you are using gpt-4o-mini model, You could try to update the System Prompt message as below:

    You are a helpful AI assistant with knowledge cutoff of Oct 2023

    User's image

    Hope this answers.

    1 person found this answer helpful.

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.