Understanding Internal ChatGPT Deployment

Jai-6363 245 Reputation points
2023-10-17T01:26:34.8066667+00:00

The ChatGPT web application behind the AI assistant does not have a view of the model it is running on? The completion from the chatbot is "As an AI assistant, I'm based on OpenAI GPT-3". The team has deployed an internal ChatGPT for employees and utilized the GPT-4 model after deploying the web application from the wizard on the AOAI studio.

Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
4,080 questions
{count} votes

Accepted answer
  1. Ramr-msft 17,826 Reputation points
    2023-10-25T04:56:13.6333333+00:00

    Thanks, Ultimately, the model is performing next token prediction in response to your question. The model doesn't have any native ability to query what model version is currently being run to answer your question. To answer this question, you can always go to Azure OpenAI Studio > Management > Deployments > and consult the model name column to confirm what model is currently associated with a given deployment name.

    The questions, "What model are you running?" or "What is the latest model from OpenAI?" produce similar quality results to asking the model what the weather will be today. It might return the correct result, but purely by chance. On its own, the model has no real-world information other than what was part of its training/training data. In the case of GPT-4, as of August 2023 the underlying training data goes only up to September 2021. GPT-4 was not released until March 2023, so barring OpenAI releasing a new version with updated training data, or a new version that is fine-tuned to answer those specific questions, it's expected behavior for GPT-4 to respond that GPT-3 is the latest model release from OpenAI.

    If you wanted to help a GPT based model to accurately respond to the question "what model are you running?", you would need to provide that information to the model through techniques like prompt engineering of the model's system message, Retrieval Augmented Generation (RAG) which is the technique used by Azure OpenAI on your data where up-to-date information is injected to the system message at query time, or via fine-tuning where you could fine-tune specific versions of the model to answer that question in a certain way based on model version.

    0 comments No comments

1 additional answer

Sort by: Most helpful
  1. Ramr-msft 17,826 Reputation points
    2023-10-18T03:26:37.7466667+00:00

    @Jai-6363 Thanks for the question, You can ground your models with vector databases, web apis, and function calling as well as establishing metaprompts that mitigate this.

    Here is the blog for the same.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.