We are switching the Azure Open AI text-davinci-003 model to gpt-35-turbo-instruct (0914) model (Standard tier.)
Region : EastUS
We created one resource in Azure Open AI services and deployed the gpt-35-turbo-instruct model.
As it is mentioned in the Microsoft documentation that gpt-35-turbo-instruct (0914) only accepts the maximum tokens - 4097 tokens
https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/models#gpt-35-models
But when we use this model, maxtoken is accepted above 4096 tokens(Accepting more than 8000 tokens)
Sample code used to check this model,
String azureKey = "t#######";
String deploymentOrModelId = "deploymentID"; //deploymentid created in Azure portal(gpt-35-turbo-instruct) under Azure open AI services
String endpoint = "https://wyz.openai.azure.com/";
OpenAIClient client = new OpenAIClientBuilder().endpoint(endpoint)
.credential(new AzureKeyCredential(azureKey)).buildClient();
List<String> prompt = new ArrayList<>();
prompt.add("what is tree");
CompletionsOptions options = new CompletionsOptions(prompt);
options.setMaxTokens(800);
options.setPresencePenalty(0.0);
options.setFrequencyPenalty(0.0);
options.setTemperature(1.0);
options.setTopP(0.5);
Completions completions = client.getCompletions(deploymentOrModelId, options);
Dependencies used,
<dependency>
<groupId>com.azure</groupId>
<artifactId>azure-sdk-bom</artifactId>
<version>1.2.15</version>
<type>pom</type>
<scope>import</scope>
</dependency>
<dependency>
<groupId>com.azure</groupId>
<artifactId>azure-ai-openai</artifactId>
<version>1.0.0-beta.2</version>
</dependency>
Kindly suggest us to solve this problem, is there any setting need to be checked in Azure portal?