Chat in Azure AI Studio fails when max tokens set at 5000

matsuo_basho 10 Reputation points
2024-07-25T21:55:22.5366667+00:00

I'm using the Chat playground in Azure AI Studio, running Meta Llama 3.1 70B instruct with a Serverless compute.

When I set the max tokens to 500, it works fine. However, when I set it to 5000, I get an error:
Request failed with status code 500. Clear the output to start a new dialog.

According to Meta's announcement, the model's context length is 128K, so why is there an issue?

User's image

Please let me know if this is some sort of MS quota issue and if so, whether it's something I have to request and wait for. Perhaps it's the fact that I'm running this on serverless compute.

Azure AI services
Azure AI services
A group of Azure services, SDKs, and APIs designed to make apps more intelligent, engaging, and discoverable.
2,645 questions
0 comments No comments
{count} votes