Azure OpenAI API - Internal Server Error (500) on chat.completions.parse with DeepSeek-R1 and Meta-Llama-3.1-8B-Instruct

Anonymous
2025-02-21T04:43:06.47+00:00

Issue Summary:

I am encountering an Internal Server Error (500) when calling client.beta.chat.completions.parse with the Azure OpenAI API. This occurs for multiple models, including DeepSeek-R1 and Meta-Llama-3.1-8B-Instruct.

openai.InternalServerError: Error code: 500 - {'error': {'code': 'InternalServerError', 'message': 'Backend returned unexpected response. Please contact Microsoft for help.'}}

Code snippets:

from pydantic import BaseModel

from openai import AzureOpenAI

client = AzureOpenAI(

azure_endpoint="https://<your-resource-name>.services.ai.azure.com/",

api_key="<your-api-key>",

api_version="2024-10-21"

)

class CalendarEvent(BaseModel):

name: str

date: str

participants: list[str]

completion = client.beta.chat.completions.parse(

model="DeepSeek-R1", # Also fails with Meta-Llama-3.1-8B-Instruct

messages=[

{"role": "system", "content": "Extract the event information."},

{"role": "user", "content": "Alice and Bob are going to a science fair on Friday."},

],

response_format=CalendarEvent,

)

event = completion.choices[0].message.parsed

print(event)

print(completion.model_dump_json(indent=2))

Azure AI services
Azure AI services
A group of Azure services, SDKs, and APIs designed to make apps more intelligent, engaging, and discoverable.
3,602 questions
{count} votes

1 answer

Sort by: Most helpful
  1. SriLakshmi C 6,010 Reputation points Microsoft External Staff Moderator
    2025-02-21T10:50:19.6733333+00:00

    Hello Ayush,

    Greetings and Welcome to Microsoft Q&A!

    I understand that you're encountering a common issue with the Azure OpenAI API, specifically an Internal Server Error (500) when using the chat.completions.parse endpoint with the DeepSeek-R1 and Meta-Llama-3.1-8B-Instruct models.

    The error may be due to regional dependencies. Certain models may not be fully supported or available in specific Azure regions, leading to backend errors when processing requests.

    Also check the below snippet

    import os
    from azure.ai.inference import ChatCompletionsClient
    from azure.core.credentials import AzureKeyCredential
    
    client = ChatCompletionsClient(
        endpoint=os.environ["AZURE_INFERENCE_ENDPOINT"],
        credential=AzureKeyCredential(os.environ["AZURE_INFERENCE_CREDENTIAL"]),
    )
    

    Also kindly refer this Create a client to consume the model.

    I hope you understand. And, if you have any further query do let us know.

    Thank you!

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.