Learn how to use JSON mode

JSON mode allows you to set the models response format to return a valid JSON object as part of a chat completion. While generating valid JSON was possible previously, there could be issues with response consistency that would lead to invalid JSON objects being generated.

JSON mode support

JSON mode is only currently supported with the following models:

Supported models

  • gpt-35-turbo (1106)
  • gpt-35-turbo (0125)
  • gpt-4 (1106-Preview)
  • gpt-4 (0125-Preview)

API support

Support for JSON mode was first added in API version 2023-12-01-preview

Example

import os
from openai import AzureOpenAI

client = AzureOpenAI(
  azure_endpoint = os.getenv("AZURE_OPENAI_ENDPOINT"), 
  api_key=os.getenv("AZURE_OPENAI_API_KEY"),  
  api_version="2024-03-01-preview"
)

response = client.chat.completions.create(
  model="gpt-4-0125-Preview", # Model = should match the deployment name you chose for your 0125-Preview model deployment
  response_format={ "type": "json_object" },
  messages=[
    {"role": "system", "content": "You are a helpful assistant designed to output JSON."},
    {"role": "user", "content": "Who won the world series in 2020?"}
  ]
)
print(response.choices[0].message.content)

Output

{
  "winner": "Los Angeles Dodgers",
  "event": "World Series",
  "year": 2020
}

There are two key factors that need to be present to successfully use JSON mode:

  • response_format={ "type": "json_object" }
  • We told the model to output JSON as part of the system message.

Including guidance to the model that it should produce JSON as part of the messages conversation is required. We recommend adding instruction as part of the system message. According to OpenAI failure to add this instruction can cause the model to "generate an unending stream of whitespace and the request could run continually until it reaches the token limit."

Failure to include "JSON" within the messages returns:

Output

BadRequestError: Error code: 400 - {'error': {'message': "'messages' must contain the word 'json' in some form, to use 'response_format' of type 'json_object'.", 'type': 'invalid_request_error', 'param': 'messages', 'code': None}}

Other considerations

You should check finish_reason for the value length before parsing the response. The model might generate partial JSON. This means that output from the model was larger than the available max_tokens that were set as part of the request, or the conversation itself exceeded the token limit.

JSON mode produces JSON that is valid and parses without error. However, there's no guarantee for output to match a specific schema, even if requested in the prompt.