Query a model with the Open Responses API

This article explains how to query foundation models using the Open Responses API and describes the provider-specific behavior to account for when you do.

The Open Responses API is an open, multi-provider implementation of the responses-style request format. It uses an input field instead of messages and returns a structured output array. Send requests to the /serving-endpoints/open-responses path with the model serving endpoint name in the model field of the request body.

Note

For OpenAI models, use the OpenAI Responses API directly. That path is a native passthrough and supports the full set of OpenAI Responses parameters and tools. This article covers the Open Responses API, which works across providers but supports a focused feature set.

Query examples

The following example queries a foundation model endpoint with the Open Responses API.

curl \
  -u token:$DATABRICKS_TOKEN \
  -X POST \
  -H "Content-Type: application/json" \
  -d '{
    "model": "databricks-claude-sonnet-4-5",
    "input": [
      {
        "role": "user",
        "content": "What is a mixture of experts model?"
      }
    ],
    "max_output_tokens": 256
  }' \
  https://<workspace_host>.databricks.com/serving-endpoints/open-responses

The response is a response object with an output array. For streaming requests (stream: true), the response is a text/event-stream where each event is a response chunk.

Provider-specific behavior

Databricks translates the Open Responses request to each provider's native format. Behavior is consistent for most requests, but the following provider-specific differences apply.

All providers

Conversations are stateless. previous_response_id and server-side conversation storage aren't supported. Send the full conversation in the input field on each turn.
Some OpenAI-specific fields are accepted but ignored on non-OpenAI providers. Fields such as user, safety_identifier, metadata, and truncation are returned in the response for portability but don't change provider behavior.

Databricks-hosted (open source) models

Feature support is per model. Function calling, reasoning, structured output, and image input are enabled per model. A request that uses a feature the model doesn't support returns an error. For example, a model that supports reasoning might not support image input.
Image input must be a URL or data URI. Provide images through image_url as an https URL or a data: URI. File references (file_id) and document inputs (input_file) aren't supported.

Anthropic Claude models

Temperature uses a 0–2 scale. Claude uses a native 0–1 range, so Databricks rescales the value by halving it—temperature: 1.0 behaves like 0.5.
Reasoning round-trips across turns. To let the model reason over its prior thinking in a multi-turn conversation, send the returned reasoning items—with their encrypted_content unchanged—back in the next request's input. See Query reasoning models.
Image and document inputs must be base64 data URIs. Provide images through image_url as a base64 data: URI and documents through file_data as a base64 data: URI. https URLs and file_id references aren't supported.
Structured output has constraints. text.format of type json_schema is supported, but json_object isn't and returns an error. Structured output can't be combined with streaming or with reasoning, and you can't pin tool_choice to a specific tool when using it. See Structured outputs on Azure Databricks.
Reasoning tokens are included in usage.output_tokens rather than reported separately.

Google Gemini models

Temperature uses a 0–2 scale. Gemini uses a native 0–1 range, so Databricks rescales the value by halving it—temperature: 1.0 behaves like 0.5.
Reasoning round-trips across turns. To let the model reason over its prior thinking in a multi-turn conversation, send the returned reasoning items—with their encrypted_content unchanged—back in the next request's input. See Query reasoning models.
Image input accepts both https URLs and base64 data URIs.
Reasoning tokens are reported in usage.output_tokens_details.reasoning_tokens.

Important

Multi-turn tool calls with Gemini require preserving encrypted_content. Gemini returns an encrypted_content value on each function_call item it produces. When you send the tool result back for the next turn, you must include the original function_call item with its encrypted_content field unchanged. Agent frameworks that reconstruct tool calls from only name, arguments, and call_id drop this field, which causes the follow-up request to be rejected.

The following example preserves the function_call item (with its encrypted_content) when returning the tool result:

{
  "model": "databricks-gemini-2-5-pro",
  "input": [
    { "role": "user", "content": "What's the weather in San Francisco?" },
    {
      "type": "function_call",
      "call_id": "call_abc123",
      "name": "get_weather",
      "arguments": "{\"city\": \"San Francisco\"}",
      "encrypted_content": "<opaque-provider-signature>"
    },
    {
      "type": "function_call_output",
      "call_id": "call_abc123",
      "output": "{\"temp_f\": 64}"
    }
  ]
}

Tools

The Open Responses API supports function-type tools across providers. For details and the supported models, see Function calling on Azure Databricks. For the web search built-in tool, see Web search on Azure Databricks.

Other built-in and custom tool types (for example custom, apply_patch, image_generation, and mcp) are available only through the OpenAI Responses API.

Supported models

The Open Responses API is available across Databricks foundation models, including Anthropic Claude, Google Gemini, and Databricks-hosted open models, and support extends to new models going forward. For the current list of available models, see Foundation model types.

Feature support, such as function calling, reasoning, structured output, and image input, depends on the underlying model. See Provider-specific behavior.

Supported input types

Input support depends on the model and provider. Text input is supported by all models. For image input, see the per-provider notes in Provider-specific behavior and the format and size requirements in Query vision models. For per-model input types, see Databricks-hosted foundation models available in Foundation Model APIs.

Additional resources

Feedback

Was this page helpful?

Last updated on 2026-06-29