opomba,
Dostop do te strani zahteva pooblastilo. Poskusite se vpisati alispremeniti imenike.
Dostop do te strani zahteva pooblastilo. Poskusite lahko spremeniti imenike.
Learn how to send requests to agents deployed to Databricks Apps or Model Serving endpoints. Databricks provides multiple query methods to fit different use cases and integration needs.
Select the query approach that best fits your use case:
| Method | Key benefits |
|---|---|
| Databricks OpenAI Client (Recommended) | Native integration, full feature support, streaming capabilities |
| REST API | OpenAI-compatible, language-agnostic, works with existing tools |
AI Functions: ai_query |
OpenAI-compatible, query legacy agents hosted on Model Serving endpoints only |
Databricks recommends the Databricks OpenAI Client for new applications. Choose the REST API when integrating with platforms that expect OpenAI-compatible endpoints.
Databricks OpenAI Client (Recommended)
Databricks recommends that you use the DatabricksOpenAI Client to query a deployed agent. Depending on the API of your deployed agent, you will either use the responses or chat completions client:
Agents deployed to Apps
Use the following example for agents hosted on Databricks Apps following the ResponsesAgent interface, which is the recommended approach for building agents. You must use a Databricks OAuth token to query agents hosted on Databricks Apps.
from databricks.sdk import WorkspaceClient
from databricks_openai import DatabricksOpenAI
input_msgs = [{"role": "user", "content": "What does Databricks do?"}]
app_name = "<agent-app-name>" # TODO: update this with your app name
# The WorkspaceClient must be configured with OAuth authentication
# See: https://docs.databricks.com/aws/en/dev-tools/auth/oauth-u2m.html
w = WorkspaceClient()
client = DatabricksOpenAI(workspace_client=w)
# Run for non-streaming responses. Calls the "invoke" method
# Include the "apps/" prefix in the model name
response = client.responses.create(model=f"apps/{app_name}", input=input_msgs)
print(response)
# Include stream=True for streaming responses. Calls the "stream" method
# Include the "apps/" prefix in the model name
streaming_response = client.responses.create(
model=f"apps/{app_name}", input=input_msgs, stream=True
)
for chunk in streaming_response:
print(chunk)
If you want to pass in custom_inputs, you can add them with the extra_body param:
streaming_response = client.responses.create(
model=f"apps/{app_name}",
input=input_msgs,
stream=True,
extra_body={
"custom_inputs": {"id": 5},
},
)
for chunk in streaming_response:
print(chunk)
Agents on Model Serving
Use the following example for legacy agents hosted on Model Serving following the ResponsesAgent interface. You can use either a Databricks OAuth token or a Personal Access Token (PAT) to query agents hosted on Model Serving.
from databricks_openai import DatabricksOpenAI
input_msgs = [{"role": "user", "content": "What does Databricks do?"}]
endpoint = "<agent-endpoint-name>" # TODO: update this with your endpoint name
client = DatabricksOpenAI()
# Run for non-streaming responses. Invokes `predict`
response = client.responses.create(model=endpoint, input=input_msgs)
print(response)
# Include stream=True for streaming responses. Invokes `predict_stream`
streaming_response = client.responses.create(model=endpoint, input=input_msgs, stream=True)
for chunk in streaming_response:
print(chunk)
If you want to pass in custom_inputs or databricks_options, you can add them with the extra_body param:
streaming_response = client.responses.create(
model=endpoint,
input=input_msgs,
stream=True,
extra_body={
"custom_inputs": {"id": 5},
"databricks_options": {"return_trace": True},
},
)
for chunk in streaming_response:
print(chunk)
Use the following example for legacy agents on model serving following the ChatAgent or ChatModel interfaces.
from databricks.sdk import WorkspaceClient
messages = [{"role": "user", "content": "What does Databricks do?"}]
endpoint = "<agent-endpoint-name>" # TODO: update this with your endpoint name
ws_client = WorkspaceClient()
client = ws_client.serving_endpoints.get_open_ai_client()
# Run for non-streaming responses. Invokes `predict`
response = client.chat.completions.create(model=endpoint, messages=messages)
print(response)
# Include stream=True for streaming responses. Invokes `predict_stream`
streaming_response = client.chat.completions.create(model=endpoint, messages=messages, stream=True)
for chunk in streaming_response:
print(chunk)
If you want to pass in custom_inputs or databricks_options, you can add them with the extra_body param:
streaming_response = client.chat.completions.create(
model=endpoint,
messages=messages,
stream=True,
extra_body={
"custom_inputs": {"id": 5},
"databricks_options": {"return_trace": True},
},
)
for chunk in streaming_response:
print(chunk)
REST API
The Databricks REST API provides endpoints for models that are OpenAI-compatible. This allows you to use Databricks agents to serve applications that require OpenAI interfaces.
This approach is ideal for:
- Language-agnostic applications that use HTTP requests
- Integrating with third-party platforms that expect OpenAI-compatible APIs
- Migrating from OpenAI to Databricks with minimal code changes
Authenticate with the REST API using a Databricks OAuth token. Refer to the Databricks Authentication Documentation for more options and information.
Agents deployed to Apps
Use the following example for agents hosted on Databricks Apps following the ResponsesAgent interface, which is the recommended approach for building agents. You must use a Databricks OAuth token to query agents hosted on Databricks Apps.
curl --request POST \
--url <app-url>.databricksapps.com/responses \
--header 'Authorization: Bearer <OAuth token>' \
--header 'content-type: application/json' \
--data '{
"input": [{ "role": "user", "content": "hi" }],
"stream": true
}'
If you want to pass in custom_inputs, you can add them to the request body:
curl --request POST \
--url <app-url>.databricksapps.com/responses \
--header 'Authorization: Bearer <OAuth token>' \
--header 'content-type: application/json' \
--data '{
"input": [{ "role": "user", "content": "hi" }],
"stream": true,
"custom_inputs": { "id": 5 }
}'
Agents on Model Serving
Use the following example for legacy agents hosted on Model Serving following the ResponsesAgent interface. You can use either a Databricks OAuth token or a Personal Access Token (PAT) to query agents hosted on Model Serving. The REST API call is equivalent to:
- Using the Databricks OpenAI Client with
responses.create. - Sending a POST request to the specific endpoint's URL (ex:
https://<host.databricks.com>/serving-endpoints/\<model-name\>/invocations). For more information, see your endpoint's Model Serving page and the Model Serving Documentation.
curl --request POST \
--url https://<host.databricks.com\>/serving-endpoints/responses \
--header 'Authorization: Bearer <OAuth token>' \
--header 'content-type: application/json' \
--data '{
"model": "\<model-name\>",
"input": [{ "role": "user", "content": "hi" }],
"stream": true
}'
If you want to pass in custom_inputs or databricks_options, you can add them to the request body:
curl --request POST \
--url https://<host.databricks.com\>/serving-endpoints/responses \
--header 'Authorization: Bearer <OAuth token>' \
--header 'content-type: application/json' \
--data '{
"model": "\<model-name\>",
"input": [{ "role": "user", "content": "hi" }],
"stream": true,
"custom_inputs": { "id": 5 },
"databricks_options": { "return_trace": true }
}'
Use the following for agents created with legacy ChatAgent or ChatModel interfaces. This is equivalent to:
- Using the Databricks OpenAI Client with
chat.completions.create. - Sending a POST request to the specific endpoint's URL (ex:
https://<host.databricks.com>/serving-endpoints/\<model-name\>/invocations). For more information, see your endpoint's Model Serving page and the Model Serving Documentation.
curl --request POST \
--url https://<host.databricks.com\>/serving-endpoints/chat/completions \
--header 'Authorization: Bearer <OAuth token>' \
--header 'content-type: application/json' \
--data '{
"model": "\<model-name\>",
"messages": [{ "role": "user", "content": "hi" }],
"stream": true
}'
If you want to pass in custom_inputs or databricks_options, you can add them to the request body:
curl --request POST \
--url https://<host.databricks.com\>/serving-endpoints/chat/completions \
--header 'Authorization: Bearer <OAuth token>' \
--header 'content-type: application/json' \
--data '{
"model": "\<model-name\>",
"messages": [{ "role": "user", "content": "hi" }],
"stream": true,
"custom_inputs": { "id": 5 },
"databricks_options": { "return_trace": true }
}'
AI Functions: ai_query
You can use ai_query to query a deployed agent hosted on model serving using SQL. See ai_query function for SQL syntax and parameter definitions.
SELECT ai_query(
"<model name>", question
) FROM (VALUES ('what is MLflow?'), ('how does MLflow work?')) AS t(question);