Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Important
This page covers the new AI Gateway (visible in the left nav of the UI), which is currently in Beta. Account admins can enable access to this feature in the account console Previews page. See Manage Azure Databricks previews.
For details on the previous version of AI Gateway (not Unity AI Gateway), see AI Gateway for serving endpoints (legacy).
Note
Unity AI Gateway is not supported on AWS GovCloud or Azure Government.
What is Unity AI Gateway?
Unity AI Gateway is the enterprise control plane for governing model services, agents, and coding tools. Use it to analyze usage, configure permissions, and manage capacity across providers.
With Unity AI Gateway, you can:
- Analyze how LLMs, agents, and coding tools are used in your organization
- Govern access to Azure Databricks-hosted and external models
- Log model traffic across all model services to Unity Catalog
- Monitor model service health and provider availability
- Enforce rate limits and guardrails at the model service, user, or group level
- Attribute costs to specific model services, users, and teams
- Route traffic intelligently across providers for reliability and load balancing
- Split traffic across multiple model backends for scalability
- Switch providers and models without code changes

Supported features
The following table defines the available Unity AI Gateway features:
| Feature | Description |
|---|---|
| Permissions | Control who has access to your model services. |
| Usage tracking | Monitor usage and costs using system tables. |
| Inference tables | Monitor and audit requests and responses in Unity Catalog Delta tables. |
| Operational metrics | Monitor usage in real time. |
| Rate limits | Enforce consumption limits at the model service, user, or group level. |
| Guardrails | Apply content filtering, sensitive data protection, and custom policies. |
| Cost observability | Analyze Azure Databricks cost for model services, destination models, principals, and tags in the billable usage system table and the usage dashboard. |
| Fallbacks | Increase reliability by routing to multiple providers when failures occur. |
| Traffic splitting | Distribute traffic across multiple model backends for better scalability and load balancing. |
| Custom APIs | Govern custom and external APIs with the same access controls, rate limits, and logging as model services. |
Note
Unity AI Gateway features don't incur charges during Beta.
Use Unity AI Gateway
Azure Databricks provides model services for popular LLMs. You can create new model services to govern agents, coding tools, and other applications.
To get started, see Configure Unity AI Gateway endpoints (legacy). To query model services, see Query Unity AI Gateway endpoints (legacy). To integrate coding agents like Cursor, Gemini CLI, Codex CLI, and Claude Code, see coding agent integration. To route LLM calls from agents you author and deploy on Databricks Apps through Unity AI Gateway, see Step 4. Govern LLM usage from your agents on Databricks Apps with Unity AI Gateway.
Query quickstart
Tip
Tell Genie Code (Agent mode) to do this for you:
Create a new notebook that queries a model service using Python and the OpenAI client.
The following example shows how to query a model service using Python and the OpenAI client:
from openai import OpenAI
import os
# To get a Databricks token, see https://docs.databricks.com/dev-tools/auth/pat
DATABRICKS_TOKEN = os.environ.get('DATABRICKS_TOKEN')
client = OpenAI(
api_key=DATABRICKS_TOKEN,
base_url="https://<workspace-url>/ai-gateway/mlflow/v1"
)
chat_completion = client.chat.completions.create(
messages=[
{"role": "user", "content": "Hello!"},
{"role": "assistant", "content": "Hello! How can I assist you today?"},
{"role": "user", "content": "What is Databricks?"},
],
model="databricks-gpt-5-2",
max_tokens=256
)
print(chat_completion.choices[0].message.content)
Replace <workspace-url> with your Azure Databricks workspace URL.