Note
Ang pag-access sa pahinang ito ay nangangailangan ng pahintulot. Maaari mong subukang mag-sign in o magpalit ng mga direktoryo.
Ang pag-access sa pahinang ito ay nangangailangan ng pahintulot. Maaari mong subukang baguhin ang mga direktoryo.
This article summarizes the limitations and region availability for Azure Databricks Model Serving and supported endpoint types.
Resource and payload limits
Model Serving imposes default limits to ensure reliable performance. If you have feedback on these limits, reach out to your Databricks account team.
The limits in this section apply to custom model and AI agent endpoints only. For Foundation Model APIs and external model resource and payload limits, see Foundation Model APIs rate limits and quotas.
Custom models and AI agents
| Feature | Granularity | Limit |
|---|---|---|
| Endpoints | Per workspace | 1000. Reach out to your Databricks account team to increase. |
| Queries per second (QPS) | Per endpoint | 300,000 using route optimization. If 1024 concurrency is not enough, reach out to your Databricks account team to increase. |
| Queries per second (QPS) | Per workspace | 300,000 using route optimization. 200 for non-route optimized, recommended only for small dev use-cases. |
| Provisioned concurrency | Per model | 1024 with custom option and route optimization. Reach out to your Databricks account team to increase. |
| Provisioned concurrency | Per workspace | 4096. Reach out to your Databricks account team to increase. |
| Create/update operations | Per workspace | 50 in 5 minutes. |
| Payload size | Per request | 16 MB. For AI agent endpoints the limit is 4 MB. |
| Request/response size | Per request | Any request/response over 1 MB will not be logged. |
| Model execution duration | Per request | 297 seconds |
| CPU endpoint model memory usage | Per endpoint | 4GB |
| GPU endpoint model memory usage | Per endpoint | Depends on GPU type |
| Environment variables | Per served model | 30. Reach out to your Databricks account team to increase. |
| Overhead latency | Per request | Less than 20 milliseconds with route optimization. |
:::
Networking and security limitations
- Model Serving endpoints are protected by access control and respect networking-related ingress rules configured on the workspace, like IP allowlists and Private Link.
- Private connectivity (such as Azure Private Link) is only supported for model serving endpoints that use provisioned throughput or endpoints that serve custom models.
- By default, Model Serving does not support Private Link to external endpoints (like, Azure OpenAI). Support for this functionality is evaluated and implemented on a per-region basis. Reach out to your Azure Databricks account team for more information.
- Model Serving does not provide security patches to existing model images because of the risk of destabilization to production deployments. A new model image created from a new model version will contain the latest patches. Reach out to your Databricks account team for more information.
Compliance security profile standards: CPU workloads
The following table lists the supported compliance security profile compliance standards for the core Model Serving functionality on CPU workloads.
Note
These compliance standards require served containers to be built in the most recent 30 days. Databricks automatically rebuilds outdated containers on your behalf. However, if this automated job fails, an event log message like the following appears and provides guidance on how to ensure your endpoints stay within compliance requirements:
"Databricks couldn't complete a scheduled compliance check for model $servedModelName. This can happen if the system can't apply a required update. To resolve, try relogging your model. If the issue persists, contact support@databricks.com."
| Region | Location | HIPAA | HITRUST | PCI-DSS | IRAP | CCCS Medium (Protected B) | UK Cyber Essentials Plus |
|---|---|---|---|---|---|---|---|
australiacentral |
AustraliaCentral | ||||||
australiacentral2 |
AustraliaCentral2 | ||||||
australiaeast |
AustraliaEast | ✓ | ✓ | ✓ | |||
australiasoutheast |
AustraliaSoutheast | ||||||
brazilsouth |
BrazilSouth | ✓ | ✓ | ✓ | |||
canadacentral |
CanadaCentral | ✓ | ✓ | ✓ | |||
canadaeast |
CanadaEast | ||||||
centralindia |
CentralIndia | ✓ | ✓ | ✓ | |||
centralus |
CentralUS | ✓ | ✓ | ✓ | |||
chinaeast2 |
ChinaEast2 | ||||||
chinaeast3 |
ChinaEast3 | ||||||
chinanorth2 |
ChinaNorth2 | ||||||
chinanorth3 |
ChinaNorth3 | ||||||
eastasia |
EastAsia | ✓ | ✓ | ✓ | |||
eastus |
EastUS | ✓ | ✓ | ✓ | |||
eastus2 |
EastUS2 | ✓ | ✓ | ✓ | |||
francecentral |
FranceCentral | ✓ | ✓ | ✓ | |||
germanywestcentral |
GermanyWestCentral | ✓ | ✓ | ✓ | |||
japaneast |
JapanEast | ✓ | ✓ | ✓ | |||
japanwest |
JapanWest | ||||||
koreacentral |
KoreaCentral | ✓ | ✓ | ✓ | |||
mexicocentral |
MexicoCentral | ||||||
northcentralus |
NorthCentralUS | ✓ | ✓ | ✓ | |||
northeurope |
NorthEurope | ✓ | ✓ | ✓ | |||
norwayeast |
NorwayEast | ||||||
qatarcentral |
QatarCentral | ||||||
southafricanorth |
SouthAfricaNorth | ||||||
southcentralus |
SouthCentralUS | ✓ | ✓ | ✓ | |||
southeastasia |
SoutheastAsia | ✓ | ✓ | ✓ | |||
southindia |
SouthIndia | ||||||
swedencentral |
SwedenCentral | ✓ | ✓ | ✓ | |||
switzerlandnorth |
SwitzerlandNorth | ✓ | ✓ | ✓ | |||
switzerlandwest |
SwitzerlandWest | ||||||
uaenorth |
UAENorth | ✓ | ✓ | ✓ | |||
uksouth |
UKSouth | ✓ | ✓ | ✓ | ✓ | ||
ukwest |
UKWest | ||||||
westcentralus |
WestCentralUS | ||||||
westeurope |
WestEurope | ✓ | ✓ | ✓ | |||
westindia |
WestIndia | ||||||
westus |
WestUS | ✓ | ✓ | ✓ | |||
westus2 |
WestUS2 | ✓ | ✓ | ✓ | |||
westus3 |
WestUS3 | ✓ | ✓ | ✓ |
Foundation Model APIs limits
For detailed information about Foundation Model APIs, including resource and payload limits for foundation and external models, see Foundation Model APIs rate limits and quotas.
Region availability
Note
If you require an endpoint in an unsupported region, reach out to your Azure Databricks account team.
If your workspace is deployed in a region that supports model serving but is served by a control plane in an unsupported region, the workspace does not support model serving. If you attempt to use model serving in such a workspace, you will see in an error message stating that your workspace is not supported. Reach out to your Azure Databricks account team for more information.
For more information on regional availability of each Model Serving feature, see Model serving features availability.
For Databricks-hosted foundation model region availability, see Foundation models hosted on Databricks.