Share via

Databricks model serving

SateeshPula 0 Reputation points
2026-03-17T16:18:11.9333333+00:00

I keep getting this error when I try to create a serving endpoint. I checked I have Premium tier, serverless enable, GPU quota available. I am not sure about any backend entitlement enablement by Databricks.

Provisioned throughput is not enabled for this workspace.

Azure Databricks
Azure Databricks

An Apache Spark-based analytics platform optimized for Azure.

0 comments No comments

1 answer

Sort by: Most helpful
  1. AI answer

    2026-03-17T16:18:32.2733333+00:00

    The error indicates that the provisioned throughput feature is not enabled in the workspace, but the endpoint being created is configured to use provisioned throughput (a mode used mainly for Foundation Model APIs and fine‑tuned foundation models).

    To resolve this, use one of these approaches depending on what is being deployed:

    1. If deploying a custom model (own MLflow model):
      • Create a custom model serving endpoint, not a provisioned throughput endpoint.
      • In the Serving UI:
        1. Go to Serving in the sidebar.
        2. Click Create serving endpoint.
        3. In Served entities → Entity, select My models – Unity Catalog or My models – Model Registry and choose the registered model and version.
        4. Configure Compute Type (CPU or GPU) and Compute Scale-out (Small/Medium/ Large) instead of any provisioned throughput options.
      • Do not set min_provisioned_throughput / max_provisioned_throughput in REST/SDK calls; instead, use the standard concurrency settings (min_provisioned_concurrency, max_provisioned_concurrency) if needed.
    2. If deploying a fine‑tuned foundation model with provisioned throughput:
      • Provisioned throughput endpoints are a specific feature of Foundation Model APIs.
      • In the Serving UI, provisioned throughput is only available for eligible models from Unity Catalog; when such a model is selected, the UI shows a Provisioned Throughput screen where the Up to dropdown is used to set tokens per second.
      • If the workspace does not have provisioned throughput enabled, this mode cannot be used. In that case, either:
        • Use the default pay‑per‑token Foundation Model API endpoints that appear at the top of the Serving list (no custom provisioned throughput endpoint required), or
        • Contact the Databricks account team to enable provisioned throughput for the workspace.
    3. If private connectivity (Azure Private Link) is required:
      • Private connectivity is only supported for provisioned throughput endpoints or custom model endpoints. Ensure the endpoint type matches this requirement.

    In summary, the error is not about GPU quota or Premium tier, but about trying to use provisioned throughput in a workspace where that feature is not enabled. Use a standard custom model serving endpoint (no provisioned throughput settings) or work with the Databricks account team to enable provisioned throughput if specifically required.


    References:

    AI-generated content may be incorrect. Read our transparency notes for more information.

    Was this answer helpful?

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.