How to run Lakeflow connector pipeline in Databricks ?

Santhosh Jayarao 0 Reputation points
2025-11-28T09:53:47.72+00:00

I am using Azure Databricks Lakeflow Connect (Public Preview) to ingest CDC data from Azure SQL. My subscription is Pay-As-You-Go (0d339ce2-e789-41a0-8ca5-35d7dde18867).

Lakeflow Connect creates a Gateway that requires serverless compute clusters to be provisioned under the hood by the AzureDatabricks_RP resource provider.

The issue is:

❌ The Lakeflow Gateway serverless cluster cannot launch in ANY region

I have tried Central India, East US, West US, Central US, East US 2, and West Europe.

Even after increasing quotas for:

Standard FSv2 family (up to 16 vCPUs)

Standard EDv4 family (up to 16 vCPUs)

Standard Dv4 family

Standard Dsv4 family

Standard DCasv5 family

The compute provisioning still consistently fails with:

The VM size you are specifying is not available.
SkuNotAvailable: The requested VM size for resource 'Following SKUs have failed for Capacity Restrictions: Standard_F4s'

QuotaExceeded: Operation could not be completed as it results in exceeding approved standardEDv4Family Cores quota.
Current Limit: 0, Required: 4

QuotaExceeded: Standard_Dv4 / Standard_Dsv4 / Standard_DCasv5 Families

databricks_error_message: Encountered Quota Exhaustion issue in your account
CLUSTER_LAUNCH_CLIENT_ERROR

❗ Key Point:

These errors occur even though quota has been successfully increased, and across multiple regions. Databricks support documentation notes that this is caused by a missing backend permission for the Databricks RP to allocate serverless compute on new PAYG subscriptions.

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
{count} votes

1 answer

Sort by: Most helpful
  1. VRISHABHANATH PATIL 1,820 Reputation points Microsoft External Staff Moderator
    2025-12-01T09:43:02.2666667+00:00

    Hi @Santhosh Jayarao

    Thank you for reaching out to Microsoft QA. Below is a detailed analysis of your concern along with the steps to resolve the issue.
     

    What’s actually happening (in plain English)

    When you use Lakeflow Connect for Azure SQL, two things spin up:

    • A Gateway that runs on your Azure compute (classic/customer compute).
    • An ingestion pipeline that runs on Databricks serverless.

    So you need Azure VM capacity + per‑family quotas for the Gateway in the region you pick, and you need your Databricks workspace to be eligible for serverless so the pipeline can run. This is why you’re seeing both SkuNotAvailable and QuotaExceeded on the Gateway side, even though you raised quotas.

    References:

    Make sure your workspace is ready for serverless (so pipelines will run)

    Note: New/just‑created accounts and PAYG subscriptions sometimes need a back‑end enablement period before serverless fully works. If the workspace is brand new, factor that in.

    Fix the Gateway launch: deal with SKU capacity and per‑family quotas

    Your errors point to two separate (but related) Azure issues:

    1. SkuNotAvailable for Standard_F4s That VM size is at capacity in the region/zone you tried. Pick a different SKU or a different region where it’s available. Quick way to check:

    Shell

    az vm list-skus --location eastus2 --size Standard_F --all --output table

    az vm list-skus --location centralus --size Standard_E --all --output table

     

    You’re looking for lines where Restrictions = None.

    Helpful threads: https://learn.microsoft.com/en-us/answers/questions/1372856/skunotavailable-while-creating-and-starting-the-az https://community.databricks.com/t5/get-started-discussions/need-help-in-creating-compute-cluster/td-p/130468

    1. QuotaExceeded for EDv4 / Dv4 / Dsv4 / DCasv5 families Increase per‑family vCPU quotas and the Total Regional vCPU in the exact region you’ll use. Don’t just raise global quotas—raise family quotas for:
      • Fsv2 (covers Standard_F4s workers)
        • EDv4 (covers the recommended Gateway driver)
          • Others only if your policy defaults to them (Dv4/Dsv4/DCasv5)

    Pin the Gateway to the known‑good SKUs with a compute policy

    Databricks documents the recommended sizes for the SQL Server connector Gateway:

    • Driver: Standard_E64d_v4
    • Worker: Standard_F4s
    • Minimum: 8 cores

    Create a compute policy that pins these, and apply it to the Gateway task so you don’t get surprise families that hit quota limits:

    Policy (JOB_COMPUTE family)

    JSON

    {

    "name": "lakeflow-gateway-edv4-f4s",

    "definition": {

    "cluster_type": { "type": "fixed", "value": "dlt" },

    "num_workers": { "type": "unlimited", "defaultValue": 1, "isOptional": true },

    "runtime_engine": { "type": "fixed", "value": "STANDARD", "hidden": true },

    "driver_node_type_id": { "type": "fixed", "value": "Standard_E64d_v4" },

    "node_type_id": { "type": "fixed", "value": "Standard_F4s" }

    },

    "policy_family_id": "JOB_COMPUTE"

    Apply the policy to the Gateway (Pipeline API fragment)

    JSON

    {

    "clusters": [

    {

    "label": "lakeflow_gateway",

    "policy_id": "abc-def-123",

    "apply_policy_default_values": true

    }

    ]

    }

    Docs:

    Community experience with the same SKUs: https://community.databricks.com/t5/get-started-discussions/issue-with-the-sql-server-ingestion-in-databricks-lakflow/td-p/122226

    Pick a region that actually has capacity

    If East US is tight, try East US 2 or South Central US (or another region that shows no restrictions for Standard_F4s and Standard_E*d_v4). Validate with the az vm list-skus check above.

    If your PAYG subscription is new and this still won’t launch

    This matches the known pattern where AzureDatabricks_RP (the resource provider) needs backend enablement on new PAYG subscriptions for serverless scenarios. Open two support tickets in the Azure portal:

    Ticket #1 – Compute quotas (Microsoft.Compute)

    Increase Per‑VM family vCPU and Total Regional vCPU quotas in <region> for Fsv2 (to use Standard_F4s) and EDv4 (driver). We need at least +8 vCPU each for the Databricks Lakeflow SQL Server connector Gateway. If Standard_F4s is capacity‑restricted, please remove/adjust that restriction.

    Ticket #2 – Azure Databricks RP / serverless enablement

    We’re on a new PAYG subscription and the Lakeflow Connect Gateway consistently fails with CLUSTER_LAUNCH_CLIENT_ERROR, SkuNotAvailable: Standard_F4s, and QuotaExceeded despite quota increases. Please confirm AzureDatabricks_RP backend permissions and ensure our workspace is fully enabled for serverless.

    You can reference these docs directly in your tickets:

    End‑to‑end checklist (do these in order)

    1. Workspace prerequisites: Unity Catalog enabled; serverless‑supported region; Premium plan as needed. https://learn.microsoft.com/en-us/azure/databricks/compute/serverless/ https://learn.microsoft.com/en-us/azure/databricks/admin/sql/serverless
    2. Region validation: Use az vm list-skus to confirm no restrictions for Standard_F4s and Standard_E*d_v4 in your chosen region.
    3. Quota increases: Raise Fsv2 and EDv4 family quotas plus Total Regional vCPU for that region (not just global).
    4. Compute policy: Create/apply the policy to pin Standard_E64d_v4 (driver) and Standard_F4s (worker), ≥8 cores. https://learn.microsoft.com/en-us/azure/databricks/ingestion/lakeflow-connect/sql-server-pipeline
    5. Gateway deploy: Redeploy the Gateway with the policy attached.
    6. Pipeline run: Lakeflow pipeline runs on serverless; configure schedule/alerts. https://learn.microsoft.com/en-us/azure/databricks/ingestion/lakeflow-connect/
    7. If blocked: Open the two Azure support tickets (Compute quotas + Databricks RP/serverless enablement).

    Extra references you might want handy

    TL;DR

    • Pin the Gateway to Standard_E64d_v4 (driver) + Standard_F4s (worker) with a compute policy.
    • Pick a region where those SKUs show no restrictions and raise per‑family + regional vCPU quotas there.
    • If the subscription/workspace is new PAYG, ask Azure support to confirm Databricks RP/serverless enablement.
    • Then deploy the Gateway and run the pipeline—job done.

Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.