The VM size you are specifying is not available. [details] QuotaExceeded: Operation could not be completed as it results in exceeding approved Total Regional Cores quota. Additional details - Deployment Model: Resource Manager, Location: CentralIndia, Cur
I'm trying to create cluster in Azure databricks, I check different regions and with different memory and core sizes but all i see is SKU errors or (The VM size you are specifying is not available. [details] QuotaExceeded: Operation could not be…
Azure Databricks
Guidance on Partition-Level Retry Strategy for Catch-Up CDC Ingestion and Reconciliation
In our ongoing healthcare data migration project, we are ingesting data from IBM DB2 (via IBM InfoSphere CDC) into Kafka (on GCP), and then further processing it through Databricks into Azure SQL Hyperscale. Here’s the specific situation for which we…
Azure Databricks
My Databricks workspace (adb-3634739564378269.9.azuredatabricks.net) is not loading after login.
My Databricks workspace (adb-3634739564378269.9.azuredatabricks.net) is not loading after login. Login succeeds and redirects to the workspace, but it shows a blank screen or freezes. Error from browser DevTools: ChunkLoadError: Loading chunk 46442…
Azure Databricks
Databricks automate
Hello , I have scheduled data bricks automated mail for data quality checks for 15-20 datasets which send DQ report at certain scheduled time. Since all these mails goes to business in separate mailers, it wants all of them as a single mail. How to do…
Azure Databricks
ADF vs Databricks for Load in ETL into Hyperscale
We are working on a highly sensitive healthcare data migration project involving: Source: IBM DB2 (on-prem) with partitioned tables (up to 18 TB in size). CDC: IBM InfoSphere CDC → Kafka Topics (on GCP). Target: Azure SQL Hyperscale. There are two…
Azure Databricks
When to Use MERGE INTO vs APPLY CHANGES INTO in Databricks CDC Pipelines
Background: In our CDC pipeline, we use Databricks to process Kafka CDC data (I/U/D events) into Delta tables. We’re evaluating whether to continue using MERGE INTO or shift to APPLY CHANGES INTO. ❓ Questions for Microsoft: When should we prefer APPLY…
Azure Databricks
CDC Merge Hyperscale Options
In our current project, we have already completed a historical load of ~80 TB into Azure SQL Hyperscale, and the table content in Hyperscale is in sync with our "branch" Delta Lake table in Databricks. For catch-up CDC ingestion, incremental…
Azure Databricks
Kafka CDC merge ordering
In our CatchUp architecture, we are consuming CDC data from IBM InfoSphere CDC (FirstWare) into Kafka topics. For a given primary key, it is possible that we get a sequence of operations like Insert followed by one or more Updates. These events are…
Azure Databricks
Best Practices for Reconciling Kafka CDC Operations (I/U/D) in Azure Databricks During Catch-Up
Background: In our healthcare-sensitive project, we are performing large-scale historical migration from IBM DB2 to Azure. The catch-up phase handles all CDC changes (Insert/Update/Delete) that occurred after the historical snapshot. These changes are…
Azure Databricks
Auto loader in detail
Hello, my task is to provide costing of auto loader, it should be close to accurate. Please advise how to do that. Thanks
Azure Databricks
Retry and Failure Handling Strategy for CDC Merge Pipeline from Kafka to Databricks and Hyperscale
In our CDC ingestion architecture, we are processing incremental changes( 3000-30,000 events/sec) , 800 topics for 800 tables from IBM DB2 using Kafka topics (via IBM InfoSphere CDC), with the following two stages: Kafka to Databricks Silver Layer: We…
Azure Databricks
Kafka Partitionings vs DB partitions
We are working on a large-scale CDC ingestion pipeline after completion of One time historicsl Migration where we have already imported 80 TB of data vi ADF to bronze layer where: Source: IBM DB2 (on-prem) CDC Tool: IBM InfoSphere CDC publishes to…
Azure Databricks
Guidance on Connecting Azure Databricks to External Kafka Cluster (GCP-Hosted) for Structured Streaming Ingestion
We are implementing a real-time ingestion pipeline where Azure Databricks (in our tenant) consumes CDC data directly from a Kafka cluster hosted on GCP (external to Azure). The Kafka topics are populated by IBM InfoSphere CDC and are available in…
Azure Databricks
Best Practices for Handling Kafka Load Spikes in Structured Streaming Without Autoscaling
❓Question for Microsoft/Databricks Team: We are working on a stateful real-time CDC ingestion pipeline using Azure Databricks Structured Streaming, where: Kafka (CDC topics from on-prem DB2 via IBM CDC) is our source. Azure Databricks reads these topics…
Azure Databricks
mpact of Kafka Partition Size on Databricks Streaming Performance When Writing to Azure SQL Hyperscale
n our project, we are using Databricks (not ADF) for both catch-up and real-time CDC ingestion from Kafka topics and writing the output directly to Azure SQL Hyperscale via JDBC. Some of our source Kafka topics (originating from DB2 CDC) may have large…
Azure Databricks
Hash calculation strategy for datatypes mismatch
In our current project, we are migrating data from an on-premises IBM DB2 system to Azure SQL Hyperscale, using Azure Databricks for transformation and reconciliation. This includes both batch and CDC-based pipelines. Our project requirement is not just…
Azure Databricks
Unable to create or bring up the cluster on azure databricks. - Failed to perform resource identity operation
Hi, We have set up an Azure Databricks service along supporting services and it was working fine until the below changes were performed on Azure subscription. Details of changes: --> subscriptions was moved to a different directory Post this change,…
Azure Databricks
I can't get databricks to talk to my storage account. Error 403
I can't get the data bricks to mount my data lake storage. I get error 403 no matter what I do.
Azure Databricks
I am not able to create cluster. I am trying to create single node cluster. However , when I am trying to select node type all seems disabled. Also I am getting a message that "cluster cannot be created because no node is enabled for this subscription".
I am trying to create a single node cluster using my free trial account . I have selected west india as my region while creating resource group. When I am trying to create "all purpose cluster" and then I am trying to select node type, all VM…
Azure Databricks
Guidance on Designing Control Tables Across Historical Migration, Catch-up, and Streaming Phases
In our project, we are handling data ingestion in three phases: One-time historical migration (via ADF) Catch-up CDC (Kafka from IBM CDC) Real-time streaming (Structured Streaming with Databricks) We have already designed separate control tables for…