GRPC connection issue in serverless databricks -connect

Question

GRPC connection issue in serverless databricks -connect

Subray Hegde 0

SUMMARY:

Queries executed via Databricks Connect v17 (Spark Connect / gRPC) on

serverless compute COMPLETE SUCCESSFULLY on the server side (Spark tasks

finish, results are produced), but the Spark Connect gRPC channel FAILS

TO DELIVER results back to the client application. The client receives

nothing, waits, and eventually cancels the query after its timeout.

This issue is 100% exclusive to Spark Connect. The Databricks SQL

Connector (poll-based HTTP) on the same data, same network, same user

has ZERO cancellations.

ENVIRONMENT:

• databricks-connect version: 17 (latest)

• Client: External Python application via Databricks Connect

• Compute: Serverless (SERVERLESS_COMPUTE)

• Protocol: SPARK_CONNECT (gRPC / HTTP2)

EXACT FAILURE FLOW:

Client app sends query via Databricks Connect (gRPC) → serverless
Serverless executes query — Spark tasks complete, results produced
*** Server FAILS to stream results back via gRPC *** (result_fetch_duration_ms = 0 — result delivery never starts)
Client waits... receives nothing... hits app timeout
Client cancels query/session
Query recorded as CANCELED in query history

0 comments

1 answer

Your answer

Answer 1

@Subray Hegde Hey there! It sounds like you’re running into a current limitation with Spark Connect (gRPC) on serverless compute—your queries finish on the server side but the gRPC channel never actually streams the results back to your client. Here’s what’s likely going on and how you can work around it:

Spark Connect support is only GA on all-purpose clusters right now • Serverless compute (the SQL-warehouse–style endpoints) don’t yet support the bidirectional HTTP/2 streaming that Spark Connect relies on. • That’s why your Databricks SQL Connector (HTTP-polling) works fine, but gRPC stalls as soon as it tries to send rows back.
Workarounds • If you need Spark Connect today, spin up an all-purpose cluster instead of a serverless endpoint. Make sure you’ve got spark.databricks.service.server.enabled=true on the cluster and you’re on a Databricks Runtime that matches your databricks-connect client version. • If you absolutely need serverless compute, continue using the SQL Connector (poll/HTTP based), which is fully supported.
Next steps / follow-up questions To nail down exactly what’s happening, can you share a bit more info? • Have you confirmed whether Spark Connect works against an all-purpose cluster in your workspace? • Are there any HTTP/2 proxies or corporate gateways between your client and Databricks? (gRPC streaming can break if a proxy doesn’t support HTTP/2.) • Can you grab driver/executor logs from your serverless session to see if any gRPC errors show up? • Which exact Spark Runtime and databricks-connect client version are you matching?

Reference docs you might find helpful:

• Troubleshooting Databricks Connect for Python:

https://learn.microsoft.com/azure/databricks/dev-tools/databricks-connect/python/troubleshooting

• Troubleshooting Databricks Connect for Scala:

https://learn.microsoft.com/azure/databricks/dev-tools/databricks-connect/scala/troubleshooting

• Advanced usage of Databricks Connect (gRPC over HTTP/2, custom headers, proxies):

https://learn.microsoft.com/azure/databricks/dev-tools/databricks-connect/advanced#use-spark-connect-server-with-databricks-connect

• Serverless compute networking & connectivity guide:

https://learn.microsoft.com/azure/databricks/security/network/serverless-network-security

Hope this sheds some light. Let me know the answers to those follow-up questions, and we’ll dig deeper!

Note: This content was drafted with the help of an AI system. Please verify the information before relying on it for decision-making.

Share via

GRPC connection issue in serverless databricks -connect

1 answer

Your answer