@29577539 - Thanks for the question and using MS Q&A platform.
The error message you are seeing indicates that there is a communication issue between Databricks and your on-premises Impala instance. Here are some steps you can take to troubleshoot and resolve the issue:
- Check network connectivity: Ensure that there is network connectivity between Databricks and your on-premises Impala instance. You can use tools like ping or telnet to test connectivity.
- Check firewall settings: Ensure that the necessary ports are open in the firewall settings for your on-premises Impala instance. The default port for Impala is 21050, but this may vary depending on your configuration.
- Check Impala configuration: Ensure that Impala is configured to allow connections from Databricks. You may need to add the IP address or hostname of the Databricks cluster to the Impala configuration.
- Check JDBC driver version: Ensure that you are using the correct version of the JDBC driver for your Impala instance. You can download the latest version of the Cloudera Impala JDBC driver from the Cloudera website.
- Check Databricks cluster configuration: Ensure that the Databricks cluster is configured to use the correct JDBC driver and connection settings for your Impala instance. You can configure these settings in the cluster configuration settings.
- Check Impala logs: Check the Impala logs for any errors or warnings related to the connection issue. This may provide additional information on the cause of the issue.
By following these steps, you should be able to identify and resolve the communication issue between Databricks and your on-premises Impala instance.
And also, checkout the MS Q&A thread: How can i connect to my on premise Impala system from Azure databricks using python/pyspark code addressing similar issue.
Hope this helps. Do let us know if you any further queries.
If this answers your query, do click Accept Answer
and Yes
for was this answer helpful. And, if you have any further query do let us know.