Azure Databricks fail to install Geospark libraries from Maven

Anuj, Singh (Cognizant) 5 Reputation points
2024-04-15T06:24:17.8033333+00:00

Hi Team , I am attempting to add below two geospark Maven libraries to my Azure Databricks interactive cluster with Runtime Version 14.3 LTS . Geospark_Library

However , I am getting below error

Library installation attempted on the driver node of cluster 0311-204237-y518gt4u and failed.

 

Please refer to the following error message to fix the library or contact Databricks support.

 

Error Code: DRIVER_LIBRARY_INSTALLATION_FAILURE.

 

Error Message: Library resolution failed because unresolved dependency: org.datasyslab:geospark-sql_2.3:1.3.1: not found unresolved dependency: org.datasyslab:geospark:1.3.1: not found

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
1,933 questions
{count} votes

1 answer

Sort by: Most helpful
  1. PRADEEPCHEEKATLA-MSFT 77,516 Reputation points Microsoft Employee
    2024-04-15T07:06:28.4366667+00:00

    @Anuj, Singh (Cognizant) - Thanks for the question and using MS Q&A platform.

    The error message you are seeing indicates that Databricks is unable to download the Geospark library from Maven. This could be due to a network issue or a problem with the Maven repository.

    Here are some steps you can take to resolve this issue:

    1. Check your network connectivity: Ensure that your network connection is stable and that you are able to access the internet. You can try pinging the Maven repository to see if you are able to connect to it.
    2. Check the Maven repository: Verify that the Maven repository is up and running and that the GeoSpark library is available. You can try downloading the library manually from the repository to see if it is accessible.
    3. Check the Databricks cluster configuration: Ensure that the cluster is configured to use the correct Maven repository and that the repository is accessible from the cluster. You can check the cluster configuration by going to the cluster settings in the Databricks workspace.
    4. Try using a different Maven repository: If the issue persists, you can try using a different Maven repository to see if that resolves the issue. You can configure the cluster to use a different repository by updating the Maven settings in the cluster configuration.

    As per the repro, I had tried to install the maven library (org.datasyslab:geospark:1.3.1 & org.datasyslab:geospark-sql_2.3:1.3.1) on Cluster Details - Databricks Runtime Version: 14.3 LTS (includes Apache Spark 3.5.0, Scala 2.12) and able to successfully able to install without any issue.User's image

    In case, if you are still experiencing the same issue. I would suggest you share the steps which you are trying to install and also try to download the jar file directly from the maven site: https://mvnrepository.com/artifact/org.datasyslab and upload manally and install.

    Hope this helps. Do let us know if you any further queries.


    If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.