notebook-scoped lib failing to find the dependent cluster-scoped lib

Tri Tran 1 Reputation point Microsoft Employee
2020-10-07T21:45:54.727+00:00

Our app hit an issue with ModuleNotFoundError recently with Databricks Runtime 6.4 and it looks this is a databricks issue, so wanted to ask about that.

The problem can happen as follows:

  1. Two libraries A.B and A.C; A.C depends on A.B
  2. Install A.B as cluster-scoped lib and it goes to default location, e.g. /databricks/python/lib/python3.7/site-packages/A/B
  3. Install A.C as notebook-scoped lib via dbutils.library.installPyPI, this goes to virtual env location, e.g., /local_disk0/pythonVirtualEnvDirs/virtualEnv-ead9164f-61e5-4c26-9374-374bfb80263c/lib/python3.7/site-packages/A/C. And A.B was not installed because A.B with same version was already installed in cluster scope.
  4. When we import A.C in notebook, A.B needs to be imported as well but this will fail with ModuleNotFoundError because A.B is not found in the virtual environment.

We hit this issue with A.B = "azure-core==1.8.2" and A.C = "azure-kusto-data==0.0.24". Our current workaround is to either install both packages together in cluster or both in notebook session.

Could you let us know if this is indeed a databricks issue and the plan to fix it? Thank you.

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,063 questions
{count} votes

1 answer

Sort by: Most helpful
  1. HimanshuSinha-msft 19,381 Reputation points Microsoft Employee
    2020-10-14T21:35:03.173+00:00

    Hello @Tri Tran ,

    We have discussed the same aks here : https://github.com/MicrosoftDocs/azure-docs/issues/64000

    Thanks
    Himanshu

    0 comments No comments