How to create a global init script to install an .whl ​​library on all clusters?

Cristina Santana Souza 61 Reputation points
2021-06-10T18:11:37.85+00:00

Hello,

I created a globat init script to install the package on all clusters at startup. The script runs successfully and I can import the functions from the package, but the library installed doesn't appear in the Libraries tab of the cluster. Am I doing something wrong?

  • Script

104344-image.png

  • Runtime

104372-image.png

  • Logs

104373-image.png

  • Libraries

No library is displayed. This is a problem because those who use the cluster will not know that the library has been installed.

104300-image.png

References:

Best regards,
Cristina

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,514 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. PRADEEPCHEEKATLA 90,641 Reputation points Moderator
    2021-06-11T09:56:00.293+00:00

    Hello @Cristina Santana Souza ,

    Thanks for the question and using MS Q&A platform.

    Unfortunately, libraries installed using Global/Init scripts does not show in libraries section.

    Note: Even, if you install libraries using Notebooks even those will not be shown in the libraries section in the cluster.

    Example: I had installed a library called openpyxl via Global/Init/Notebook does not show in libraries section in the cluster.

    Global Init Script:

    104620-image.png

    Event Log confirming the INIT_SCRIPTS_FINISHED:

    104743-image.png

    Cluster shows: No library is displayed.

    104764-image.png

    Successfully able to use the installed library named openpyxl in the notebooks:

    104772-image.png

    How to verify whether package is installed or not?

    You can use the below command to get libraries that are installed:

    import pkg_resources  
    for d in pkg_resources.working_set:  
         print(d)  
    

    First lets verify the installed version of openpyxl on the cluster.

    import pkg_resources  
    pkg_resources.get_distribution('openpyxl').version  
    

    104765-image.png

    Hope this helps. Do let us know if you any further queries.

    ------------

    • Please accept an answer if correct. Original posters help the community find answers faster by identifying the correct answer. Here is how.
    • Want a reminder to come back and check responses? Here is how to subscribe to a notification.

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.