How to use the whl file in the Spark pool in Azure Synapse Analytics?

Sanket Kelkar 0 Reputation points
2023-03-29T07:08:20.9133333+00:00

Hi All,

I want to use a .whl file in the Spark Pool of Azure Synapse Analytics. There are total 3 ways that I have tried -

A. From the Azure Portal - by manually adding the .whl file to the workspace packages and then to the spark pool packages. This method is too slow and takes approx. 30 mins to complete.

B. From the Azure CLI (in Powershell) - by running the below commands -

  • Upload the .whl file to the workspace
   az synapse workspace-package upload --workspace-name myWorkSpace --package "C:\Users\Test\Documents\GitHub\TestGitHub\dist\my_etl-0.0.1-py3-none-any.whl"
  • Attach the package to the spark pool
   az synapse spark pool update --name mySparkPoolName--workspace-name myWorkSpace --resource-group myRG --package-action Add --package my_etl-0.0.1-py3-none-any.whl

This method is also slow and takes approx. 20 mins to complete.

C. From the Storage account that is linked to the Spark pool -

I uploaded the .whl file to the location -

/mySynapseLinkedStorage/synapse/workspaces/myWorkSpace/sparkpools/mySparkPool/libraries/python/my_etl-0.0.1-py3-none-any.whl

My question is, how do I link this file which is in my storage to the Spark Pool?

Also, is there any faster option to attach the .whl file to the Spark pool?

Thank you!

Sanket Kelkar

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,920 questions
{count} votes

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.