Install Packages on Azure Synapse DEP-enabled workspaces

Abhiram Duvvuru 231 Reputation points Microsoft Employee
2024-06-26T02:17:08.77+00:00

Hi

I have a Synapse workspace with DEP enabled. Since PyPI libraries cannot be installed directly in the Spark pool, could you please advise on how to install the azure-mgmt-kusto, azure-kusto-data, and azure-keyvault-secrets packages on the pool? I couldn't find the corresponding jar files for these packages.

Thanks,

Abhiram

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,692 questions
{count} votes

Accepted answer
  1. phemanth 8,645 Reputation points Microsoft Vendor
    2024-06-26T06:54:19.89+00:00

    @Abhiram Duvvuru

    Welcome to Microsoft Q&A platform and thanks for posting your question.

    In Azure Synapse Analytics, you can’t directly install Python libraries from PyPI in the Spark pool. However, you can use the following workaround to use these libraries:

    1. Create a wheel file for each package: You can create a wheel (.whl) file for each of the packages (azure-mgmt-kusto, azure-kusto-data, azure-keyvault-secrets) on your local machine. To do this, first, you need to install the wheel package using pip:
         pip install wheel
      
      Then, you can create a wheel file for each package:
         pip wheel --wheel-dir=./ azure-mgmt-kusto
         pip wheel --wheel-dir=./ azure-kusto-data
         pip wheel --wheel-dir=./ azure-keyvault-secrets
      
      This will create a .whl file for each package in the current directory. Upload the wheel files to a storage account: Next, you need to upload these .whl files to a blob storage account that your Synapse workspace can access.
    2. Install the packages from the wheel files: Finally, you can install these packages in your Spark pool using the spark.jars.packages configuration option. You need to provide the path to the .whl files in the blob storage account. Here is an example:
         spark.conf.set(
             "spark.jars.packages",
             "wasbs://<your-container>@<your-storage-account>.blob.core.windows.net/<path-to-wheel-file>"
         )
      
      You need to replace <your-container>, <your-storage-account>, and <path-to-wheel-file> with your actual blob storage container, storage account, and the path to the .whl file, respectively.

    Please note that you need to do this for each Spark session where you want to use these packages. Also, be aware that using this method might have implications on the performance and stability of your Spark jobs, as these packages are not natively designed to work with Spark. It’s recommended to test your jobs thoroughly after installing these packages.

    for more details please refer https://pypi.org/project/azure-mgmt-kusto/

    Hope this helps. Do let us know if you any further queries.

    1 person found this answer helpful.

0 additional answers

Sort by: Most helpful