how add class library for BeautifulSoup in Azure Databricks

Verma, Manish Kumar 131 Reputation points

hi all,
how add class library for BeautifulSoup in Azure Data-bricks

i want to run below code in pyspark notebook

from bs4 import BeautifulSoup
import pandas as pd

table = BeautifulSoup(open('C:/age0.html','r').read()).find('table')
df = pd.read_html(table)

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
1,297 questions
No comments
{count} votes

1 answer

Sort by: Newest
  1. PRADEEPCHEEKATLA-MSFT 56,471 Reputation points Microsoft Employee

    Hello @Verma, Manish Kumar ,

    Welcome to the Microsoft Q&A platform.

    To install library for BeautifulSoup on Azure DataBricks:

    pip install beautifulsoup4  


    Tested on:

    Databricks Runtime Version7.0 (includes Apache Spark 3.0.0, Scala 2.12)

    Note: The file location should be Databricks File System (DBFS) or mount Azure Storage accounts. You cannot use local path in Azure Databricks.


    Hope this helps. Do let us know if you any further queries.


    Do click on "Accept Answer" and Upvote on the post that helps you, this can be beneficial to other community members.