Using XML in Synapse spark pool

Question

Hi,
Can anybody help with using the XML in Synapse spark pool with pyspark? I found some articles where they suggest a code like this would load the XML into a data frame, but I get an error when trying it:
df=spark.read.format("com.databricks.spark.xml").option("rootTag", "Catalog").option("rowTag","book").load("books.xml")
The error is this:
"java.lang.ClassNotFoundException: Failed to find data source: com.databricks.spark.xml"
Apparently, the package with the com.databricks.spark.xm format can be used in Synapse Analytics, but I don't know what should I list in the requirements.txt file in the spark configuration to get this loaded.
If somebody can provide the steps it would be greatly appreciated.

LJ

Accepted Answer

An alternative to the
requirements.txt
approach is to upload the JAR fiule as a workload package.

You can download the JAR from here.

Once you have that then follow the instructions here. Note specifically the step
You can also select additional workspace packages to add Jar or Wheel files to your pool.
which allows you to upload the JAR.

Answer

Thanks Martin. This worked.
I wonder why it doesn't work with the requirements.txt. It could be I didn't list the library name correctly or spark-xml library doesn't exist where Synapse is trying to install it using requirements.txt.

Using XML in Synapse spark pool

1 additional answer