Hi @Poornima Sreedhar ,
Welcome to Microsoft Q&A forum and thanks for your query.
Databricks has a spark driver for XML - GitHub - databricks/spark-xml: XML data source for Spark SQL and DataFrames . You can use this databricks library on Synapse Spark.
Compatible with Spark 3.0 and later with Scala 2.12, and also Spark 3.2 and later with Scala 2.12 or 2.13. Scala 2.11 and Spark 2 support ended with version 0.13.0.
Or you can always read the XML through Python, Scala, C# and write it out to a DataFrame, or implement a UDF to explode it into rows.
Here is a thread where a user shared an example on how they have used it - synapse spark pool - pyspark load a subset of xml files from given folder
Hope this info helps.
- Please don't forget to click on and upvote button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is how
- Want a reminder to come back and check responses? Here is how to subscribe to a notification
- If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators