Thanks for using Microsoft Q&A platform and thanks for posting your query here.
I understand that you want to convert excel/csv to XML using ADF.
- You can use Azure Function to write python code convert the Excel/CSV file to XML format and run the azure function from ADF pipeline.
Kindly refer to the below resources:
- https://www.geeksforgeeks.org/how-to-convert-excel-to-xml-format-in-python/
- https://blog.groupdocs.cloud/conversion/convert-xml-to-excel-and-excel-to-xml-in-python/
- You can write Pyspark code in Azure synapse notebook.
- To read data from excel file:
import pandas as pd
account_key_value="your_storage_acc_key"
df = pd.read_excel('abfs://******@jadls2.dfs.core.windows.net/xl_files/sample.xlsx', storage_options = {'account_key' : account_key_value})
s_df=spark.createDataFrame(df)
display(s_df)
Instead of account key, you can also use these options:
storage_options = {'sas_token' : 'sas_token_value'}
storage_options = {'connection_string' : 'connection_string_value'}
storage_options = {'tenant_id': 'tenant_id_value', 'client_id' : 'client_id_value', 'client_secret': 'client_secret_value'}
- To write data intot xml file:
df = spark.read.format('xml').options(rowTag='book').load('books.xml')
(df.select("author", "_id").write
.options(rowTag='book', rootTag='books')
.xml('newbooks.xml')
)
Here are the resources you can refer:
https://learn.microsoft.com/en-us/azure/databricks/query/formats/xml
Hope it helps. Kindly accept the answer by clicking on Accept answer
button. Thankyou