reading excel file(xlsm,xlsx) unstructured data placed in Azure blob. How to read excel file and transforming it to structured format via Databricks.

Rakesh Reddy Badam 1 Reputation point
2022-04-21T11:31:27.927+00:00

Hi All, I have a requirement to read excel file(xlsm,xlsx) unstructured data placed in Azure blob. How to achieve this scenario of reading excel file and transforming it to structured format via Databricks.

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
2,534 questions
{count} votes

1 answer

Sort by: Most helpful
  1. PRADEEPCHEEKATLA 90,651 Reputation points Moderator
    2022-04-22T07:42:03.757+00:00

    Hello @Rakesh Reddy Badam ,

    Thanks for the question and using MS Q&A platform

    There are different variations possible starting from xls and xlsx, whether it contains macros, becomes xlsm, xlsxm etc.

    If format becomes an issue, it might be just easier to create a copy of excel file as csv, then read it in a dataframe, process it easily, write it back with desired changes

    This SO thread suggests it would work for xlsx, not sure however if it would be applicable for all xlsx files and for all other excel file types.

    And also, refer to the Handling Excel Data in Azure Databricks

    Hope this will help. Please let us know if any further queries.

    ------------------------------

    • Please don't forget to click on 130616-image.png or upvote 130671-image.png button whenever the information provided helps you. Original posters help the community find answers faster by identifying the correct answer. Here is how
    • Want a reminder to come back and check responses? Here is how to subscribe to a notification
    • If you are interested in joining the VM program and help shape the future of Q&A: Here is how you can be part of Q&A Volunteer Moderators

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.