Synapse how to read excel with formula

Tamashevich, Tatsiana 40 Reputation points
2024-06-27T08:57:56.29+00:00

Hello,

in Synapse spark pool we need to read excel file that contains formula inside of cells (sometimes it is simple sum, sometimes complicated formula, sometimes it is reference to another spread sheet and cell).

Unfortunately, we observe that some formulas are eveluated correctly and some are not.

For instance, if cell has formula: =SUM(A2:A10), it does not take sum value as we see it in excel, it takes only value from A2.

So far, I have tried different approached to read excel file (All of them produced same result: instead of taking value as sum of A2:A10, only value from A2 is taken):

  • pandas
  • python openpyxl
  • com.crealytics.spark.excel

Would really appreciate if someone can give a hint how to resolve it.

Thank you very much in advance!

Azure Synapse Analytics
Azure Synapse Analytics
An Azure analytics service that brings together data integration, enterprise data warehousing, and big data analytics. Previously known as Azure SQL Data Warehouse.
4,597 questions
Excel
Excel
A family of Microsoft spreadsheet software with tools for analyzing, charting, and communicating data.
1,638 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Sina Salam 6,341 Reputation points
    2024-06-27T11:20:50.2666667+00:00

    Hello Tamashevich, Tatsiana,

    Welcome to the Microsoft Q&A and thank you for posting your questions here.

    Problem

    I understand that you are having issue when reading Excel files containing formulas into a Synapse Spark pool, the formulas are not being evaluated correctly. Instead of returning the calculated results of the formulas, only partial or incorrect values are retrieved, and this issue persists across different libraries and methods used for reading the Excel files.

    Solution

    To solve this issue, you will need to understand that not all libraries evaluate the formulas correctly, and reading Excel files with formulas in Synapse Spark pool can be tricky. So, what come to my mind are the followings:

    1. You will need to evaluate formulas in Excel Before Reading.
    2. Use xlwings to Evaluate Formulas. https://docs.xlwings.org/en/stable/macro.html
    3. Use com.crealytics.spark.excel with Formula Evaluation.
    4. Lastly, use VBA Script to Pre-evaluate Formulas. https://docs.microsoft.com/en-us/office/vba/api/excel.workbook.saveas

    Accept Answer

    I hope this is helpful! Do not hesitate to let me know if you have any other questions.

    ** Please don't forget to close up the thread here by upvoting and accept it as an answer if it is helpful ** so that others in the community facing similar issues can easily find the solution.

    Best Regards,

    Sina Salam