Excel on Azure

 

I amended my open-source CsvTools with an Excel reader. Once I read the excel worksheet into a datatable, I can use all the data table operators from the core CsvTools, including enumeration, Linq over the rows, analysis, mutation, and saving back out as a CSV. So this gives be a Linq-to-Excel on Azure experience, which ought to win a buzzword bingo contest!

The excel reader uses the OpenXml SDK, and so it can run on Azure.  This is useful because Excel as a COM-object doesn’t run on servers, and so I couldn’t upload excel files to my ASP.Net projects without really fighting the security settings. With OpenXml, it’s easy since you’re just reading XML.

Here’s a little azure MVC test page that demonstrates uploading a xlsx file and displaying the contents in azure:

(side note: deploying MVC to Azure is super easy, courtesy of this great tutorial).

I also need to give a shout-out for Nuget! The dependency management here was great. I have one Nuget package for the core CsvTools (which is just the CSV reader with no dependencies) , and another package CsvTools.Excel (which has a dependency on CsvTools and the OpenXml SDK).

The excel reader is an extension method exposed off “DataTable.New”, so it’s easily discoverable.

Here’s a sample excel sheet, foo.xlsx:

image

And then the code to read it from C#:

 private static void TestExcel()
{
    var dt = DataTable.New.ReadExcel(@"c:\temp\foo.xlsx");
    var names = from row in dt.Rows where int.Parse(row["age"]) > 10 select row["Name"];
    foreach (var name in names)
    {
        Console.WriteLine(name);
    }            
}

This example just reads the first worksheet in the workbook, which is the common case for my usage scenarios where people are using excel as a CSV format.  It prints:

 EdJohn

There  are also some other overloads to give the whole list of worksheets.

 public static IList<MutableDataTable> ReadExcelAllSheets(this DataTableBuilder builder, string filename);
public static IList<MutableDataTable> ReadExcelAllSheets(this DataTableBuilder builder, Stream input);
  

The reader is intended for Excel workbooks that represent tabular data and is not hardened against weird or malformed input.

Anyway, I’m finding this useful for some experiments, and sharing in case somebody else finds it useful too. 

(Now I just need to throw in a WebAPI parameter binding for DataTables, use WebAPI’s query string support, and add some data table Azure helpers and I will be the buzzword bingo champion!)

Comments

  • Anonymous
    April 26, 2012
    The comment has been removed

  • Anonymous
    May 27, 2012
    The comment has been removed

  • Anonymous
    June 04, 2012
    @NuGet User Hi, the version of NuGet that Web Pages Administration uses is really old and this package uses features that require a newer version of it. The easiest way to get up to speed with the newest version of WebPages Administration (and NuGet) would be to get the latest version of WebMatrix. (www.microsoft.com/.../betafeatures.aspx) It's still in beta but it should work perfectly well. Plus it has better support for managing packages from inside the editor.

  • Anonymous
    June 14, 2012
    @Pranav I got round the issue by using the package manager in Visual Web Developer 2010 Express, rather that moving to the beta version of WebMatrix.

  • Anonymous
    June 14, 2012
    Hi Mike There's an issue with the CellToText function in the Excel extensions. theCell.CellValue can be null (if a cell is empty), but the return from the function in this case is theCell.CellValue.Text which causes a null reference exception.

  • Anonymous
    July 12, 2012
    @ Nugetuser - thanks for reporting that. I submited a fix and updated the nuget package.