Invoking Native Excel UDFs From Managed Code (Pt1)

As part of a little ongoing investigation, I wanted to communicate between a managed add-in and a native XLL UDF. As I was writing this up, it turned out to be a much longer post that I planned, so I've split it into two parts (I'll post the second part in the next week or so, mince pies permitting).

Obviously, you can build managed code to invoke methods exposed by the Excel COM object model. The question is, can you build managed code – in an add-in, say – to invoke unmanaged UDFs (which are not part of the COM OM). More specifically, can your managed code invoke unmanaged UDFs exposed not by a COM-based automation add-in but by a regular XLL (which is really little more than a regular native Windows DLL)?

Well, obviously, since I'm blogging it, the answer is "Yes". There are several techniques you could use, and of these, one or two that are actually practical:

· If the XLL additionally implements some COM interface, we could then use COM interop from the managed client to invoke UDFs either using a static typelib/interop assembly or using runtime reflection. We could get the XLL COM object from the Excel.AddIns collection. However, requiring an XLL to implement a COM interface is a pretty unnatural act, so we won't consider that further.

· The Excel C API includes the xlUDF identifer for the Excel4/Excel12 function: this is designed specifically to invoke UDFs. However, to use this in a managed client would require p/invoke to the Excel4/Excel12 function in xlcall32.lib. We could do this using DllImport, but this requires a DLL as its parameter, which would therefore require you to build xlcall32.lib into a DLL, so this is pretty unnatural also – and we won't consider this further either.

· The managed client could use the Excel COM OM, specifically Application.Run – which can be used to invoke any VBA macro or UDF. The benefits of this approach include that it is documented and supported, and that because it uses COM interop to talk to Excel, parameter and return type marshaling is done for you. The disadvantage is that because it uses COM interop, there is a perf overhead.

· Another option is to p/invoke to the Windows API functions LoadLibrary and GetProcAddress to get a direct pointer to the UDF. LoadLibrary will use the already-loaded XLL if it is already mapped into the process. We can then use Marshal.GetDelegateForFunctionPointer to map the UDF pointer returned by GetProcAddress to a managed delegate. The disadvantage of this is that we have to do all the marshaling manually. The advantage is that avoiding the COM interop layer may give us a perf boost.

Really, it was the last question that interested me: is it worth doing all the manual marshaling in order to avoid the COM interop layer? (Of course, this is actually a general question that also applies outside the context of Office programmability.) Also, how interesting is that marshaling, given that the Excel C API likes to use XLOPERs for parameter and return types?

To continue this investigation, I first created a simple native XLL, following the guidelines in the updated Excel 2007 XLL SDK. See my previous post for details of how to create a native XLL with VS 2008.

Next, I created a VSTO Excel 2007 add-in. I added a Ribbon (XML) item. I used a Ribbon (XML) item as opposed to a Ribbon (Visual Designer) item, because while the VSTO Ribbon visual designer is a wonderful RAD tool, I sometimes prefer to work with the XML directly and avoid the overhead of the VSTO Ribbon OM. In particular, it's easier to show the complete code in a blog post for an XML-based Ribbon than it is for a VSTO OM-based Ribbon. Here's my Ribbon XML:

<?xml version="1.0" encoding="UTF-8"?>

<customUI xmlns="https://schemas.microsoft.com/office/2006/01/customui" onLoad="Ribbon_Load" >

  <ribbon>

    <tabs>

      <tab idMso="TabAddIns">

        <group id="groupVstoInvoker"

               label="VstoInvoker">

          <button id="buttonComOm"

                  label="COM OM"

                  imageMso="AdpStoredProcedureQueryMakeTable"

                  size="large"

                  onAction="OnButtonComOm"/>

          <button id="buttonWinApi"

                  label="Win API"

                  imageMso="AdpStoredProcedureQueryUpdate"

                  size="large"

                  onAction="OnButtonWinApi"/>

          <editBox id="editBoxRepeat"

                   label="Repeat"

                   sizeString="wwwww"

                   onChange="OnEditBoxRepeat"/>

        </group>

      </tab>

    </tabs>

  </ribbon>

</customUI>

The screenshot below shows how it's rendered at runtime. There are 2 buttons, one for invoking a UDF via the Excel COM OM and a second button for invoking a UDF via the Windows API. The "Repeat" editbox allows the user to specify the number of times to invoke the UDF repeatedly (defaults to 1).

The various Ribbon event handlers are listed below. The onAction handler for the Win API button is currently just stubbed out (I'll implement this in Pt.2). The editbox onChange handler converts the editbox text to an integer and caches it as the UDF invocation repeat count. The onAction handler for the COM OM button does the work to invoke the target UDF via Excel's exposed COM OM. The UDF invocation deliberately includes a call to a UDF that takes and returns a simple type (char*) as well as one that uses an XLOPER.

private int repeatCount = 1;

public void OnEditBoxRepeat(Office.IRibbonControl control, string text)

{

    try

    {

        repeatCount = Convert.ToInt32(text);

    }

    catch (Exception ex)

    {

        Debug.WriteLine(ex.ToString());

    }

    if (repeatCount == 0)

    {

        repeatCount = 1;

    }

}

public void OnButtonComOm(Office.IRibbonControl control)

{

    for (int i = 0; i < repeatCount; i++)

    {

        try

        {

            // Call the GetChars UDF, which returns a simple char*.

            object m = Type.Missing;

            object o1 = Globals.ThisAddIn.Application.Run(

                "GetChars",

                m, m, m, m, m, m, m, m, m, m, m, m, m, m, m,

                m, m, m, m, m, m, m, m, m, m, m, m, m, m, m);

            // Call GetString (which returns an XLOPER).

            object o2 = Globals.ThisAddIn.Application.Run(

                "GetString",

                m, m, m, m, m, m, m, m, m, m, m, m, m, m, m,

                m, m, m, m, m, m, m, m, m, m, m, m, m, m, m);

            Globals.ThisAddIn.Application.ActiveCell.Value2 =

                String.Format("{0}, {1}", o1, o2);

        }

        catch (Exception ex)

        {

            Debug.WriteLine(ex.ToString());

        }

    }

}

As you can see, I can call both GetChars and GetString UDFs in the same way, even though one returns a double and the other an XLOPER – I don't have to worry about converting between XLOPERs and managed types, because this is all done for me by the COM interop layer. When I get the return value from the UDF, I'm simply setting it into the current active cell. So, this approach is pretty simple from the developer's perspective. That's one advantage. The other aspect I wanted to check was performance, so I layered on some simple timing code using the System.Diagnostics.Stopwatch class, as shown below.

private double msecPerTick = 1000.0 / Stopwatch.Frequency;

public void OnButtonComOm(Office.IRibbonControl control)

{

    long ticksAccumulated = 0;

    Stopwatch watch;

    for (int i = 0; i < repeatCount; i++)

    {

        // Perf-count this method call.

        watch = Stopwatch.StartNew();

        try

        {

            // Call the UDFs.

        }

        catch (Exception ex)

        {

            Debug.WriteLine(ex.ToString());

        }

        // Keep a running total of the elapsed call time.

        ticksAccumulated += watch.ElapsedTicks;

    }

    // Calculate and report the average elapsed time.

    Debug.WriteLine(String.Format(

        "Application.Run: average = {0} msec",

        (ticksAccumulated * msecPerTick) / repeatCount));

}

Note that I'm including the cell access operation as part of the timing instead of restricting the timing to the raw UDF call alone, because this is a fairly likely scenario, and I wanted to make sure that I included it so as not to miss any potential context switch that might take place between calling the UDF and accessing the Excel UI through the OM. To view the perf timing output, I used DebugView, downloadable here. Over a large number of invocations (10,000), repeated over 100 runs, calling my UDFs via Excel's COM OM averaged out at 1.9625 msec (on my HP Compaq NC8430 notebook, Centrino duo T7400 2.16 GHz, 3.37 Gb RAM, running Vista Ultimate 64bit). Next, I'll look at how to do the job using the Windows API.