Chapter 10: Client Support For Server UDFs (Beginning Excel Services)

This article is an excerpt from Professional Excel Services by ** from Wrox (ISBN 978-0-470-10486-6, copyright Wrox 2007, all rights reserved). No part of these chapters may be reproduced, stored in a retrieval system, or transmitted in any form or by any means—electronic, electrostatic, mechanical, photocopying, recording, or otherwise—without the prior written permission of the publisher, except in the case of brief quotations embodied in critical articles or reviews.

Chapter 8 introduced UDFs on the server. At the beginning of the chapter it was shown how the #NAME error is displayed inside Excel Workbooks. This happens because Excel does not natively support managed UDFs written for the server, which can have a negative impact on people authoring workbooks for the server. Because of this, it is sometimes useful to make server UDFs work on the client.

This chapter discusses two possible ways to add client support to UDFs written for the server. The first way involves adding functionality to the UDF code so that it will be usable on the server. After that, you learn how to develop a generic solution that can be used to utilize server UDFs on the client without any modification to their code. Finally, this chapter will suggest ways to allow UDFs to distinguish when they are executed on the client or on the server.

Contents

  • Modifying UDFs to Run on the Client

  • Add COM Support to a UDF Assembly

  • Registering the Assembly

  • Using the UDF inside Excel Client

  • Adding Client Support for GetMultiStockQuote

  • Conclusion

  • Generic Client UDF Support

  • Configuration Application

  • UdfProxy

  • CUdfAutomation

  • Final Steps

  • Improvements

  • Other Generic Solutions

  • Summary

Modifying UDFs to Run on the Client

Chapter 9 enumerated some of the mechanisms that exist in Excel for augmenting the list of available UDFs. This section will show how to add functionality to a server UDF to make it compliant with one of these mechanisms.

The extensibility point of choice for this will be Automation Servers. These have been around for a few versions now and are very simple to work with. They make use of COM registration for the discovery mechanism. Any COM class that is registered and has the Programmable key will be an automation server usable by Excel. The first time a user types a formula into a cell, Excel will instantiate the automation servers it uses, getting back a pointer to the IDispatch interface supported by the object. Excel will then try to search inside all created automation servers for each UDF that is typed in. When such a UDF is found, Excel will call into it and return the value into the cell.

The server UDF assembly that will be converted to work on the client will comprise two methods that were discussed in "Chapter 8: the GetLastWord() UDF and the GetMultiStockQuote() UDF."

Add COM Support to a UDF Assembly

Adding COM support to an assembly is easy. The appropriate attributes are added to the class or methods that need to be exposed to COM clients and .NET will do the rest.

In this example, the class is called MyUdfClass and originally looks like this (before modifying it for COM):

Note

The System.Runtime.InteropServices namespace is added here for the COM attributes that will be used from it. The Microsoft.Win32 namespace will be used later to write some entries in the registry using the Registry class.

using System;
using System.Collections.Generic;
using System.Text;
using System.Runtime.InteropServices;

using Microsoft.Win32;

using Microsoft.Office.Excel.Server.Udf;

namespace UdfWithClientSupport
{
     [UdfClass]
     public class MyUdfClass
     {
          [UdfMethod]
          public string GetLastWord(string phrase)
          {
               int lastSpace = phrase.Trim().LastIndexOf(" ");
               if (lastSpace == -1)
                    return phrase;

               string result = phrase.Substring(lastSpace + 1);
               return result;
          }
     }
}

Multiple attributes need to be added to tell .NET how this class is to be used through COM. The final class definition becomes:

[UdfClass]
[ProgId("UdfWithClientSupport.MyUdfClass")]
[ClassInterface(ClassInterfaceType.AutoDual)]
[Guid(MyUdfClass.Guid)]
[ComVisible(true)]
public class MyUdfClass
{
     public const string Guid = "<Generated GUID goes here>";
     // ...
}

The ProgId and Guid attributes are not strictly needed, but they make the class more predictable. The ClassInterface attribute is used to instruct .NET about what type of IDispatch implementation is expected. The ComVisible attribute implies that this class is exposed via COM.

Note

Excel expects COM servers used for UDFs to support a minimal set of interfaces—namely, IDispatch and ITypeInfo. Using the ClassInterfaceType.AutoDispatch as a parameter to the ClassInterface attribute would cause Excel to not recognize the functions that can be called on the object.

Adding the Programmable Key

As explained before, a COM object needs to have the Programmable key for Excel to be able to use a COM object as a UDF.

To do this, the class needs to execute custom code when it is being registered as a COM class. The .NET Framework supplies this functionality by allowing classes to have a method with an attribute called ComRegisterFunction. When this attribute is present on a method, .NET will make sure it gets called during the registration process. Similarly, the class will need to delete that key when it's being unregistered:

[ComRegisterFunction]
public static void RegistrationMethod(Type type)
{
     // Only add stuff to the registration 
     // if it’s this class that’s being registered.
     if (typeof(MyUdfClass) != type)
     {
          return;
     }

     // Add "Programmable" under our key.
     RegistryKey key = Registry.ClassesRoot.CreateSubKey(
          "CLSID\\{" + Guid + "}\\Programmable");
     key.Close();
}

[ComUnregisterFunction]
public static void UnregisterationMethod(Type type)
{
     // Only add stuff to the registration 
     // if it's this class that’s being registered.
     if (typeof(MyUdfClass) != type)
     {
          return;
     }

     // Add "Programmable" under our key.
     Registry.ClassesRoot.DeleteSubKey("CLSID\\{" + Guid + "}\\Programmable");
}

Now that this class is done, the next step is to register the assembly.

Registering the Assembly

.NET supplies a tool for registering assemblies as COM libraries. The tool is called RegAsm.exe and is located deep inside the Windows folder. The following command needs to be executed to register the class:

%windir%\Microsoft.NET\Framework\v2.0.50727\RegAsm.exe /codebase UDFWithClientSupport.dll

Executing this command will show a warning but will register the library all the same.

Note   It is not recommended to use the /codebase option with assemblies that are not strongly named. For production code, it is highly recommended that developers sign their assemblies.

During development, it is possible to use the Visual Studio 2005 project properties to get the assembly to automatically register.

Using the UDF inside Excel Client

Now that everything is ready, Excel needs to be told to actually use it. This is done by using the Addins section in the Excel Options dialog and clicking Go at its bottom where the combo box displays the Excel Add-ins text (see Figure 10-1).

Figure 10-1

Figure 10-1

Note

In the new Excel 2007, the Options dialog is hiding in the well-concealed File menu. The File menu is the circular office button at the top-left side of the Excel app window. Once this menu is opened, the options button is located at its bottom. This should bring up the Addins dialog box. In there, one of the buttons is called Automation and will bring up another dialog box, shown in Figure 10-2, which lists all the available automation servers in the system.

Clicking OK once the UDF automation server is checked will display an error saying that mscoree was not found. This error is ignorable, selecting No will dismiss it.

Figure 10-2

Figure 10-1

Using GetLastWord in a Formula

As shown in Figure 10-3, everything is lined up so that the UDF can be used in Excel.

Adding Client Support for GetMultiStockQuote

While modifying the GetLastWord() UDF to work on the client was relatively easy, the changes needed to make GetMultiStockQuote() work are somewhat more complex.

Excel 2007 does not recognize an object array as a valid parameter, so just calling the method like GetLastWord() does not work. Instead, a new method will be added that will be used by Excel client. This method will not have the UdfMethod attribute—Excel Services will not know about it.

The parameter this new method will receive will be an object instead of the object array that the server method expects. Excel will place one of two values inside this parameter. In cases where the parameter passed is a constant or the result of another UDF, Excel will pass in an object that represents that value.

Note

In the case of values such as doubles or Booleans, the value will be boxed inside the object parameter. In the case of a string, the object will be castable to a string.

When the parameter is a reference to a cell or a range, Excel will pass in a Range COM object. The easiest way to use this object is to reference the Excel 2007 PIA and cast the object and use it.

Figure 10-3

Figure 10-3

Note

PIA stands for Primary Interop Assembly and is a standardized way for software to expose COM interfaces to managed applications.

That way, however, requires that the assembly take a dependency on the Excel 2007 PIA. Instead, it is possible to access the Range object using a mechanism similar to reflection. Adding the ComVisible attribute set to false to the original GetMultiStockQuote() UDF ensures that Excel will not be able to call it:

[UdfMethod(IsVolatile = true)]
[ComVisible(false)]
public object[,] GetMultiStockQuote(object[,] symbols)
{
     // ... This code is exactly the same as the one in Chapter 8.
}

public object[,] GetMultiStockQuote(object symbols)
{
     // Get the Value2 property from the object.
     Type type = symbols.GetType();
     object[,] result = null;
     if (type.IsCOMObject)
     {
          object[,] range = (object[,])type.InvokeMember("Value2", 
               System.Reflection.BindingFlags.Instance | 
               System.Reflection.BindingFlags.Public | 
               System.Reflection.BindingFlags.GetProperty, 
               null, 
               symbols, 
               null);
          int rows = range.GetLength(0);
          int columns = range.GetLength(1);
          object[,] param = new object[rows, columns];
          Array.Copy(range, param, rows * columns);
          result = GetMultiStockQuote(param);
     }
     
     if (result == null)
     {
          result = new object[0, 0];
     }
     return result;
}

The new GetMultiStockQuote() method first uses the Invoke() method on the type to get back the Value2 property of the Range object. This will return an array from the COM API of Excel. Next, it allocates an array of the same size and copies the contents of the fetched array into it. Finally, it sets the resulting array into the local variable that will ultimately be returned to Excel.

Note

There's a seemingly unneeded copy operation in the code—the part that copies the array returned from invoking Value2 into another managed array. The Excel OM always returns arrays that are 1-based. While .NET languages know how to work with such arrays, it is not the "standard" behavior. If the original GetMultiStockQuote() method knew how to handle 1-based arrays, it would be possible to pass the range local variable into the original GetMultiStockQuote() method.

Conclusion

Making Excel Services UDFs work on the client is not a hard process. It is pretty simple and requires only a couple of steps per assembly. Some methods, as seen in the GetMultiStockQuote() example, may require some extra attention though.

The next section shows a generic solution that will allow server UDFs to work on the client with no modifications. This will keep the UDFs clean and will require significantly less code.

Generic Client UDF Support

The following solution mimics the way server UDFs work by creating a component that acts as a shim between Excel's way of doing UDFs and the server's way of doing it.

There are three parts to this solution. First, there's a simple application that configures what UDF assemblies will be used on the client (similar to the Excel Services UDF configuration page). Second, there's a component that behaves like an automation server that can be added to Excel (called ClientManagedUdfs). This component will be written using C++/CLI. Last, there's a component that is responsible for loading the UDF assemblies and invoking them which will be written in C# — its name is UdfProxy.

Note

C++/CLI is the new name for Managed C++ or Managed Extensions for C++.

The assembly that loads and invokes the UDF code could have been written as part of the C++/CLI assembly. However, since C++/CLI is still somewhat harder to debug and is actually more verbose than C# in many cases, I usually try to minimize the amount of code written in it.

Figure 10-4 shows the CUDFAutomation addin that is loaded by Excel and is used as any other automation server would be. It exposes an IDispatch interface that is dynamically built on startup. Each method in the IDispatch interface is wired to a method in one of the UDF assemblies the addin is configured to load. When CUDFAutomation detects that a call is made to one of these methods, it does the necessary work to call the correct method with the correct parameters.

Excel calls CUDFAutomation via standard COM interfaces. CUDFAutomation calls into the assembly via managed reflection. The interop occurs inside the implementation of IDispatch.

Figure 10-4

Figure 10-4

Configuration Application

The configuration application contains a single form with a ListView control. It allows the user to add or remove assemblies from the list, as shown in Figure 10-4.

When the configuration application saves the setting, it places it in the following directory:

%AppData%\ClientManagedUdf.settings

This file contains XML that encapsulates the configuration. For the case depicted in Figure 10-5, the XML will look like this:

<?xml version="1.0"?>
<Config xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance” 
xmlns:xsd=”http://www.w3.org/2001/XMLSchema”>
  <UdfAssemblyCollection>
    <UdfAssembly>
      <Name>C:\Temp\udfs\SampleUdf.dll</Name>
    </UdfAssembly>
    <UdfAssembly>
      <Name>X:\Book\Code Samples\UDFExample\UDFExample\bin\Debug\UDFExample.dll</Name>
    </UdfAssembly>
  </UdfAssemblyCollection>
</Config>

The configuration application uses the classes in the UdfProxy assembly to save this information—the actual logic inside the configurator utility is very simple.

Figure 10-5

Figure 10-5

UdfProxy

The UdfProxy assembly takes care of loading assemblies, iterating them to find the classes and methods that are actual UDFs. It also takes care of other details such as saving and loading the list of assemblies in the user profile.

Configuration Classes

The names of the assemblies that will be loaded as UDFs are managed by three classes: UdfAssembly, UdfAssemblyCollection, and Config.

UdfAssemblyCollection wraps a List<> instance that contains UdfAssembly instances.

public class UdfAssemblyCollection : IEnumerable<UdfAssembly>
{
     private List<UdfAssembly> m_assemblies = new List<UdfAssembly>();
     //...
}

The class exposes some very basic collection members such as Count, GetEnumerator (so it can be used in foreach statements), an indexer and add/remove methods.

UdfAssembly contains the name of an assembly that will be searched for viable UDFs:

public class UdfAssembly
{
     private string m_name;

     public string Name
     {
          get { return m_name; }
          set { m_name = value; }
     }
}

The Config classes make use of the UdfAssemblyCollection class. It also gives the caller save and load functionality:

public class Config
{
     private static Config m_current;
     private UdfAssemblyCollection m_assemblyCollection = new 
UdfAssemblyCollection();

     private static string ConfigFileName
     {
          get 
          { 
               return Path.Combine(
                    Environment.GetFolderPath(
                         Environment.SpecialFolder.ApplicationData),
                     "ClientManagedUdf.settings"); 
          }
     }

     public static Config Current
     {
          get
          {
               if (m_current == null)
               {
                    m_current = new Config();
                    m_current.Load();
               }
               return m_current;
          }
     }
// …

Use the Current property to access the Config instance, which checks to see if the configuration instance is already created. If it's not, it will create it.

The ConfigFileName property returns the name of the settings file.

Note

This pattern of configuration class will be repeated in the book for client applications.

Loading and saving of the instance is done by using the XmlSerializer class. This class takes care of everything, including the names of the nodes in the XML.

The methods for saving and loading are:

private  ()
{
     XmlSerializer s = new XmlSerializer(typeof(Config));
     return s;
}

private void Load()
{
     if (File.Exists(ConfigFileName))
     {
          try
          {
               using (FileStream s = File.Open(ConfigFileName, FileMode.Open))
               {
                    m_current = (Config)CreateSerializer().Deserialize(s);
               }
          }
          catch (Exception)
          {
          }
     }

     if (m_current == null)
     {
          m_current = new Config();
     }
}

public void Save()
{
     using (FileStream s = File.Create(ConfigFileName))
     {
          CreateSerializer().Serialize(s, Current);
     }
}

The XmlSerializer class created by the CreateSerializer() method takes care of turning the Config instance into XML.

UdfManager Class

The UdfManager class takes care of loading and preparing the actual UDFs for usage. It is the one that will go over the Config instance and load all the assemblies, extract the methods marked with the UdfMethod attribute and build a list of them. Each such method will be wrapped by the UdfMethodWrapper class.

public class UdfManager
{
     private StreamWriter m_logWriter;
     private List<UdfMethodWrapper> m_methodList = new List<UdfMethodWrapper>();
     private Dictionary<string, UdfMethodWrapper> m_methodDictionary = 
          new Dictionary<string, UdfMethodWrapper>
               (StringComparer.InvariantCultureIgnoreCase);

     internal List<UdfMethodWrapper> MethodList
     {
          get { return m_methodList; }
     }

     internal Dictionary<string, UdfMethodWrapper> MethodDictionary
     {
          get { return m_methodDictionary; }
     }
     // ...
}

The m_logWriter member is used to write a log file of the loading process — errors and warnings will go there. The log can be found at:

%AppData%\ClientManagedUdf.log

There are two collection classes used at the same time with UdfManager. Because the solution simulates IDispatch, every method needs to have both a number (a DISPID) and a name. The first collection (m_methodList) will keep an indexed list of the methods while the dictionary (m_methodDictionary) will keep a list keyed by the name of the method.

After creating an instance of the UdfManager class, callers should call the Load() method to have the manager actually load all the UDFs.

public void Load()
{
     using (LogWriter = File.CreateText(
          Path.Combine(
               Environment.GetFolderPath(
                    Environment.SpecialFolder.LocalApplicationData),                          
"ClientManagedUdf.log")))
     {
          Log("Loading assemblies...");
          foreach (UdfAssembly udfAssembly in Config.Current.UdfAssemblyCollection)
          {
               try
               {
                    LoadFromAssembly(udfAssembly);
               }
               catch (Exception e)
               {
                    Log("Error loading assembly '{0}'. Error was: {1}", 
                         udfAssembly.Name, e);
               }
          }
     }
     LogWriter = null;
}

private void LoadFromAssembly(UdfAssembly udfAssembly)
{
     Assembly assembly = Assembly.LoadFrom(udfAssembly.Name);
     LoadTypes(assembly);
}

The Load() method goes through each UdfAssembly in the configuration and tries to load it through the LoadFromAssembly() method. If the assembly load is successful, the UdfManager class will try to find all the types that are decorated with the UdfClass attribute:

private void LoadTypes(Assembly assembly)
{
     Log("Getting types from assembly...");
     Type[] types = assembly.GetTypes();
     Log("Searching through {0} types...", types.Length);
     foreach (Type type in types)
     {
          UdfClassAttribute attribute = 
               (UdfClassAttribute)Attribute.GetCustomAttribute(
                    type, typeof(UdfClassAttribute));
          if (attribute != null)
          {
               Log("Found class '{0}' marked as a Udf class.", type.Name);
               LoadMethods(type);
          }
     }
}

Once such a class is located, the LoadMethods() method is called for that type. This method has two parts. The first one for instantiating the class and the second for getting all the methods out of it:

private void LoadMethods(Type type)
{
// Create an instance of the class.
     Log("Trying to instantiate the type.");
     object instance = null;
     try
     {
          instance = Activator.CreateInstance(type);
     }
     catch (Exception)
     {
          Log("Unable to instantiate type '{0}'.", type.Name);
          throw;
     }

     // Find methods in the class.
     // ...
}

First, the Activator class is used to attempt to create an instance. If for any reason this method fails (in this case, because there is no parameterless constructor, or if constructor throws an exception), loading this UDF class will be considered a failure and a message will be logged.

Note

For each Excel process, there will be instances of the UDF classes. These instances will live for the entire lifetime of the Excel process. This is different from how they work on the server where they get attached to each opened session (and through that, for every opened workbook). For the most part, this is not a big issue because the UDFs don’t really have any way of knowing which workbook they are attached to on the server. However, it's worth keeping in mind as in some cases it may lead to differences in behavior.

Once the class itself is instantiated, the code looks for methods that are decorated with the UdfMethod attribute:

private void LoadMethods(Type type)
{
     // Create an instance of the class.
     // ...

     // Find methods in the class.
     Log("Loading methods from class.");
     MethodInfo[] methodInfos = type.GetMethods();
     foreach (MethodInfo methodInfo in methodInfos)
     {
          UdfMethodAttribute attribute = 
               (UdfMethodAttribute)Attribute.GetCustomAttribute(
                    methodInfo, typeof(UdfMethodAttribute));
          if (attribute != null)
          {
               Log("Found method '{0}' marked as a Udf method.", methodInfo.Name);
               if (MethodExists(methodInfo.Name))
               {
                    Log("A method named '{0}' already exists. " +  
                         "Ignoring this one.", methodInfo.Name);
               }
               else
               {
                    Log("Adding method to list.");
                    UdfMethodWrapper method = new UdfMethodWrapper(
                         methodInfo, methodInfo.Name, instance);
                    Add(method);
               }
          }
     }
}
private bool MethodExists(string name)
{
     return MethodDictionary.ContainsKey(name);
}

Using reflection, the code uses the GetMethods() method to get all the publicly available methods on the class. Each such method is checked against the current list of methods to see if a prior method with the same name already exists (that's what MethodExists() returns). If such a method does not exist, the code creates a UdfMethodWrapper class and passes the MethodInfo instance to it. It then adds the newly created UdfMethod to the list of available methods:

private void Add(UdfMethodWrapper method)
{
     int index = MethodList.Count;
     method.Index = index;

     MethodList.Add(method);
     MethodDictionary.Add(method.Name, method);
}

The Add() method takes the size of the array and uses it as the new index for the method. The index will be needed later when building the dynamic IDispatch interface implementation.

UdfManager also contains methods for getting UdfMethodWrapper instances by their name or index.

UdfMethodWrapper Class

Two jobs are accomplished by the UdfMethodWrapper class. The first is to cache information about the method that is attached to it and the second is the invocation of the method with the proper parameters:

public class UdfMethodWrapper
{
     private MethodInfo m_methodInfo;
     private int m_index;
     private string m_name;
     private object m_instance;
     private bool m_hasEllipsis;
     private int m_parameterCount;
     private ParameterInfo[] m_parameters;
     private bool m_isVolatile;
// ...
}

The members are used primarily for caching information about the method. All this information is accessible via reflection on the m_methodInfo member, however, since reflection is a relatively slow process, some caching is in order.

Note

In this case, doing this caching shaves roughly 10% off the overhead of executing the method.

The m_instance field points to an instance of the class that contains this UDF method.

private void GatherInfo()
{
     // Get the last parameter in the parameter list.
     ParameterInfo[] parameters = MethodInfo.GetParameters();

     m_parameters = parameters;
     m_parameterCount = parameters.Length;

     if (parameters.Length > 0)
     {
          ParameterInfo info = parameters[parameters.Length - 1];
          m_hasEllipsis = Attribute.GetCustomAttribute(info, 
typeof(ParamArrayAttribute)) != null;
     }

     UdfMethodAttribute att = 
(UdfMethodAttribute)Attribute.GetCustomAttribute(MethodInfo, 
typeof(UdfMethodAttribute));
     if (att != null)
     {
          m_isVolatile = att.IsVolatile;
     }
}

GatherInfo() is called from the constructor and is responsible for caching information about the method. It first takes the parameters and caches them in a member and then checks to see if the last parameter is decorated with the ParamArray parameter—if so it means the writer of the UDF expects the method to be called with a varying amount of parameters. Finally, the method checks to see if the method is marked as volatile.

The next step is the invocation method call:

public object Invoke(object[] args)
{
     args = AdjustArgs(args);
     object result = MethodInfo.Invoke(m_instance, args);
     return result;
}

Using reflection, the method is invoked and the m_instance parameter passed in. The args local variable is generated by calling the AdjustArgs() method which makes sure the arguments passed into the method fit with the ones the method expects.

For example, a method may take an Int32 as a parameter (int in C# and Integer in ). However, Excel cells cannot really contain integers—they only contain doubles. If a double is passed in, it needs to be cast to become an integer. The AdjustArgs() method goes over all the parameters and takes care of such conversions.

It also makes sure that the last parameters are packed into an array, which is passed as the last parameter of the method in the case that the method employs the ParamArray attribute:

private object[] AdjustArgs(object[] args)
{
     object[] realParams = new object[ParameterCount];
     int regularParametersCount = ParameterCount;
     Array paramArray = null;
     ParameterInfo ellipsisParameter = null;

     if (HasEllipsis)
     {
          regularParametersCount—;
          ellipsisParameter = m_parameters[ParameterCount - 1];
          paramArray = Array.CreateInstance(
               ellipsisParameter.ParameterType.GetElementType(), 
               args.Length - regularParametersCount);
          realParams[realParams.Length - 1] = paramArray;
     }

     // Copy/Convert the parameters.
     for (int i = 0; i < regularParametersCount; i++)
     {
          realParams[i] = ConvertParameter(args[i], 
               m_parameters[i].ParameterType);
     }

     if (HasEllipsis)
     {
          Type elementType = 
               ellipsisParameter.ParameterType.GetElementType();
          for (int i = 0; i < paramArray.Length; i++)
          {
               paramArray.SetValue(
                    ConvertParameter(args[regularParameters + i], elementType), 
                    i);
          }
     }

     return realParams;
}

The first part that initializes the local variables creates an array called realParams. This will be used to contain the parameters that will be eventually passed to the managed UDF method. The second part (inside the first check for HasEllipsis) creates the array that will be passed as the last parameter to the method in the case of a ParamArray method. The last two parts go over the parameters and make sure they are converted to the correct values (by calling ConvertParameter()).

private object ConvertParameter(object value, Type parameterType)
{
     object result = null;
     Type toType = parameterType;
     if (value == null)
     {
          result = GetDefaultValueForType(result, toType);
     }
     else
     {
          Type fromType = value.GetType();

          if (fromType == toType)
          {
               result = value;
          }
          else if (toType == typeof(object[]))
          {
               if (fromType == typeof(object[,]))
               {
                    throw new InvalidOperationException();
               }

               result = new object[] { value };
          }
          else if (toType == typeof(object[,]))
          {
               if (fromType == typeof(object[]))
               {
                    result = PromoteArrayRank(value, result);
               }
               else
               {
                    result = new object[,] { { value } };
               }
          }
          else
          {
               if (fromType.IsPrimitive && 
                    IsSupportedParameterType(fromType) &&
                    toType.IsPrimitive && 
                    IsSupportedParameterType(toType))
               {
                    result = Convert.ChangeType(value, toType);
               }
               else if (toType == typeof(DateTime) && fromType == typeof(double))
               {
                    result = DateTime.FromOADate((double)value);
               }
          }
     }
     return result;
}
internal static bool IsSupportedParameterType(Type type)
{
     bool result = false;
     if (type.IsPrimitive)
     {
          if (type != typeof(Int64) && type != typeof(UInt64))
          {
               result = true;
          }
     }
     else if (type == typeof(String))
     {
          result = true;
     }
     else if (type == typeof(DateTime))
     {
          result = true;
     }
     return result;
}

The ConvertParameter() method takes the value that needs to potentially be converted and the type it needs to be converted to (targetType). It first makes sure the value that needs to be converted is not null. If it is, it means the code needs to place some kind of default value in the variable—this is taken care of by the GetDefaultValueForType() method, which knows, by the type of the parameter, what the default has to be (for example, if the parameter is an integer, and there is an empty cell passed into it, this method will return zero since that’s what an empty cell evaluates to).

Note

This method works similarly to how the server does—it returns zero if the parameter is numeric, an empty string if it’s a string, and so on, or null otherwise.

Next, the method checks if the types are the same. If they are, it will not do anything to the value and just return it.

If the value is a two-dimensional array but the target type is a one-dimensional array, the function fails because this conversion is not supported on the server. On the other hand, if the value is one-dimensional and the target is two-dimensional, the method will promote the rank of the array by creating a new two-dimensional array and copying the values from the one-dimensional array.

Finally, the code makes sure that the parameter is supported and uses the Convert.ChangeType() method to coerce the value that was actually passed in to the one expected by the method. Any errors (such as a double source value overflowing a target Int32 parameter) would throw an exception, which would show an error in the cell (which is the current behavior of the server).

The DateTime parameter is somewhat special—when it's a parameter, the FromOADate() method is used to generate one out of a double (which is the form Excel uses to store dates).

CUdfAutomation

Excel will use the CUdfAutomation class to make UDF calls. It is an Active Template Library (ATL) class inside a C++ DLL that supports the common language runtime (CLR) (in other words, a C++/CLI assembly). Using C++/CLI gives this code the unique opportunity to do things that are either very hard for pure managed code to do or are downright impossible. In this case, the code dynamically generates an IDispatch interface that is used by Excel to make the appropriate UDF calls.

As with most ATL projects, there are many files that are needed for a successful compilation; however, only two code files are of actual interest and those are the UdfAutomation.h and UdfAutomation.cpp files. This section will first show the .h file, which contains the definition of the class and then the .cpp file, which contains the more complex implementation.

The CUdfAutomation class itself looks just like a regular ATL class with a few additions:

class ATL_NO_VTABLE CUdfAutomation :
     public CComObjectRootEx<CComSingleThreadModel>,
     public CComCoClass<CUdfAutomation, &CLSID_UdfAutomation>,
     public IUdfAutomation,
     public IUdfDispatch,
     public ITypeInfo,
     public IDispatchImpl<AddInDesignerObjects::IDTExtensibility2>
{
     //...
}

The interesting part of this class is the inheritance from IUdfDispatch and from IDispatchImpl<AddInDesginerObjects::IDTExtensibility2>.

The first inherited class is one that is defined at the top of the file and will allow CUdfAutomation to implement the dynamically built IDispatch interface (the dynamic IDispatch interface will have its methods prefixed with Udf_.)

Note   Because IDTExtensibility itself inherits from IDispatch, CUdfAutomation cannot just inherit from IDispatch and call it a day—there needs to be an extra layer there. In this case, the extra layer also changes the names of the methods so that there is no problem with the methods implemented in CUdfAutomation overriding the methods supplied for IDTExtensibility.

Throughout the rest of this section, the term dispmethods will be used to describe the "methods" that are exposed by the IDispatch interface using the Invoke mechanism.

CUdfAutomation has only two field members—both used to store references to managed instances. Because CUdfAutomation is a native class, it needs to store these references via some sort of bridge. In C++/CLI this bridge is implemented by the gcroot<> template class:

class ATL_NO_VTABLE CUdfAutomation :
     // ...
     private:
     gcroot<UdfProxy::UdfManager^> m_manager;
     gcroot<Excel::Application^> m_app;

While m_manager will hold a reference to the class that was described before—the one that manages all the UDFs that are loaded, the m_app field will hold a reference to the Excel Application object.

The FinalConstruct() method of CUdfAutomation is called when the class construction is done (when it's created via COM for example). That's where most initialization code should go:

HRESULT FinalConstruct()
{
     try
     {
          m_manager = gcnew UdfManager();
          m_manager->Load();
     }
     catch (System::OutOfMemoryException^)
     {
          return E_OUTOFMEMORY;
     }
     catch (System::Exception^)
     {
          return E_FAIL;
     }
     return S_OK;
}

In this case, the code creates a new UdfManager instance (which was imported from the UdfProxy assembly) and loads it.

Note

If you never worked with C++/CLI, the syntax may seem a bit weird. Trust me. It's much better than it was in the first version (Managed C++ in the 2003 release). It just takes a bit of getting used to.

The method also catches exceptions and translates them into errors that the caller of the class will recognize.

IDTExtensibility2 defines five methods, only one of which is utilized in this class—the OnConnection() method. This is called when Excel first creates the automation addin and allows it to properly set up for using Excel.

Note

It should have been possible to forgo implementing IDTExtensibility2 and instead use the Marshal::GetActiveObject() method to get the currently running Excel instance from the ROT (running object table). This approach has two distinct disadvantages. The first one is that it would not have worked as expected when more than one instance of Excel is running. The second is that Excel (and most Office apps for that matter) do not register themselves in the ROT until they lose focus. That means that at least in some cases, the code would not have been able to find the Application object until after the user switched to a different window.

In this case, all the code does is cache the reference to the Application object that Excel exposes so that it can later be used to set the volatility of UDFs:

STDMETHOD(OnConnection)(IDispatch * app, 
     AddInDesignerObjects::ext_ConnectMode ConnectMode,
     IDispatch * AddInInst,
     SAFEARRAY * * custom )
{
     m_app = (Excel::Application^)
          System::Runtime::InteropServices::Marshal::GetObjectForIUnknown(
               System::IntPtr(app));
     return S_OK;
}

Even though this method is only one line long, there are a few things worth noting. IDTExtensiblity2 is implemented natively. This means that the parameters that are passed into it are native, too. In this case, the IDispatch that is passed as the app parameter is a native COM pointer and as such cannot be transparently used by managed code. Since the implementation of the dynamic IDispatch interface will be done mostly in managed C++, the code will need to have a managed wrapper for the COM object representing the Excel application. That's where the call to Marshal::GetObjectForIUnknown() comes in. It takes the IDispatch pointer and turns it into a managed object, which is then cast into an Excel::Application object, which is a managed interop wrapper for the Excel Application COM instance.

The next part is the core implementation of the class that is located inside the UdfAutomation.cpp file. The three most important implementations here are GetFuncDesc(), UDF_GetIDsOfNames(), and UDF_Invoke(). The first method belongs to the ITypeInfo interface, which is used to divine information about various dispmethods. The second method is used by Excel to see if a named dispmethod is supported. The third one is used to invoke the dispmethod.

Note

Theoretically, IDispatch alone should have been enough for achieving the ability to invoke methods. Excel, however, requires both interfaces to be able to do the invocation. The only functionality it really uses from ITypeInfo is metadata about the supported IDispatch methods.

When Excel parses a formula and finds something it does not recognize, but that looks like a UDF call inside a formula, it will go to each of the automation servers registered and call GetIDsOfNames() to see if the server supports the UDF. If it does, the GetIDsOfNames() method returns a dispid (a numeric identifier) that represents the dispmethod corresponding to the name. Later, that dispid will be used to invoke the method:

STDMETHODIMP CUdfAutomation::UDF_GetIDsOfNames(
     REFIID riid, 
     OLECHAR **rgszNames, 
     UINT cNames, 
     LCID lcid, 
     DISPID *rgdispid)
{
     if (!rgdispid || !rgszNames)
     {
          return E_POINTER;
     }
     HRESULT retResult = S_OK;

     try
     {
          for (UINT i = 0; i < cNames; i++)
          {
               DISPID id;
               int intId = 0;
               UdfMethodWrapper^ method = 
                    m_manager->UdfMethodFromName(
                         gcnew System::String(rgszNames[i]));
               if (method == nullptr)
               {
                    id = DISPID_UNKNOWN;
                    retResult = DISP_E_UNKNOWNNAME;
               }
               else
               {
                    id = (DISPID)method->Index;
               }
               rgdispid[i] = id;
          }
     }
     catch (System::OutOfMemoryException^)
     {
          return E_OUTOFMEMORY;
     }
     catch (System::Exception^)
     {
          return E_FAIL;
     }
     return S_OK;
}

The GetIDsOfNames() method first checks to make sure the out parameter is valid. It then uses the UdfManager instance to fill in the requested information.

The rgszNames parameter contains an array of strings—each string representing a name of a dispmethod that is being sought.

Note   In Excel, there will only be one dispmethod name request per call to GetIDsOfNames(), but for completeness, this method knows how to handle calls with multiple dispids.

The out parameter rgdispid parameter is an array that needs to be filled with these dispids. UdfManager.UdfMethodFromName() is used to get back the method in question. If the name is not located in the UdfManager instance, the special value DISP_E_UNKNOWNNAME is placed in the array to signify that no such name exists. If a UdfMethod instance is located, its index will be used as the dispid and placed inside the array.

The GetFuncDesc() method is accessible through the ITypeInfo interface, which in turn is accessible by calling the GetTypeInfo() method of IDispatch (in this case, renamed to UDF_GetTypeInfo()):

STDMETHODIMP CUdfAutomation::UDF_GetTypeInfo(
     UINT itinfo, 
     LCID lcid, 
     ITypeInfo** pptinfo)
{
     *pptinfo = static_cast<ITypeInfo*>(this);
     (*pptinfo)->AddRef();
     return S_OK;
}

The method is very simple. Because the CUdfAutomation class implements the ITypeInfo interface as well as all the others, all it needs to do is cast itself to that interface and return it (remembering to call AddRef).

GetFuncDesc() is slightly more involved. Its job is to inform Excel about the dispmethod metadata:

STDMETHODIMP CUdfAutomation::GetFuncDesc(
     UINT index, 
     ::FUNCDESC **pppfuncdesc)
{
     ::FUNCDESC* pdesc = new ::FUNCDESC();
     if (!pdesc)
          return E_OUTOFMEMORY;

     ZeroMemory(pdesc, sizeof(::FUNCDESC));
     UdfMethodWrapper^ methodInfo;
     try
     {
          methodInfo = m_manager->UdfMethodFromId(index);
     }
     catch (System::Exception^)
     {
          return E_FAIL;
     }

     pdesc->memid = index;
     pdesc->funckind = ::FUNC_DISPATCH;
     pdesc->invkind = ::INVOKE_FUNC;
     pdesc->callconv = ::CC_STDCALL;
     int paramCount = methodInfo->ParameterCount;
     int optionalCount = 0;
     if (methodInfo->HasEllipsis)
     {
          optionalCount = 200 — paramCount - 1;
          paramCount = 200;
     }
     pdesc->cParams = paramCount;
     pdesc->cParamsOpt = optionalCount;
     *pppfuncdesc = pdesc;
     return S_OK;
}

The first part of the method prepares the memory that will be used to return the answer to the caller. That involves allocating an instance of the FUNCDESC structure, which is then filled with the relevant information.

In this case, the relevant information is pretty constant (it is all method calls, all of which use CC_STDCALL, etc.). It's just the number of potential arguments that may change. In this case, the limit will either be the actual amount of parameters a method has (if it does not have a param array) or 200 (an arbitrarily large number of possible arguments) if it does.

Note   The ITypeInfo::GetFuncDesc() method uses an indexed ID for methods rather than DispIds. In the simplistic case presented here, both are one and the same. If you were to elaborate and add extra methods to the IDispatch interface, some fixups of these numbers may be needed.

Because of the unique way in which ITypeInfo works, there’s also a method used to delete the allocated FUNCDESC structure:

STDMETHODIMP_(void) CUdfAutomation::ReleaseFuncDesc(::FUNCDESC *pfuncdesc)
{
     delete pfuncdesc;
}

There are a few more methods on the interface that are needed, namely, GetNames(), GetTypeAttr(), and ReleaseTypeAttr(). These provide more information about the available dispmethods.

The last part of the CUdfAutomation class is the actual Invoke() method (UDF_Invoke() in this case). This method takes the ID of the dispmethod that needs to be executed and an array of the parameters that are to be passed to the method.

The code for UDF_Invoke() looks like this:

#pragma managed(push,off)
STDMETHODIMP CUdfAutomation::UDF_Invoke(
                    DISPID dispidMember, REFIID riid, 
                    LCID lcid, WORD wFlags, 
                    ::DISPPARAMS *pdispparams, VARIANT *pvarResult, 
                    ::EXCEPINFO *pexcepinfo, UINT *puArgErr)
{
     return DoInvoke(
          dispidMember, riid, 
          lcid, wFlags, 
          pdispparams, pvarResult,
          pexcepinfo, puArgErr);
}
#pragma managed(pop)

The method simply calls into the DoInvoke() method, which takes the same exact set of parameters. The reason for this lies in the two #pragma directives wrapping the method. These instruct the C++/CLI compiler to make sure that this UDF_Invoke() method is compiled with no managed support. When it in turn calls into the mixed mode DoInvoke(), the compiler can do a better job of binding the param­eters, shaving a few percentages for the overall time it takes to actually call the Invoke() mechanism (from tests performed, a gain of roughly 10% was realized by doing this).

The DoInvoke() method contains the actual implementation of invoking the appropriate UDF. It has four stages:

  • Initialization

  • Excel Range to array transformation for those parameters that need it

  • Native to managed transformation for all parameters

  • Calling the UDF method, returning to Excel the value that was returned from the managed UDF and doing some cleanup

This section will show the method using these four stages:

HRESULT CUdfAutomation::DoInvoke(
               DISPID dispidMember, REFIID riid, 
               LCID lcid, WORD wFlags, 
               ::DISPPARAMS *pdispparams, VARIANT *pvarResult, 
               ::EXCEPINFO *pexcepinfo, UINT *puArgErr)
{
     // Stage 1
     if (!pdispparams || !pvarResult)
     {
          return E_POINTER;
     }
     if (pdispparams->cNamedArgs > 0)
     {
          return E_FAIL;
     }

     int paramCount = pdispparams->cArgs;
     VARIANT* params = new VARIANT[paramCount];
     bool* owned = new bool[paramCount];
     
     for (int i = 0; i < paramCount; i++)
     {
          VariantInit(&params[i]);
          owned[i] = false;
     }
     HRESULT hr;
     // Stage 2
     // ...
     // Stage 3
     // ...
     // Stage 4
     // ...
} 

The method prepares a secondary VARIANT array that will contain the parameters passed into the DoInvoke() method (inside the DISPPARAMS array.) These parameters may contain Excel Range objects, and these are not consumable by a server UDF. The method also prepares a bool array, which will be used during cleanup. Finally, this stage zeroes out the memory in these arrays.

The second stage looks for Excel Range objects among the passed in parameters and turns them into arrays instead:

HRESULT CUdfAutomation::DoInvoke(
               DISPID dispidMember, REFIID riid, 
               LCID lcid, WORD wFlags, 
               ::DISPPARAMS *pdispparams, VARIANT *pvarResult, 
               ::EXCEPINFO *pexcepinfo, UINT *puArgErr)
{
     // Stage 1
     // ...
     // Stage 2
     try
     {
          // Go over the list of arguments and transform the 
          // IDispatch type to be the Value2 property value.
          for (int i = 0; i < paramCount; i++)
          {
               VARIANT* current = &pdispparams->rgvarg[i];
               if (current->vt == VT_DISPATCH)
               {
                    VARIANT* pNewVariant = &params[i];
                    IDispatch* piDisp = current->pdispVal;

                    ::DISPPARAMS invokeParams = {0};
                    hr = piDisp->Invoke(0, IID_NULL, 
                         0, DISPATCH_PROPERTYGET, 
                         &invokeParams, pNewVariant, 
                         NULL, NULL);
                    if (FAILED(hr))
                    {
                         throw gcnew System::InvalidOperationException();
                    }
                    owned[i] = true;
               }
               else
               {
                    params[i] = *current;
               }
          }

          // Stage 3
          // ...
          // Stage 4
          // ...
     }
     // ...
     return S_OK;
}

Each dispmethod parameter is examined (inside the dispparams array) to see if its type is VT_DISPATCH. When that's the case, it means that Excel has passed in a VARIANT containing a reference to the Excel object model Range object. Since this code is only interested in the array of values contained inside that range, it will execute the default property on these Range objects to get back the array of data by calling Invoke() on the IDispatch pointer. The array value is placed inside the pNewVariant pointer (which actually resides inside the allocated array of VARIANT objects we allocated). When a range is retrieved, the code also sets an entry in the owned array to true, to signify that this VARIANT will need to be freed when the method is done. If the type of the parameter is not VT_DISPATCH, the code simply bit-copies the VARIANT into the array. Because the owned array will have false for that element, it will not be freed.

Each of these parameters now needs to be transformed to its managed equivalent so that the managed UDF will be able to comprehend it:

HRESULT CUdfAutomation::DoInvoke(
               DISPID dispidMember, REFIID riid, 
               LCID lcid, WORD wFlags, 
               ::DISPPARAMS *pdispparams, VARIANT *pvarResult, 
               ::EXCEPINFO *pexcepinfo, UINT *puArgErr)
{
     // Stage 1
     // ...
     // Stage 2
     try
     {
          // ...
          // Stage 3
          array<System::Object^>^ params = 
               GetObjectsForVariants(params, pdispparams->cArgs);

          // Reverse the array.
          array::Reverse(params);

          // Stage 4
          // ...
     }
     // ...
     return S_OK;
}

First, the method calls into the GetObjectsForVariants() method to get back a managed Object array (that's the funky array**<System::Object^>^** thingamajig). Because the original parameters passed to the UDF_Invoke() method are in reverse order, they also need to be straightened out to the correct order before being passed into the managed code.

Note

This reverse order has nothing to do with UDFs or Excel. It's just how IDispatch is designed. This is probably due to the fact that, historically, when parameters have been passed on the stack to regular functions, parameters have been reversed, and this simulates that for (almost) no good reason.

At the end of stage 3, the params local variable will contain correctly ordered managed references to the values passed by Excel into the UDF:

HRESULT CUdfAutomation::DoInvoke(
               DISPID dispidMember, REFIID riid, 
               LCID lcid, WORD wFlags, 
               ::DISPPARAMS *pdispparams, VARIANT *pvarResult, 
               ::EXCEPINFO *pexcepinfo, UINT *puArgErr)
{
     // Stage 1
     // ...
     // Stage 2
     try
     {
          // Stage 3
          // ...
          // Stage 4
          UdfMethodWrapper^ method = m_manager->UdfMethodFromId(dispidMember);
          
          // Set method to be volatile if needed.
          if (method->IsVolatile)
          {
               m_app->Volatile(System::Type::Missing);
          }
          System::Object^ resultObject = method->Invoke(params);
          Marshal::GetNativeVariantForObject(
               resultObject, System::IntPtr(pvarResult));
     }
     catch (System::OutOfMemoryException^)
     {
          return E_OUTOFMEMORY;
     }
     catch (System::Exception^)
     {
          return E_FAIL;
     }
     finally
     {
          for (int i = 0; i < paramCount; i++)
          {
               if (owned[i])
               {
                    VariantClear(&params[i]);
               }
          }
          delete[] params;
          delete[] owned;
     }
     return S_OK;
}

This last stage actually makes the call. It retrieves the UdfMethodWrapper instance from the UdfManager and checks if the method is volatile. If it is, it makes sure to call the Excel Application object and notify it that the method is volatile. The UdfMethodWrapper.Invoke() method is then called, with the params local variable passed as a parameter.

This is how volatility works in Excel. While running, UDFs need to make sure they call the Volatile method on the Application object. Excel detects that and marks them as volatile. Of course, a UDF has a return value. This value gets assigned to the resultObject managed reference, which is then transformed to its native equivalent and placed in a VARIANT so that it can be returned to Excel.

Finally, the code makes sure it cleans up after itself, clearing what VARIANT instances were allocated and deleting the allocated arrays.

There are four more methods in the UdfAutomation.cpp file responsible for transforming managed values to native ones and vice versa. These functions are pretty long and are beyond the scope of this book. Debugging them, though, should help you understand their functionality. The methods are:

GetObjectsForVariants()—Called from UDF_Invoke(), this method is used to take an array of VARIANT structures and transform them into an array of Object instances.

GetObjectForNonArrayVariant()—Used by GetObjectForVariant() as a helper method for transforming simple types of VARIANTs into their Object equivalents.

GetObjectForVariant()—Transforms a single VARIANT into an equivalent Object instance. Does this for arrays as well.

GetVariantForObject()—Used for the result of the UDF call—transforms an Object instance into a VARIANT that is consumable by Excel.

Final Steps

To complete the project, the UdfProxy class needs to be strong-named and placed in the GAC so that it can be used by the UdfAutomation object. If it were not placed in the GAC, it would have to be placed in the Excel 2007 directory.

The UdfAutomation DLL needs to be registered in the system, using the regsvr32.exe utility. That will ensure that Excel can find it in the list of automation servers.

Improvements

The following improvements can be made to the project, but are outside the scope of this book:

  • Easier configuration

It is possible to move the logic of the configuration utility into Excel itself by adding a ribbon button that will bring up the dialog box. Once that’s done, UdfProxy can be modified to dynamically add or remove UDF assemblies without having to reboot Excel.

  • Setup

    A deployment program can be made that takes care of placing the files in the correct location, doing the necessary registration and configuring Excel to automatically load the addin.

  • Detecting client

    Sometimes it might be useful to have different behavior in the same UDF class based on whether or not it’s running on the client. There are two ways of doing this. The first involves checking for the name of the process running the UDF. (See the System.Diagnostics.Process class for more information on how to do that.) If it's Excel.exe that means it’s the client. Otherwise, it probably means it’s the server.

Note

The name of the process for the server will be w3wp.exe; however, do not check for that as it may change in the future. Instead, if the name is not Excel.exe, assume that it’s the server running the UDF.

The second, more correct, way of doing this would be to add a base class inside the UdfProxy assembly. That class will have a property named IsClient, which will return true in the case where it’s being executed by Excel. Instead of checking the process name though, this property is set when the UdfManager class instantiates the UDF class. It can easily cast the produced object to the base class and if the cast is successful, set that property to true.

Other Generic Solutions

The solution described here is by no means the only one or even the best one. There are at least two more solutions—each with its advantages and disadvantages.

The first uses a similar outlook on the problem but uses a generic XLL instead of a generic COM automation server. This has the advantage of potentially performing better than the one described here.

Note

Actually, it may have much better performance than the automation addin, since it can take advantage of the Multithreaded Recalc engine—something automation addins cannot do in this version.

The disadvantage is that it is significantly more complex—XLLs are not as simple to use as automation addins and would require such things as custom code for pushing parameters onto the stack and handling the XLOPER structures in the XLL SDK.

The second possible solution is to use code generation to automatically create assemblies that look like the one described at the beginning of this chapter (instead of adding code to the existing UDF assembly, though, these assemblies would instantiate them and call into them from methods accessible through an automation server). The main advantage is that there is no need for any configuration—these assemblies can be individually used by the regular Excel mechanisms. The disadvantage is that they will probably perform worse than the solution presented here (due to the fact that COM interop, for the most part, is much slower than C++/CLI interop).

Summary

Regrettably, Excel Services UDFs do not work on the client. This chapter shows how to work around that limitation in two different ways. Both ways have their advantages and disadvantages—at the end of the day, it is up to the person deploying the solution to decide which way is best for them.