.NET

P/Invoke Revisited

Jason Clark

Code download available at:NETColumn0410.exe(121 KB)

Contents

Marshaling Structures
Object Lifetime and Pinning
Marshaling Value Types vs. Reference Types
StructLayoutAttribute
Non-Blittable Marshaling
A Word About Complexity

In the July 2003 installment of the .NET column I covered the basics of Win32® interoperation with the Microsoft® .NET Framework (P/Invoke). Based on reader feedback, this topic is worthy of further coverage, so I have decided to revisit P/Invoke in this column. It will build upon the information in the July 2003 issue, so if you are not familiar with basic .NET interop, I suggest reviewing that column before digging into this one.

This month I am going to delve into the interop details of marshaling data structures to native functions. In a future piece, I hope to address calls into native code that calls back into your managed code, and I'll work through an interop case study using Windows® Forms. So let's get started on marshaling structures.

Marshaling Structures

The common language runtime (CLR) is capable of marshaling references to managed memory to native APIs. In order to accomplish this, the runtime has to consider a number of differences between managed memory and native memory. Let's take a look at some of these differences:

  • The field layout of managed reference objects is reorganized by the runtime (by default)
  • Managed reference objects are garbage collected; native memory is not
  • Managed reference objects are moved around in physical memory by the system; native objects are not
  • Certain common native structure layouts, such as inline arrays, are difficult to accomplish using some .NET-compliant languages such as C# and Visual Basic® .NET

These are the major considerations that you and the CLR must account for when moving data between managed and native code. To begin addressing these differences, I'll start with an example. Figure 1 shows an excerpt from the winbase.h header file in the Windows Platform SDK, that contains the declaration of the GetSystemPowerStatus API and the SYSTEM_POWER_STATUS native structure type definition.

Figure 1 Native GetSystemPowerStatus

typedef struct _SYSTEM_POWER_STATUS { BYTE ACLineStatus; BYTE BatteryFlag; BYTE BatteryLifePercent; BYTE Reserved1; DWORD BatteryLifeTime; DWORD BatteryFullLifeTime; } SYSTEM_POWER_STATUS, *LPSYSTEM_POWER_STATUS; BOOL WINAPI GetSystemPowerStatus( OUT LPSYSTEM_POWER_STATUS lpSystemPowerStatus );

Figure 2 shows C# code that makes a call to the native GetSystemPowerStatus API, and outputs information from the resulting data structure. Note that in the C# code it is necessary to redefine the native SYSTEM_POWER_STATUS type definition as SystemPowerStatus, a managed reference type. The type can have any name, but must have a field layout that matches that of the target data structure. It is also possible to marshal managed value types to native code, which I will cover in a moment. But for now, it is important to note that you need to define a managed type that exactly matches the in-memory layout of the native structure that is expected by the API.

Figure 2 GetSystemPowerStatus Call Using Class

class Interop { static void OutputBatteryLifeRefType() { SystemPowerStatus status = new SystemPowerStatus(); GetSystemPowerStatusRef(status); Console.WriteLine("Battery has {0} percent life remaining. State={1}", status._BatteryLifePercent, status._BatteryFlag); } [DllImport("Kernel32", EntryPoint="GetSystemPowerStatus")] static extern Boolean GetSystemPowerStatusRef(SystemPowerStatus sps); // In C#, 'class' defaults to automatic field layout [StructLayout(LayoutKind.Sequential)] // Required! class SystemPowerStatus { public ACLineStatus _ACLineStatus; public BatteryFlag _BatteryFlag; public Byte _BatteryLifePercent; public Byte _Reserved1; public Int32 _BatteryLifeTime; public Int32 _BatteryFullLifeTime; } // Note: Underlying type of byte to match Win32 header enum ACLineStatus : byte { Offline = 0, Online = 1, Unknown = 255 } enum BatteryFlag : byte { High = 1, Low = 2, Critical = 4, Charging = 8, NoSystemBattery = 128, Unknown = 255 } }

The SystemPowerStatus class in Figure 2 is attributed with the StructLayoutAttribute class. Any time you use a managed reference type to marshal data to native code, you must apply the StructLayoutAttribute to the type definition and specify a layout kind of Sequential. The LayoutKind.Sequential setting tells the runtime to leave the fields in their defined order, which addresses the first difference in the preceding bulleted list. We will look at the StructLayoutAttribute in more detail in a moment, but first let's dig into the memory ramifications of the code in Figure 2.

Object Lifetime and Pinning

Looking at the call to GetSystemPowerStatusRef in Figure 2, a question arises. What happens if a garbage collection kicks in during the call to the native function? If this happens, won't the object referred to by the status local variable potentially be moved around in memory by the collector? In fact, if status were not referred to by subsequent code, couldn't the collector clean up the object entirely while the native method is executing? The answer to these questions would be yes, were it not for the pinning feature of the CLR.

When the runtime marshaler sees that your code is passing to native code a reference to a managed reference object, it automatically pins the object. What this means is that the object is put in a special state where the garbage collector will neither move the object in memory nor remove the object from memory. Pinned objects hurt the performance of the garbage collector, but they assure that the memory remains intact for the life of the native call; this is critical to the proper functioning of the native code.

When the native function returns, the marshaled object is automatically unpinned. Automatic pinning is very convenient, but it raises another question. What happens if the native function caches the pointer for use later on? When the function returns, won't the collector be free to move the object? The answer is yes, and the solution for your code in such a situation is to manually pin the object using the System.Runtime.InteropServices.GCHandle type.

The following lines of code use the GCHandle type in order to pin and unpin a managed string:

String s = "PinMe"; GCHandle pinHandle = GCHandle.Alloc(s, GCHandleType.Pinned); try { ••• } finally { pinHandle.Free(); }

Manually pinning an object is not difficult. The tricky part is knowing when you should. The GetSystemPowerStatus API, for example, does not cache the pointer that is passed in as its only argument. And so, automatic pinning is sufficient for calls to this native API. However, an overlapped call to the native ReadFileEx function would require manual pinning of the second and fourth parameters to the call because the API will continue to utilize the referenced memory after the function returns. You need to know the behavior of the function you are calling well enough to decide whether automatic pinning is sufficient, or whether you will need to manually pin your managed object.

Pinning addresses the differences between managed and native memory in terms of moving memory around and collecting unreferenced objects. Now let's take a look at how the native call in Figure 2 differs if we use a value type rather than a reference type.

Marshaling Value Types vs. Reference Types

The code in Figure 3 functions similarly to the code in Figure 2 in that it makes a call to the native GetSystemPowerStatus function. However, in Figure 3 the SystemPowerStatus type is defined as a managed structure rather than a managed class. Said another way, we are marshaling a value type rather than a reference type.

Figure 3 GetSystemPowerStatus Call Using Struct

class Interop { static void OutputBatteryLifeValType() { SystemPowerStatus status; GetSystemPowerStatusVal(out status); Console.WriteLine("Battery has {0} percent life remaining. State={1}", status._BatteryLifePercent, status._BatteryFlag); } [DllImport("Kernel32", EntryPoint="GetSystemPowerStatus")] static extern Boolean GetSystemPowerStatusVal(out SystemPowerStatus sps); // In C#, 'struct' defaults to sequential field layout struct SystemPowerStatus { public ACLineStatus _ACLineStatus; public BatteryFlag _BatteryFlag; public Byte _BatteryLifePercent; public Byte _Reserved1; public Int32 _BatteryLifeTime; public Int32 _BatteryFullLifeTime; } }

When your code defines a variable of a reference type, the variable is either a reference to an object or it is a null reference. In contrast, a variable of a value type is the actual data, and no reference is involved. This difference between the two data types in the managed type system affects marshaling to native code.

When you pass a reference variable to a native function, the runtime marshals the reference into a native pointer to the data structure. However, when you marshal a value to a native function, the runtime pushes a copy of the value object onto the thread's stack. Sometimes this is what you want, but more often a native API will expect a pointer to a structure (as is the case with GetSystemPowerStatus), and so you have to take steps to indicate to the system that a pointer to the variable should be passed, rather than a copy of the variable.

To do this, you can define the parameter to your static extern P/Invoke method as an out or ref parameter. Remember that this tells the runtime to pass a reference to the variable, rather than to the data in the variable on the stack.

Fundamentally, the difference between marshaling classes and structures is one of indirection. When you marshal a managed reference type, you are always passing either a pointer or a pointer to a pointer. When you marshal a managed value type, you are either passing the data wholesale on the stack or you are passing a pointer to the data. Thus, you can marshal value types with either zero or one level of indirection, while reference types can be marshaled using one or two levels of indirection to the native code.

Surprisingly, you often find that marshaling managed value types is preferable to marshaling managed reference types. One reason is that managed structures are somewhat more similar to C structures than are managed classes. The runtime never reorganizes the field layouts of managed structures written in C# or Visual Basic .NET, and the garbage collector never moves unboxed values around in memory. Most structure marshaling can be performed using either value types or reference types. However, until you have a compelling reason to use a class to marshal data to native code, you should generally use structures.

StructLayoutAttribute

You may have noticed that StructLayoutAttribute is absent from the structure definition in Figure 3. This is because the C# compiler emits one automatically for structures. However, you may still need to use StructLayoutAttribute explicitly with your value type definitions for its other features. StructLayoutAttribute allows you to specify four marshaling settings for your managed reference and value types. They are as follows: layout kind, character set, field packing, and size.

The LayoutKind enumerated value of StructLayoutAttribute can be any of three values: Auto, Sequential, and Explicit. We have already addressed Sequential, which is typically the setting you will use and which is the layout emitted by the C# and Visual Basic .NET compilers when compiling value types not explicitly attributed with a StructLayoutAttribute. Reference types not marked with StructLayoutAttribute default to using Auto layout, which is not a valid setting for code that interoperates with native functions. Finally, there is the Explicit value.

When you set LayoutKind to Explicit, you are telling the marshaler that you want to lay out the fields in your type on a field-by-field basis. Using the Explicit setting also requires you to apply FieldOffsetAttribute to each field in your data type, specifying the relative offset (in bytes) of each field. You do not usually need the full flexibility of LayoutKind.Explicit; however, it is useful for marshaling more advanced data structures, such as unions in C, where fields overlap one another.

The CharSet property of StructLayoutAttribute lets you specify whether Char and String fields in your type are marshaled as one-byte ANSI values or two-byte Unicode values. The CharSet enumerated type includes Ansi, Unicode, and Auto settings. The default is Ansi and this is almost never what you want. Instead, when marshaling a structure containing text, you should generally use the Auto setting, which uses ANSI on Windows 95-based systems and Unicode on Windows NT®-based systems. We'll take a look at an example of marshaling a data structure containing text in just a moment.

The Pack and Size properties of StructLayoutAttribute allow you to specify the packing boundary for your fields and the size of your structure, respectively. The default packing size is eight bytes, which is consistent with most of the type definitions in the Win32 SDK. The size of your data type in memory is usually based on the fields in the type definition. But you can use the Size property to manually set the memory footprint of your type. The Marshal.SizeOf method returns the size, in bytes, of the marshaled version of a managed object or type.

Non-Blittable Marshaling

So far we have only looked at marshaling cases where the runtime is able to pass to native code a pointer that points directly to managed memory. Two things make this possible: pinning and a memory layout consistent with what native code expects. Sometimes, however, it is impractical or impossible for your managed memory to be laid out in the same arrangement as a native equivalent data structure.

In these cases, the runtime makes a copy of your managed object using native memory and native layout. A pointer to the copy is then passed to the native call. If the parameter to the extern method is also marked as [Out] using OutAttribute, then the marshaler will also copy the memory back into your managed ob-ject when the call returns. In these cases, you may have to use MarshalAsAttribute to help the marshaler understand what the native memory layout is supposed to look like. Structures and classes that require marshaling in this fashion are called non-blittable types, and they incur substantially more overhead from the runtime due to the memory allocations and copies. Note, though, that there are some non-blittable types where MarshalAs isn't needed. And there are cases where you could use MarshalAs (such as marking an integer with I4), although in reality the setting would be ignored. And so, while the presence of the MarshalAsAttribute is not necessarily an indicator of whether a type is blittable or non-blittable, it is often useful or necessary for marshaling of non-blittable types.

Let's look at an example. The code in Figure 4 is excerpted from header files in the Win32 Platform SDK. It shows the ANSI versions of the GetVersionEx API function and OSVERSIONINFO structure. The OSVERSIONINFO structure contains an inline array of 128 CHAR values. This field is interesting for two reasons. It is not directly expressible in C# as arrays are always reference types. Also, it is a character array and therefore could be a total of 128 bytes or 256 bytes depending on whether ANSI or Unicode is used. Figure 5 contains the C# code that calls this API.

Figure 5 Managed Non-Blittable Call to GetVersionEX

class Interop { static void OutputVersionInfo() { OSVersionInfo info = new OSVersionInfo(); GetVersionEx(info); Console.WriteLine("Extra version info string = {0}", info.CSDVersion); } [DllImport("Kernel32", CharSet=CharSet.Auto)] static extern Boolean GetVersionEx( [Out][In]OSVersionInfo, versionInformation); [StructLayout(LayoutKind.Sequential, CharSet = CharSet.Auto)] class OSVersionInfo { public UInt32 OSVersionInfoSize = (UInt32) Marshal.SizeOf(typeof(OSVersionInfo)); public UInt32 MajorVersion = 0; public UInt32 MinorVersion = 0; public UInt32 BuildNumber = 0; public UInt32 PlatformId = 0; // Attribute used to indicate marshalling for String field [MarshalAs(UnmanagedType.ByValTStr, SizeConst = 128)] public String CSDVersion = null; } }

Figure 4 Native GetVersionEX

typedef struct _OSVERSIONINFOA { DWORD dwOSVersionInfoSize; DWORD dwMajorVersion; DWORD dwMinorVersion; DWORD dwBuildNumber; DWORD dwPlatformId; CHAR szCSDVersion[ 128 ]; // Maintenance string for PSS usage } OSVERSIONINFOA, *POSVERSIONINFOA, *LPOSVERSIONINFOA; WINBASEAPI BOOL WINAPI GetVersionExA( IN OUT LPOSVERSIONINFOA lpVersionInformation );

Notice that the OSVersionInfo class definition in Figure 5 has replaced the inline CHAR array in the native structure with a String object reference in the managed object. Notice also that the field is attributed with MarshalAsAttribute, which specifies that the field should be marshaled as a ByValTStr with a size of 128 characters. I could have chosen to marshal this field as a managed Char[] as well, however that would have the same memory layout as String, and so I chose to use the String for the sake of convenience. See Figure 6 for a picture of the memory layout differences between the native structure and the managed representation of the native structure.

Figure 6 Layout Differences in Native and Managed Memory

Figure 6** Layout Differences in Native and Managed Memory **

Figure 6 should make it clear why the runtime can't just pass a pointer to the managed object directly to the native API. The native API is expecting a memory layout similar to the block on the left, where the char array is part of the memory block. The native API would simply treat the pointer/reference field in the managed memory block as two or four char values and would continue to clobber memory after the managed object's block. This is why the runtime must make a second copy of the managed structure with a layout equivalent to the native structure.

In brief, when cases arise in which there is no way to directly express the memory layout of the target structure, you can use MarshalAsAttribute to specify alternate memory arrangements for the marshaled version of the data. In all cases like this, the CLR will have to copy the managed fields to a native memory buffer before calling the native API.

In native code, there is no equivalent to managed metadata. This is why you have to carefully define your managed types to match the resulting native memory expected by the target function. However, the native code's lack of knowledge about the passed-in buffer also allows you to make manual adjustments to meet your needs, for example to handle considerations such as performance.

For instance, an alternative implementation of the managed OSVersionInfo class may choose to ignore the string returned in the last field and just leave that field out of the class definition. Of course, in an implementation such as this, the Size property of the StructLayoutAttribute must be used to extend the size of the instance to leave space for the data; otherwise a call to the native function would cause memory corruption.

Similarly, it is possible to define the last field as 128 individual char fields. Doing this, of course, is tedious. But this up-front effort pays off in back-end performance. The sample code for this column, available on the MSDN®Magazine Web site, includes an implementation of OSVersionInfo that takes this approach. Look for OSVersionInfoBlittable in the sources. The OSVersionInfoBlittable structure is both a value type and is blittable. Calls to GetVersionExBlit will be substantially faster than calls to the counterpart function in Figure 5. Meanwhile, you can use a public property to provide friendly String access to the otherwise unfriendly block of Char fields. The get accessor method in the CSDVersion property uses unsafe C# syntax to create a String object using the address of the first Char in the block of Chars. This has the effect of deferring the conversion portion of the marshaling until code accesses the String. It also limits the conversion to the String portion of the data structure.

Also included in the download is a performance test that runs the blittable and non-blittable interop methods a million times while timing them. On my system the blittable call to GetVersionInfo was over 10 times faster than the non-blittable counterpart.

A Word About Complexity

The P/Invoke facilities of the CLR are very complete and should allow for marshaling of many different kinds of data to native code. As the complexity of your data structures increase, however, you will find more use for unsafe code blocks in C#, the MarshalAsAttribute, and other detailed tools in the interop arsenal. Getting interop right can prove tedious as things get complex.

Part of what makes interop complicated using C# is the fact that C# does not know how to natively read C/C++ header files. If it did, it could help out significantly in your P/Invoke work. However, the C++ compiler is capable of building managed code, and can read standard header files directly. You may find that, for interoperation with certain native functions, a helper DLL written using managed C++ saves some trouble, compared to P/Invoking directly from C#.

That's it for this month's column. Browse over to https://www.pinvoke.net, a site created by Adam Nathan to allow developers to find, edit, and add P/Invoke signatures for public consumption. This has helped many people P/Invoke to Win32 and other APIs without having to battle the complexity alone.

Send your questions and comments for Jason to  dot-net@microsoft.com.

Jason Clark provides training and consulting for Microsoft and Wintellect and is a former developer on the Windows NT and Windows 2000 Server team. He is the coauthor of Programming Server-side Applications for Microsoft Windows 2000 (Microsoft Press, 2000). You can get in touch with Jason at JClark@Wintellect.com.