Wednesday, December 22, 2010

Advanced Topics in PInvoke String Marshaling

Introduction

The .NET Platform Invoke tools, used through the DllImportAttribute, are a powerful and simple mechanism to interface with unmanaged DLLs. However, there are many subtleties that are important when addressing string buffer ownership responsibility with unmanaged code. This article covers some of the additional options besides the default MarshalAs(UnmanagedType.LPStr).

Background
So you've bought into .NET hook line and sinker, but you still have a bunch of native code DLLs around you want to make use of. Platform Invoke provides a mechanism to wrap those native DLLs, but unfortunately some Platform Invoke situations are not as simple as you might think from reading the basic Platform Invoke documentation.
What if the native code expects to own the string after the call? What if you'd like to marshal UTF8 strings instead of ANSI/ASCII strings? What if the parameter is really an out-buffer that the target is going to write to? These are just a few of the realities that exist when interfacing with native-code DLLs. While you'll learn that it's easy to handle any of these situations using PInvoke, they are all distinctly different and require different code.

Using the code

We're not going to cover the basic concepts of PInvoke here. For that we recommend you review one of the many excellent tutorials available elsewhere. Instead, we're going to consider some of the different ways one can use PInvoke to interact with a native-code DLL entry point declared as:

  void my_function(char *data);  

There are several possible contracts this C-code could have with us over the character pointer data. Below are a few of those contracts. In all cases we assume a null-terminated ANSI/ASCII string.
data may be read-only during the lifetime of the call, and never stored by native code
data may be modified during the lifetime of the call, and never stored by native code
data may be adopted by native code as it's own, where native code expects to free it later
If you're familiar with PInvoke tutorials, you should be familiar with how to handle case #1 above.   We simply declare the entry point and specify the built-in LPStr marshaler.

    [DllImport("mydllname")]
    extern static unsafe void my_function( [MarshalAs(UnmanagedType.LPStr)] string data)

The attribute decorator is simple, and automatically handles case #1 above. Before the call, the built-in Marshaler allocates a fixed-location buffer for a null-terminated string, and copies an ANSI/ASCII compatible version of the managed string into the buffer.  After the call, the Marshaler automatically frees the buffer, making sure not to leak memory.

Read more: Codeproject