Specify a character set

The DllImportAttribute.CharSet field controls string marshalling and determines how platform invoke finds function names in a DLL. This topic describes both behaviors.

Some APIs export two versions of functions that take string arguments: narrow (ANSI) and wide (Unicode). The Windows API, for instance, includes the following entry-point names for the MessageBox function:

  • MessageBoxA

    Provides 1-byte character ANSI formatting, distinguished by an "A" appended to the entry-point name. Calls to MessageBoxA always marshal strings in ANSI format.

  • MessageBoxW

    Provides 2-byte character Unicode formatting, distinguished by a "W" appended to the entry-point name. Calls to MessageBoxW always marshal strings in Unicode format.

String Marshalling and Name Matching

The CharSet field accepts the following values:

Ansi (default value)

  • String marshalling

    Platform invoke marshals strings from their managed format (Unicode) to ANSI format.

  • Name matching

    When the DllImportAttribute.ExactSpelling field is true, as it is by default in Visual Basic, platform invoke searches only for the name you specify. For example, if you specify MessageBox, platform invoke searches for MessageBox and fails when it cannot locate the exact spelling.

    When the ExactSpelling field is false, as it is by default in C++ and C#, platform invoke searches for the unmangled alias first (MessageBox), then the mangled name (MessageBoxA) if the unmangled alias is not found. Notice that ANSI name-matching behavior differs from Unicode name-matching behavior.

Unicode

  • String marshalling

    Platform invoke copies strings from their managed format (Unicode) to Unicode format.

  • Name matching

    When the ExactSpelling field is true, as it is by default in Visual Basic, platform invoke searches only for the name you specify. For example, if you specify MessageBox, platform invoke searches for MessageBox and fails if it cannot locate the exact spelling.

    When the ExactSpelling field is false, as it is by default in C++ and C#, platform invoke searches for the mangled name first (MessageBoxW), then the unmangled alias (MessageBox) if the mangled name is not found. Notice that Unicode name-matching behavior differs from ANSI name-matching behavior.

Auto

  • Platform invoke chooses between ANSI and Unicode formats at run time, based on the target platform.

Specify a character set in Visual Basic

You can specify character-set behavior in Visual Basic by adding the Ansi, Unicode, or Auto keyword to the declaration statement. If you omit the character-set keyword, the DllImportAttribute.CharSet field defaults to the ANSI character set.

The following example declares the MessageBox function three times, each time with different character-set behavior. The first statement omits the character-set keyword, so the character set defaults to ANSI. The second and third statements explicitly specify a character set with a keyword.

Friend Class NativeMethods
    Friend Declare Function MessageBoxA Lib "user32.dll" (
        ByVal hWnd As IntPtr,
        ByVal lpText As String,
        ByVal lpCaption As String,
        ByVal uType As UInteger) As Integer

    Friend Declare Unicode Function MessageBoxW Lib "user32.dll" (
        ByVal hWnd As IntPtr,
        ByVal lpText As String,
        ByVal lpCaption As String,
        ByVal uType As UInteger) As Integer

    Friend Declare Auto Function MessageBox Lib "user32.dll" (
        ByVal hWnd As IntPtr,
        ByVal lpText As String,
        ByVal lpCaption As String,
        ByVal uType As UInteger) As Integer
End Class

Specify a character set in C# and C++

The DllImportAttribute.CharSet field identifies the underlying character set as ANSI or Unicode. The character set controls how string arguments to a method should be marshalled. Use one of the following forms to indicate the character set:

[DllImport("DllName", CharSet = CharSet.Ansi)]
[DllImport("DllName", CharSet = CharSet.Unicode)]
[DllImport("DllName", CharSet = CharSet.Auto)]
[DllImport("DllName", CharSet = CharSet::Ansi)]
[DllImport("DllName", CharSet = CharSet::Unicode)]
[DllImport("DllName", CharSet = CharSet::Auto)]

The following example shows three managed definitions of the MessageBox function attributed to specify a character set. In the first definition, by its omission, the CharSet field defaults to the ANSI character set.

using System;
using System.Runtime.InteropServices;

internal static class NativeMethods
{
    [DllImport("user32.dll")]
    internal static extern int MessageBoxA(
        IntPtr hWnd, string lpText, string lpCaption, uint uType);

    [DllImport("user32.dll", CharSet = CharSet.Unicode)]
    internal static extern int MessageBoxW(
        IntPtr hWnd, string lpText, string lpCaption, uint uType);

    [DllImport("user32.dll", CharSet = CharSet.Auto)]
    internal static extern int MessageBox(
        IntPtr hWnd, string lpText, string lpCaption, uint uType);
}
typedef void* HWND;

// Can use MessageBox or MessageBoxA.
[DllImport("user32")]
extern "C" int MessageBox(
    HWND hWnd, String* lpText, String* lpCaption, unsigned int uType);

// Can use MessageBox or MessageBoxW.
[DllImport("user32", CharSet = CharSet::Unicode)]
extern "C" int MessageBoxW(
    HWND hWnd, String* lpText, String* lpCaption, unsigned int uType);

// Must use MessageBox.
[DllImport("user32", CharSet = CharSet::Auto)]
extern "C" int MessageBox(
    HWND hWnd, String* lpText, String* lpCaption, unsigned int uType);

See also