Server-Side Programming with Unicode

To make a database Unicode-aware involves defining Unicode-aware client interactions in addition to using the nchar, nvarchar, and nvarchar(max) data types to define Unicode storage. You can define Unicode-aware client interactions by performing the following on the database server side:

  • Switch from non-Unicode data types to Unicode data types in table columns and in CONVERT() and CAST() operations.

  • Substitute using ASCII() and CHAR() functions with their Unicode equivalents, UNICODE() and NCHAR().

  • Define variables and parameters of stored procedures and triggers in Unicode.

  • Prefix Unicode character string constants with the letter N.

Using UNICODE(), NCHAR(), and Other Functions

The ASCII() function returns the non-Unicode character code of the character passed in. Therefore, use the counterpart UNICODE() function for Unicode strings where you would use the ASCII function on non-Unicode strings. The same is true of the CHAR function; NCHAR is its Unicode counterpart.

Because the SOUNDEX() function is defined based on English phonetic rules, it is not meaningful on Unicode strings unless the string contains only the Latin characters A through Z and a through z.

ASCII, CHAR, and SOUNDEX can be passed Unicode parameters, but these arguments are implicitly converted to non-Unicode strings. This could cause the possible loss of Unicode characters before processing, because these functions operate on non-Unicode strings by definition.

Besides the UNICODE() and NCHAR() functions, the following string manipulation functions support Unicode wherever possible: CHARINDEX(), LEFT(), LEN(), UPPER(), LOWER(), LTRIM(), RTRIM(), PATINDEX(), REPLACE(), QUOTENAME(), REPLICATE(), REVERSE(), STUFF(), SUBSTRING(), UNICODE(). These functions accept Unicode arguments, respect the 2-byte character boundaries of Unicode strings, and use Unicode sorting rules for string comparisons when the input parameters are Unicode.

Defining Parameters in Stored Procedures

Defining parameters with a Unicode data type guarantees that client requests or input are implicitly converted to Unicode on the server and not corrupted in the process. If the parameter is specified as an OUTPUT parameter, a Unicode type also minimizes the chance of corruption on its way back to the client.

In the following stored procedure, the variable is declared as a Unicode data type.

    @name nvarchar(40)
SELECT p.ListPrice, v.Name
    FROM Production.Product p 
        INNER JOIN Purchasing.ProductVendor pv
            ON p.ProductID = pv.ProductID  
        INNER JOIN Purchasing.Vendor v
            ON pv.VendorID = v.VendorID
WHERE p.Name = @name;

Using the N Prefix

Unicode string constants that appear in code executed on the server, such as in stored procedures and triggers, must be preceded by the capital letter N. This is true even if the column being referenced is already defined as Unicode. Without the N prefix, the string is converted to the default code page of the database. This may not recognize certain characters.

For example, the stored procedure created in the previous example can be executed on the server in the following way:

EXECUTE Product_Info @name = N'Chain'

The requirement to use the N prefix applies to both string constants that originate on the server and those sent from the client.