ODBC Driver 17 for SQL Server on Linux. Charset conversion problem with OTRS (bug?)

Marc Fauser 1 Reputation point
2021-09-21T20:07:20.137+00:00

I updated from ODBC Driver 13 for SQL Server to version 17.
With version 13, everything was working fine, with version 17, I cannot find what's wrong.

We have OTRS running which connects to a SQL Server database to retrieve data for customers.
E.g. we transfer if the customer has a maintenance contract or not.
Yes = thumb up emoji which I cannot post in this forum ( 4 byte character )
No = thumb down emoji which I cannot post in this forum ( 4 byte character )
This was working fine until version 17.
The select (simplified) was
SELECT N'<thumb up emoji>' as maintenance
I changed the select to
SELECT cast( N'<thumb up emoji>' as nvarchar(10) ) as maintenance
and it works again.
In my odbc.ini, I have no charset defined. In OTRS I have defined UTF-8 as a source and destination charset.
Any hints on what could be wrong or is it a bug in the driver as it was working before?

SQL Server
SQL Server
A family of Microsoft relational database management and analysis systems for e-commerce, line-of-business, and data warehousing solutions.
9,775 questions
{count} votes

2 answers

Sort by: Most helpful
  1. Olaf Helper 28,936 Reputation points
    2021-09-22T05:29:25.477+00:00

    Yes = thumb up emoji which I cannot post in this forum ( 4 byte character )

    4 byte char's are UTF-8, while MS SQL Server mainly supports 2 byte's char's = Unicode.
    Which data type do you use to store the text and which SQL Server version are you using?

    cast( N'<thumb up emoji>' as nvarchar(10) )

    And the type nvarchar here is Unicode, not UTF-8.


  2. YufeiShao-msft 6,986 Reputation points
    2021-09-22T07:16:27.127+00:00

    Hi @Marc Fauser ,

    Upon connection, the driver detects the current local of the process it is loaded in, if it used one of the support encoding, the driver uses that encoding for SQLCHAR data, otherwise, it defaults to UTF-8.
    Since all process start in the ‘C’locale by default (and cause the driver to default to UTF-8), if an application needs to use one of the encodings, it should use the setlocale function to set the locale appropriately before connecting

    In a typical Linux environment where the encoding is UTF-8, users of ODBC Driver 17 upgrading form 13 or 13.1 won’t observe any differences, however, applications that use a non-UTF-8 encoding need to use that encoding for data to/from the driver instead of UTF-8.
    https://learn.microsoft.com/en-us/sql/connect/odbc/linux-mac/programming-guidelines?view=sql-server-ver15

    So you should check the character sets/encodings of ODBC 17.

    and you should konw, UTF-8 is supported is SQL Server Version 2019, but not previous Versions, they support only ASCII and UniCode