Determining length of character in different code set

Brian Hart 21 Reputation points
2022-01-09T08:34:09.7+00:00

I encountered a situation yesterday I have never seen before in 18 years of using Access as middleware to import from Excel to DB2.

The DB2 ODBC driver will normally truncate an inbound string to match the field length when Access passes a string longer than the destination DB column length in an insert or update statement. This is good because I do not have to check the length of inbound strings in my code; I just let the ODBC driver truncate for me.

But today a spreadsheet included an en dash (ASCII 150, inserted using Alt-0150). This would not have been a problem had it not been a string of exactly 40 characters, matching the length of the destination field in the DB2 table. When my VBA code attempted to run the INSERT statement, it silently failed to create the record in the DB2 database. I had foreseen the possibility of some record not being inserted (although certainly not for this reason), and the only way we knew there was a failure was that my users' SOP includes checking the resulting totals in DB2 against the source spreadsheet after the import.

Diagnostics:

1. Testing the insert statement in my DB2 UI correctly generated a "too long" error.
2. My VBA code is this: CurrentDb.Execute "[SQL INSERT statement here]", dbFailOnError. The insert failed without generating the "too long" error I get in the DB2 UI. So although there was a DB2 error, it was apparnetly not passed back to Access.
3. Access: selecting the length of this string returns 40
4. UTF-8 encoded DB2 database: selecting the length of this string returns 42 (this is the live environment)
5. IBM-1252 encoded DB2 database: selecting the length of this string returns 40. So this is specific to UTF-8 destination database.
6. The ODBC driver does not attempt to truncate, so I infer that it sees the length as 40.
7. Testing in Notepad++ with just the en dash clarified one thing: with default UTF-8 encoding, the column and position before the en dash are both one, and after the en dash, column is 2 and position is 4. If I switch to ANSI, it gives me this: A–A.

While the underlying issue is DB2-related, I need to preempt failed inserts by recognizing strings that will silently overflow a DB2 field length before I run the SQL statement. Within VBA, how can I determine whether a specific character will be recognized as having a length greater than one in some other encoding, UTF-8, in this case?

Microsoft 365 and Office Access Development
0 comments No comments
{count} votes

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.