Maybe you should test the DATALENGTH(VARCHARTEST) instead of DATALENGTH(NVARCHARTEST).
SQL Server 2019 Developer Edition: Collation: Latin1_General_100_CI_AS_SC_UTF8: Varchar is taking 2 bytes for ASCII Characters. For example the upper case letter 'A'
Hi,
Problem as described in Question.
Steps followed:
Step 1:
SELECT name, collation_name
FROM sys.databases
WHERE name = N'MyOptionsTest2';
CREATE TABLE SCTEST4 (ID INT,
VARCHARTEST VARCHAR(30),
NVARCHARTEST NVARCHAR(60),
);
GO
INSERT INTO SCTEST4 VALUES(1,'A', N'A')
SELECT *,LEN(NVARCHARTEST) as LENVC, DATALENGTH(NVARCHARTEST) as DATALENVC,ASCII(VARCHARTEST) as ASCIIVARCHAR, UNICODE(VARCHARTEST) as UNICODEVARCHAR, LEN(NVARCHARTEST) as LENNVC, DATALENGTH(NVARCHARTEST) as DATALENNVC, ASCII(NVARCHARTEST) as ASCIINVARCHAR, UNICODE(NVARCHARTEST) as UNICODENVARCHAR from SCTEST4
Just to Cross verify the collation on the column VARCHARTEST:
SELECT
name,
collation_name
FROM sys.columns
WHERE name = N'VARCHARTEST';
It appears VARCHAR takes 2 bytes for ASCII characters too, ? (This is of course UTF-8 collation which seem to be introduced in SQL Server 2019)
Can anyone help answer this ?
4 additional answers
Sort by: Most helpful
-
Olaf Helper 45,371 Reputation points
2022-10-17T12:50:41.717+00:00 What have you expected by a UTF-8 codepage?
ASCII always uses 1 byte, while UTF-8 stores up to 4 bytes, see UTF-8 support => Storage -
CathyJi-MSFT 22,331 Reputation points Microsoft Vendor
2022-10-18T02:03:50.58+00:00 Hi @Aswin ,
Quote from MS document.
With UTF-8 encoding, characters in the ASCII range (000000–00007F) require 1 byte, code points 000080–0007FF require 2 bytes, code points 000800–00FFFF require 3 bytes, and code points 0010000–0010FFFF require 4 bytes.
If the answer is helpful, please click "Accept Answer" and kindly upvote it. If you have extra questions about this answer, please click "Comment".
-
-
Aswin 21 Reputation points
2022-10-18T08:19:53.737+00:00 Everything was fine but there was small typo ,mistake in the below:
/*******************
SELECT *,LEN(NVARCHARTEST) as LENVC, DATALENGTH(NVARCHARTEST) as DATALENVC,ASCII(VARCHARTEST) as ASCIIVARCHAR, UNICODE(VARCHARTEST) as UNICODEVARCHAR, LEN(NVARCHARTEST) as LENNVC, DATALENGTH(NVARCHARTEST) as DATALENNVC, ASCII(NVARCHARTEST) as ASCIINVARCHAR, UNICODE(NVARCHARTEST) as UNICODENVARCHAR from SCTEST4
*****************/I was looking at DATALENGTH(NVARCHARTEST) instead of DATALENGTH(VARCHARTEST) -- The second attribute in above query