Hi Glenn
For the character fields in SQL Server, the value is not the number of characters but the number of bytes available for that field. The number of bytes per character is then determined by your UTF settings - UTF8 can allow multi-byte characters to be stored in char/varchar columns and therefore may have differing results for LEN(column) and DATALENGTH(column).
I'm not clear on what you mean by the amount of memory that can be used? Do you mean the space that it takes up on the data page? There is a difference in this case between char and varchar. A char column will reserve the number of bytes supplied in the definition even if the actual value is NULL, however for varchar, variable-length columns with nulls don't consume space in the variable-length data portion of the rows. They do however consume two-bytes per column in the column offset array.
varchar(max) is a different approach. When using MAX you can store up to 2GB in the column. You should only plan to use MAX if you know that your data will have values greater than 8000 bytes (the maximum value for varchar).
The way the actual data is stored on the pages depends upon the length of the individual rows. If all of the row data is less than 8060 bytes in total then the row will all be stored IN_ROW_DATA allocation units. If the length spills over this, then one or more columns' data will be moved to ROW_OVERFLOW allocation units. The varchar(max) is stored on separate LOB_DATA pages.
You may need to consider memory allocations when doing activities such as sorting. The memory grant required for the query will consider the defined size and assume that the varchar columns are filled to 50% of their defined size.
What this suggests is that you really consider your data (and potential data) and size your columns appropriately. Minimise the risk of triggering ROW_OVERFLOW pages as this has a negative performance impact. If you "know" that rows will be longer than 8060 bytes then move some columns off to another table that you can JOIN with if you need all columns as this will actually perform better.
SQL database concerns
I am currently making the change from ACCESS to SQL database. I am setting up a database and there is one thing that I am unable to confirm. For my text fields I have selected to use nvarcar(). When setting the value for the max field size (reflected in characters), is the number reflect the max amount of memory that can be used, with a minimum amount of memory set aside for the field? I am using field interchangeably with record, but I am sure that the difference can be easily discerned. I have concerns about the size setting. I am confused as to exactly how much memory is set aside to an empty nvarcar() field. It is my understanding that the size you fill into the parentheses, nvarcar(250), sets the maximum size of that the field to the value, without effecting the amount of memory set aside. There are some fields that I want to be nvarcar(MAX). I am especially concerned about the amount of memory that is set aside for this type of field. There will be a lot of them. I don't mind a memory hog record, if there is a lot of content, but I don't want to leave a lot of unused memory sitting on the table.
Azure SQL Database
-
Martin Cairney 2,261 Reputation points
2023-04-20T20:36:24.0166667+00:00
2 additional answers
Sort by: Most helpful
-
Alberto Morillo 34,671 Reputation points MVP Volunteer Moderator
2023-04-20T20:35:04.6866667+00:00 To my knowledge, a character on nvarchar fields is 2 bytes so nvarchar(250) means 500 bytes.
An nvarchar(MAX) can store up to 2GB of characters
Try to avoid nvarchar(max). Usually queries using nvarchar(n) as faster (twice as fast) over nvarchar(max). If you put the same data on a nvarchar(250) column on one row vs nvarchar(max) another row different table (rows are the same except for the 2 columns) row size ends 50% bigger on the table using the nvarchar(max).
-
Glenn Walker 251 Reputation points
2023-04-21T15:13:12.1866667+00:00 Thanks. That pretty much alleviates my concerns. I generally only use MAX for fields in detail field in tables that I use for comboboxes, since there are never more than a handful of records. But in retrospect, it is probably not a good idea for me to have a field that allows someone to stuff it with 2G of data for a table that can have many, many records. A Meg or two is OK, but when you start using Gig's, it is probably something that should be discontinued. I understand that I am using characters/memory interchangeably here. But for what I was wanting to understand that is acceptable. I accidentally placed this response into a comment.