varchar(max) field on dedicate FG would be for best managing and move out to dedicated table if the varchar(max) is empty

Alen Cappelletti 1,047 Reputation points
2023-10-25T15:19:48.08+00:00

Hi all,

I got a table like this (see below) with roughly 180 000 records. It's big... close to 100 GB (2 varchar(max) fields).

I used PAGE compression for reduce space size... but... anyway it's great.

I would like to know your opinion, if could be better move out to another FG the 2 varchar(max) field.
Now are on the same PRIMARY filegroup. Naturally this operation will not reduce space but I could operate more accordingly when I need it for maintenance operations

I use to dedicated FG_user for all user tables, but I didn't create this DB from the beginning... also I've never separated this kind of TYPE on dedicated FG.

I noticed that one of the two fields [REPORT_XML] is almost always empty. It would be appropriate to reduce the weight by managing it in a dedicated external table like (CODICE_ID, REPORT_XML) to reduce the weight of the original one or even if it's empty it will minimally affect the size of the table

CREATE TABLE [dbo].[report](
	[CODICE_ID] [int] IDENTITY(1,1) NOT NULL,
	[CODICE_SOGGETTO_ID] [int] NULL,
	[TIPO_REPORT] [char](2) NULL,
	[REPORT_HTML] [varchar](max) NULL,
	[REPORT_XML] [varchar](max) NULL,
	[DATA_INSERIMENTO] [datetime] NULL,
	[STATO] [varchar](1) NOT NULL,
	[ORIGINE] [varchar](1) NOT NULL,
	[USER] [varchar](60) NOT NULL,
 CONSTRAINT [PK__CODICE_ID] PRIMARY KEY CLUSTERED 
(
	[CODICE_ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO

Thanks Alen.

SQL Server | Other
0 comments No comments
{count} votes

2 answers

Sort by: Most helpful
  1. Erland Sommarskog 122.7K Reputation points MVP Volunteer Moderator
    2023-10-25T21:34:04.06+00:00

    100 GB for 180 000 rows suggests an average row size of 550 KB.

    If you want to move the LOB columns to a different filegroup, you can do that with TEXTIMAGE_ON, but I don't think this is possible with ALTER TABLE, so you would need to create a new table and copy data over. Well, if you are to move the LOB to that other filegroup it has to be copied anyway. And if you roll your own, you can take 1000 rows at a time to avoid log explosions.

    Moving the LOB columns to separate tables? Could be an alternative, if you want to have the LOB data on a separate filegroup to keep down the size of the primary filegroup, to permit quicker recovery in cases of a disaster.

    It may also be worth investigating the actual length by running

    SELECT SUM(datalength(REPORT_HTML)), SUM(datalengtth(REPORT_XML)) 
    FROM dbo.report
    

    It could be that there is a lot of empty space. Out-or-row LOB data does not always behave well when you update.

    0 comments No comments

  2. Alen Cappelletti 1,047 Reputation points
    2023-10-26T09:30:04.7733333+00:00

    The main problem was the size of that table. I decided to use a varbinary(max) column by introducing the values into it with the COMPRESS function and then excluding the native varchar(max) column.

    The weight of the tables turned out to be incredibly small!

    The weight saving even on an unvalued varchar(max) column is ridiculous... it seems to me to be only 2 bytes overhead for column.

    So the problem was the content of the fields. A dedicated FG for 8 GB is more an academic question... but I don't think it will overwhelm my life in maintenance routines.

    Thanks ALEN

    SELECT TOP(1) CAST(DECOMPRESS([REPORT_HTML_COMP]) AS VARCHAR(MAX)) AS [REPORT_HTML] FROM [dbo].[REPORT]
    
    CREATE TABLE [dbo].[report](
    	[CODICE_ID] [int] IDENTITY(1,1) NOT NULL,
    	[CODICE_SOGGETTO_ID] [int] NULL,
    	[TIPO_REPORT] [char](2) NULL,
    	[REPORT_HTML] [varbinary](max) NULL,
    	[REPORT_XML] [varbinary](max) NULL,
    	[DATA_INSERIMENTO] [datetime] NULL,
    	[STATO] [varchar](1) NOT NULL,
    	[ORIGINE] [varchar](1) NOT NULL,
    	[USER] [varchar](60) NOT NULL,
     CONSTRAINT [PK__CODICE_ID] PRIMARY KEY CLUSTERED 
    (
    

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.