so I have a situation where our SharePoint DB grows very big which is double what shows up in the storage metric. I have re-run $web.RecalculateStorageMetrics() and it still show that much.
This is SharePoint Server 2016 and SQL Server 2016 Enterprise installation, it is used as a document repository.
- Most files are pdf and around 1/3 are DOCX.
- Just recently we implemented FS so we could reduce content DB size.
- After setup FS, we ran migrate command from PowerShell and after 3 weeks it is finally done.
This is where things get very confusing, our content database size currently at 2TB (from shrink database GUI) and 4% free space. Where in storage metric, our repository size is at 909.9GB (99.77% from total size). There is more than 1TB size that we didn't know who used it.
I thought that this is used by ex blob data that has been moved to FS, so I have tried several things:
- run DBCC CHECKDB (I didn't see any error).
- run DBCC CLEANTABLE to every 200 FileStream tables.
- execute sp_filestream_force_garbage_collection (return 0).
- rebuild index on filestream table.
- run a full backup every week and run an incremental backup every day.
- I have also done shrink with the option reorganize files before releasing.
- Running "Microsoft.Data.SqlRemoteBlobs.Maintainer.exe -ConnectionStringName RBSMaintainerConnection -Operation GarbageCollection ConsistencyCheck ConsistencyCheckForStores -GarbageCollectionPhases rdo -ConsistencyCheckMode r -TimeLimit 1440" (I didn't see anything got deleted).
After much searching using this query "select sum(case when blob_data_inline is null then 1 else 0 end) CountFS, sum(case when blob_data_inline is null then blob_size else 0 end) SizeFS, sum(case when blob_data_inline is not null then 1 else 0 end) CountInline, sum(case when blob_data_inline is not null then blob_size else 0 end) SizeInline from mssqlrbs_filestream_data_1.rbs_filestream_data_x", I have determined that many spaces are being occupied by rbs_filestream_data_x. There are 2 column varbinary(max), one with FILESTREAM column (blob_data_filestream) and another without it (blob_data_inline). Each table has a max of 200k rows with around 50-70k filestream column and the rest go to inline column. That inline column size is around 900GB, while FS size is around the same size with blob folder property.
That filestream table is what keeping this DB so large, is this the correct FS implementation? I thought that all blob data would be moved out from content DB. But in this case, it still holds data in the content DB.