Replication: Infinite Retention Period Causing Performance Issues

A little explanation before explaining how to resolve this issue:

The retention period for subscriptions specified for each publication controls how often the Publisher and Subscribers should synchronize. If subscriptions do not synchronize with the Publisher within the retention period, they are marked as 'expired' and will need to be reinitialized. This is to prevent old Subscriber data from synchronizing and uploading these changes to the Publisher. The default retention period for a publication is 14 days. Because the Merge Agent cleans up the publication and subscription databases based on this value, care must be taken to configure this value appropriate to the application.

Merge Meta Data Cleanup:
When there is a large amount of merge meta data in the system tables, cleaning up the meta data improves the performance of merge replication. Prior to SQL Server 2000 Service Pack 1 (SP1), meta data could be cleaned up only by running sp_mergecleanupmetadata. However, SQL Server 2000 SP1 and later includes retention-based meta data cleanup, which means that meta data can be automatically deleted from the following system tables:

MSmerge_contents

MSmerge_tombstone

MSmerge_genhistory

The default retention period for publications is 14 days. If an article belongs to several publications, there might be different retention periods. In that situation, the longest retention period is used to determine the earliest possible time that cleanup can occur.

Important If there are multiple publications on a database, and any one of those publications uses an infinite publication retention period (@retention=0), merge meta data for the database will not automatically be cleaned up. For this reason, use infinite publication retention with caution.

https://msdn.microsoft.com/en-us/library/aa256294(SQL.80).aspx

https://msdn.microsoft.com/en-us/library/ms178557.aspx
https://msdn.microsoft.com/en-us/library/ms151778.aspx

Question:

What to do when we find ourselves in a situation in which we have a large amount of data in the metadata tables, an infinite retention period and performance problems?

How does one go about cleaning all the metadata without causing even further issues?

Answer:

Most important thing to know is that one should never go from “infinite retention” to “a couple of days” in one step. If you do so, then the first cleanup will try to remove a lot of data. Depending on the number of rows in the system tables, this most likely will cause a problem - probably worse that the initial performance issue.

To resolve this issue, we need to do reduce the retention period gradually:

Evaluate the number of rows that will be deleted:

1. Use coldate from Msmerge_genshistory to get the list of generations that will be removed
2. With that list (generation) get the number of rows that will be deleted in other system tables

Example:

Select generation from dbo.MSMerge_genhistory where coldata < '[current data - retention period you desire]'

Example:

So, from today, 22nd of June 2009, i would like a retention of 14 days: < 8th of June 2009

With the list of generations generated, we can see how many rows of data will need to be deleted from the other system tables (MSmerge_contents / MSmerge_tombstone) as well. If the number is quite high, then we increase the datetime: (example) < 1st of May 2009

Depending on the amount of data needed to be deleted, one may need "one to many" intermediate steps before reaching the target retention desired.

The key to this being a success is estimating the effect of the deletes that needs to be executed (by default batch of 5000 rows).

Note:

Once again, its important to emphasize the care one must take when deciding to have the retention of a publication to infinite.