Secondary Replica not Synchronising in Basic Availability Group

SteveUK 116 Reputation points
2020-10-02T08:03:40.65+00:00

Morning All,

We have an issue with our secondary replica server in a Basic Availability group in that it stopped synchronising a while ago. There are three separate groups, each with one database (as there are BAGs). Each of the databases is in a similar state.

As the primary server can't truncate the logs when it takes a backup because the secondary is not up-to-date, the log drives keep filling up and the databases / backups are getting large to.

We tried rebooting the secondary as we understand that should re-initiate a sync but get the following error:

The remote copy of database <db name> has not been rolled forward to a point in time that is encompassed in the local copy of the database log.

Therefore, we are presuming that the secondary is so far out of date that the primary doesn't have the necessary log information to play into the secondary.

What's the best way to remedy this situation? We are presuming that we need to restore a fresh copy of the data but not sure the best way to do this with minimal downtime (the application does handle outages fairly well as it has a functional local copy of the data that it uses).

This article sort-of covers it but not quite our situation (it presumes that the database info is still available on the secondary) but we can't find any other articles that explain the necessary steps:
https://www.sqlpassion.at/archive/2017/09/25/adding-a-replica-back-into-a-sql-server-availability-group/

Would the best approach be to:

  1. Remove the database from the availability group on the secondary (not sure if this is possible) and then add back in / reseed
  2. Remove the database from the AG on the primary, copy data to the secondary and then add back in
  3. Just restore a recent backup of the database / logs from the primary to the secondary while still in AG (not sure if this is possible if it's still in the AG)
  4. Remove the replica server from the AG together, delete the database and then re-seed from Primary when adding back into AG.

Grateful for any advice!

Steve

SQL Server
SQL Server
A family of Microsoft relational database management and analysis systems for e-commerce, line-of-business, and data warehousing solutions.
13,204 questions
0 comments No comments
{count} votes

Accepted answer
  1. Cris Zhan-MSFT 6,611 Reputation points
    2020-10-02T10:03:07.663+00:00

    Hi @SteveUK ,

    Are you saying that the secondary database in 3 BAGs are all out of sync? (state: Not Synchronizing). Is there a problem with the network connection between the primary replica and the secondary replica?

    You can check DashBoard on the primary replica to view the health of the replicas and the databases, and check whether the Errorlog on each replica records any error message.

    If there are no network issues, try to manually suspend the data movement of the availability database on the secondary replica, and then resume.
    If it does not work, try to remove the database from the availability group on the secondary replica and rejoin it.


    If the answer is helpful, please click "Accept Answer" and upvote it.
    Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.

    0 comments No comments

2 additional answers

Sort by: Most helpful
  1. SteveUK 116 Reputation points
    2020-10-02T16:17:45.72+00:00

    Hi, thanks for your reply. We've managed to get all three databases back up and synchronising!

    You replied as we were half-way through the remedial action on the first database and that gave me more confidence that we were on the correct lines.

    All of the DBs were 'not synchronising' and they wouldn't re-sync; the error reported was that there was not enough info in the logs on the primary to play back on the secondary as mentioned in original post. I think as some point both have been primary in turn without having synchronised with the other and so had become out of step.

    What we did to recover was:

    1. Removed the secondary server as a replica in the group (on the primary - this set db to 'restoring' on the secondary)
    2. Backed up the DB and log files on the primary
    3. Restored these to the secondary with 'replace' and 'with norecovery'
    4. Added the replica back into the AG on the primary, which brought the db back online and started synchronising again.

    It then took a couple of attempts of backing up the transaction log for two of the databases and then shrinks to get the files back to a reasonable size (one was 15GB, one was 6GB - have learnt about VLFs today!)

    I know now that removing the secondary as a replica was probably a bit overkill but as we're using 3 x Basic AGs then it wasn't an issue and was very quick. I suppose it would be different if it were a standard AG with multiple databases and we perhaps should have used the 'remove database' on the secondary as you suggested.

    For reference, here are the Microsoft Learn that we followed:

    https://learn.microsoft.com/en-us/sql/database-engine/availability-groups/windows/remove-a-secondary-replica-from-an-availability-group-sql-server?view=sql-server-ver15

    https://learn.microsoft.com/en-us/sql/database-engine/availability-groups/windows/add-a-secondary-replica-to-an-availability-group-sql-server?view=sql-server-ver15

    https://learn.microsoft.com/en-us/sql/database-engine/availability-groups/windows/manually-prepare-a-secondary-database-for-an-availability-group-sql-server?view=sql-server-ver15

    https://learn.microsoft.com/en-us/sql/database-engine/availability-groups/windows/join-a-secondary-database-to-an-availability-group-sql-server?view=sql-server-ver15

    Best Regards, Steve

    0 comments No comments

  2. Sreekanth Madambath 86 Reputation points
    2020-10-04T16:11:42.007+00:00

    When you plan to backup restore as your option so you can do below step as well to speed up the process.
    You are not required remove secondary server, you can remove the DBs from the always on in primary.
    Once you remove, change the recovery model to Simple and shrink the log file.
    Then revert back the recovery to Full
    Take a fresh full , log and get sync with secondary.

    This will help to avoid multiple log backups to take after reconfiguring all to shrink.