SQL Server AOAG went in "Not Synchronizing/Pending Recovery" state

Heisenberg 261 Reputation points
2024-02-23T06:06:59.96+00:00

hi Folks, Last week we had a maintenance on one of our 2 node aoag setup, its a synchronous/manual failover aoag. Our task was simple, to restart sql server services on the primary node. If i assume correctly, in this situation no failover is required and we just restarted sql server services of primary node, however when services came back up all databases showed "Not Synchronizing/Pending Recovery" state. We waited for 5 minutes but there was no recovery of the databases. Hence we failover to second node where all DBs showed synchronized state. After that we tried to fail back to original primary which was in norecovery state, but state of those dbs remained same "Not Synchronizing/Pending Recovery". Hence we ended up failing over to original secondary where database were accessible and ended up rebuilding aoag on original primary node. What might have happened? I am attaching errorlog for the reference. Also, in such cases any steps we might have taken to perform recovery on the original primary node? thank you.

SQL Server
SQL Server
A family of Microsoft relational database management and analysis systems for e-commerce, line-of-business, and data warehousing solutions.
13,685 questions
{count} votes

Accepted answer
  1. Erland Sommarskog 110.4K Reputation points MVP
    2024-02-27T22:17:58.4+00:00

    I found a lot of scary errors in the error log. Here are just a few:

    2024-02-14 22:43:15.25 spid30s A file activation error occurred. The physical file name 'L:\MSSQL\Log\SVData_Log1.ldf' may be incorrect. Diagnose and correct additional errors, and retry the operation.

    2024-02-14 22:43:15.25 spid32s Error: 5105, Severity: 16, State: 1.> 2024-02-14 22:43:15.25 spid32s A file activation error occurred. The physical file name 'L:\MSSQL\Log\ProcessesDB_Log1.ldf' may be incorrect. Diagnose and correct additional errors, and retry the operation.

    2024-02-14 22:43:15.25 spid30s Error: 945, Severity: 14, State: 2.

    2024-02-14 22:43:15.25 spid30s Database 'SVData' cannot be opened due to inaccessible files or insufficient memory or disk space. See the SQL Server errorlog for details.> 2024-02-14 22:43:15.25 spid32s Error: 945, Severity: 14, State: 2.

    2024-02-14 22:43:15.25 spid32s Database 'ProcessesDB' cannot be opened due to inaccessible files or insufficient memory or disk space. See the SQL Server errorlog for details.> 2024-02-14 22:43:15.25 spid40s Error: 829, Severity: 21, State: 1.

    2024-02-14 22:43:15.25 spid40s Database ID 19, Page (4:0) is marked RestorePending, which may indicate disk corruption. To recover from this state, perform a restore.

    2024-02-14 22:43:15.25 spid40s Error: 5105, Severity: 16, State: 1.> 2024-02-14 22:43:15.25 spid40s A file activation error occurred. The physical file name 'L:\MSSQL\Log\Client_Log1.ndf' may be incorrect. Diagnose and correct additional errors, and retry the operation.

    2024-02-14 22:43:15.25 spid40s Error: 945, Severity: 14, State: 2.

    2024-02-14 22:43:15.25 spid40s Database 'Client' cannot be opened due to inaccessible files or insufficient memory or disk space. See the SQL Server errorlog for details.

    I have no idea what is going on, but it seems that those L, J and other drives were not available when SQL Server started.

    As for why, I don't know. But I note that you seem to be on Amazon EC2, so you should probably talk to Amazon.

    0 comments No comments

2 additional answers

Sort by: Most helpful
  1. Debarchan Sarkar - MSFT 1,131 Reputation points Microsoft Employee
    2024-02-25T01:15:10.9033333+00:00

    I did not find any logs attached. Diagnosing the cause of an issue like this can be complex as it could be due to a variety of factors. However, based on your situation, here are a few potential reasons and solutions for the "Not Synchronizing/Pending Recovery" state you observed: Network or Connection Issues: If there were any temporary network interruptions or connection issues, the Always On availability groups might not have been able to communicate between the nodes, causing them to enter a Not Synchronizing/Pending Recovery state. You can verify this by checking the server error logs and cluster logs for any signs of network-related errors during that period. SQL Server Service Start-Up: Sometimes, the SQL Server service may take longer to start up, especially if there are many databases or if the databases are very large. During this time, the databases would appear in a recovery pending state. Usually, they should return to normal once the service is fully started; however, in your case, it's possible there was an unexpected prolongation of this state. Synchronization Issue: In synchronous commit mode, before transactions are committed on the primary replica, they must also be hardened on the log disk of the secondary replica. If there was an issue with data synchronization, it could put the database into the Not Synchronizing/Pending Recovery state.

    To recover from such situations, you can try the following steps: Firstly, ensure there are no network issues preventing communication between the replicas. Check if the SQL Server services are running on both machines, if not, restart them. Investigate the SQL Server error log for any messages related to Always On Availability Groups or database recovery.

    Without the actual error log, it's difficult to provide a more precise solution. However, these general guidelines should help you diagnose and potentially resolve the issue. In the future, before restarting SQL Server services or performing maintenance tasks, consider manually failing over the AOAG to the secondary node. This ensures continuity of service and mitigates the risk of encountering such issues.

    0 comments No comments

  2. LiHongMSFT-4306 26,791 Reputation points
    2024-02-26T03:28:44.9166667+00:00

    Hi @Heisenberg

    I am attaching errorlog for the reference.

    Not seen the error log.

    Regarding this issue, check if this blog helps: Availability Group Database Reports Not Synchronizing / Recovery Pending After Database Log File Inaccessible.

    Best regards,

    Cosmog Hong


    If the answer is the right solution, please click "Accept Answer" and kindly upvote it. If you have extra questions about this answer, please click "Comment".

    Note: Please follow the steps in our Documentation to enable e-mail notifications if you want to receive the related email notification for this thread.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.