Latency with hadr_capture_log_block in SQL Server AlwaysOn AG

Ramil_R 0 Reputation points
2023-02-16T14:51:28.4766667+00:00

There is 2 node AlwaysON AG in synchronous mode. In the morning (09:00 am) we noticed that queue from application side started to grow, in SQL Server there were high HADR_SYNC_COMMIT wait type.

Using the article https://techcommunity.microsoft.com/t5/sql-server-blog/troubleshooting-high-hadr-sync-commit-wait-type-with-always-on/ba-p/385369 we configured Extended Events session on Primary and on Secondary, gathered data for 10 minutes, then changed AlwaysOn AG to asynchronous mode and issue is gone.

Here's what we got in Extended Events session.

Did analysis like in article above.

On primary:

User's image

On secondary:

User's image

As you see the biggest latency here is on primary in hadr_capture_log_block between mode 2 and 3 ~ 249 ms.

As far as I understand the bottleneck was in "Queue of DbMgrPartner" - it was processing too long.
The question is what the root cause ?

Network metrics (Bytes sent, bytes received) in perfmon didn't change after switching to async mode.

One interesting point in perfmon on Primary:

User's image

3 identical lines here: Bytes Sent to Replica/sec, Bytes Sent to Transport/sec and Log Bytes Flushed/sec
10:35 - time when we switch to async

SQL Server | Other
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Ronen Ariely 15,216 Reputation points
    2023-02-19T03:45:12.17+00:00

    Please check the following post at the Microsoft blog. It should solve your needs and will explain how to monitor High HADR_SYNC_COMMIT wait type with Always On Availability Groups

    https://techcommunity.microsoft.com/t5/sql-server-blog/troubleshooting-high-hadr-sync-commit-wait-type-with-always-on/ba-p/385369


Your answer

Answers can be marked as 'Accepted' by the question author and 'Recommended' by moderators, which helps users know the answer solved the author's problem.