Alwayon Leasetimeout

Sam 1,371 Reputation points
2020-12-29T08:19:52.513+00:00

Hi Experts,

Have some question related to Lease timeout behaviour in AG.

2020-12-28 02:14:39.540 Server Error: 19419, Severity: 16, State: 1.
2020-12-28 02:14:39.540 Server Windows Server Failover Cluster did not receive a process event signal from SQL Server hosting availability group 'SalesdwAG02' within the lease timeout period.
2020-12-28 02:14:39.550 Server Error: 19407, Severity: 16, State: 1.
2020-12-28 02:14:39.550 Server The lease between availability group 'SalesdwAG02' and the Windows Server Failover Cluster has expired. A connectivity issue occurred between the instance of SQL Server and the Windows Server Failover Cluster. To determine whether the availability group i
2020-12-28 02:14:39.550 Server Always On: The local replica of availability group 'SalesdwAG02' is going offline because either the lease expired or lease renewal failed. This is an informational message only. No user action is required.
2020-12-28 02:14:39.550 Server The state of the local availability replica in availability group 'SalesdwAG02' has changed from 'PRIMARY_NORMAL' to 'RESOLVING_NORMAL'. The state changed because the lease between the local availability replica and Windows Server Failover Clustering (WS

Questions

  1. Why and when Lease timeout happens?
  2. What value to be set for Lease timeout to start of with?
  3. When Lease timeout happended, we expected a Automatic failover to happen, but it didn't happen why? Any reason for this behaviour. We have configure AG with "Synchronous commit" , "Automatic failover".
  4. After changing the LeaseTimeout in windows failover cluster, do we need restart the nodes involved in AG to take into effect?
  5. In SSMS, If we go to AG properties , we have setting called "Session timeout" default is 10 seconds. Any recommendations of what value to be set?

Env:
SQL Server 2017 Enterprise Edition
Windows server 2019 datacenter

Regards,
Sam

SQL Server
SQL Server
A family of Microsoft relational database management and analysis systems for e-commerce, line-of-business, and data warehousing solutions.
12,808 questions
0 comments No comments
{count} votes

Accepted answer
  1. CathyJi-MSFT 21,096 Reputation points Microsoft Vendor
    2020-12-30T06:46:52.677+00:00

    Hi @Sam ,

    > Why and when Lease timeout happens?

    You need to check you SQL server error log and cluster log to find the reason and the time that this happened.

    > What value to be set for Lease timeout to start of with?

    You can set the values to 20000 milliseconds. The maximum value is 100,000 ms.

    > After changing the LeaseTimeout in windows failover cluster, do we need restart the nodes involved in AG to take into effect?

    No, you do not need to restart the nodes. Click the Apply tap to take it effect as below screenshot.

    52221-screenshot-2020-12-30-143530.jpg

    Please refer to MS document How It Works: SQL Server AlwaysOn Lease Timeout and Diagnose Unexpected Failover or Availability Group in RESOLVING State to get more information.

    > In SSMS, If we go to AG properties , we have setting called "Session timeout" default is 10 seconds. Any recommendations of what value to be set?

    We recommend that you keep the time-out period at 10 seconds or greater. Refer to MS document Modify the session-timeout period for an availability group replica to get more information.

    Best regards,
    Cathy


    If the response is helpful, please click "Accept Answer" and upvote it, thank you.
    Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.

    1 person found this answer helpful.

2 additional answers

Sort by: Most helpful
  1. Tom Phillips 17,716 Reputation points
    2020-12-29T13:44:13.467+00:00
    1 person found this answer helpful.
    0 comments No comments

  2. Anshul Farkya 111 Reputation points
    2020-12-30T07:22:45.027+00:00

    I have seen this scenario in my environment, and it was caused due to I/O issues on storage. It usually occurs while running maintenance jobs for large databases. It was resolved once we upgraded the storage to a higher performance one.
    As far as Automatic failover scenario is concerned, it is advised to have a Witness disk or file share (Cloud witness in case of Azure VM's) for quorum configuration, which can help to perform Automatic failover if communication between always-on nodes is lost.

    Also, please don't forget to change the lease-timeout values as recommended. :-)

    1 person found this answer helpful.