Connection Timeouts in Multi-subnet Availability Group
THE DEFINITION
One of the issues that generates a lot of call volume we see on the AlwaysOn team is dealing with connectivity issues to the availability group listener in multi-subnet environments.
A “multi-subnet” environment is defined when the OS cluster used as the backbone for AlwaysOn has server nodes that are located in multiple, different subnets. Usually there are only 2 subnets, however, there can be more.
When the availability group listener (AGL) is configured properly, it will have an IP address for each defined subnet and have an “OR” dependency on each of the IP addresses. By default, when it is brought online it will be registered in DNS by Windows Cluster. The cluster will submit all of the IP addresses that are in the dependency list and the DNS server will generally register an A record for each IP address. (If non Microsoft Windows DNS servers are used, the exact implementation can be different).
When a client operating system (OS) needs to resolve the AGL name to IP by querying the DNS server, the DNS server will return multiple IP addresses – one for each subnet. The listener IP address in the subnet currently hosting the availability group primary replica will be online. The other listener IP address(es) will be offline. Because not all of the IP addresses returned by DNS will be online, client applications can run into problems when attempting to connect to the listener.
THE PROBLEM
By default, the behavior of the SQL client libraries is to try all IP addresses returned by the DNS lookup - one after another (serially) until the all of the IP addresses have been exhausted and either a connection is made, or a connection timeout threshold has been reached. This can be problematic, because depending upon DNS configurations, the “correct” or “online” IP address may not be the first IP address returned. The default timeout for a TCP connection attempt is 21 seconds and if the first IP address attempted is not online, it will wait 21 seconds before attempting the next IP address. For each subsequent IP address, it will again have to wait 21 seconds before moving to the next IP address until the connection attempt times out or it establishes a connection to an IP address that responds.
The default connection timeout period for .NET client libraries is 15 seconds, therefore, some applications may experience intermittent connection timeouts – or delays in connecting – which can cause application delays or performance issues.
THE RESOLUTION
Beginning with the native client libraries for SQL 2012 as well as the .NET 4.5 libraries (earlier .NET libraries with hotfixes – see below in Appendix A), Microsoft added a new connection string parameter that can be added to change the connection behavior. This new parameter, MultiSubnetFailover, should be used and set to “TRUE.” When set to TRUE, the connection attempt behavior changes. It will no longer attempt all of the IP addresses serially, but in parallel. That is, all of the IP addresses that the availability group listener is dependent on will receive a SYN request at the TCP layer “in parallel” (technically one immediately after the other, but not waiting for acknowledgement – so effectively “in parallel”). This means that whichever IP address is online will be attempted immediately rather than waiting for any timeouts on IP addresses that are not online. The server will respond immediately and establish a connection, while the other IP addresses and their respective connection attempts will eventually timeout – but since the application is already connected it does not matter that those connection attempts timeout and fail.
THE PROBLEM 2 – THE SEQUEL
The client libraries by default do not enable this parameter (i.e. it is set to “FALSE”). The connection strings must be modified in order to ensure successful consistent and successful connections to a multi-subnet listener. Sometimes it isn’t possible to modify the connection strings – and so some applications will still encounter timeout issues when trying to connect. In some cases, the connection timeouts can be intermittent or they can be very consistent - depending upon the order in which IP addresses are returned.
RESOULTION 2 – THE SEQUEL
One option to resolve the issue if an application cannot use the MultiSubnetFailover parameter is to change the behavior of how the AGL is registered with DNS. This assumes that dynamic updating of DNS is allowed within the environment. If you are unsure, please check with your DNS administrators to determine if dynamic updating of DNS is allowed.
There are two parameters that affect how the AGL is registered with DNS. By modifying these parameters on the server we can transparently change the experience of the client OS in its name resolution caching.
The first parameter of interest is called RegisterAllProvidersIP. This parameter determines whether the Windows Cluster will register all of the IP addresses the AGL is dependent on, or only the one active IP address. When set to 1 (default if the AGL is created from SQL Server), the AGL clustered resource is created with all of the IP addresses the AGL is dependent on, registered in DNS. When set to 0, only the one active IP address is registered in DNS (the IP address in the subnet hosting the primary replica). (NOTE: if a Client Access Point is created using Windows Failover Cluster Manager, the RegisterAllProvidersIP parameter is set to 0 by default.)
The second parameter is called HostRecordTTL. This parameter governs how long (in seconds) before cached DNS entries on a client OS are expired, forcing the client OS to re-query the DNS server again to obtain the current IP address. By default, this value is 1200 (20 minutes). This means that after a client OS makes a call to the DNS server to resolve this name to an IP address, the client OS will cache that value for 20 minutes, only querying the DNS server again after that cached record expires. If this value is reduced to 120 or 60 for example, then the client OS cached copy will expire much more quickly.
This is important because during a failover in which the primary replica moves from one subnet to the other, the old IP address that was online is un-registered, and the new IP address that is brought online is registered. This updates DNS with the new IP address as soon as the AGL comes online, but client operating systems will not resolve the AGL name to the new IP address until the currently cached entry expires, which if it had just re-queried DNS immediately before the failover, the client OS would have to wait up to 20 minutes before expiring its cached copy and querying the DNS server again to get the new IP address. This causes the client OS to continue trying to connect to the OLD IP address until its cached copy has expired. By changing the HostRecordTTL parameter value to a much lower setting than 1200, it will cause the cached value to expire more quickly. So if set to 60, the client OS will only have to wait at most 60 seconds after a failover before acquiring the new IP address – allowing client operating systems to resolve to the new, correct IP address much sooner.
The drawback to setting the value to a lower number is how often the client OS will query the DNS server. If you have a handful of application servers, then changing the value from 1200 to 60 would probably have no perceptible impact on the DNS server(s). However, if there are thousands of client machines that all must resolve the AGL name to IP, this increases the load on the DNS server(s) and could cause problems.
A balance must be drawn between the lowest possible cache expiration time and the increased DNS server load.
The following PowerShell instructions show how to change the RegisterAllProvidersIP and HostRecordTTL settings. It is important to note that these settings cannot take effect until the AGL is brought offline and then online again, forcing it to re-register with DNS. Remember, the availability group is dependent on the AGL. If the AGL goes offline, so will the availability group and the databases. However, this dependency can be temporarily removed, allowing for the OFFLINE and re-ONLINE of the AGL without taking the availability group offline.
If there are applications or users that are actively using the AGL to connect to a replica (primary or secondary with read-only Routing), then the OFFLINE/ONLINE process will cause service interruption regardless of dependencies. Therefore, if the AGL is in use, perform the following steps during a maintenance window.
STEPS TO CHANGE
Availability group listener resource parameters: RegisterAllProvidersIP and HostRecordTTL
The following steps show how to turn off RegisterAllProvidersIP and reduce the Client OS DNS cache timeouts (HostRecordTTL) parameters, in the event that you cannot use the MultiSubnetFailover=True parameter on all connection strings.
If you prefer, there are two sample scripts (one TSQL and one PowerShell) in Appendix B at the end of this document that has all of the required commands already configured. These scripts can then be executed to perform all of the steps necessary.
NOTES:
- The “>” symbol at the beginning of each line represents the command prompt and should not be typed.
- Resource names are listed inside brackets (“<” and “>”) – do not include the brackets when typing the command for execution.
Before making any changes, it is necessary to get the PORT values for each listener defined in the AG. This is because temporarily removing the cluster resource dependencies for the AG resource on the AGL will eliminate the port assignment for the listener. If the port assigned was not 1433, the port needs to be specified again for the listener. Capturing the existing port assignments before making any changes will allow the restoration of the proper port assignments at the end of the script.
1. Capture the existing port assignment for each listener in the AG.
In SQL Server Management Studio, connect to the AG primary node execute the following TSQL and then keep the results to execute after all of the PowerShell commands have been completed.
SELECT '-- (1) Copy/Paste the results of this query '
+ ' into a query window.'
AS [Generated TSQL script:]
UNION
SELECT '-- (2) After all PowerShell scripts/command'
+ ' have been executed,'
UNION
SELECT '-- (3) Execute the following TSQL commands'
+ ' to restore PORT settings.'
UNION
SELECT 'ALTER AVAILABILITY GROUP '
+ ag.name + ' MODIFY LISTENER '''
+ agl.dns_name + ''' (PORT = '
+ CAST(ISNULL(agl.port,1433) AS VARCHAR(5)) + ');'
+ CHAR(13) + CHAR(10)
FROM sys.availability_group_listeners agl
INNER JOIN sys.availability_groups ag
ON agl.group_id = ag.group_id
The above script should yield one or more TSQL statements that can be copied and pasted into a query window in SSMS later to re-configure the port for each listener.
2. On any one of the nodes in the cluster, open an administrative PowerShell window.
3. Get the cluster resource name for the availability group resource and the availability group listener resource using the following commands:
- >Import-Module FailoverClusters
- >Get-ClusterResource
This will produce an output similar to the following:
In this list of resources, we will concern ourselves with three different columns:
- the Name (left most, with heading surrounded in light green box)
- the OwnerGroup (third column from left, with heading in orange box)
- the ResourceType (right most column with heading in yellow box)
The scripts below require the use of the resource name (left most column) for the resource on which we will make changes.
To get the correct resource, first find the name of your availability group in the third column (OwnerGroup) (light orange box in picture below). Once you have found the correct group, then find the resources that we need to change. We will look for two types: “SQL Server Availability Group” and “Network Name”. The resource types will be found in the right most column. In the picture below, the availability group resource is underlined in red, and the network name resource (for the listener) is underlined in yellow.
For the subsequent steps, use the following resource names:
- “TestAG_TestAGList_”, to substitute for <AG Listener Resource Name>
- “TestAG”, to substitute for <AG RESOURCE NAME>
4. Change the parameters with the following commands:
- >Get-ClusterResource <AG Listener Resource Name> | Set-ClusterParameter -Name HostRecordTTL -Value 120
- >Get-ClusterResource <AG Listener Resource Name> | Set-ClusterParameter -Name RegisterAllProvidersIP -Value 0
As can be seen in the example above, the resource name for the “Network Name” type is used, “TestAG_TestAGList”. After the command is executed, a yellow warning message is shown that indicates the parameter change will not take effect until the resource is taken offline and then brought back online. This can be done during a normal availability group cluster failover, or through PowerShell script (later in this document). Similarly, the second command above can be issued to change the parameter RegisterAllProvidersIP. It, too, will return a yellow warning message identical to the one shown – indicating the parameter change will not take effect until the resource is taken offline and brought back online.
5. Temporarily remove dependency between the availability group resource and the listener name resource.
Because the listener name resource has to be taken offline and back online for the above changes to take effect, and the fact that the availability group resource is dependent on the listener name, simply taking the listener name resource offline will also take the availability group (and its databases) offline in the process. To avoid taking the availability group resource offline, the dependency that the availability group has on the listener name can be temporarily removed and then re-applied. This can either be done by the Windows Failover Cluster Manager utility or through PowerShell commands.
To remove the dependency using Windows Failover Cluster Manager:
- Select the availability group resource.
- Right click and select properties.
- On the Properties dialog, navigate to the dependencies tab
- Select the resource and click the “Delete” button and then “OK” to close the dialog box.
To remove the dependency using PowerShell:
- >Remove-ClusterResourceDependency -Resource <AG RESOURCE NAME> -Provider <AG Listener Resource Name>
6. Offline and re-online the listener resource to force re-registration with DNS and complete the changes:
- >Stop-ClusterResource <AG Listener Resource Name>
- >Start-ClusterResource <AG Listener Resource Name>
To force updating DNS on Windows Server 2012 or 2012 R2:
- >Get-ClusterResource <AG Listener Resource Name> | Update-ClusterNetworkNameResource
To force updating DNS on Windows Server 2008 or 2008R2:
- >Cluster.exe RES <AG Listener Resource Name> /registerdns
7. Re-add the dependency of the AG resource on the Listener name resource. (The dependency should exist for proper function of the availability group and access to the databases within the availability group. Failure to re-add the dependency could cause unintended behavior of the availability group and database availability.)
To re-add the dependency using Windows Server Failover Cluster Manager:
- Select the availability group resource.
- Right click and select properties.
- On the Properties dialog, navigate to the dependencies tab
- Click the “drop down underneath “Resource” and select the listener name resource.
- Click the “apply” button, then “OK” to close the dialog box.
To re-add the dependency using PowerShell execute the following command:
- >Add-ClusterResourceDependency -Resource <AG RESOURCE NAME> -Provider <AG Listener Resource Name>
8. Verification – view the dependency to make sure it is re-applied, ensure all cluster resources for this AG are online, and review the parameters to make sure they’re set to the new value:
- >Get-ClusterResourceDependency <AG RESOURCE NAME>
- >Get-ClusterResource <AG Listener Resource Name> | Get-ClusterParameter HostRecordTTL, RegisterAllProvidersIP
- >Get-ClusterResource
9. Re-configure Listener PORT settings.
In step 1 above, a TSQL script was executed that generated additional TSQL commands that will restore the original PORT settings for the AG Listeners. Copy and paste the results from the query in step 1 into a query window in SSMS – connected to the primary and execute the TSQL. The TSQL generated from step one should look similar to:
Do not type the TSQL from the above image, use the TSQL that was generated in step 1 on your machine!
After pasting into a query window, the generated TSQL text should look something similar to the following with ALTER AVAILABILITY GROUP statements inside TRY/CATCH blocks.
Upon execution it should return successful completion:
Finally, the following query will return the list of listeners and their port settings – for all availability groups on the machine:
/* this script will obtain the ports defined
* for each availability group listener that
* exists.*/
SELECT ag.name AS [Availability Group],
agl.dns_name AS [Listener DNS Name],
agl.port AS [Port]
FROM sys.availability_group_listeners agl
INNER JOIN sys.availability_groups ag
ON agl.group_id = ag.group_id
ORDER BY ag.name, agl.dns_name
HELPFUL REFERENCES
- Availability Group Listeners, Client Connectivity, and Application Failover (SQL Server)
- MultiSubnetFailover parameter
- Requirements and Recommendations for a Multi-site Failover Cluster
- SQL Server Multi-Subnet Clustering (this was written for SQL failover cluster installations, but is also applicable for AlwaysOn)
- "Timeout expired" error when you try to connect to SQL Server 2012 through an AlwaysOn availability group listener
FINAL NOTES
For older operating systems such as Windows 7 and Windows Server 2008 R2, it is recommended the hotfixes referenced below be applied to ensure connection timeouts do not occur. Because of an issue with TDX/TDI filter drivers, timeouts can still occur when connecting to a name with multiple IP addresses – even when the correct client libraries are used and the MultisubnetFailover=True parameter is specified. These drivers are usually installed as part of older security systems such as anti-virus and intrusion detection. Ensuring these hotfixes have been applied will help prevent connectivity timeouts. Please note however, there is no hotfix for Windows Server 2008, only Windows Server 7 and Windows Server 2008 R2. If using Windows Server 2008 as a client, please refer to the articles below for more options in resolving timeout issues.
- Time-out error and you cannot connect to a SQL Server 2012 AlwaysOn availability group listener in a multi-subnet environment
- Connection times out when you use AlwaysOn availability group listener with MultisubnetFailover parameter
Two additional things to consider with respect to registering the listener name with DNS after making changes to either the HostRecordTTL or RegisterAllProvidersIP parameters --- DNS replication and previous settings.
The DNS server that is contacted by the OS cluster when registering or de-registering hostnames may not be the same DNS server that clients are using to resolve names to IP addresses. If this is the case, then it is possible to have additional delays in the client’s ability to get freshly updated information – simply because the client’s DNS server may not have the updated information. DNS replication topology and configuration settings can cause additional delays before all changes are replicated throughout an enterprise network. If significant delays are experienced either after a failover, or when changing these parameters, the network or DNS administrator should be contacted to investigate the DNS replication topology for the enterprise to determine if the time required to replicate across the entire organization can be reduced.
The other item to consider is that the “previous settings” are most likely already cached on client machines. If the parameter settings were “default” prior to making any changes, then any cached entries on client machines will still have the “old” expiration setting (TTL) – which is 20 minutes. That means, that even after changing the RegisterAllProvidersIP and HostRecordTTL settings – and taking the resource offline and back online to take effect – previously cached entries are not automatically expired. The client must wait the for the current TTL setting before it expires its cached copy. This means that it could still be up to 20 minutes before a client will get the new settings.
After the changes have been made and cached entries have been expired, the new settings will take effect and any subsequent TTL expirations will take place based upon the new setting (for example after 60 or 120 seconds) rather than the original default value of 20 minutes. This can be expedited on client machines if necessary by issuing an IPCONFIG /FLUSHDNS command from an elevated command prompt. This will cause the client to expire all cached entries and re-query the DNS server to obtain the new settings.
APPENDIX A
Section 5.7.1 Client-Connectivity For AlwaysOn Availability Groups from: SQL Server 2012 Release Notes.
The following table summarizes driver support for AlwaysOn Availability Groups:
Driver |
Multi-Subnet Failover |
Application Intent |
Read-Only Routing |
Multi-Subnet Failover: Faster Single Subnet Endpoint Failover |
Multi-Subnet Failover: Named Instance Resolution For SQL Clustered Instances |
SQL Native Client 11.0 ODBC |
Yes |
Yes |
Yes |
Yes |
Yes |
SQL Native Client 11.0 OLEDB |
No |
Yes |
Yes |
No |
No |
ADO.NET with .NET Framework 4.0 with connectivity patch * |
Yes |
Yes |
Yes |
Yes |
Yes |
ADO.NET with .NET Framework 3.5 SP1 with connectivity patch ** |
Yes |
Yes |
Yes |
Yes |
Yes |
Microsoft JDBC driver 4.0 for SQL Server |
Yes |
Yes |
Yes |
Yes |
Yes |
*Download the connectivity patch for ADO .NET with .NET Framework 4.0: https://support.microsoft.com/kb/2600211.
**Download the connectivity patch for ADO.NET with .NET Framework 3.5 SP1: https://support.microsoft.com/kb/2654347.
MultiSubnetFailover Keyword and Associated Features
MultiSubnetFailover is a new connection string keyword used to enable faster failover with AlwaysOn availability groups and AlwaysOn Failover Cluster Instances in SQL Server 2012. The following three sub-features are enabled when MultiSubnetFailover=True is set in connection string:
- Faster multi-subnet failover to a multi-subnet listener for an AlwaysOn Availability Group or Failover Cluster Instances.
- Named instance resolution to a multi-subnet AlwaysOn Failover Cluster Instance.
- Faster single subnet failover to a single subnet listener for an AlwaysOn Availability Group or Failover Cluster Instances.
- This feature is used when connecting to a listener that has a single IP in a single subnet. This performs more aggressive TCP connection retries to speed up single subnet failovers.
- Named instance resolution to a multi-subnet AlwaysOn Failover Cluster Instance.
- This is to add named instance resolution support for an AlwaysOn Failover Cluster Instances with multiple subnet endpoints.
MultiSubnetFailover=True Not Supported by NET Framework 3.5 or OLEDB
Issue: If your Availability Group or Failover Cluster Instance has a listener name (known as the network name or Client Access Point in the WSFC Cluster Manager) depending on multiple IP addresses from different subnets, and you are using either ADO.NET with .NET Framework 3.5SP1 or SQL Native Client 11.0 OLEDB, potentially, 50% of your client-connection requests to the availability group listener will hit a connection timeout.
APPENDIX B - SCRIPTS
This is the script that should be run first to collect the port assignments for the listeners and generate TSQL code to be executed after the PowerShell script to re-configure the port settings to their original values.
/* this script will obtain the ports defined
* for each availability group listener that
* exists. If no port is defined, it will
* assign it will use port 1433.
* The output will show the TSQL syntax
* to alter the listeners to apply the
* same port values later, should they
* need to be re-configured to the same
* ports.*/
DECLARE @CRLF CHAR(2) = CHAR(13) + CHAR(10)
DECLARE @EndTryCatch VARCHAR(max) = 'END TRY' + @CRLF +
'BEGIN CATCH' + @CRLF +
'IF (@@ERROR <> 19468)' + @CRLF +
'SELECT ERROR_NUMBER() AS ErrNum, ERROR_MESSAGE() AS ErrMsg' +
@CRLF + 'END CATCH' + @CRLF
SELECT '-- (1) Copy/Paste the results of this query '
+ ' into a query window.'
AS [Generated TSQL script:]
UNION
SELECT '-- (2) After all PowerShell scripts/command'
+ ' have been executed,'
UNION
SELECT '-- (3) Execute the following TSQL commands'
+ ' to restore PORT settings.'
UNION
SELECT 'BEGIN TRY' + @CRLF + 'ALTER AVAILABILITY GROUP '
+ ag.name + ' MODIFY LISTENER '''
+ agl.dns_name + ''' (PORT = '
+ CAST(ISNULL(agl.port,1433) AS VARCHAR(5)) + ');'
+ @CRLF + @EndTryCatch + @CRLF
FROM sys.availability_group_listeners agl
INNER JOIN sys.availability_groups ag
ON agl.group_id = ag.group_id
PowerShell script to change the HostRecordTTL and RegisterAllProvidersIP settings.
There are five variables that need to be changed before executing the script. They are located toward the top of the script underneath “CHANGE THESE VARIABLES”. It is recommended you become familiar with the script and its options in a test environment before attempting in production. The script is written such that it will affect ALL availability group listener resources for the specified availability group, since if the parameters need to be changed for one listener, then most likely, if the availability group has more than one listener, they should all be changed.
#**************************************************************************
#This script is provided "AS IS" with no warranties, and confers no rights.
# Use of included script samples are subject to the terms specified at
# https://www.microsoft.com/info/cpyright.htm
#**************************************************************************
#**************************************************************************
# VARIABLES
#$strAGName the name of the availability group
#$TTLValue the # of seconds for HostRecordTTL timeout value
#$AllIPs [0 | 1] 0 = only register one IP, 1 = register all IPs
#$RestartListener [0 | 1] 1 = restart listener / 0 = do not restart
#$RemoveDependencies [0 | 1] 1 = temporarily remove / 0 = leave alone
#**************************************************************************
#**************************************************************************
#
# CHANGE THESE VARIABLES
#Define Variables
$strAGName = "TestAG" #<<<<<<<<<<<<<<<<<<<<<<<<<
$TTLValue = "120" #<<<<<<<<<<<<<<<<<<<<<<<<<
$AllIPs = 0 #<<<<<<<<<<<<<<<<<<<<<<<<<
$RestartListener = 1 #<<<<<<<<<<<<<<<<<<<<<<<<<
$RemoveDependencies = 1 #<<<<<<<<<<<<<<<<<<<<<<<<<
#
#**************************************************************************
#**************************************************************************
#Notes:
# 1) Test this script in non-production environments first.
# 2) This script will change the parameters for _all_ listeners
# for the specified availability group
# 3) This script can optionally restart the listener(s)
# 4) if restaring listeners, it can optionally temporarily
# remove and restore the dependencies to take the
# listener(s) offline without taking the availability group
# itself offline. If choosing not to temporarily remove
# and restore dependencies, then when the listener(s) are
# taken offline, the availability group resource will also
# go offline - thus making the databases in the AG inaccessible.
# 5) if choosing to remove dependencies, the existing depenedencies
# are collected and restored after restaring the listener(s)
# 6) Windows Server 2012/2012R2 has a powershell command to
# re-register listener(s) with DNS. Server 2008/2008R2 does
# does not. there is logic to determine and use the CLUSTER.EXE
# command for Windows Server 2008/2008R2
#**************************************************************************
#no changes required below this point
#Get OS version
$OSMajor = ([System.Environment]::OSVersion.Version).Major
$OSMinor = ([System.Environment]::OSVersion.Version).Minor
#load cluster module
Import-Module FailoverClusters
#get the cluster role (group) object based on the AG name provided above
$objAGGroup = Get-ClusterGroup $strAGName -ErrorAction SilentlyContinue
if ($objAGGroup -eq $null)
{Write-Host "Error: Availability Group not found."}
else
{
#get the AG resource object in this cluster role (group)
$objAGRes = $objAGGroup | Get-ClusterResource |
Where-Object {$_.ResourceType -match "SQL Server Availability Group*"}
#get the listener(s) object(s) in this cluster role (group)
$objListener = $objAGGroup | Get-ClusterResource |
Where-Object {$_.ResourceType -match "Network Name*"}
#change the parameter settings: HostRecordTTL & RegisterAllProvidersIP
Write-Host "Making changes to Network Name:" $list.Name
$objListener | Set-ClusterParameter -Name HostRecordTTL -Value $TTLValue
$objListener | Set-ClusterParameter -Name RegisterAllProvidersIP -Value @AllIPs
$objListener | Get-ClusterParameter -Name HostRecordTTL
$objListener | Get-ClusterParameter -Name RegisterAllProvidersIP
if ($RestartListener -eq 1) {
if($RemoveDependencies -eq 1) {
#capture the dependency(ies) that the AG resource depends on
$DepStr = ($objAGRes | Get-ClusterResourceDependency).DependencyExpression
Write-Host "Removing dependecny for " $objAGRes.Name " on '" $DepStr "'"
Set-ClusterResourceDependency -Resource $objAGRes -Dependency $null
} #if remove dependencies
#restart the listener resource(es)
Write-Host "Restarting Network Name resource:" $list.Name
$objListener | Stop-ClusterResource
$objListener | Start-ClusterResource
#force re-registration in DNS
if ($OSMajor -ge 6 -and $OSMinor -ge 2) {
#Windows Server 2012 and up
$objListener | Update-ClusterNetworkNameResource -Verbose
}
else {
#for Windows Server 2008/2008R2
ForEach($list in $objListener) {
cluster.exe res $list.name /registerdns
}#foreach
}
if($RemoveDependencies -eq 1) {
#restore the dependency(ies) to previous setting
Write-Host "Reapplying dependencies for " $objAGRes.Name
Set-ClusterResourceDependency -Resource $objAGRes -Dependency $DepStr
#show dependency (so it can be compared) / show the settings
$objAGRes | Get-ClusterResourceDependency
} #if remove dependencies
else {
#if we chose not to remove dependencies we need to restart
#the availability group resource
$objAGRes | Start-ClusterResource
}
} #if restart
}#else - availability group found
Comments
Anonymous
October 13, 2014
That´s really a nice and very useful post. It helps a lot to troubleshoot common issues in AG Listener connectivity.Thanks!Anonymous
April 06, 2015
Fantastic article, thanks.Anonymous
May 06, 2015
thank you very much. great article, exactly what I was looking for.Anonymous
June 15, 2015
Thank you very much for this article.Anonymous
July 13, 2015
Saved my job! Thank you.Anonymous
July 13, 2015
Saved my job! thank you.Anonymous
August 05, 2015
Thank you so much for this post, very useful for our customers. you can also find an official KB from Microsoft website that can help on that issue: support.microsoft.com/.../2792139 Thanks again for the post. regards, Nick.Anonymous
August 19, 2015
Outstanding and truly comprehensive article. Thank you very much.Anonymous
August 25, 2015
The comment has been removedAnonymous
August 27, 2015
The comment has been removed- Anonymous
July 18, 2016
The comment has been removed
- Anonymous
Anonymous
November 23, 2015
Thanks so much for the article. I was able to get a third node for DR working in our remote office using the details provided.Anonymous
July 18, 2016
Is there a current kb article for the connectivity patch for ADO.NET with .NET Framework 3.5 SP1 on Windows 7? Your Final Notes recommends applying hotfixes on Windows 7, but then the versions seem a little garbled in the rest of that paragraph. However KB 2654347 only provides links for Windows XP SP3, Windows 2008, and Windows 2008 R2. Why ever would Microsoft provide a .Net patch for Windows XP SP3, but not for Windows 7?We have a client app using .Net 2.0 APIs on Windows 7 and we would like to move its database to a SQL Server Always On Cluster.- Anonymous
February 06, 2017
Paul, have you implemented this?
- Anonymous
Anonymous
July 20, 2016
Thanks Good One!!Anonymous
July 28, 2016
The comment has been removed- Anonymous
March 15, 2017
The comment has been removed
- Anonymous
Anonymous
January 30, 2017
Is it mandatory to temporarily remove dependency between the availability group resource and the listener name resource. Or can I do a cluster failover to achieve the same results?- Anonymous
March 15, 2017
Yes you can do a cluster failover and achieve the same results.
- Anonymous
Anonymous
February 06, 2017
We are going to implement AlwaysON so want want to foresee if we can face this issue on not . we are on SQL Server 2014 SP2 with windows server 2012 R2. Do we need to implement this ? HostRecordTTL Value is currently 1200 and RegisterAllProvidersIP Value is 1 . We are on multisubnet cluster environment . Does it matter which version of .net or OLEDB or JDBC is running on client to connect to DB. Because although we are on higher version on DB side , the application is running on lower .net and JDBC version.- Anonymous
March 15, 2017
The comment has been removed- Anonymous
May 19, 2017
My Listener is accessible only from the primary replica node itself. Do you have any idea how to resolve this?FYI, Always on has been setup in Amazon EC2 .Not working even i set up RegisterAllProvidersIP =0 and the changes you mentioned in the article.
- Anonymous
- Anonymous
Anonymous
June 23, 2017
This helped me! Thx. Here's my condensed version. https://dba2o.wordpress.com/2017/06/23/sql-server-alwayson-listener-returns-the-wrong-ip-after-flushdns-error-connection-timeout/Anonymous
June 28, 2017
The comment has been removed- Anonymous
October 05, 2017
you make so much sense..
- Anonymous
Anonymous
November 14, 2017
I had same issue for one of the SQL Always on cluster. This solution resolved my issue.Thanks you.Anonymous
February 21, 2018
The comment has been removedAnonymous
June 04, 2019
Questions;1, the HostRecordTTL property for network name 'Name: "my cluster nmae' is set to 1200 (20 minutes). For multi-site clusters the suggested value is 300 (5 minutes). this is the warning from the cluster vaslidation report.how to change this cluster parameter? please note that it is diferent from that of AG listener, which expains in this article. "my cluster name", for exmple, AG-LISTENER, is not listed in the outputs from get-clusterresource. off course, the paramenter of cluster cannot found from get-clusterparameter2. MultisubnetFailover parameter, please clarify if it would be the default parameter for SQL 2016 standard (using 2017 SSMS). In other words, should the MultisubnetFailover parameter=true be set in SQL 2016 standard if 2 subnets use on 2 nodes in the cluster?thanksJohn