Help fix: Error sql alwayson sometime is disconnect and can not open connect to secondary replica for read
Question
Saturday, April 8, 2017 1:55 PM
Hi all,
i have problem with sql server 2014 alwayson, i have two node windows server 2012 r2 failover cluster updated to service packs newest, and sql server 2014 sp2 updated to version 12.0.5540.0
my system running is normal, but sometime(two or three times a day) i have error alwayson group secondary replica(node_02) is disconnect with primary replica(node_01), i check ping is ok, ports is open, telnet to ports is ok, at the time i can't connect to node_02 to run read query(i use routing), no log for error sql services, windows services, primary replica is query read/write normal, few minutes late, system is normal.
this is log:
04/08/2017 09:10:02,spid35s,Unknown,The recovery LSN (137082:338000:1) was identified for the database with ID 7. This is an informational message only. No user action is required.
04/08/2017 09:10:00,spid35s,Unknown,AlwaysOn Availability Groups connection with primary database established for secondary database 'COMPANY_01' on the availability replica 'NODE_01' with Replica ID: {6ee79d8f-e575-46d8-9936-419c9d1f7af0}. This is an informational message only. No user action is required.
04/08/2017 09:10:00,spid40s,Unknown,The recovery LSN (136730:1251160:1) was identified for the database with ID 8. This is an informational message only. No user action is required.
04/08/2017 09:09:59,spid40s,Unknown,AlwaysOn Availability Groups connection with primary database established for secondary database 'COMPANY_01_TEST' on the availability replica 'NODE_01' with Replica ID: {6ee79d8f-e575-46d8-9936-419c9d1f7af0}. This is an informational message only. No user action is required.
04/08/2017 09:09:59,spid35s,Unknown,A connection for availability group 'TEST_AG' from availability replica 'NODE_02' with id [B45F451A-FF79-4969-B232-24BF484B37F7] to 'NODE_01' with id [6EE79D8F-E575-46D8-9936-419C9D1F7AF0] has been successfully established. This is an informational message only. No user action is required.
04/08/2017 09:09:56,spid35s,Unknown,AlwaysOn Availability Groups connection with primary database terminated for secondary database 'COMPANY_01_TEST' on the availability replica 'NODE_01' with Replica ID: {6ee79d8f-e575-46d8-9936-419c9d1f7af0}. This is an informational message only. No user action is required.
04/08/2017 09:09:54,spid40s,Unknown,AlwaysOn Availability Groups connection with primary database terminated for secondary database 'COMPANY_01' on the availability replica 'NODE_01' with Replica ID: {6ee79d8f-e575-46d8-9936-419c9d1f7af0}. This is an informational message only. No user action is required.
and more log begin from disconnect to normal:
04/08/2017 09:13:59,spid44s,Unknown,CHECKDB for database 'COMPANY_01' finished without errors on 2010-08-31 11:33:39.280 (local time). This is an informational message only; no user action is required.
04/08/2017 09:13:59,spid46s,Unknown,The recovery LSN (137082:338288:3) was identified for the database with ID 7. This is an informational message only. No user action is required.
04/08/2017 09:13:59,spid46s,Unknown,AlwaysOn Availability Groups connection with primary database established for secondary database 'COMPANY_01' on the availability replica 'NODE_01' with Replica ID: {6ee79d8f-e575-46d8-9936-419c9d1f7af0}. This is an informational message only. No user action is required.
04/08/2017 09:13:59,spid44s,Unknown,681 transactions rolled forward in database 'COMPANY_01' (7:0). This is an informational message only. No user action is required.
04/08/2017 09:13:59,spid44s,Unknown,The recovery LSN (137082:338288:3) was identified for the database with ID 7. This is an informational message only. No user action is required.
04/08/2017 09:13:59,spid44s,Unknown,AlwaysOn Availability Groups connection with primary database established for secondary database 'COMPANY_01' on the availability replica 'NODE_01' with Replica ID: {6ee79d8f-e575-46d8-9936-419c9d1f7af0}. This is an informational message only. No user action is required.
04/08/2017 09:13:59,spid44s,Unknown,Recovery of database 'COMPANY_01' (7) is 0% complete (approximately 761 seconds remain). Phase 1 of 3. This is an informational message only. No user action is required.
04/08/2017 09:13:59,spid44s,Unknown,Recovery of database 'COMPANY_01' (7) is 0% complete (approximately 762 seconds remain). Phase 1 of 3. This is an informational message only. No user action is required.
04/08/2017 09:13:58,spid44s,Unknown,Starting up database 'COMPANY_01'.
04/08/2017 09:13:58,spid44s,Unknown,State information for database 'COMPANY_01' - Hardended Lsn: '(137082:338256:1)' Commit LSN: '(137082:338248:3)' Commit Time: 'Apr 8 2017 9:13AM'
04/08/2017 09:13:58,spid46s,Unknown,AlwaysOn Availability Groups connection with primary database terminated for secondary database 'COMPANY_01' on the availability replica 'NODE_01' with Replica ID: {6ee79d8f-e575-46d8-9936-419c9d1f7af0}. This is an informational message only. No user action is required.
04/08/2017 09:13:58,spid44s,Unknown,State information for database 'COMPANY_01' - Hardended Lsn: '(137082:338256:1)' Commit LSN: '(137082:338248:3)' Commit Time: 'Apr 8 2017 9:13AM'
04/08/2017 09:13:58,spid44s,Unknown,Nonqualified transactions are being rolled back in database COMPANY_01 for an AlwaysOn Availability Groups state change. Estimated rollback completion: 100%. This is an informational message only. No user action is required.
04/08/2017 09:13:58,spid44s,Unknown,Availability database 'COMPANY_01'<c/> which is in the secondary role<c/> is being restarted to resynchronize with the current primary database. This is an informational message only. No user action is required.
04/08/2017 09:13:58,spid44s,Unknown,Error: 35278<c/> Severity: 17<c/> State: 1.
04/08/2017 09:13:57,spid44s,Unknown,Using the recovery LSN (137082:338232:1) stored in the metadata for the database with ID 7. This is an informational message only. No user action is required.
04/08/2017 09:13:57,spid44s,Unknown,Error: 35286<c/> Severity: 16<c/> State: 1.
04/08/2017 09:13:57,spid44s,Unknown,The recovery LSN (137082:338256:1) was identified for the database with ID 7. This is an informational message only. No user action is required.
04/08/2017 09:13:57,spid44s,Unknown,AlwaysOn Availability Groups connection with primary database established for secondary database 'COMPANY_01' on the availability replica 'NODE_01' with Replica ID: {6ee79d8f-e575-46d8-9936-419c9d1f7af0}. This is an informational message only. No user action is required.
04/08/2017 09:13:57,spid44s,Unknown,Recovery of database 'COMPANY_01' (7) is 0% complete (approximately 761 seconds remain). Phase 1 of 3. This is an informational message only. No user action is required.
04/08/2017 09:13:57,spid44s,Unknown,Recovery of database 'COMPANY_01' (7) is 0% complete (approximately 762 seconds remain). Phase 1 of 3. This is an informational message only. No user action is required.
04/08/2017 09:13:57,spid44s,Unknown,Starting up database 'COMPANY_01'.
04/08/2017 09:13:57,spid44s,Unknown,State information for database 'COMPANY_01' - Hardended Lsn: '(137082:338248:1)' Commit LSN: '(137082:338224:3)' Commit Time: 'Apr 8 2017 9:13AM'
04/08/2017 09:13:56,spid58s,Unknown,AlwaysOn Availability Groups connection with primary database terminated for secondary database 'COMPANY_01' on the availability replica 'NODE_01' with Replica ID: {6ee79d8f-e575-46d8-9936-419c9d1f7af0}. This is an informational message only. No user action is required.
04/08/2017 09:13:56,spid44s,Unknown,State information for database 'COMPANY_01' - Hardended Lsn: '(137082:338248:1)' Commit LSN: '(137082:338224:3)' Commit Time: 'Apr 8 2017 9:13AM'
04/08/2017 09:13:56,spid44s,Unknown,Nonqualified transactions are being rolled back in database COMPANY_01 for an AlwaysOn Availability Groups state change. Estimated rollback completion: 100%. This is an informational message only. No user action is required.
04/08/2017 09:13:56,spid35s,Unknown,Availability database 'COMPANY_01'<c/> which is in the secondary role<c/> is being restarted to resynchronize with the current primary database. This is an informational message only. No user action is required.
04/08/2017 09:13:56,spid35s,Unknown,Error: 35278<c/> Severity: 17<c/> State: 1.
04/08/2017 09:13:56,spid59s,Unknown,The recovery LSN (136730:1251160:1) was identified for the database with ID 8. This is an informational message only. No user action is required.
04/08/2017 09:13:56,spid59s,Unknown,AlwaysOn Availability Groups connection with primary database established for secondary database 'COMPANY_01_TEST' on the availability replica 'NODE_01' with Replica ID: {6ee79d8f-e575-46d8-9936-419c9d1f7af0}. This is an informational message only. No user action is required.
04/08/2017 09:13:49,spid49s,Unknown,A connection for availability group 'TEST_AG' from availability replica 'NODE_02' with id [B45F451A-FF79-4969-B232-24BF484B37F7] to 'NODE_01' with id [6EE79D8F-E575-46D8-9936-419C9D1F7AF0] has been successfully established. This is an informational message only. No user action is required.
04/08/2017 09:13:48,spid49s,Unknown,AlwaysOn Availability Groups connection with primary database terminated for secondary database 'COMPANY_01' on the availability replica 'NODE_01' with Replica ID: {6ee79d8f-e575-46d8-9936-419c9d1f7af0}. This is an informational message only. No user action is required.
04/08/2017 09:13:48,spid63s,Unknown,AlwaysOn Availability Groups connection with primary database terminated for secondary database 'COMPANY_01_TEST' on the availability replica 'NODE_01' with Replica ID: {6ee79d8f-e575-46d8-9936-419c9d1f7af0}. This is an informational message only. No user action is required.
04/08/2017 09:13:38,spid59s,Unknown,The recovery LSN (136730:1251160:1) was identified for the database with ID 8. This is an informational message only. No user action is required.
04/08/2017 09:13:37,spid59s,Unknown,AlwaysOn Availability Groups connection with primary database established for secondary database 'COMPANY_01_TEST' on the availability replica 'NODE_01' with Replica ID: {6ee79d8f-e575-46d8-9936-419c9d1f7af0}. This is an informational message only. No user action is required.
04/08/2017 09:13:37,spid35s,Unknown,The recovery LSN (137082:338232:1) was identified for the database with ID 7. This is an informational message only. No user action is required.
04/08/2017 09:13:33,spid35s,Unknown,AlwaysOn Availability Groups connection with primary database established for secondary database 'COMPANY_01' on the availability replica 'NODE_01' with Replica ID: {6ee79d8f-e575-46d8-9936-419c9d1f7af0}. This is an informational message only. No user action is required.
04/08/2017 09:13:32,spid35s,Unknown,A connection for availability group 'TEST_AG' from availability replica 'NODE_02' with id [B45F451A-FF79-4969-B232-24BF484B37F7] to 'NODE_01' with id [6EE79D8F-E575-46D8-9936-419C9D1F7AF0] has been successfully established. This is an informational message only. No user action is required.
04/08/2017 09:13:31,spid35s,Unknown,AlwaysOn Availability Groups connection with primary database terminated for secondary database 'COMPANY_01_TEST' on the availability replica 'NODE_01' with Replica ID: {6ee79d8f-e575-46d8-9936-419c9d1f7af0}. This is an informational message only. No user action is required.
04/08/2017 09:13:31,spid59s,Unknown,AlwaysOn Availability Groups connection with primary database terminated for secondary database 'COMPANY_01' on the availability replica 'NODE_01' with Replica ID: {6ee79d8f-e575-46d8-9936-419c9d1f7af0}. This is an informational message only. No user action is required.
04/08/2017 09:13:31,spid59s,Unknown,A connection timeout has occurred on a previously established connection to availability replica 'NODE_01' with id [6EE79D8F-E575-46D8-9936-419C9D1F7AF0]. Either a networking or a firewall issue exists or the availability replica has transitioned to the resolving role.
04/08/2017 09:12:38,spid49s,Unknown,The recovery LSN (136730:1251160:1) was identified for the database with ID 8. This is an informational message only. No user action is required.
04/08/2017 09:12:38,spid49s,Unknown,AlwaysOn Availability Groups connection with primary database established for secondary database 'COMPANY_01_TEST' on the availability replica 'NODE_01' with Replica ID: {6ee79d8f-e575-46d8-9936-419c9d1f7af0}. This is an informational message only. No user action is required.
04/08/2017 09:12:38,spid35s,Unknown,The recovery LSN (137082:338056:1) was identified for the database with ID 7. This is an informational message only. No user action is required.
04/08/2017 09:12:30,spid35s,Unknown,AlwaysOn Availability Groups connection with primary database established for secondary database 'COMPANY_01' on the availability replica 'NODE_01' with Replica ID: {6ee79d8f-e575-46d8-9936-419c9d1f7af0}. This is an informational message only. No user action is required.
04/08/2017 09:12:26,spid35s,Unknown,A connection for availability group 'TEST_AG' from availability replica 'NODE_02' with id [B45F451A-FF79-4969-B232-24BF484B37F7] to 'NODE_01' with id [6EE79D8F-E575-46D8-9936-419C9D1F7AF0] has been successfully established. This is an informational message only. No user action is required.
04/08/2017 09:12:03,Server,Unknown,Resource Monitor (0xc90) Worker 0x00000000026EC160 appears to be non-yielding on Node 0. Memory freed: 0 KB. Last wait: MISCELLANEOUS. Last clerk: type USERSTORE_TOKENPERM<c/> name ACRUserStore-56-4-1. Approx CPU Used: kernel 0 ms<c/> user 0 ms<c/> Interval: 67723.
04/08/2017 09:11:24,spid49s,Unknown,A connection timeout has occurred while attempting to establish a connection to availability replica 'NODE_01' with id [6EE79D8F-E575-46D8-9936-419C9D1F7AF0]. Either a networking or firewall issue exists<c/> or the endpoint address provided for the replica is not the database mirroring endpoint of the host server instance.
04/08/2017 09:11:18,spid35s,Unknown,The connection to the primary replica is not active. The command cannot be processed.
04/08/2017 09:11:18,spid35s,Unknown,Error: 35250<c/> Severity: 16<c/> State: 11.
04/08/2017 09:11:09,spid46s,Unknown,AlwaysOn Availability Groups connection with primary database terminated for secondary database 'COMPANY_01_TEST' on the availability replica 'NODE_01' with Replica ID: {6ee79d8f-e575-46d8-9936-419c9d1f7af0}. This is an informational message only. No user action is required.
04/08/2017 09:11:09,spid34s,Unknown,AlwaysOn Availability Groups connection with primary database terminated for secondary database 'COMPANY_01' on the availability replica 'NODE_01' with Replica ID: {6ee79d8f-e575-46d8-9936-419c9d1f7af0}. This is an informational message only. No user action is required.
04/08/2017 09:10:59,spid35s,Unknown,AlwaysOn Availability Groups connection with primary database established for secondary database 'COMPANY_01' on the availability replica 'NODE_01' with Replica ID: {6ee79d8f-e575-46d8-9936-419c9d1f7af0}. This is an informational message only. No user action is required.
04/08/2017 09:10:59,spid34s,Unknown,A connection for availability group 'TEST_AG' from availability replica 'NODE_02' with id [B45F451A-FF79-4969-B232-24BF484B37F7] to 'NODE_01' with id [6EE79D8F-E575-46D8-9936-419C9D1F7AF0] has been successfully established. This is an informational message only. No user action is required.
04/08/2017 09:10:50,spid35s,Unknown,AlwaysOn Availability Groups connection with primary database terminated for secondary database 'COMPANY_01_TEST' on the availability replica 'NODE_01' with Replica ID: {6ee79d8f-e575-46d8-9936-419c9d1f7af0}. This is an informational message only. No user action is required.
04/08/2017 09:10:50,spid49s,Unknown,AlwaysOn Availability Groups connection with primary database terminated for secondary database 'COMPANY_01' on the availability replica 'NODE_01' with Replica ID: {6ee79d8f-e575-46d8-9936-419c9d1f7af0}. This is an informational message only. No user action is required.
04/08/2017 09:10:24,spid35s,Unknown,The recovery LSN (136730:1251160:1) was identified for the database with ID 8. This is an informational message only. No user action is required.
04/08/2017 09:10:24,spid34s,Unknown,The recovery LSN (137082:338000:1) was identified for the database with ID 7. This is an informational message only. No user action is required.
04/08/2017 09:10:24,spid35s,Unknown,AlwaysOn Availability Groups connection with primary database established for secondary database 'COMPANY_01_TEST' on the availability replica 'NODE_01' with Replica ID: {6ee79d8f-e575-46d8-9936-419c9d1f7af0}. This is an informational message only. No user action is required.
04/08/2017 09:10:24,spid34s,Unknown,AlwaysOn Availability Groups connection with primary database established for secondary database 'COMPANY_01' on the availability replica 'NODE_01' with Replica ID: {6ee79d8f-e575-46d8-9936-419c9d1f7af0}. This is an informational message only. No user action is required.
04/08/2017 09:10:18,spid40s,Unknown,A connection for availability group 'TEST_AG' from availability replica 'NODE_02' with id [B45F451A-FF79-4969-B232-24BF484B37F7] to 'NODE_01' with id [6EE79D8F-E575-46D8-9936-419C9D1F7AF0] has been successfully established. This is an informational message only. No user action is required.
04/08/2017 09:10:12,spid40s,Unknown,AlwaysOn Availability Groups connection with primary database terminated for secondary database 'COMPANY_01_TEST' on the availability replica 'NODE_01' with Replica ID: {6ee79d8f-e575-46d8-9936-419c9d1f7af0}. This is an informational message only. No user action is required.
04/08/2017 09:10:12,spid34s,Unknown,AlwaysOn Availability Groups connection with primary database terminated for secondary database 'COMPANY_01' on the availability replica 'NODE_01' with Replica ID: {6ee79d8f-e575-46d8-9936-419c9d1f7af0}. This is an informational message only. No user action is required.
04/08/2017 09:10:02,spid35s,Unknown,The recovery LSN (137082:338000:1) was identified for the database with ID 7. This is an informational message only. No user action is required.
04/08/2017 09:10:00,spid35s,Unknown,AlwaysOn Availability Groups connection with primary database established for secondary database 'COMPANY_01' on the availability replica 'NODE_01' with Replica ID: {6ee79d8f-e575-46d8-9936-419c9d1f7af0}. This is an informational message only. No user action is required.
04/08/2017 09:10:00,spid40s,Unknown,The recovery LSN (136730:1251160:1) was identified for the database with ID 8. This is an informational message only. No user action is required.
04/08/2017 09:09:59,spid40s,Unknown,AlwaysOn Availability Groups connection with primary database established for secondary database 'COMPANY_01_TEST' on the availability replica 'NODE_01' with Replica ID: {6ee79d8f-e575-46d8-9936-419c9d1f7af0}. This is an informational message only. No user action is required.
04/08/2017 09:09:59,spid35s,Unknown,A connection for availability group 'TEST_AG' from availability replica 'NODE_02' with id [B45F451A-FF79-4969-B232-24BF484B37F7] to 'NODE_01' with id [6EE79D8F-E575-46D8-9936-419C9D1F7AF0] has been successfully established. This is an informational message only. No user action is required.
04/08/2017 09:09:56,spid35s,Unknown,AlwaysOn Availability Groups connection with primary database terminated for secondary database 'COMPANY_01_TEST' on the availability replica 'NODE_01' with Replica ID: {6ee79d8f-e575-46d8-9936-419c9d1f7af0}. This is an informational message only. No user action is required.
04/08/2017 09:09:54,spid40s,Unknown,AlwaysOn Availability Groups connection with primary database terminated for secondary database 'COMPANY_01' on the availability replica 'NODE_01' with Replica ID: {6ee79d8f-e575-46d8-9936-419c9d1f7af0}. This is an informational message only. No user action is required.
sorry for my english, can i help you, thanks.
All replies (13)
Saturday, April 8, 2017 2:14 PM
Hello,
share result of this..
SELECT
ar.replica_server_name,
adc.database_name,
ag.name AS ag_name,
drs.is_local,
drs.is_primary_replica,
drs.synchronization_state_desc,
--drs.is_commit_participant,
drs.synchronization_health_desc,
--drs.recovery_lsn,
--drs.truncation_lsn,
--drs.last_sent_lsn,
--drs.last_sent_time,
--drs.last_received_lsn,
--drs.last_received_time,
--drs.last_hardened_lsn,
--drs.last_hardened_time,
--drs.last_redone_lsn,
--drs.last_redone_time,
--drs.log_send_queue_size,
--drs.log_send_rate,
--drs.redo_queue_size,
--drs.redo_rate,
--drs.filestream_send_rate,
--drs.end_of_log_lsn,
--drs.last_commit_lsn,
drs.last_commit_time
FROM sys.dm_hadr_database_replica_states AS drs
INNER JOIN sys.availability_databases_cluster AS adc
ON drs.group_id = adc.group_id AND
drs.group_database_id = adc.group_database_id
INNER JOIN sys.availability_groups AS ag
ON ag.group_id = drs.group_id
INNER JOIN sys.availability_replicas AS ar
ON drs.group_id = ar.group_id AND
drs.replica_id = ar.replica_id
ORDER BY
ag.name,
ar.replica_server_name,
adc.database_name;
Please Mark it as Answered if it answered your question OR mark it as Helpful if it help you to solve your problem.
Saturday, April 8, 2017 3:14 PM
thank you, this is query, my account can not post link or image,
replica_server_name database_name ag_name is_local is_primary_replica synchronization_state_desc synchronization_health_desc last_commit_time
NODE_01 COMPANY_01 TEST_AG 1 1 SYNCHRONIZED HEALTHY 2017-04-08 21:44:19.303
NODE_01 COMPANY_02 TEST_AG 1 1 SYNCHRONIZED HEALTHY 2017-04-07 22:11:19.317
NODE_02 COMPANY_01 TEST_AG 0 0 SYNCHRONIZED HEALTHY 2017-04-08 21:44:19.303
NODE_02 COMPANY_02 TEST_AG 0 0 SYNCHRONIZED HEALTHY 2017-04-07 22:11:19.317
Saturday, April 8, 2017 3:37 PM
It looks like your AG was working and then you encountered a network interruption.
Can you configure your session time out to perhaps 1 minute.
Are you mirroring over a wan? You might need to do asynchronous instead of synchronous mode replication.
Saturday, April 8, 2017 3:46 PM
no, this is on LAN, and speed is 1Gbps, i configed time out from 10s default to 20s but error still occurs, i will try to 1 minute
thank.
Saturday, April 8, 2017 3:56 PM
Can you drill down on your always on high availability folder, Availability Groups, your ag, and then Availability replicas. Are any of your replicas red?
Saturday, April 8, 2017 4:11 PM
share dashboard report.
Please Mark it as Answered if it answered your question OR mark it as Helpful if it help you to solve your problem.
Saturday, April 8, 2017 5:09 PM
this is dashboard:
Saturday, April 8, 2017 5:15 PM
Can you drill down on your always on high availability folder, Availability Groups, your ag, and then Availability replicas. Are any of your replicas red?
secondary replica is red color when error disconnect occurs in few minute, later is normal
Saturday, April 8, 2017 5:39 PM
declare @xel_path varchar(1024);
declare @utc_adjustment int = datediff(hour, getutcdate(), getdate());
target event_file path retrieval
;with target_data_cte as
(
select
target_data =
convert(xml, target_data)
from sys.dm_xe_sessions s
inner join sys.dm_xe_session_targets st
on s.address = st.event_session_address
where s.name = 'alwayson_health'
and st.target_name = 'event_file'
),
full_path_cte as
(
select
full_path =
target_data.value('(EventFileTarget/File/@name)[1]', 'varchar(1024)')
from target_data_cte
)
select
@xel_path =
left(full_path, len(full_path) - charindex('\', reverse(full_path))) +
'\AlwaysOn_health*.xel'
from full_path_cte;
replica state change events
;with state_change_data as
(
select
object_name,
event_data =
convert(xml, event_data)
from sys.fn_xe_file_target_read_file(@xel_path, null, null, null)
)
select
object_name,
event_timestamp =
dateadd(hour, @utc_adjustment, event_data.value('(event/@timestamp)[1]', 'datetime')),
ag_name =
event_data.value('(event/data[@name = "availability_group_name"]/value)[1]', 'varchar(64)'),
previous_state =
event_data.value('(event/data[@name = "previous_state"]/text)[1]', 'varchar(64)'),
current_state =
event_data.value('(event/data[@name = "current_state"]/text)[1]', 'varchar(64)')
from state_change_data
where object_name = 'availability_replica_state_change'
order by event_timestamp desc;
share result.
Please Mark it as Answered if it answered your question OR mark it as Helpful if it help you to solve your problem.
Saturday, April 8, 2017 6:09 PM
this is query:
Sunday, April 9, 2017 1:58 AM
It looks like your AG was working and then you encountered a network interruption.
Can you configure your session time out to perhaps 1 minute.
Are you mirroring over a wan? You might need to do asynchronous instead of synchronous mode replication.
i changed session timeout to 1 minute and have not error now, i will monitor continues :D
Tuesday, April 11, 2017 3:54 PM
i incremented session timeout to 90s, but error timeout and disconnect still occur, one time in day
name timestamp
hadr_db_partner_set_sync_state 2017-04-10 13:39:19.3712866
hadr_db_partner_set_sync_state 2017-04-10 13:39:19.3779110
error_reported 2017-04-10 13:40:01.8038874
hadr_db_partner_set_sync_state 2017-04-10 13:40:05.8282818
hadr_db_partner_set_sync_state 2017-04-10 13:40:05.8330131
hadr_db_partner_set_sync_state 2017-04-10 15:38:30.5699689
hadr_db_partner_set_sync_state 2017-04-10 15:38:30.5774561
error_reported 2017-04-10 15:39:30.1759471
error_reported 2017-04-10 15:40:33.2092476
error_reported 2017-04-10 15:40:54.2407832
hadr_db_partner_set_sync_state 2017-04-10 15:40:58.2598532
hadr_db_partner_set_sync_state 2017-04-10 15:40:58.2757176
alwayson_ddl_executed 2017-04-10 19:25:42.4905323
alwayson_ddl_executed 2017-04-10 19:25:42.9152291
alwayson_ddl_executed 2017-04-10 19:25:42.9465747
alwayson_ddl_executed 2017-04-10 19:25:42.9538887
alwayson_ddl_executed 2017-04-10 19:25:43.4165960
alwayson_ddl_executed 2017-04-10 19:25:43.4345903
alwayson_ddl_executed 2017-04-10 19:25:43.4643749
alwayson_ddl_executed 2017-04-10 19:25:43.4711707
hadr_db_partner_set_sync_state 2017-04-11 01:44:03.6869960
hadr_db_partner_set_sync_state 2017-04-11 01:44:03.6936779
error_reported 2017-04-11 01:46:12.1862276
hadr_db_partner_set_sync_state 2017-04-11 01:46:12.2527787
hadr_db_partner_set_sync_state 2017-04-11 01:46:12.3307402
???
Tuesday, April 11, 2017 4:05 PM
I think what is key here is that when you get your disconnect your read only replica is red.
Have you configured a private network?
Any way you can share your cluster logs?