How to diagnose replication sp_replcounters transport-level error?

cdw 41 Reputation points
2022-10-03T20:59:20.093+00:00

I have a transactional replication publication / subscription set up but am receiving errors in the Log Reader Agent:

2022-10-03 18:41:45.133 OLE DB Publisher '<SERVER>': exec sp_replcounters N'<DATABASE>'
2022-10-03 18:42:04.556 Status: 2, code: 20011, text: 'The process could not execute 'sp_repldone/sp_replcounters' on '<SERVER>'.'.
2022-10-03 18:42:04.556 The process could not execute 'sp_repldone/sp_replcounters' on '<SERVER>'.
2022-10-03 18:42:04.556 Status: 2, code: 64, text: 'TCP Provider: The specified network name is no longer available.'.
2022-10-03 18:42:04.556 Status: 2, code: 64, text: 'Communication link failure'.

The distributor is co-located with the subscriber, across a WAN from the publisher.

This appears to be due to a misconfiguration of the SQL replication and not transient network issues because I am able to query the replication status from the publisher and query the publication database from the distributor but am unable to execute the sp_replcounters successfully from the distributor. Whenever EXEC sp_replcounters is run from the distributor it give the following error (matching the error in replication agent history):

Msg 64, Level 20, State 0, Line 0
A transport-level error has occurred when receiving results from the server. (provider: TCP Provider, error: 0 - The specified network name is no longer available.)

However, I am able to select data from the publisher database without issue and run the same EXEC sp_replcounters from the publication server. Also, both servers have the appropriate IP addresses configured in the C:\Windows\System32\drivers\etc\hosts file, so DNS should not be a factor.

What steps does sp_replcounters perform that could cause an network issue?

SQL Server | Other
0 comments No comments
{count} vote

Accepted answer
  1. Erland Sommarskog 121.4K Reputation points MVP Volunteer Moderator
    2022-10-03T22:00:19.187+00:00

    I don't know much about sp_replcounters, but you could run Profiler on the other machine to see what commands you can see from sp_replcounters. I would also check the SQL Server errorlog on the other machine for errors.

    However, this has the smell of a network issue, but of a different kind that just flaky switch. No, this since behaviour is consistent, this sounds like something which checks packets and says "oops, nasty stuff" and cuts the connection. No, I don't know of any product that behaves this way with sp_repldone. But recall that some ten years ago, people would occasionally have the connection cut as soon as they said RAISERROR. And that was indeed due to some security program. (Which acted that way because of an issue that was corrected in SQL 2000 SP2.)

    2 people found this answer helpful.

1 additional answer

Sort by: Most helpful
  1. Seeya Xi-MSFT 16,586 Reputation points
    2022-10-04T06:25:06.983+00:00

    Hi @cdw ,

    This appears to be a network issue. The ping-t command will show the network stability during the snapshot application.
    Here is a similar thread: https://dba.stackexchange.com/questions/149539/the-specified-network-name-is-no-longer-available
    You can also take a look at this document: https://learn.microsoft.com/en-us/sql/relational-databases/replication/troubleshoot-tran-repl-errors?view=sql-server-ver16

    Best regards,
    Seeya


    If the answer is the right solution, please click "Accept Answer" and kindly upvote it. If you have extra questions about this answer, please click "Comment".
    Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.

    1 person found this answer helpful.

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.