question

ME-6236 avatar image
0 Votes"
ME-6236 asked ME-6236 answered

Domain Controller Failover Issues: New DC with all FSMO roles never authenticates alone

The problem I am having is that if I take one of my two writable domain controllers offline, nobody seems to "fail over" to using the other domain controller like they're supposed to - applications we run within our network that use AD for authentication just keep asking for a username and password and never actually authenticate you, and external users reliant on a read-only DC on another network segment can't authenticate to our remote access website either.

I currently have three domain controllers in my Domain: DC1, DC2, and RO1. DC1 and RO1 is Server 2019, DC2 is Server 2012R2. Both of the writable DCs are AD-integrated DNS servers, with their network adapters configured to point at each other.

DC1 and DC2 are on the same subnet. RO1 is a read only controller out in a different network segment in order to support a remote access solution managed by the organization above me (who manages the general network I connect to).

In the past, if I were to take one or the other local DCs offline, local users would fail over to whichever was actually still running (as expected), as would remote users as the RODC fetches the active one to authenticate.

The current DC1 is a relatively new addition, replacing one called DC. DC1 was brought online and joined with DC and DC2, and everything seemed fine. I transferred all the FSMO roles that DC had over to its replacement, DC1 - netdom query fsmo shows all roles as being on the new DC1. We demoted and took DC offline to retire it since it was a Server 2012 machine and we're migrating away from those. Cleaned up a few errant DNS records that claimed the old DC was still around, but other than that everything chugged along as it had. Last patch cycle though, we had DC2 offline while the DC1 and RO1 remained active, but discovered the authentication related issues above. External users could not authenticate in at all, and users who were already logged in found our AD-authenticating applications suddenly asking them to log in again (to no avail).

Unfortunately I'm not sure why this is. DC1, the new controller, is definitely recognized by the Domain. Replication happens fine - Repadmin /showrepl is successful, and /replsum has no errors reported. All involved internal machines can resolve their host names and ping each other. If I ping the domain, I can get either writable DC, same as if I tracert to the domain. I can make edits on DC1 and see them on DC2, and vice versa (and changes like group policy made on DC1 specifically definitely exist out in the greater network). I can take the RODC and tell it to load records from DC1 and DC2 without issue.

If I take DC2 offline, however, that's when things go sideways. Ping or Tracert to our domain fails, external users get denied access, and internal users see our AD-authenticated applications fail and constantly call for a username and password. The opposite does not happen, however - if I take the new DC1 offline, local users sometimes have a slight chugging delay as if their machine was trying to contact DC1 before failing over to DC2 and authenticating successfully, and external users come in just fine.

Internal clients have been tested using the DC/DNS servers in question as the DNS server entries on their network adapters.

There's nothing super obvious in the Event Logs, and everything I can think of appears correctly configured. I'm not sure where to progress from here - has anyone had similar symptoms that they've been able to correct?

windows-serverwindows-active-directorywindows-server-2019
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

ME-6236 avatar image
0 Votes"
ME-6236 answered

Ultimately it was a problem with the organization that manages the network we connect to - specifically, Firewall rules were not correctly configured.

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

DSPatrick avatar image
0 Votes"
DSPatrick answered ME-6236 commented

if I take one of my two writable domain controllers offline, nobody seems to "fail over" to using the other domain controller like they're supposed to

I'd check the problem members have the address of other healthy domain controllers listed for DNS

--please don't forget to upvote and Accept as answer if the reply is helpful--









· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Ah, I omitted that from the post - the clients that have tested this on-network do have both the DNS servers / Domain Controllers in question as their DNS entries on their network adapters.

0 Votes 0 ·
DSPatrick avatar image
0 Votes"
DSPatrick answered DSPatrick commented

Please run;

Dcdiag /v /c /d /e /s:%computername% >C:\dcdiag.log
repadmin /showrepl >C:\repl.txt
ipconfig /all > C:\dc1.txt
ipconfig /all > C:\dc2.txt
ipconfig /all > C:\dc3.txt
ipconfig /all > C:\problemworkstation.txt

then put unzipped text files up on OneDrive and share a link.



· 6
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Just checking if there's any progress or updates?

--please don't forget to upvote and Accept as answer if the reply is helpful--



0 Votes 0 ·

Not had an opportunity yet; I can do that but I would need to spend time scrubbing out specific machine names for security reasons.

0 Votes 0 ·

Shouldn't necessary as names are meaningless, editing the files defeats the purpose so it sounds like you'll be better off starting a case here with product support.
https://support.serviceshub.microsoft.com/supportforbusiness

--please don't forget to upvote and Accept as answer if the reply is helpful--





0 Votes 0 ·

129088-fedcdcdiag.log



Hello,

Attached is the DCDIAG log file for the DC that isn't picking up the slack. Unfortunately one drive is blocked here, as are most similar services.

0 Votes 0 ·
fedcdcdiag.log (329.1 KiB)

Unfortunately one drive is blocked here, as are most similar services

More likely is your new Q&A profile is still untrusted. Work around is to post text only or code block of link. https://onedrive.live.com/ If you cannot put up all the files I'd suggest starting a case here with product support.
https://support.serviceshub.microsoft.com/supportforbusiness










0 Votes 0 ·
Show more comments
LimitlessTechnology-2700 avatar image
0 Votes"
LimitlessTechnology-2700 answered ME-6236 commented

Hello,

Thank you for your question.

I would like to suggest you to check below troubleshooting steps.

  1. Please check if there is any DNS forwarder is configured in DNS along with DNS replication related event logs.

  2. also can you give DNS ips of all your three DCs to the clients machines.

  3. Please check again AD replication is in healthy state you can run below Microsoft GUI tool check the AD Health.
    https://www.microsoft.com/en-in/download/details.aspx?id=30005

  4. If you take D2 offline then please check the Event logs on client machines related to authentications and tracer to your Domain name.

  5. Please also check time is in Sync between your DC and client computers.



If the reply was helpful, please don’t forget to upvote or accept as answer.

· 3
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Hello,

No forwarders are configured.

There is some slight time drift but within 5 minutes.

AD Replication is healthy. Only two DCs have DNS and clients have the IPs but it doesn't seem to make an impact..

0 Votes 0 ·

Hello,

If AD replication is healthy and all are in Sync , then it could be issue with DNS records as the login Authentication request is not being forwarded to new DCs.

Please check DNS event logs for any clue.
also please look for tombstone timespan

0 Votes 0 ·
ME-6236 avatar image ME-6236 LimitlessTechnology-2700 ·

DNS issues had been my idea; I rooted through and found few discrepancies. Both the DCs are in there together where they should be, with the only sole entry of a DC being the folder for PDC (which has the one that wont take over). I did see the _msdcs.domain.local folder only had a single entry for my old, now long gone, DC so I did remove that and add in the two active ones. This might have solved my tracert issue where I was only able to tracert domainname and get the new DC responding when both were online, but didn't fix the issue with authentication. Where should I look for things precisely for authentication?

In your earlier post about forwarders, did you mean the forwarder node in DNS or the Forwarder tab in the DNS server's properties (where you can also do things like enable root hints)? I may have misunderstood.

0 Votes 0 ·