Why are LDAP queries using PrincipalContext very slow since upgrading to Windows Server 2019

hbuelow 21 Reputation points
2020-12-24T20:38:26.067+00:00

We recently upgraded from Windows 2008 R2 to Windows Server 2019 and since the upgrade the piece of code below now takes over a minute to run when it previously took 1 to 2 seconds. Using netmon I can see that the server is making multiple DNS calls to locate the dc even though we specified the fully qualified name of the dc in the request. The network adapter on the server is configured to append 3 different DNS suffixes and calls are made using each suffix to determine the dc even though it is not necessary. If I remove ContextOptions.Negotiate and use ContextOptions.SimpleBind then these additional calls are not made. Unfortunately this is not a viable option. I also ran the same test on a Windows 2008 R2 server with the additional DNS suffixes as well and no calls are made to locate the dc so the call completes fast. I also used tracelog on both the windows 2008 R2 server and on the Windows 2019 server and compared them and the main difference is that the log from the Windows 2019 server contains the text “LDAP connection 0x7e86118 attempting to resolve 'FULLQUALIFIEDDCNAME.COM' using DC locator.” Why is it trying to locate the dc when I specified the fully qualified name of the dc??

Dim oPrincipalContext As New PrincipalContext(ContextType.Domain, Me.activeDirectoryHost & ":636", sDefaultSearchUserOU, ContextOptions.ServerBind Or ContextOptions.Negotiate Or ContextOptions.SecureSocketLayer, sDomain & "\" & sServiceUser, sServicePassword)
Dim oUserContext As OurCustomUserPrincipalExtended = OurCustomUserPrincipalExtended.FindByIdentity(oPrincipalContext, sUserName)

Windows Server 2019
Windows Server 2019
A Microsoft server operating system that supports enterprise-level management updated to data storage.
3,458 questions
Active Directory
Active Directory
A set of directory-based technologies included in Windows Server.
5,858 questions
{count} votes

Accepted answer
  1. Gary Nebbett 5,721 Reputation points
    2020-12-30T19:13:21.117+00:00

    Hello Helen (@hbuelow ),

    If one cannot investigate further the firewall problems between the Windows 2019 server and the DC then it is perhaps worth investigating how the Windows 2008 R2 server manages to work.

    Here are three ideas, although none of them may be suitable in your environment:

    1. Don't use ContextOptions.SecureSocketLayer and TCP port 636 (just normal LDAP and port 389); now make a new Network Monitor capture and run your code/script. Most of the connection will still be encrypted (by SASL), but the authentication mechanism should be visible in the trace.
    2. Follow the instructions in Event Tracing in LDAP Applications to capture ETW (Event Tracing for Windows) data for the connection (this way of capturing data can see the raw LDAP protocol without encryption, so you can continue to use the ContextOptions.SecureSocketLayer option).
    3. Just use ETW to trace the Microsoft-Windows-LDAP-Client provider. Without the registry modification of option 2, this will collect less information but it might still be enough to understand how the Windows 2008 R2 system is authenticating.

    The "klist purge" might have not had the desired effect because explicit credentials are used in the LDAP bind. There are two "messages" to the local Kerberos client that "purge" its cache: KerbPurgeTicketCacheMessage and KerbPurgeTicketCacheExMessage. "klist purge" probably uses the first message, but the second message might be necessary in this case.

    Gary


7 additional answers

Sort by: Most helpful
  1. Gary Nebbett 5,721 Reputation points
    2020-12-30T12:43:45.527+00:00

    Hello Helen (@hbuelow ),

    The "Negotiate" authentication protocol "negotiates" a concrete authentication protocol (in practice, either "Kerberos" or "NTLM"). Typically, if the necessary conditions are met (such as being able to construct a "service principal name" (SPN)), "Kerberos" is tried first and "NTLM" is used as a fallback.

    Referring to a service with an IP address rather than a domain name means that no SPN can be constructed and NTLM is used.

    The DNS queries in the 2nd network trace includes queries for _kerberos._tcp...; these queries are trying to locate a Kerberos Key Distribution Centre (KDC) in the domand "sDomain" that will issue a Kerberos ticket for the "ldap" service, using the credentials [sDomain\sServiceUser, sServicePassword].

    Something (such as a firewall) is probably blocking the Kerberos requests - this would probably be visible in an unfiltered view of the network trace that you made. The next step is to identify which device is blocking the Kerberos traffic (some intermediate firewall device or firewall settings on the KDC).

    Gary

    0 comments No comments

  2. hbuelow 21 Reputation points
    2020-12-30T13:19:23.977+00:00

    @Gary Nebbett
    Why don't we see any of these kerberos requests in the netmon log from the Windows 2008 R2 servers? Also one of the kerberos requests does return success and it returns the ips of the dcs. In the netmon log it queries for each DNS suffix and it queries against the site as well. Is there any way to stop the queries against the site? See below. It repeats all of the calls below twice during a single bind.

    _kerberos._tcp.THESITE._sites.dc._msdcs.sDomain.DNSSuffix1 - Error
    _kerberos._tcp.THESITE._sites.dc._msdcs.sDomain.DNSSuffix2 - Error
    _kerberos._tcp.THESITE._sites.dc._msdcs.sDomain.DNSSuffix3 - Error
    _kerberos._tcp.dc._msdcs.sDomain.DNSSuffix1 - Error
    _kerberos._tcp.dc._msdcs.sDomain.DNSSuffix2 - Success

    Thanks,
    Helen


  3. Gary Nebbett 5,721 Reputation points
    2020-12-30T18:17:59.05+00:00

    Hello Helen (@hbuelow ),

    [For some unknown reason, "reply" would not work - hence a new "answer" message]

    The Kerberos packets in the trace are probably proof enough, but you could try using telnet, if that would give you more confidence.

    Obviously, "telnet" won't "work" but the nature of the reported failure would allow someone with enough experience to conclude whether the TCP connection (at the TCP protocol level) succeeded or failed.

    You could also try on your Windows 2008 R2 server "client", making a new network trace to observe Kerberos in action. First start Network Monitor on the client, then issue the command "klist purge" (to purge/empty the Kerberos cache), then run your code/script; finally stop the capture any examine the captured data.

    If you don't know whether there are any firewall capable devices on the path between your clients and the DC and you can't investigate the settings or trace the behaviour of the DC, then we might not be able to proceed with Kerberos as the authentication protocol...

    Gary