Additional SQL Server features and topics not covered by specific categories
In an Active Directory–joined Always On Availability Group, the main additional moving parts compared to a workgroup configuration are:
- Authentication model (Kerberos vs certificates)
- Service accounts and endpoint permissions
- Name resolution and SPNs
- Cluster/AG metadata stored in AD
Any misconfiguration in these areas can cause the AG to be created successfully but block the secondary database from joining and synchronizing.
Below are the key AD-related factors and concrete troubleshooting steps, based on the documented behavior of AGs and their mirroring-style endpoints.
1. Endpoint configuration and connectivity
The join operation uses the HADR endpoint (DATABASE_MIRRORING endpoint) between replicas. If the secondary cannot establish an active connection to the primary’s endpoint, joining the database fails with errors like 35250.
From the documented troubleshooting steps for Msg 35250 The connection to the primary replica is not active. The command cannot be processed.:
- Ensure the endpoint is created and started on both replicas.
- Verify connectivity to the endpoint port (for example with Telnet) and confirm that no firewall rules block it.
- Confirm the endpoint definition matches the IP/port actually used by the AG.
- Check for name resolution issues.
In an AD environment, endpoints typically use FQDNs and Kerberos; in a workgroup lab, certificates and direct IPs are often used. A mismatch between endpoint URL/IP and how replicas resolve each other’s names can cause the secondary to fail to connect.
Concrete checks:
- On both replicas, query the endpoint definition and confirm:
- Same port on both sides
-
STATE = STARTED -
LISTENER_IPandLISTENER_PORTare reachable from the partner
- Use
telnet <partnerName> <endpointPort>(or equivalent) from each node to the other.
2. Service accounts and CONNECT permission on the endpoint
In an AD environment, SQL Server services usually run under domain accounts and use Windows authentication to the endpoint. The documented requirements are:
- If both instances run under the same domain account, security is simpler: the login exists automatically in both
masterdatabases. - If instances run under different accounts, each account must be created in
masteron the remote instance and grantedCONNECTpermission on the mirroring endpoint. - If using built-in or non-domain accounts, certificates must be used for endpoint authentication.
If the workgroup lab uses certificates correctly but the AD lab relies on domain accounts without proper endpoint permissions, the secondary will not be able to connect to the primary’s endpoint even though the AG object and cluster look healthy.
Concrete checks:
- Confirm what accounts the SQL Server services run under on both nodes.
- On each instance, in
master, verify that the partner’s service account exists as a login and hasCONNECTpermission on the endpoint. - If using built-in or non-domain accounts in AD, ensure certificates are configured as in the workgroup scenario.
3. Name resolution, SPNs, and Kerberos
In AD, HADR communication typically uses Kerberos over TCP. Kerberos requires:
- Correct DNS resolution of server names
- Correct Service Principal Names (SPNs) for the SQL Server service accounts
- No duplicate or missing SPNs
If Kerberos fails, connections may fall back to NTLM or fail, depending on policy. For the AG endpoint, this can manifest as the secondary being unable to establish the required session to the primary.
The documented guidance for name resolution includes:
- Use
nslookup <ServerName>andnslookup <IP_Address>on each node. - Use
Resolve-DnsName -Name <ServerName>andResolve-DnsName -Name <IP_Address>. - Check the local
HOSTSfile for incorrect entries.
Concrete checks:
- Ensure that the names used in the endpoint URLs resolve to the correct IPs on both nodes.
- Confirm there are no stale or conflicting entries in
C:\Windows\System32\drivers\etc\hosts. - Verify SPNs for the SQL Server service accounts (for example using
setspnor Kerberos tools) and correct any duplicates or missing entries.
4. Error details from DMVs and logs
The documented troubleshooting path for join failures recommends checking the dynamic management view sys.dm_hadr_availability_replica_states:
- Query
last_connect_error_numberand related columns on the secondary to see the exact error code when it attempts to connect to the primary. - Use that error number to focus on the specific cause (network, authentication, endpoint, etc.).
Also review:
- SQL Server error log on both replicas at the time of the join attempt
- Windows Event Logs (FailoverClustering and System) for related errors
5. Version/build issues
The documentation explicitly recommends ensuring SQL Server is on a recent build to avoid known bugs that can affect AG configuration and join operations.
Concrete step:
- Confirm both replicas run the same, up-to-date SQL Server 2022 build and apply the latest cumulative update if needed.
6. Why it can work in workgroup but fail in AD
In the workgroup lab, the configuration likely uses:
- Certificates for endpoint authentication
- Direct IPs or simple names in endpoints and hosts files
This avoids Kerberos, SPNs, and domain account complexities.
In the AD lab, the same high-level steps (backup/restore, AG creation) can succeed while the underlying HADR connection from secondary to primary fails due to:
- Missing or incorrect endpoint
CONNECTpermissions for the domain service accounts - Name resolution differences (FQDN vs short name) causing SPN mismatches
- Kerberos/SPN misconfiguration preventing successful authentication to the endpoint
The AG object and listener can be created by the cluster and SQL Server, but the database-level join requires a working, authenticated endpoint session, which is where AD-related issues surface.
Recommended troubleshooting sequence for the AD lab
- Attempt to join the secondary database and note the exact error message/number.
- On the secondary, query
sys.dm_hadr_availability_replica_statesforlast_connect_error_numberand related fields. - Verify endpoint configuration and state on both replicas (port, IP,
STATE = STARTED). - Test network connectivity to the endpoint port between nodes.
- Validate DNS and hosts file entries using
nslookup,Resolve-DnsName, andtype/Get-Contenton the hosts file. - Confirm SQL Server service accounts and ensure each has
CONNECTpermission on the partner’s endpoint if using different accounts. - Verify SPNs and Kerberos configuration for the SQL Server service accounts.
- Ensure both replicas are on a current SQL Server 2022 build.
These checks target the AD-specific layers (Kerberos, SPNs, domain accounts, name resolution) that do not exist or are handled differently in the workgroup lab and are the most likely reasons the secondary database cannot join and start synchronization.
References: