We have three servers set up as the following
- Server1 : Hyper-V host set up as RD Connection Broker, RD Web Access, RD Gateway, and RD Virtualization Host
- Server2: Storage host set up with a Cluster storage volume for the two Hyper-V hosts to use for central storage
- Server3: Hyper-V host set up as RD Virtualization Host
Both Hyper-V servers have been added to a failover cluster.
On multiple occasions, it seems that either one or both of our Hyper-V hosts lose access to the cluster although they remain members.
For example, when trying to connect to the existing cluster on Server1, I get the following error:
An error occurred trying to display the cluster information.
One or more errors occurred.
Access is denied. (Exception from HRESULT: 0x80070005 (E_ACCESSDENIED))
On the other hand, Server3 connects to the cluster just fine, and shows these errors under Server1:
Event ID 1196 Microsoft-Windows-FailoverClustering
Cluster network name resource 'Cluster Name' failed registration of one or more associated DNS name(s) for the following reason:
DNS bad key.
From the failover cluster manager on Server3, it also shows Server1 as being Up, just like Server3.
The only error that is being logged on Server3 is the following:
Event ID 81 Microsoft-Windows-FailoverClustering-Client
LogExtendedErrorInformation (974): Extended RPC error information:
ProcessID is 7088
System time is: 16834/471/48251 0:0:12480:7516
Generating component is 2
Status is 5
Detection location is 501
Flags is 0
NumberOfParameters is 4
Unicode string: ncacn_ip_tcp
Unicode string: DEVOPS-HYPERV-3
Long val: -1182943054
Long val: 5
Extended RPC error information:
ProcessID is 7088
System time is: 16834/471/48251 0:0:12480:7516
Generating component is 2
Status is 5
Detection location is 1750
Flags is 0
NumberOfParameters is 1
Long val: 5
Another very interesting thing: while Server3 has been added as a Virtualization Host, there is some weird behavior. It shows as a server under the Remote Desktop Services section of the Server Manager, but in the Overview tab, only Server1 is listed in the "Deployment Servers" section. If I try to add Server3 to the deployment servers as a Virtualization Host (like it should already be), I get the following error:
Failed:
Could not get the health information of the server <Server3> in the allocated time.
The user that is configuring/managing these servers is in the following groups:
- Local Administrators
- Hyper-V Administrators
- Remote Desktop Users
- Remote Management Users
The user account is not a Domain Admin, but it does have the following OU permissions:
- Create Computer Objects
- Read All Properties
If I try the command Get-ClusterAccess on Server1, It shows that my user account has full permissions:
IdentityReference AccessControlType ClusterRights
----------------- ----------------- -------------
NT AUTHORITY\SYSTEM Allow Full
NT AUTHORITY\NETWORK SERVICE Allow Read
BUILTIN\Administrators Allow Full
BUILTIN\Storage Replica Administrators Allow Full
DOMAIN\MYUSERACCOUNT Allow Full
NT SERVICE\MSDTC Allow Full
NT SERVICE\VmHostAgent Allow Full
NT SERVICE\smphost Allow Full
Weirdly, if I try running the same command from Server3 (which can connect to the Cluster in the Failover Manager):
PS C:\WINDOWS\system32> Get-ClusterAccess
Get-ClusterAccess : You do not have administrative privileges on the cluster. Contact your network administrator to
request access.
Access is denied
At line:1 char:1
+ Get-ClusterAccess
+ ~~~~~~~~~~~~~~~~~
+ CategoryInfo : AuthenticationError: (:) [Get-ClusterAccess], ClusterCmdletException
+ FullyQualifiedErrorId : ClusterAccessDenied,Microsoft.FailoverClusters.PowerShell.GetClusterAccessCommand
It might of importance to note that Server1 and Server3 belong in the following groups on each server:
- RDS Endpoint Servers
- RDS Management Servers
- RDS Remote Access Servers
Typically, this issue can be fixed by removing both Server1 and Server3 from the domain, adding Server1 back first and then Server3. Sometimes this works on the first try, but other times it takes multiple repetitions of removing and re-adding before the servers can properly access the failover cluster again.
There are also times where Server3 cannot connect to the RD Connection Broker on Server1, but this is usually also related to the failover cluster issue and is fixed when removed and re-added to the domain.
Clearly, there is some sort of communication/permission mismatch between the servers, or between the servers and our domain.
This current environment is a Proof of Concept for a VDI implementation, and these issues are extremely inconvenient. We cannot move forward with an implementation that continuously presents such issues without a clear cause. If anyone can help troubleshoot these issues, it would be greatly appreciated. Otherwise, we'll be forced to look at other options that are more stable and less error-prone.