I have found what appears to be an undocumented requirement that in Windows Server 2022 and higher, Remote Desktop services brokers are required to have more vCPU and RAM than the session hosts in the RDS deployment.
I have been able to reproduce this behavior with both Windows Server 2022 and Windows Server 2025 using the marketplace images in Azure.
For an example deployment, I had a session broker VM with 2 vCPU and 8GB of RAM.
One session collection had hosts that had 4 vCPU and 16GB of RAM each.
For testing, i stood up another session collection that had 2 vCPU and 4GB of RAM.
When trying to connect to the session collection where the hosts had 4 vCPU and 16GB of RAM, the connection would fail with three events being logged on the broker. Connecting to the other session collection where the hosts had less RAM than the broker worked fine.
Log Name: Microsoft-Windows-TerminalServices-SessionBroker/Admin
Source: Microsoft-Windows-TerminalServices-SessionBroker
Date: 1/9/2025 11:47:18 AM
Event ID: 802
Task Category: RD Connection Broker processes connection request
Level: Error
Keywords:
User: NETWORK SERVICE
Computer: XXXXXXXXXXXX
Description:
RD Connection Broker failed to process the connection request for user XXXXXXXXX.
Error: Insufficient system resources exist to complete the requested service.
Log Name: Microsoft-Windows-TerminalServices-SessionBroker-Client/Operational
Source: Microsoft-Windows-TerminalServices-SessionBroker-Client
Date: 1/9/2025 11:47:18 AM
Event ID: 1296
Task Category: RD Connection Broker Client processes request from a user
Level: Error
Keywords:
User: NETWORK SERVICE
Computer: XXXXXXXXX
Description:
Remote Desktop Connection Broker Client failed while getting redirection packet from Connection Broker.
User : XXXXXXXXX
Error: Element not found.
Log Name: Microsoft-Windows-TerminalServices-SessionBroker-Client/Operational
Source: Microsoft-Windows-TerminalServices-SessionBroker-Client
Date: 1/9/2025 11:47:18 AM
Event ID: 1306
Task Category: RD Connection Broker Client processes request from a user
Level: Error
Keywords:
User: NETWORK SERVICE
Computer: XXXXXXXXXX
Description:
Remote Desktop Connection Broker Client failed to redirect the user XXXXXXXXXXXX.
Error: NULL
....
The broker had plenty of free CPU and RAM. CPU utilization was around 2%, RAM utilization was around 30%.
I then resized the working session hosts to have 4 vCPU and 16GB of RAM, this caused the connection to fail again with the same events logged on the broker.
I then resized the broker to have 8vCPU and 32GB of RAM.
This did not immediately resolve the problem, however if i removed the session hosts from the deployment and added them back in again, all was fine.
As if once it gets into this "bad" state you need to remove the hosts and add them back in again to fix it.
This limitation is not documented, and frankly it seems crazy.
If i need to run session hosts with 64GB of RAM, i need to pay for broker VMs in Azure with 128GB of RAM even though i only really need 3GB for the actual workload on the brokers?
Is this some kind of an evil plot by Microsoft to get people to pay for unnecessarily large VMs in Azure or give up on running RDS and move to Azure Virtual Desktop?