Hi
We’re implementing a custom solution using the Azure Automation Account and Hybrid Runbook Worker nodes (on-premises) in a spoke Subscription / VNet as per the architecture below:

Basically, we have a hub VNet (where we got no control and do not want to deploy any service that belongs to our solution) where the ExpressRoute circuit is terminating from on-prem.
The Automation account needs to talk to on-prem services like Microsoft SQL DB and AD DS for which Hybrid Runbook Workers have been placed. The Automation Account is secured behind Private Endpoint and had Public Endpoint completely disabled.
To enable Hybrid Runbook Worker nodes to be able to resolve the Private Endpoints (FQDNs of the JRDS and Agentsvc services) of the Automation Account, following measures have been employed:
- Deployed DNS Private Resolver service with an Inbound Endpoint within a dedicated subnet (as it needs one) in the spoke VNet / Subnet (Subnet02). I’ve read some blogs that confirm this is a valid architecture and works – any additional configuration (DNS zone linking etc.) required on Hub VNet?
- Configured two-way VNet Peering between the Hub and Spoke VNets
- Configured a DNS Conditional Forwarded in the on-premises AD-integrated DNS server as follows:
- DNS Domain Name: privatelink.azure-automation.net
- Forwarding IP: Subnet02 IP that’s assigned to Inbound Endpoint of DNS Private Resolver service.
- Ensured that the on-premises subnet ranges for the Domain Controllers (with DNS) and Hybrid Runbook Workers are advertised to Azure over the ExpressRoute Private Peering connection.
- Azure Arc agent deployed on the VM (Azure Arc Private Link Scope is not used - is this required in this scenario?)
- Appropriate Firewall rules are in-place to allow the following:
Allow DNS (TCP and UDP 53) traffic from on-prem AD-integrated DNS to entire Subnet02 (where the DNS Private Resolver’s Inbound Endpoint is deployed). Do we need to allow DNS traffic for the whole Azure IP space, including the Hub VNet address space?
Allowed outbound 443 to the Global URL of the Automation Account, i.e., *.azure-automation.net
This is as per the article: https://learn.microsoft.com/en-us/azure/automation/extension-based-hybrid-runbook-worker-install?tabs=windows%2Cbicep-template
There’s no proxy on the way from on-prem to Azure along the ExpressRoute to intercept the outbound TCP 443.
The issue I’m seeing is that when I push the Extension-based (V2) Hybrid Runbook Worker agent on the nodes from the Azure portal, it fails with this error:
VERBOSE: [2023-05-17 05:04:56Z] Invoking HybridWorkerService Enable ...
WARNING: Error while reaching the Hybrid Worker server. Retrying it for : 1 time after waiting for 6 seconds
WARNING: Error while reaching the Hybrid Worker server. Retrying it for : 2 time after waiting for 12 seconds
WARNING: Error while reaching the Hybrid Worker server. Retrying it for : 3 time after waiting for 24 seconds
WARNING: Error while reaching the Hybrid Worker server. Retrying it for : 4 time after waiting for 48 seconds
WARNING: Error while reaching the Hybrid Worker server. Retrying it for : 5 time after waiting for 96 seconds
VERBOSE: [2023-05-17 05:09:04Z] Error encountered handling extension configuration...
VERBOSE: [2023-05-17 05:09:04Z] [ERROR] {"Message":"Authentication failed for private links"}
VERBOSE: [2023-05-17 05:09:04Z] {
"Exception": {
"Message": "{\"Message\":\"**Authentication failed for private links\**"}",
"Data": {
},
"InnerException": null,
"TargetSite": {
"Name": "ExtractErrorMessageAndThrow",
"DeclaringType":
ridRegistration.ExtensionWorker.ExtensionWorkerConnectUtil",
"ReflectedType":
"HybridRegistration.ExtensionWorker.ExtensionWorkerConnectUtil",
"MemberType": 8,
"MetadataToken": 100663378,
"Module": "HybridRegistration.ExtensionWorker.dll",
"IsSecurityCritical": true,
"IsSecuritySafeCritical": false,
"IsSecurityTransparent": false,
"MethodHandle":
"Attributes": 134,
"CallingConvention": 33,
"ReturnType": "void",
"ReturnTypeCustomAttributes": "Void ",
"ReturnParameter": "Void ",
"IsGenericMethod": false,
"IsGenericMethodDefinition": false,
"ContainsGenericParameters": false,
"MethodImplementationFlags": 0,
"IsPublic": true,
"IsPrivate": false,
"IsFamily": false,
"IsAssembly": false,
"IsFamilyAndAssembly": false,
"IsFamilyOrAssembly": false,
"IsStatic": false,
"IsFinal": false,
"IsVirtual": false,
"IsHideBySig": true,
"IsAbstract": false,
"IsSpecialName": false,
"IsConstructor": false,
"CustomAttributes": ""
},
"StackTrace": " at
HybridRegistration.ExtensionWorker.ExtensionWorkerConnectUtil.ExtractErrorMessageAndThrow(AggregateException
exception)\r\n at HybridRegistration.ExtensionWorker.ExtensionWorkerConnectUtil.RegisterGroupInDatabase(String
token)\r\n at HybridRegistration.ExtensionWorker.ExtensionWorkerConnectUtil.Connect(Uri endpoint, Int32 timeout,
When I do a nslookup on the FQDNs of the Private Endpoints (Automation Account) from the Hybrid Runbook Worker nodes (that are using the AD-integrated DNS servers as their DNS), I receive the correct response (Private IPs).
If I enable the Public Endpoint of the Automation Account, the agent installation succeeds.
Even after enabling the Private Endpoint for the Automation Account, the extension-based Hybrid Runbook Worker agent is trying to connect to Public Endpoints and failing - confirmed this through logs at the Firewall.
Checked the 0.settings file on the worker node (on-premises) inside the directory C:\Packages\Plugins\Microsoft.Azure.Automation.HybridWorker.HybridWorkerForWindows<version>\RuntimeSettings and this is what it shows:

Tracert from on-prem to Azure goes as far as the Microsoft Edge. No issue reported by the Network Watcher tests (Connection troubleshoot and IP flow verify) on Azure side. I deployed a test machine in the same VNet (Subnet01 – different subnet) where the Private DNS Resolver is located and the VM can resolve the Automation Account’s Private Endpoint FQDN to it’s private IP.
Any pointers / guidance will be highly appreciated.
Thanks