еhank you for reaching out and sharing your question on the Q&A portal. I appreciate the details you provided.
It sounds like you’re encountering an authentication error when trying to attach the VM, even though both the VM and the ML workspace are in the same private virtual network. Since the virtual network has no internet access, the error suggests the system might still be trying to reach the VM via its public IP, which isn’t what you want.
First, let’s confirm that the VM is properly configured for SSH access within the virtual network. Since you mentioned you can SSH into the machine via VPN using its public IP, the next step is to ensure the ML workspace can communicate with the VM using its private IP. You can find more details about configuring private networks for Azure ML in the Microsoft documentation.
When running the az ml compute attach
command, make sure the SSH key and username are correct. Also, since the VM is in a private network, you might need to explicitly specify the private IP or ensure the command is executed in a way that routes traffic internally. The Azure ML compute attach documentation explains the parameters, but it doesn’t explicitly mention private network usage, so we might need to adjust the approach.
One thing to try is verifying the VM’s network security group (NSG) rules to ensure SSH traffic (port 22) is allowed from the ML workspace’s subnet. You can check NSG settings in the Azure Virtual Networks documentation.
If the issue persists, you could also try using the VM’s private IP directly in the command (if supported).
Let me know if this helps.
Best regards,
Alex
P.S. If my answer help to you, please Accept my answer
PPS That is my Answer and not a Comment
https://ctrlaltdel.blog/
just seen your last comment... it really helps clarify the situation. I understand now that your workspace and VMs are in a private virtual network, and you’re running into SSH issues when trying to attach a new VM, even though it’s in the same virtual network as the workspace.
From what you’ve described, it does seem like Azure ML might be trying to reach the VM over the public internet rather than through the internal Azure backbone network, even though both the workspace and the VM are in the same private virtual network. This would explain why SSH fails despite the VM being reachable from within the network.
Check the VM’s Network Security Group (NSG) Rules Even though the VM is in the same virtual network, the NSG might be blocking SSH traffic from the Azure ML service’s internal IP range. Make sure port 22 (SSH) is allowed from the subnet where the ML workspace is deployed. You can find more about NSG rules in the Microsoft documentation.
Ensure the VM’s Private IP is Used for SSH When attaching the VM via CLI, try explicitly specifying the private IP (if possible) or confirm that the resource-id
parameter correctly points to the VM in the internal network. The az ml compute attach
command should ideally use the private network, but if it defaults to public, we might need a workaround.
Verify the SSH Key and Credentials Double-check that the SSH key and username provided in the command are correct. Sometimes, the key format or permissions can cause issues. The Azure ML compute attach docs mention the --ssh-private-key-file
parameter—ensure the key is in the right format (PEM) and accessible.
Test Internal Connectivity from Another VM If possible, try SSH’ing into the new VM from another VM in the same virtual network (like an existing attached compute instance). This will confirm whether the SSH service itself is working correctly inside the network.
Check Azure ML Service Endpoints Azure ML might use specific service tags or endpoints for internal communication. Ensure these are allowed in your NSG. The list of required service tags might help, even in a private network setup.
Temporary: Public IP with Restricted Access If nothing else works, you could try assigning a public IP to the VM temporarily (with strict NSG rules to only allow Azure ML’s IP range) just for the attach process, then remove the public IP afterward. This is not ideal but might help confirm the behavior.
Since your first VM attached successfully, comparing its NSG rules, subnet configuration, and SSH setup with the new VM might reveal differences.