Provisioning a virtual machine (VM) in Azure requires additional components besides the VM itself, including networking and storage resources. This article shows best practices for running a secure Windows VM on Azure.
Architecture
Download a Visio file of this architecture.
Workflow
Resource group
A resource group is a logical container that holds related Azure resources. In general, group resources based on their lifetime and who will manage them.
Put closely associated resources that share the same lifecycle into the same resource group. Resource groups allow you to deploy and monitor resources as a group and track billing costs by resource group. You can also delete resources as a set, which is useful for test deployments. Assign meaningful resource names to simplify locating a specific resource and understanding its role. For more information, see Recommended Naming Conventions for Azure Resources.
Virtual machine
You can provision a VM from a list of published images, or from a custom managed image or virtual hard disk (VHD) file uploaded to Azure Blob storage.
Azure offers many different virtual machine sizes. For more information, see Sizes for virtual machines in Azure. If you're moving an existing workload to Azure, start with the VM size that's the closest match to your on-premises servers. Then measure the performance of your actual workload in terms of CPU, memory, and disk input/output operations per second (IOPS), and adjust the size as needed.
Generally, choose an Azure region that is closest to your internal users or customers. Not all VM sizes are available in all regions. For more information, see Services by region. For a list of the VM sizes available in a specific region, run the following command from the Azure CLI:
az vm list-sizes --location <location>
For information about choosing a published VM image, see Find Windows VM images.
Disks
For best disk I/O performance, we recommend Premium Storage, which stores data on solid-state drives (SSDs). Cost is based on the capacity of the provisioned disk. IOPS and throughput also depend on disk size, so when you provision a disk, consider all three factors (capacity, IOPS, and throughput). Premium storage also features free bursting, combined with an understanding of workload patterns, offers an effective SKU selection and cost optimization strategy for IaaS infrastructure, enabling high performance without excessive over-provisioning and minimizing the cost of unused capacity.
Managed disks simplify disk management by handling the storage for you. Managed disks don't require a storage account. You specify the size and type of disk and it's deployed as a highly available resource. Managed disks also offer cost optimization by providing desired performance without the need for over-provisioning, accounting for fluctuating workload patterns, and minimizing unused provisioned capacity.
The OS disk is a VHD stored in Azure Storage, so it persists even when the host machine is down. We also recommend creating one or more data disks, which are persistent VHDs used for application data.
Ephemeral disks provide good performance at no extra cost, but come with the significant drawbacks of being non-persistent, having limited capacity, and being restricted to OS and temp disk use only. When possible, install applications on a data disk, not the OS disk. Some legacy applications might need to install components on the C: drive; in that case, you can resize the OS disk using PowerShell.
The VM is also created with a temporary disk (the D:
drive on Windows). This disk is stored on a physical drive on the host machine. It is not saved in Azure Storage and may be deleted during reboots and other VM lifecycle events. Use this disk only for temporary data, such as page or swap files.
Network
The networking components include the following resources:
Virtual network. Every VM is deployed into a virtual network that can be segmented into multiple subnets.
Network interface (NIC). The NIC enables the VM to communicate with the virtual network. If you need multiple NICs for your VM, a maximum number of NICs is defined for each VM size.
Public IP address. A public IP address is needed to communicate with the VM — for example, via Remote Desktop Protocol (RDP). The public IP address can be dynamic or static. The default is dynamic.
- Reserve a static IP address if you need a fixed IP address that doesn't change — for example, if you need to create a DNS 'A' record or add the IP address to a safe list.
- You can also create a fully qualified domain name (FQDN) for the IP address. You can then register a CNAME record in DNS that points to the FQDN. For more information, see Create a fully qualified domain name in the Azure portal.
Network security group (NSG). Network security groups are used to allow or deny network traffic to VMs. NSGs can be associated either with subnets or with individual VM instances.
- All NSGs contain a set of default rules, including a rule that blocks all inbound Internet traffic. The default rules can't be deleted, but other rules can override them. To enable Internet traffic, create rules that allow inbound traffic to specific ports — for example, port 80 for HTTP. To enable RDP, add an NSG rule that allows inbound traffic to TCP port 3389.
Azure NAT Gateway. Network Address Translation (NAT) gateways allow all instances in a private subnet to connect outbound to the internet while remaining fully private. Only packets that arrive as response packets to an outbound connection can pass through a NAT gateway. Unsolicited inbound connections from the internet aren't permitted.
Azure Bastion. Azure Bastion is a fully managed platform as a service solution that provides secure access to VMs via private IP addresses. With this configuration, VMs don't need a public IP address that exposes them to the internet, which increases their security posture. Azure Bastion provides secure RDP or Secure Shell (SSH) connectivity to your VMs directly over Transport Layer Security (TLS) through various methods, including the Azure portal or native SSH or RDP clients.
Operations
Diagnostics. Enable monitoring and diagnostics, including basic health metrics, diagnostics infrastructure logs, and boot diagnostics. Boot diagnostics can help you diagnose boot failure if your VM gets into a non-bootable state. Create an Azure Storage account to store the logs. A standard locally redundant storage (LRS) account is sufficient for diagnostic logs. For more information, see Enable monitoring and diagnostics.
Availability. Your VM might be affected by planned maintenance or unplanned downtime. You can use VM reboot logs to determine whether a VM reboot was caused by planned maintenance. For higher availability, deploy multiple VMs in an availability set or across availability zones in a region. Both of these configurations provide a higher service-level agreement (SLA).
Backups To protect against accidental data loss, use the Azure Backup service to back up your VMs to geo-redundant storage. Azure Backup provides application-consistent backups.
Stopping a VM. Azure makes a distinction between "stopped" and "deallocated" states. You are charged when the VM status is stopped, but not when the VM is deallocated. In the Azure portal, the Stop button deallocates the VM. If you shut down through the OS while logged in, the VM is stopped but not deallocated, so you will still be charged.
Deleting a VM. If you delete a VM, you have the option to delete or keep its disks. That means you can safely delete the VM without losing data. However, you will still be charged for the disks. You can delete managed disks just like any other Azure resource. To prevent accidental deletion, use a resource lock to lock the entire resource group or lock individual resources, such as a VM.
Considerations
These considerations implement the pillars of the Azure Well-Architected Framework, which is a set of guiding tenets that can be used to improve the quality of a workload. For more information, see Microsoft Azure Well-Architected Framework.
Cost optimization
Cost optimization is about looking at ways to reduce unnecessary expenses and improve operational efficiencies. For more information, see Overview of the cost optimization pillar.
There are various options for VM sizes depending on the usage and workload. The range includes most economical option of the Bs-series to the newest GPU VMs optimized for machine learning. For information about the available options, see Azure Windows VM pricing.
For predictable workloads, use Azure Reservations and Azure savings plan for compute with a one-year or three-year contract and receive significant savings off pay-as-you-go prices. For workloads with no predictable time of completion or resource consumption, consider the Pay as you go option.
Use Azure Spot VMs to run workloads the can be interrupted and do not require completion within a predetermined timeframe or an SLA. Azure deploys Spot VMs if there is available capacity and evicts when it needs the capacity back. Costs associated with Spot virtual machines are significantly lower. Consider Spot VMs for these workloads:
- High-performance computing scenarios, batch processing jobs, or visual rendering applications.
- Test environments, including continuous integration and continuous delivery workloads.
- Large-scale stateless applications.
Use the Azure Pricing Calculator to estimates costs.
For more information, see the cost section in Microsoft Azure Well-Architected Framework.
Security
Security provides assurances against deliberate attacks and the abuse of your valuable data and systems. For more information, see Overview of the security pillar.
Use Microsoft Defender for Cloud to get a central view of the security state of your Azure resources. Defender for Cloud monitors potential security issues and provides a comprehensive picture of the security health of your deployment. Defender for Cloud is configured per Azure subscription. Enable security data collection as described in Onboard your Azure subscription to Defender for Cloud Standard. When data collection is enabled, Defender for Cloud automatically scans any VMs created under that subscription.
Patch management. If enabled, Defender for Cloud checks whether any security and critical updates are missing. Use Group Policy settings on the VM to enable automatic system updates.
Antimalware. If enabled, Defender for Cloud checks whether antimalware software is installed. You can also use Defender for Cloud to install antimalware software from inside the Azure portal.
Access control. Use Azure role-based access control (Azure RBAC) to control access to Azure resources. Azure RBAC lets you assign authorization roles to members of your DevOps team. For example, the Reader role can view Azure resources but not create, manage, or delete them. Some permissions are specific to an Azure resource type. For example, the Virtual Machine Contributor role can restart or deallocate a VM, reset the administrator password, create a new VM, and so on. Other built-in roles that may be useful for this architecture include DevTest Labs User and Network Contributor.
Note
Azure RBAC does not limit the actions that a user logged into a VM can perform. Those permissions are determined by the account type on the guest OS.
Audit logs. Use audit logs to see provisioning actions and other VM events.
Data encryption. Use Azure Disk Encryption if you need to encrypt the OS and data disks.
Operational excellence
Operational excellence covers the operations processes that deploy an application and keep it running in production. For more information, see Overview of the operational excellence pillar.
Use infrastructure as Code (IaC) either by using a single Azure Resource Manager template for provisioning the Azure resources (declarative approach) or by using a single PowerShell script (imperative approach).Because all the resources are in the same virtual network, they're isolated in the same basic workload. That makes it easier to associate the workload's specific resources to a DevOps team, so that the team can independently manage all aspects of those resources. This isolation enables the DevOps Team and Services to perform continuous integration and continuous delivery (CI/CD).
Also, you can use different Azure Resource Manager templates and integrate them with Azure DevOps Services to provision different environments in minutes, for example to replicate production like scenarios or load testing environments only when needed, saving cost.
Consider using the Azure Monitor to Analyze and optimize the performance of your infrastructure, Monitor and diagnose networking issues without logging into your virtual machines.
Next steps
- To create a Windows VM, see Quickstart: Create a Windows virtual machine in the Azure portal
- To install NVIDIA drivers on a Windows VM, see Install NVIDIA GPU drivers on N-series VMs running Windows
- To install AMD drivers on a Windows VM, see Install AMD GPU drivers on N-series VMs running Windows
- To provision a Windows VM, see Create and Manage Windows VMs with Azure PowerShell
- Default outbound access in Azure