Share via


Best practices for virtual machine deployments on OpenShift Virtualization

This document provides guidance for optimizing performance and cost efficiency when deploying virtual machines (VMs) using OpenShift Virtualization on Azure Red Hat OpenShift. This guidance also addresses any concerns around application performance and provides actionable steps for successful deployment.

Approach to optimization

Note

GPU-dependent workloads are currently not supported on OpenShift Virtualization on Azure. Plan your deployments accordingly.

Optimizing VM deployments begins with understanding your application workloads and aligning infrastructure choices accordingly. OpenShift Virtualization deployment on Azure Boost machines, the cluster's worker nodes, introduces architectural overhead compared to native VM or pod deployments. Planning for capacity and performance should account for this overhead.

Workload identification

Before provisioning VMs, categorize your workloads to determine their performance and resource requirements. Common workload types include:

  • General purpose: Web servers, application servers, content management systems.
  • Database: Relational and NoSQL databases requiring consistent IOPS and memory.
  • Real-time analytics: Low-latency data processing, operational dashboards.
  • AI/ML: Compute-intensive workloads requiring high CPU/GPU and memory.
  • Data streaming & messaging: High-throughput, low-latency event-driven architectures.
  • Batch processing: Periodic or on-demand jobs processing large data volumes.
  • High-performance computing (HPC): Scientific simulations, financial modeling.
  • Edge and IoT: Aggregating and processing data from distributed sensors.
  • Media processing: Video encoding/decoding, image transformation, streaming.
  • Dev/Test environments: Temporary environments for development and testing.

Each workload type has unique characteristics that influence VM sizing, storage configuration, and performance tuning strategies.

Right sizing your application Workloads

Key considerations for right sizing

  • Minimum core requirement: OpenShift Virtualization requires a minimum of eight (8) core Azure VMs for OpenShift worker nodes.
  • Architectural overhead: Performance might vary depending on the architectural decisions taken while configuring the environment, including instance types, storage, and network characteristics.
  • Scaling out: For demanding workloads, scaling out your Azure Red Hat OpenShift cluster by adding more nodes can help overcome resource contention and maintain throughput.
  • Benchmark your workloads: Avoid relying solely on on-premises sizing references; benchmark your own workloads to inform right sizing.
  • Cost factors: Consider Azure compute costs, OpenShift licensing, VM licensing, and scalability requirements.

Right sizing ensures that your VMs are provisioned with adequate resources to meet performance goals without overprovisioning. This process is critical in cloud environments, where resource efficiency directly impacts cost and performance.

Steps to right size workloads

  1. Define health metrics

    • CPU Utilization: Target 60–70% average usage.
    • Memory Pressure: Monitor swap usage, memory saturation, and page faults.
    • IO Strain: Measure disk latency, throughput, and queue depth.
  2. Set up monitoring

    • Use Prometheus and Grafana for real-time metric collection and visualization.
    • Enable KubeVirt metrics for VM-level insights.
    • Integrate with Azure Monitor, via Azure Arc, to correlate infrastructure-level metrics with application performance.
  3. Analyze historical data

    • Review performance trends over time.
    • Identify peak usage periods and resource saturation events.
    • Use historical baselines to guide future autoscaling decisions.
  4. Adjust VM specifications

  5. Test and validate

    • Perform load testing using tools like Apache JMeter, Locust, or stress-ng.
    • Validate against defined health metrics and performance targets.
    • Iterate on configuration changes and retest to confirm improvements.

Fine tuning your environment

Fine tuning your OpenShift Virtualization environment is essential to achieving optimal performance, especially for demanding workloads. The following best practices are derived from extensive benchmarking and real-world experience on Azure Boost VM series (Dsv5/Dsv6).

Performance optimization strategies

  • Scale out or up for demanding workloads: Add more nodes or upsize the nodes in your Azure Red Hat OpenShift cluster for high concurrency or resource-intensive applications.
  • Avoid strict resource limits: Set only guest memory for VMs; avoid strict resource limits unless required for governance.
  • Tune storage and network configurations: Select storage solutions and performance tiers that match your workload needs. For network-intensive workloads, tune settings such as NAPI and multiqueue, and monitor throughput and latency.
  • Monitor and benchmark regularly: Use Prometheus, Grafana, and Azure Monitor to track key metrics. Benchmark your own workloads to validate performance and guide further tuning.
  • Expect architectural overhead: Plan capacity and set expectations accordingly, especially for workloads with high I/O or network demands.

VM overcommit tuning

OpenShift Virtualization operator allows you to adjust CPU and memory overcommit ratios, letting you allocate more virtual resources than physically available. This change can improve density and resource utilization but might increase contention and affect performance.

Best practices for overcommit tuning:

  • Use conservative overcommit for production workloads.
  • Consider higher overcommit for dev/test environments.
  • Monitor resource usage and adjust ratios as needed.

For more information, see Configuring higher VM workload density

Best practices based on benchmarking

  • Database workloads: Avoid setting both resource requests and limits for VMs. Monitor performance closely when using fast storage and high concurrency. Scale out cluster nodes for large database deployments.
  • Network workloads: Tune network settings for optimal throughput. Scale out as needed to achieve the desired network throughput.

Storage solution tuning

  • OpenShift Data Foundation (ODF): Use SSD-backed storage for low-latency access. Configure replication and erasure coding policies based on workload needs. To prevent competition for your application compute resources, consider creating a separate worker pool for ODF with smaller Azure VM sizes, Ds16v5 is a good starting point, and use taints/tolerations to ensure ODF is the only workload scheduled there. Monitor storage performance and adjust replication factors as needed.
  • Azure NetApp Files (ANF): Choose performance tiers based on IOPS and throughput requirements. Ensure proper mount options and network configuration for optimal performance. Use volume snapshots and backups to support data protection and recovery strategies.

OpenShift Virtualization for Azure Red Hat OpenShift.