There are a few potential reasons why the Azure VMs may be performing slower than your on-premises machines:
- Virtualization overhead: Virtualization can introduce a performance penalty, but this is usually 5-15%. A factor of 2 slower performance is much higher than expected from virtualization alone, so there may be other factors at play.
- Network performance: If the on-premises machines are connected via infiniband, which is a high-speed interconnect used in HPC environments, this could be a bottleneck for the Azure VMs, which are likely connected via standard ethernet. You can try running your workload on a VM with a faster network connection (e.g. Azure HPC cache or Azure Virtual Machines with SR-IOV support) to see if this improves performance.
- CPU performance: The on-premises machines may have faster CPUs, even though they have an older generation and a slower clock speed. You can try running a benchmark on the on-premises and Azure VMs to compare their CPU performance and see if this could be a factor.
- Software and driver versions: It's also worth checking that you are using the same version of the software and drivers on both the on-premises and Azure VMs. Different versions can have other performance characteristics, so using the same versions on both could help to ensure a fair comparison.
- Other factors: Other factors could be at play, such as differences in the workload or the overall system configuration. It may be helpful to run some additional tests and gather more detailed performance metrics to help identify the root cause of the performance difference.