Storage for HPC in the finance sector
This article provides recommendations for implementing storage in HPC environments for the finance sector. Large-scale HPC workloads in finance environments create demands for data storage and access that exceed the capabilities of traditional cloud file systems.
Design considerations
To decide which storage solution to use, you need to take into account the following application requirements.
- Latency
- IOPS
- Throughput
- File sizes and number
- Job runtime
- Associated costs
- Affinity for storage location: on-premises versus Azure
Design recommendations
Use Standard or Premium Azure Blob Storage for high-throughput, low-latency storage. It offers these benefits:
- It provides exabyte-scale, high-throughput, low-latency access where necessary, a familiar file system, and multi-protocol access (REST, HDFS, NFS).
- It's cost effective.
- You can mount Blob Storage as a file system by using BlobFuse. Doing so makes it easy to allow multiple nodes to mount the same container for read-only scenarios.
- It supports NFS 3.0 at the blob service endpoint for high-throughput, read-heavy workloads.
- You can optimize costs by moving data to cooler tiers via the ability to perform lifecycle management with last update/access time intelligent tiering, with customizable policies.
Use Azure NetApp Files for ReadWriteMany (unique) or write-once, read-once applications. It provides these benefits:
- A wide choice of file protocols (NFSv3, NFSv4.1, SMB3).
- Performance that's comparable with on-premises performance, with multiple tiers (Ultra, Premium, Standard).
- Deploys in minutes and offers a wide range of tiers and flexibility.
- Flexible capacity pool types and performance, where the QoS per volume is automatically assigned based on the tier of the pool and the volume quota.
The following table provides a comparison of Blob Storage, Azure Files, Azure Managed Lustre, and Azure NetApp Files.
Blob Storage | Azure Files | Azure Managed Lustre | Azure NetApp Files | |
---|---|---|---|---|
Use cases | Best suited for large-scale read-heavy sequential access workloads where data is ingested once and minimally modified. Low total cost of ownership, if there's light maintenance. |
A highly available service that's best suited for random access workloads. For NFS shares, Azure Files provides full POSIX file system support. The built-in CSI driver enables you to easily use it from container platforms like Azure Container Instances and Azure Kubernetes Service (AKS), in addition to VM-based platforms. |
Azure Managed Lustre is a fully managed parallel file system best suited to medium to large HPC workloads. Enables HPC applications in the cloud without breaking application compatibility by providing familiar Lustre parallel file system functionality, behaviors, and performance, securing long-term application investments. |
A fully managed file service in the cloud, powered by NetApp, with advanced management capabilities. Azure NetApp Files is suited for workloads that require random access. It provides broad protocol support and improved data protection. |
Available protocols | NFS 3.0 REST Azure Data Lake Storage |
SMB NFS 4.1 (No interoperability between either protocol.) |
Lustre | NFS 3.0 and 4.1 SMB |
Key features | Integrated with Azure HPC Cache for low-latency workloads. Integrated management, including lifecycle management, immutable blobs, data failover, and metadata index. |
Zonally redundant for high availability. Consistent single-digit millisecond latency. Predictable performance and cost that scales with capacity. |
High storage capacity up to 2.5PB. Low (~2ms) latency. Spin up new clusters in minutes. Supports containerized workloads with AKS. |
Extremely low latency (as low as sub-millisecond). Rich NetApp ONTAP management capability, like SnapMirror Cloud. Consistent hybrid cloud experience. |
Performance (per volume) | As much as 20,000 IOPS. As much as 100 GiB/s throughput. | As much as 100,000 IOPS. As much as 80 GiB/s throughput. | As much as 100,000 IOPS, up to 500 GiB/s throughput. | As much as 460,000 IOPS. As much as 36 GiB/s throughput. |
Scale | As much as 2 PiB for a single volume. As much as ~4.75 TiB for a single file. No minimum capacity requirements. |
As much as 100 TiB for a single volume. As much as 4 TiB for a single file. 100 GiB minimum capacity. |
As much as 2.5 PiB for a single volume. As much as 32 PB for a single file. 4 TiB minimum capacity. |
As much as 100 TiB for a single volume. As much as 16 TiB for a single file. Consistent hybrid cloud experience. |
Pricing | Azure Blob Storage pricing | Azure Files pricing | Azure Managed Lustre pricing | Azure NetApp Files pricing |
Next steps
The following articles provide guidance that you might find helpful at various points during your cloud adoption process. They can help you succeed in your cloud adoption scenario for HPC in the finance sector.
- Azure billing offers and Active Directory tenants for finance HPC
- Finance HPC Azure identity and access management
- Management for HPC in the finance sector
- Network topology and connectivity for HPC in the finance sector
- Platform automation and DevOps for HPC in the finance sector
- Resource organization for Azure HPC in the finance sector
- Governance for finance HPC
- Security for HPC in the finance sector
- Azure high-performance computing (HPC) landing zone accelerator