Using local NVMe disks on Azure Machine Learning (AzureML) managed compute clusters, such as the Standard_NC80adis_H100_v5, is not explicitly supported for formatting and mounting during the node preparation or provisioning phase. The documentation indicates that while these VMs do include ephemeral local NVMe disks, the ability to format and mount these disks for AzureML jobs may not be feasible within the managed compute environment.
- Support for Local NVMe Disks: The local NVMe disks are primarily designed for use in raw Azure VMs where they can be formatted and mounted. In the context of AzureML managed compute clusters, there is no clear guidance on whether these disks can be utilized in the same manner. The typical usage scenario for AzureML managed clusters does not include direct access to local NVMe disks for job execution.
- Access Limitations: Access to NVMe disks is generally more straightforward when using attached VMs or non-AzureML-managed virtual machines. In managed compute clusters, the environment is abstracted, and direct manipulation of the underlying VM disks is limited, which may lead to the filesystem or superblock errors you encountered when attempting to mount the NVMe disks during job execution.
Ultimately, if your goal is to use NVMe disks for fast local scratch storage during training jobs, you may need to consider using non-managed compute instances where you have full control over the VM and can format and mount the NVMe disks as needed.