Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
This article describes how to deploy GPU-enabled containerized workloads on a provisioned machine for small form factor deployments of Azure Local.
Containerized workloads establishes your container platform by verifying Docker or installing open-source K3s. This article builds on that foundation to enable NVIDIA GPU acceleration for the workloads that you deploy in module 5.
Docker is supported for single-node GPU workloads. If you want a lightweight Kubernetes environment for orchestrated GPU workloads, you can also use the open-source K3s distribution. To compare these options before you choose one, see Container orchestrators.
Important
This feature is currently in PREVIEW. See the Supplemental Terms of Use for Microsoft Azure Previews for legal terms that apply to Azure features that are in beta, preview, or otherwise not yet released into general availability.
Prerequisites
Before you begin, make sure that you:
- Have a provisioned machine that you can reach over SSH.
- Complete the steps in Connect a provisioned machine from the Azure portal.
- Complete the steps in Run containerized workloads on a provisioned machine.
- Have supported hardware with an NVIDIA GPU installed in the provisioned machine.
- Installed NVIDIA GPU drivers on the host OS.
- Have a Windows PC on the same local network as the provisioned machine.
- Installed and signed into Azure CLI.
- Have internet connectivity available to install packages and pull container images.
If you use Docker, make sure that:
- Docker is already available on the provisioned machine, as described in Run containerized workloads on a provisioned machine.
If you use K3s, make sure that:
- Open-source K3s is installed and running.
kubectlaccess to the K3s cluster is configured, as described in Run containerized workloads on a provisioned machine.
Choose your approach
- Use Docker if you want the fastest way to run a GPU-enabled container on a single device.
- Use K3s if you want Kubernetes APIs,
kubectlworkflows, GPU scheduling, or lightweight orchestration capabilities.
Choose the same container platform that you prepared in Run containerized workloads on a provisioned machine. If you verified Docker, continue with the Docker path in this article. If you installed K3s and configured kubectl, continue with the K3s path.
How GPU-enabled workloads work
GPU-enabled container workloads rely on multiple layers working together correctly.
The following components must be configured:
- NVIDIA GPU drivers
- NVIDIA kernel modules and device nodes
- NVIDIA Container Toolkit
- Container runtime configuration
- GPU-enabled workload configuration
K3s workloads also require:
- NVIDIA Kubernetes device plugin
- Kubernetes RuntimeClass configuration
If any layer is missing or misconfigured, GPU workloads might fail to start or might not detect GPU resources correctly.
Validate NVIDIA GPU access on the host
Confirm that the operating system can detect the NVIDIA GPU.
lspci | grep -i nvidiaExample output:
01:00.0 VGA compatible controller: NVIDIA Corporation DeviceValidate that the NVIDIA drivers are functioning correctly:
nvidia-smiExample output:
+-----------------------------------------------------------------------------+ | NVIDIA-SMI 550.xx.xx | | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | +-----------------------------------------------------------------------------+
Troubleshoot nvidia-smi
If nvidia-smi fails, the NVIDIA kernel modules or device nodes might not be initialized correctly.
Load the required NVIDIA kernel modules:
sudo modprobe nvidia sudo modprobe nvidia_uvmValidate that the NVIDIA device nodes exist:
ls /dev/nvidia*If the device nodes are missing, create them manually:
sudo mknod -m 666 /dev/nvidia0 c 195 0 sudo mknod -m 666 /dev/nvidiactl c 195 255Run the command again:
nvidia-smi
Note
In production environments, NVIDIA device nodes should be managed through proper driver installation and udev rules rather than manual device creation.
Install the NVIDIA Container Toolkit
The NVIDIA Container Toolkit enables containers to access GPU devices from the host system.
Add the NVIDIA repository:
sudo curl -fsSL \ https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo \ -o /etc/yum.repos.d/nvidia-container-toolkit.repoUpdate the repository configuration.
Due to a repository metadata signature validation issue on Azure Linux with
tdnf, update the NVIDIA repository configuration before refreshing package metadata.sudo sed -i 's|^repo_gpgcheck=1|repo_gpgcheck=0|' \ /etc/yum.repos.d/nvidia-container-toolkit.repo sudo sed -i 's|^gpgcheck=0|gpgcheck=1|' \ /etc/yum.repos.d/nvidia-container-toolkit.repoRefresh package metadata:
sudo tdnf clean all sudo tdnf makecacheInstall the toolkit:
sudo tdnf install -y nvidia-container-toolkitVerify the installation:
nvidia-ctk --version
Run a GPU-enabled workload
Docker workloads can access GPUs directly through the NVIDIA container runtime.
Use this path if you followed the Docker workflow in Run containerized workloads on a provisioned machine.
Configure the NVIDIA runtime for Docker:
sudo nvidia-ctk runtime configure --runtime=dockerRestart Docker:
sudo systemctl restart dockerNote
This article uses the NVIDIA CUDA sample image hosted in the NVIDIA GPU Cloud (NGC) catalog: NVIDIA CUDA Sample Container Image.
Run the sample workload:
sudo docker run --rm --gpus all \ nvcr.io/nvidia/k8s/cuda-sample:vectoradd-cuda12.5.0-ubi8Note
If you run Docker commands without
sudo, you may see a permission denied error when connecting to/var/run/docker.sock. Usesudo docker ...for the command, or configure Docker access for the current user.Successful output resembles:
[Vector addition of 50000 elements] Test PASSED DoneThe
Test PASSEDmessage confirms that:- Docker successfully accessed the NVIDIA GPU.
- The NVIDIA runtime was configured correctly.
- The container successfully used the GPU.
Troubleshooting
nvidia-smi fails on the host
Verify that:
- NVIDIA drivers are installed.
- NVIDIA kernel modules are loaded.
/dev/nvidia0and/dev/nvidiactlexist.
GPU resources aren't visible in Kubernetes
Verify that:
- The NVIDIA device plugin is running.
- The NVIDIA runtime exists in the containerd configuration.
- K3s was restarted after runtime changes.
Docker containers can't access the GPU
Verify that:
- Docker was restarted after runtime configuration.
nvidia-container-toolkitis installed.- The
--gpus allflag is specified.
Pods or jobs remain in Pending
This issue usually indicates:
- GPU resources are unavailable.
nvidia.com/gpuisn't allocatable.- The NVIDIA runtime isn't configured correctly.
- The workload requests more GPUs than are available on the node.
Next steps
- Return to Deploy applications to your cluster to choose the workload path that you want to run next.