Deploy model to NVIDIA Triton Inference Server

Module
6 Units

Intermediate

AI Engineer

Data Scientist

Azure

Azure Machine Learning

NVIDIA Triton Inference Server is a multi-framework, open-source software that is optimized for inference. It supports popular machine learning frameworks like TensorFlow, ONNX Runtime, PyTorch, NVIDIA TensorRT, and more. It can be used for your CPU or GPU workloads. In this module, you'll deploy your production model to NVIDIA Triton server to perform inference on a cloud-hosted virtual machine.

Learning objectives

In this module, you'll learn how to:

Create an NVIDIA GPU Accelerated Virtual Machine
Configure NVIDIA Triton Inference Server and related prerequisites
Execute an inference workload on NVIDIA Triton Inference Server

Prerequisites

Azure Free Trial Account

Introduction min
Create a GPU Accelerated Virtual Machine min
Install prerequisites and NVIDIA Triton Inference Server min
Execute inference workload on NVIDIA Triton Inference Server min
Knowledge check min
Summary min