Get Started with DirectML

2025-02-10

Pairing DirectML with the ONNX Runtime is often the most straightforward way for many developers to bring hardware-accelerated AI to their users at scale. These three steps are a general guide for using this powerful combo.

1. Convert

The ONNX format enables you to leverage ONNX Runtime with DirectML, which provides cross-hardware capabilities.

To convert your model to the ONNX format, you can utilize ONNXMLTools or Olive.

2. Optimize

Once you have an .onnx model, leverage Olive powered by DirectML to optimize your model. You'll see dramatic performance improvements that you can deploy across the Windows hardware ecosystem.

3. Integrate

When your model is ready, it's time to bring hardware-accelerated inferencing to your app with ONNX Runtime and DirectML. For Generative AI models, we recommend you use the ONNX Runtime Generate() API

We built some samples to show how you can use DirectML and the ONNX Runtime:

DirectML and PyTorch

The DirectML backend for Pytorch enables high-performance, low-level access to the GPU hardware, while exposing a familiar Pytorch API for developers. More information on how to use PyTorch with DirectML can be found here

DirectML for web applications (Preview)

The Web Neural Network API (WebNN) is an emerging web standard that allows web apps and frameworks to accelerate deep neural networks with on-device hardware such as GPUs, CPUs, or purpose-built AI accelerators such as NPUs. The WebNN API leverages the DirectML API on Windows to access the native hardware capabilities and optimize the execution of neural network models. For more information on WebNN can be found here