Windows ML 指南

2025-05-24

本教程逐步讲解如何使用 Windows ML 在 Windows 上运行 ResNet-50 图像分类模型，详细说明模型获取和预处理步骤。实现涉及动态选择执行提供程序以优化推理性能。

ResNet-50 模型是用于图像分类的 PyTorch 模型。

在本教程中，你将从拥抱人脸获取 ResNet-50 模型，并使用 AI 工具包将其转换为 QDQ ONNX 格式。

然后，你将加载模型、准备输入张量，并使用 Windows ML API 运行推理，包括应用 softmax 的后处理步骤，以及检索排名靠前的预测。

获取模型和预处理

可以从 Hugging Face（ML 社区协作处理模型、数据集和应用的平台）获取 ResNet-50 。你将使用 AI 工具包将 ResNet-50 转换为 QDQ ONNX 格式（有关详细信息，请参阅将模型转换为 ONNX 格式）。

此示例代码的目标是利用 Windows ML 运行时执行繁重的工作。

Windows ML 运行时将：

加载模型。
根据需要动态选择由 IHV 提供的模型首选执行提供程序（EP），并按需从 Microsoft 应用商店下载该 EP。
使用 EP 对模型运行推理。

有关 API 参考，请参阅 OrtSessionOptions 和 Microsoft：：Windows：：AI：：MachineLearning：：Infrastructure 类。

// Create a new instance of EnvironmentCreationOptions
EnvironmentCreationOptions envOptions = new()
{
    logId = "ResnetDemo",
    logLevel = OrtLoggingLevel.ORT_LOGGING_LEVEL_ERROR
};

// Pass the options by reference to CreateInstanceWithOptions
OrtEnv ortEnv = OrtEnv.CreateInstanceWithOptions(ref envOptions);

// Use WinML to download and register Execution Providers
Microsoft.Windows.AI.MachineLearning.Infrastructure infrastructure = new();
Console.WriteLine("Ensure EPs are downloaded ...");
await infrastructure.DownloadPackagesAsync();
await infrastructure.RegisterExecutionProviderLibrariesAsync();

//Create Onnx session
Console.WriteLine("Creating session ...");
var sessionOptions = new SessionOptions();
// Set EP Selection Policy
sessionOptions.SetEpSelectionPolicy(ExecutionProviderDevicePolicy.MIN_OVERALL_POWER);

winrt::init_apartment();
// Initialize ONNX Runtime
Ort::Env env(ORT_LOGGING_LEVEL_ERROR, "CppConsoleDesktop");

// Use WinML to download and register Execution Providers
winrt::Microsoft::Windows::AI::MachineLearning::Infrastructure infrastructure;
infrastructure.DownloadPackagesAsync().get();
infrastructure.RegisterExecutionProviderLibrariesAsync().get();

// Set the auto EP selection policy
Ort::SessionOptions sessionOptions;
sessionOptions.SetEpSelectionPolicy(OrtExecutionProviderDevicePolicy_MIN_OVERALL_POWER);

EP 合辑

如果模型尚未为特定的 EP 编译（这可能因设备而异），那么首先需要针对该 EP 进行编译。这是一次性过程。下面的示例代码通过在首次运行时编译模型，然后将其存储在本地来处理它。代码的后续运行会采用编译后的版本并运行它，从而实现优化的快速推理。

有关 API 参考，请参阅 Ort：：ModelCompilationOptions 结构、 Ort：：Status 结构以及 Ort：：CompileModel。

// Prepare paths
string executableFolder = Path.GetDirectoryName(Assembly.GetEntryAssembly()!.Location)!;
string labelsPath = Path.Combine(executableFolder, "ResNet50Labels.txt");
string imagePath = Path.Combine(executableFolder, "dog.jpg");
            
// TODO: Please use AITK Model Conversion tool to download and convert Resnet, and paste the converted path here
string modelPath = @"";
string compiledModelPath = @"";

// Compile the model if not already compiled
bool isCompiled = File.Exists(compiledModelPath);
if (!isCompiled)
{
    Console.WriteLine("No compiled model found. Compiling model ...");
    using (var compileOptions = new OrtModelCompilationOptions(sessionOptions))
    {
        compileOptions.SetInputModelPath(modelPath);
        compileOptions.SetOutputModelPath(compiledModelPath);
        compileOptions.CompileModel();
        isCompiled = File.Exists(compiledModelPath);
        if (isCompiled)
        {
            Console.WriteLine("Model compiled successfully!");
        }
        else
        {
            Console.WriteLine("Failed to compile the model. Will use original model.");
        }
    }
}
else
{
    Console.WriteLine("Found precompiled model.");
}
var modelPathToUse = isCompiled ? compiledModelPath : modelPath;

// Prepare paths for model and labels
std::filesystem::path executableFolder = ResnetModelHelper::GetExecutablePath().parent_path();
std::filesystem::path labelsPath = executableFolder / "ResNet50Labels.txt";
std::filesystem::path dogImagePath = executableFolder / "dog.jpg";

// TODO: use AITK Model Conversion tool to get resnet and paste the path here
std::filesystem::path modelPath = L"";
std::filesystem::path compiledModelPath = L"";
bool isCompiledModelAvailable = std::filesystem::exists(compiledModelPath);

if (isCompiledModelAvailable)
{
    std::cout << "Using compiled model: " << compiledModelPath << std::endl;
}
else
{
    std::cout << "No compiled model found, attempting to create compiled model at " << compiledModelPath
                << std::endl;

    Ort::ModelCompilationOptions compile_options(env, sessionOptions);
    compile_options.SetInputModelPath(modelPath.c_str());
    compile_options.SetOutputModelPath(compiledModelPath.c_str());

    std::cout << "Starting compile, this may take a few moments..." << std::endl;
    Ort::Status compileStatus = Ort::CompileModel(env, compile_options);
    if (compileStatus.IsOK())
    {
        // Calculate the duration in minutes / seconds / milliseconds
        std::cout << "Model compiled successfully!" << std::endl;
        isCompiledModelAvailable = std::filesystem::exists(compiledModelPath);
    }
    else
    {
        std::cerr << "Failed to compile model: " << compileStatus.GetErrorCode() << ", "
                    << compileStatus.GetErrorMessage() << std::endl;
        std::cerr << "Falling back to uncompiled model" << std::endl;
    }
}
std::filesystem::path modelPathToUse = isCompiledModelAvailable ? compiledModelPath : modelPath;

运行推理过程

输入图像转换为张量数据格式，然后在该格式上运行推理。尽管使用 ONNX 运行时的代码通常如此，但本例中的区别在于这是通过 Windows ML 直接使用 ONNX 运行时。唯一的要求是将 #include <win_onnxruntime_cxx_api.h> 添加到代码中。

另请参阅使用 AI Toolkit for VS Code 转换模型

有关 API 参考，请参阅 Ort：：Session 结构、 Ort：：MemoryInfo 结构、 Ort：：Value 结构、 Ort：：AllocatorWithDefaultOptions 结构、 Ort：：RunOptions 结构。

using var session = new InferenceSession(modelPathToUse, sessionOptions);

Console.WriteLine("Preparing input ...");
// Load and preprocess image
var input = await PreprocessImageAsync(await LoadImageFileAsync(imagePath));
// Prepare input tensor
var inputName = session.InputMetadata.First().Key;
var inputTensor = new DenseTensor<float>(
    input.ToArray(),          // Use the DenseTensor<float> directly
    new[] { 1, 3, 224, 224 }, // Shape of the tensor
    false                     // isReversedStride should be explicitly set to false
);

// Bind inputs and run inference
var inputs = new List<NamedOnnxValue>
{
    NamedOnnxValue.CreateFromTensor(inputName, inputTensor)
};

Console.WriteLine("Running inference ...");
var results = session.Run(inputs);
for (int i = 0; i < 40; i++)
{
    results = session.Run(inputs);
}

// Extract output tensor
var outputName = session.OutputMetadata.First().Key;
var resultTensor = results.First(r => r.Name == outputName).AsEnumerable<float>().ToArray();

// Load labels and print results
var labels = LoadLabels(labelsPath);
PrintResults(labels, resultTensor);

Ort::Session session(env, modelPathToUse.c_str(), sessionOptions);
std::cout << "ResNet model loaded"<< std::endl;

// Load and Preprocess image
winrt::hstring imagePath{ dogImagePath.c_str()};
auto imageFrameResult = ResnetModelHelper::LoadImageFileAsync(imagePath);
auto inputTensorData = ResnetModelHelper::BindSoftwareBitmapAsTensor(imageFrameResult.get());

// Prepare input tensor
auto inputInfo = session.GetInputTypeInfo(0).GetTensorTypeAndShapeInfo();
auto inputType = inputInfo.GetElementType();

auto inputShape = std::array<int64_t, 4>{ 1, 3, 224, 224 };
auto memoryInfo = Ort::MemoryInfo::CreateCpu(OrtArenaAllocator, OrtMemTypeDefault);
std::vector<uint8_t> rawInputBytes;

if (inputType == ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT16)
{
    auto converted = ResnetModelHelper::ConvertFloat32ToFloat16(inputTensorData);
    rawInputBytes.assign(reinterpret_cast<uint8_t*>(converted.data()),
        reinterpret_cast<uint8_t*>(converted.data()) + converted.size() * sizeof(uint16_t));
}
else
{
    rawInputBytes.assign(reinterpret_cast<uint8_t*>(inputTensorData.data()),
        reinterpret_cast<uint8_t*>(inputTensorData.data()) +
        inputTensorData.size() * sizeof(float));
}

OrtValue* ortValue = nullptr;

Ort::ThrowOnError(Ort::GetApi().CreateTensorWithDataAsOrtValue(memoryInfo, rawInputBytes.data(),
    rawInputBytes.size(), inputShape.data(),
    inputShape.size(), inputType, &ortValue));
Ort::Value inputTensor{ ortValue };

const int iterations = 20;
std::cout << "Running inference for " << iterations << " iterations" << std::endl;
auto before = std::chrono::high_resolution_clock::now();
for (int i = 0; i < iterations; i++)
{
    //std::cout << "---------------------------------------------" << std::endl;
    //std::cout << "Running inference for " << i + 1 << "th time" << std::endl;
    //std::cout << "---------------------------------------------"<< std::endl;
    std::cout << ".";
    
    // Get input/output names
    Ort::AllocatorWithDefaultOptions allocator;
    auto inputName = session.GetInputNameAllocated(0, allocator);
    auto outputName = session.GetOutputNameAllocated(0, allocator);
    std::vector<const char*> inputNames = {inputName.get()};
    std::vector<const char*> outputNames = {outputName.get()};

    // Run inference
    auto outputTensors =
        session.Run(Ort::RunOptions{nullptr}, inputNames.data(), &inputTensor, 1, outputNames.data(), 1);

    // Extract results
    std::vector<float> results;
    if (inputType == ONNX_TENSOR_ELEMENT_DATA_TYPE_FLOAT16)
    {
        auto outputData = outputTensors[0].GetTensorMutableData<uint16_t>();
        size_t outputSize = outputTensors[0].GetTensorTypeAndShapeInfo().GetElementCount();
        std::vector<uint16_t> outputFloat16(outputData, outputData + outputSize);
        results = ResnetModelHelper::ConvertFloat16ToFloat32(outputFloat16);
    }
    else
    {
        auto outputData = outputTensors[0].GetTensorMutableData<float>();
        size_t outputSize = outputTensors[0].GetTensorTypeAndShapeInfo().GetElementCount();
        results.assign(outputData, outputData + outputSize);
    }

    if (i == iterations - 1)
    {
        // Load labels and print result
        std::cout << "\nOutput for the last iteration"<< std::endl;
        auto labels = ResnetModelHelper::LoadLabels(labelsPath);
        ResnetModelHelper::PrintResults(labels, results);
    }
    inputName.release();
    outputName.release();
}
std::cout << "---------------------------------------------" << std::endl;

后处理

softmax 函数应用于返回的原始输出，标签数据用于映射和打印具有五个最高概率的名称。

private static void PrintResults(IList<string> labels, IReadOnlyList<float> results)
{
    // Apply softmax to the results
    float maxLogit = results.Max();
    var expScores = results.Select(r => MathF.Exp(r - maxLogit)).ToList(); // stability with maxLogit
    float sumExp = expScores.Sum();
    var softmaxResults = expScores.Select(e => e / sumExp).ToList();

    // Get top 5 results
    IEnumerable<(int Index, float Confidence)> topResults = softmaxResults
        .Select((value, index) => (Index: index, Confidence: value))
        .OrderByDescending(x => x.Confidence)
        .Take(5);

    // Display results
    Console.WriteLine("Top Predictions:");
    Console.WriteLine("-------------------------------------------");
    Console.WriteLine("{0,-32} {1,10}", "Label", "Confidence");
    Console.WriteLine("-------------------------------------------");

    foreach (var result in topResults)
    {
        Console.WriteLine("{0,-32} {1,10:P2}", labels[result.Index], result.Confidence);
    }

    Console.WriteLine("-------------------------------------------");
}

void PrintResults(const std::vector<std::string>& labels, const std::vector<float>& results) {
    // Apply softmax to the results  
    float maxLogit = *std::max_element(results.begin(), results.end());
    std::vector<float> expScores;
    float sumExp = 0.0f;

    for (float r : results) {
        float expScore = std::exp(r - maxLogit);
        expScores.push_back(expScore);
        sumExp += expScore;
    }

    std::vector<float> softmaxResults;
    for (float e : expScores) {
        softmaxResults.push_back(e / sumExp);
    }

    // Get top 5 results  
    std::vector<std::pair<int, float>> indexedResults;
    for (size_t i = 0; i < softmaxResults.size(); ++i) {
        indexedResults.emplace_back(static_cast<int>(i), softmaxResults[i]);
    }

    std::sort(indexedResults.begin(), indexedResults.end(), [](const auto& a, const auto& b) {
        return a.second > b.second;
        });

    indexedResults.resize(std::min<size_t>(5, indexedResults.size()));

    // Display results  
    std::cout << "Top Predictions:\n";
    std::cout << "-------------------------------------------\n";
    std::cout << std::left << std::setw(32) << "Label" << std::right << std::setw(10) << "Confidence\n";
    std::cout << "-------------------------------------------\n";

    for (const auto& result : indexedResults) {
        std::cout << std::left << std::setw(32) << labels[result.first]
            << std::right << std::setw(10) << std::fixed << std::setprecision(2) << (result.second * 100) << "%\n";
    }

    std::cout << "-------------------------------------------\n";
}

输出

下面是预期输出类型的示例。

285, Egyptian cat with confidence of 0.904274
281, tabby with confidence of 0.0620204
282, tiger cat with confidence of 0.0223081
287, lynx with confidence of 0.00119624
761, remote control with confidence of 0.000487919

完整代码示例

此处的 GitHub 存储库中提供了完整的代码示例。

通过

Windows ML 指南

获取模型和预处理

EP 合辑

运行推理过程

后处理

输出

完整代码示例

反馈

其他资源