Float16 vs Float WinML model performance?

venki 6 Reputation points
2022-07-22T06:27:40.617+00:00

As per performance link, it is model is in float16 format, then performance can improve. We converted model from float to float16 to see if there is any reduction in GPU usage.

After this change, we didn't observe any improvement. Model input is TensorFloat16Bit and output is TensorFloat16Bit.

Also the system supports float16, any additional steps we are missing.

Community Center | Not monitored
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. Limitless Technology 39,921 Reputation points
    2022-07-26T07:55:33.453+00:00

    Hi there,

    Did you convert to Float 16 using ONNXMLTools ?

    For better performance and reduced model footprint, you can use ONNXMLTools to convert your model to float16. Once converted, all weights and inputs are float16.

    Also note each instance of LearningModel and LearningModelSession has a copy of the model in memory. If you're working with small models, you might not be concerned, but if you're working with very large models, this becomes important.

    Windows ML performance and memory

    https://learn.microsoft.com/en-us/windows/ai/windows-ml/performance-memory

    I hope this information helps. If you have any questions please let me know and I will be glad to help you out.

    ---------------------------------------------------------------------------------------------------------------------------------

    --If the reply is helpful, please Upvote and Accept it as an answer--

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.