Float16 vs Float WinML model performance?

Question

Float16 vs Float WinML model performance?

venki 6

As per performance link, it is model is in float16 format, then performance can improve. We converted model from float to float16 to see if there is any reduction in GPU usage.

After this change, we didn't observe any improvement. Model input is TensorFloat16Bit and output is TensorFloat16Bit.

Also the system supports float16, any additional steps we are missing.

1 answer

Your answer

Answer 1

Hi there,

Did you convert to Float 16 using ONNXMLTools ?

For better performance and reduced model footprint, you can use ONNXMLTools to convert your model to float16. Once converted, all weights and inputs are float16.

Also note each instance of LearningModel and LearningModelSession has a copy of the model in memory. If you're working with small models, you might not be concerned, but if you're working with very large models, this becomes important.

Windows ML performance and memory

https://learn.microsoft.com/en-us/windows/ai/windows-ml/performance-memory

I hope this information helps. If you have any questions please let me know and I will be glad to help you out.

---------------------------------------------------------------------------------------------------------------------------------

--If the reply is helpful, please Upvote and Accept it as an answer--

Share via

Float16 vs Float WinML model performance?

1 answer

Your answer