Hi there,
Did you convert to Float 16 using ONNXMLTools ?
For better performance and reduced model footprint, you can use ONNXMLTools to convert your model to float16. Once converted, all weights and inputs are float16.
Also note each instance of LearningModel and LearningModelSession has a copy of the model in memory. If you're working with small models, you might not be concerned, but if you're working with very large models, this becomes important.
Windows ML performance and memory
https://learn.microsoft.com/en-us/windows/ai/windows-ml/performance-memory
I hope this information helps. If you have any questions please let me know and I will be glad to help you out.
---------------------------------------------------------------------------------------------------------------------------------
--If the reply is helpful, please Upvote and Accept it as an answer--