Optimize model with Intel OneAPI on Azure ML

Question

I have already successfully deployed my deep learning model as an online endpoint using the following docs. Now I'm seeking to optimize the inference timings. The VM size is following Standard_DS3_v2 which belongs to DSv2 series. My questions are the following:

Is there any Azure ML curated environment that has the Intel OneAPI build and I can stack my docker context upon it to optimize things?
Are there other ways to optimize the things too?

Answer

Hi there Karnik Kanojia

Thanks for using QandA platform

For optimizing, you can create a custom Docker image with Intel OneAPI and use it as your base. Start with a Dockerfile using the intel/oneapi-basekit image, build and push it to Azure Container Registry, and configure your Azure ML endpoint to use this custom image.

also, try model quantization, pruning, using more powerful VM instances like the NC series, optimizing your inference pipeline,

If this helps kindly accept the response thanks much,

Share via

Optimize model with Intel OneAPI on Azure ML

1 answer

Your answer