Hi there Karnik Kanojia
Thanks for using QandA platform
For optimizing, you can create a custom Docker image with Intel OneAPI and use it as your base. Start with a Dockerfile using the intel/oneapi-basekit
image, build and push it to Azure Container Registry, and configure your Azure ML endpoint to use this custom image.
also, try model quantization, pruning, using more powerful VM instances like the NC series, optimizing your inference pipeline,
If this helps kindly accept the response thanks much,