Tag not monitored by Microsoft.
Speeding up ML Studio batch inference startup time
I'm trying to use Azure ML Studio batch inference but running into a problem with how slow startup is. I understand that the first startup will be slow (it takes about 8 minutes), since Azure has to provision a new compute instance and pull the docker image. However subsequent inference requests (before the compute instance times out) are still very slow (over 3 minutes). Looking through the logs, this is because it pulls the (fairly large) docker image from the registry every time, without caching it locally, even though the environment and scoring script are identical. How can I solve this?
(While 3 minutes may not seem like too long, I wanted to use fairly small batches, so it is a significant proportion of the time. And I can't use always online endpoints since I have multiple models requiring GPUs, which Azure doesn't allow sharing)