Speeding up ML Studio batch inference startup time

Question

Speeding up ML Studio batch inference startup time

David-3633 131

I'm trying to use Azure ML Studio batch inference but running into a problem with how slow startup is. I understand that the first startup will be slow (it takes about 8 minutes), since Azure has to provision a new compute instance and pull the docker image. However subsequent inference requests (before the compute instance times out) are still very slow (over 3 minutes). Looking through the logs, this is because it pulls the (fairly large) docker image from the registry every time, without caching it locally, even though the environment and scoring script are identical. How can I solve this?

(While 3 minutes may not seem like too long, I wanted to use fairly small batches, so it is a significant proportion of the time. And I can't use always online endpoints since I have multiple models requiring GPUs, which Azure doesn't allow sharing)

Ramr-msft 17,836 Reputation points

2022-06-20T15:50:00.857+00:00

@David-3633 Thanks for the question. We have forwarded to the product team to check on this.
Ramr-msft 17,836 Reputation points

2022-06-22T03:31:11.763+00:00

@David-3633 Thanks for the feedback, Please share details of your experiment and issue from the ml.azure.com portal for an engineer to lookup the issue for more investigation?. This option is available from the top right hand corner of the portal by clicking the smiley face, Please select the option Microsoft can email you about the feedback along with a screen shot so our service team can lookup and advise through email.
David-3633 131 Reputation points

2022-06-22T13:30:22.663+00:00

Hello @Ramr-msft , thanks for your reply. I've replicated the issue and sent feedback, including the link to this question.