Torch Model + Inference script in a container to Azure Kubernetes

Pedrojfb 41

Hello everyone,

I am somewhat inexperienced with cloud services and am currently trying to find the optimal way to deploy, to Azure, an ML model and inference script that I have developed locally.

I have everything running on a container which has the following process:

Model fetches data from the DB server -> If there are new sources of data -> Creates a new
thread to constantly perform inference on that particular source of data and constantly
put the results on another server

I have successfully deployed this container to both Container Instances, Container Apps (to experiment and verify that it works) and as a Pod inside a Kubernetes cluster. The main idea would be to run it inside Kubernetes and scale it the more data sources it needs to process.

My question is: is this the best approach for this scenario, deploying the container as a single image application and scaling it from there?

Thanks for your help in advance!

vipullag-MSFT 24,441 Reputation points

2022-12-06T15:46:44.377+00:00

@Pedrojfb

Apologies for delayed response on this.

Can you please elaborate a bit more on your ask.

"is this the best approach for this scenario, deploying the container as a single image application and scaling it from there?"

scaling from where? and scaling manually?
vipullag-MSFT 24,441 Reputation points

2022-12-20T15:33:59.05+00:00

@Pedrojfb

Any update on this?
Pedrojfb 41 Reputation points

2022-12-21T23:59:22.49+00:00

I'm also sorry for the delayed response.

The main question was mostly regards of whether the decision of having the model +inference script altogether inside a container is the best approach when compared to maybe using the Machine Learning Azure services (which I am not very familiar with).

So far I think the solution I described (model + script inside the container) works well. I've created a Kubernetes service and I've already setup the auto-scaler based on CPU Usage to create more replicas of this container. I have yet to perform more tests tho, especially when it comes to which VM to choose for nodes, how many vCPUs and how much memory I should reserve for each POD, etc...