This article is a solution idea. If you'd like us to expand the content with more information, such as potential use cases, alternative services, implementation considerations, or pricing guidance, let us know by providing GitHub feedback.
This article presents a solution for real-time inferencing on Azure Kubernetes Service (AKS).
Download a Visio file of this architecture.
- A machine learning model is packaged into a container and published to Azure Container Registry.
- Azure Blob Storage hosts training data sets and the trained model.
- Kubeflow is used to deploy training jobs to AKS, including parameter servers and worker nodes.
- Kubeflow is used to make a production model available. This step promotes a consistent environment across testing, control, and production.
- AKS supports GPU-enabled VMs.
- Developers build features to query the model that runs in an AKS cluster.
- Blob Storage is a service that's part of Azure Storage. Blob Storage offers optimized cloud object storage for large amounts of unstructured data.
- Container Registry builds, stores, and manages container images and can store containerized machine learning models.
- AKS is a highly available, secure, and fully managed Kubernetes service. AKS makes it easy to deploy and manage containerized applications.
- Machine Learning is a cloud-based environment that you can use to train, deploy, automate, manage, and track machine learning models. You can use the models to forecast future behavior, outcomes, and trends.
AKS is useful when you need high-scale production deployments of your machine learning models. A high-scale deployment involves a fast response time, autoscaling of the deployed service, and logging. For more information, see Deploy a model to an Azure Kubernetes Service cluster.
This solution uses Kubeflow to manage the deployment to AKS. The machine learning models run on AKS clusters that are backed by GPU-enabled virtual machines (VMs).
Potential use cases
This solution applies to scenarios that use AKS and GPU-enabled VMs for machine learning. Examples include:
- Image classification systems.
- Natural language processing algorithms.
- Predictive maintenance systems.
- What is Azure Machine Learning?
- Azure Kubernetes Service (AKS)
- Deploy a model to an Azure Kubernetes Service cluster
- Kubeflow on Azure
- What is Azure Blob Storage?
- Introduction to container registries in Azure