Based on your requirements, Azure Batch is a suitable service for hosting your machine learning model. Azure Batch allows you to run large-scale parallel and high-performance computing (HPC) applications without the need to manage and provision the underlying infrastructure.
Here's how Azure Batch aligns with your needs:
Execution with high RAM requirements: Azure Batch supports the execution of tasks on VMs with different configurations, including VMs with high RAM capacity. You can choose a VM configuration with 32GB or 128GB of RAM based on your needs.
Ad hoc and infrequent requests: Azure Batch is designed for running parallel and batch processing workloads, which aligns well with your infrequent requests. You can submit jobs to Azure Batch when you receive a request, and it will handle the scheduling and execution of those jobs on the specified VMs.
Resource optimization: Azure Batch allows you to specify the number and size of VMs to be provisioned for the execution of your tasks. You can scale up and down based on the RAM requirements and number of requests. After the tasks are completed, you can deallocate the VMs to free up resources and minimize costs.
Cost efficiency: Azure Batch offers cost optimization through on-demand VMs and low-priority VMs (preemptible VMs), which can significantly reduce costs for non-time-critical workloads. By using low-priority VMs, you can take advantage of surplus capacity at a reduced cost.
Considering your requirements and the need for a powerful machine that can be provisioned on-demand, Azure Batch is a suitable choice. It provides the flexibility to scale up resources when needed, run the model for several hours, and then deallocate the resources to minimize costs.