Hello Dragos,
Thanks for reaching out to us, there is no official documents shared for this topic in Azure but I think you can check with ray serve to see if there is any example from their end.
Use case - https://docs.ray.io/en/latest/ray-overview/use-cases.html
Examples - https://docs.ray.io/en/latest/ray-overview/examples.html
Discussion forum - https://discuss.ray.io/?_gl=111dr9eq_gcl_au*MTE4MTA1ODc1MC4xNzI0ODA4OTI2
General to answer this question, yes, it is possible to deploy a custom image with Ray Serve for model parallelization for online inference. Ray Serve is designed to handle complex model serving scenarios, including model parallelization, and can be deployed using a custom Docker image.
- Create a Custom Docker Image
First, you need to create a Docker image that includes your model, Ray Serve, and any other dependencies required for inference.
- Define Your Ray Serve Application
Ray Serve allows you to deploy models with parallelization and scaling capabilities. Create a Python script to define your Ray Serve application.
You can deploy your Ray Serve application using the custom Docker image you created. Follow these steps to deploy it:
Using Ray’s Docker Deployment:
- Push Your Docker Image: Push your Docker image to a container registry (e.g., Docker Hub). When launching a Ray cluster, you can specify the Docker image to use.
- Access and Customize the API
Ray Serve exposes HTTP endpoints for your model deployments. You can interact with these endpoints using standard HTTP requests. Customize your API by modifying the __call__
method in your ModelDeployment
class to handle different types of requests or add more functionality.
Monitoring: Use Ray’s monitoring tools to keep track of your deployment’s performance. Ray provides a dashboard where you can monitor your cluster and deployments.
Scaling: Adjust the number of replicas and resources allocated to your deployment as needed. Ray Serve can dynamically scale based on traffic and load.
I hope this helps and your issue can be solved soon.
Regards,
Yutong
-Please kindly accept the answer if you feel helpful to support the community, thanks a lot