Khaitan, Suraj (Ext) Greetings!
Are these real time endpoint in Azure AI serverless?
Yes, real-time endpoints in Azure AI are serverless. They are designed to scale automatically based on the incoming traffic and can handle multiple requests simultaneously without the need for you to manage any infrastructure.
Is there any option to deploy these prompt flows within a compute instance?
It is possible to deploy your prompt flows within a compute instance. You can use Azure Machine Learning Compute to create a cluster of virtual machines that can be used to run your prompt flows.
This gives you more control over the environment in which your flows are running and allows you to customize the resources allocated to them.
To deploy your prompt flows within a compute instance, you can use the Azure Machine Learning SDK.
Also, See Deploy a flow as a managed online endpoint for real-time inference for more details.
I hope this helps!
Do let me know if you have any further questions.
If the response helped, please do click Accept Answer
and Yes
for was this answer helpful.
Doing so would help other community members with similar issue identify the solution. I highly appreciate your contribution to the community.