Consume serverless API endpoints from a different Azure AI Foundry project or hub
Article
In this article, you learn how to configure an existing serverless API endpoint in a different project or hub than the one that was used to create the deployment.
Important
Models that are in preview are marked as preview on their model cards in the model catalog.
Certain models in the model catalog can be deployed as serverless APIs. This kind of deployment provides a way to consume models as an API without hosting them on your subscription, while keeping the enterprise security and compliance that organizations need. This deployment option doesn't require quota from your subscription.
The need to consume a serverless API endpoint in a different project or hub than the one that was used to create the deployment might arise in situations such as these:
You want to centralize your deployments in a given project or hub and consume them from different projects or hubs in your organization.
You need to deploy a model in a hub in a particular Azure region where serverless deployment for that model is available. However, you need to consume it from another region, where serverless deployment isn't available for the particular models.
Prerequisites
An Azure subscription with a valid payment method. Free or trial Azure subscriptions won't work. If you don't have an Azure subscription, create a paid Azure account to begin.
from azure.ai.ml import MLClient
from azure.identity import InteractiveBrowserCredential
from azure.ai.ml.entities import ServerlessEndpoint, ServerlessConnection
Create a serverless API endpoint connection
Follow these steps to create a connection:
Connect to the project or hub where the endpoint is deployed:
Get the endpoint's URL and credentials for the endpoint you want to connect to. In this example, you get the details for an endpoint name meta-llama3-8b-qwerty.
Manage data ingestion and preparation, model training and deployment, and machine learning solution monitoring with Python, Azure Machine Learning and MLflow.