Azure AI Studio を使用して大規模言語モデルをデプロイする方法

[アーティクル]
05/21/2024

重要

この記事で説明する機能の一部は、プレビューでのみ使用できる場合があります。このプレビューはサービスレベルアグリーメントなしで提供されており、運用環境ではお勧めしません。特定の機能はサポート対象ではなく、機能が制限されることがあります。詳しくは、Microsoft Azure プレビューの追加使用条件に関するページをご覧ください。

大規模言語モデル (LLM) をデプロイすると、Web サイト、アプリケーション、またはその他の運用環境で使用できるようになります。デプロイには通常、サーバーまたはクラウドでモデルをホストし、ユーザーがモデルと対話するための API またはその他のインターフェイスを作成することが含まれます。 Chat や Copilot などの生成 AI アプリケーションのリアルタイム推論のためにデプロイを呼び出すことができます。

この記事では、Azure AI Studio で大規模言語モデルをデプロイする方法について説明します。モデルは、モデルカタログまたはプロジェクトからデプロイできます。 Azure Machine Learning SDK を使用してモデルをデプロイすることもできます。この記事では、デプロイされたモデルで推論を実行する方法についても説明します。

コードを使用してサーバーレス API モデルをデプロイして推論する

モデルのデプロイ

サーバーレス API モデルは、従量課金制を使用してデプロイできるモデルです。例として、Phi-3、Llama-2、Command R、Mistral Large などがあります。サーバーレス API モデルの場合、モデルの微調整を選択しない限り、推論に対してのみ課金されます。

モデル ID を取得する

Azure Machine Learning SDK を使用してサーバーレス API モデルをデプロイできますが、まず、モデルカタログを参照し、デプロイに必要なモデル ID を取得しましょう。

AI Studio にサインインし、[ホーム] ページに移動します。
左側のサイドバーから [モデルカタログ] を選択します。
[デプロイオプション] フィルターで、[サーバーレス API] を選択します。
モデルを選択します。
選択したモデルの詳細ページからモデル ID をコピーします。次のようになります。azureml://registries/azureml-cohere/models/Cohere-command-r-plus/versions/3

Azure Machine Learning SDK をインストールする

次に、Azure Machine Learning SDK をインストールする必要があります。ターミナルで次のコマンドを実行します。

pip install azure-ai-ml
pip install azure-identity

サーバーレス API モデルをデプロイする

まず、Azure AI に対して認証する必要があります。

from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential
from azure.ai.ml.entities import MarketplaceSubscription, ServerlessEndpoint

# You can find your credential information in project settings.
client = MLClient(
    credential=DefaultAzureCredential(),
    subscription_id="your subscription name goes here",
    resource_group_name="your resource group name goes here",
    workspace_name="your project name goes here",
)

次に、前に見つけたモデル ID を参照しましょう。

# You can find the model ID on the model catalog.
model_id="azureml://registries/azureml-meta/models/Llama-2-70b-chat/versions/18"

サードパーティのモデルプロバイダーのサーバーレス API モデルでは、モデルを使用するために Azure Marketplace サブスクリプションが必要です。マーケットプレースサブスクリプションを作成しましょう。

Note

Microsoft からサーバーレス API モデル (Phi-3 など) をデプロイしている場合は、このパートをスキップできます。

# You can customize the subscription name.
subscription_name="Meta-Llama-2-70b-chat" 

marketplace_subscription = MarketplaceSubscription(
    model_id=model_id,
    name=subscription_name,
)

marketplace_subscription = client.marketplace_subscriptions.begin_create_or_update(
    marketplace_subscription
).result()

最後に、サーバーレスエンドポイントを作成しましょう。


endpoint_name="Meta-Llama-2-70b-chat-qwerty" # Your endpoint name must be unique

serverless_endpoint = ServerlessEndpoint(
    name=endpoint_name,
    model_id=model_id
)

created_endpoint = client.serverless_endpoints.begin_create_or_update(
    serverless_endpoint
).result()

サーバーレス API エンドポイントとキーを取得する

endpoint_keys = client.serverless_endpoints.get_keys(endpoint_name)
print(endpoint_keys.primary_key)
print(endpoint_keys.secondary_key)

デプロイを推論する

推論するには、使用しているさまざまなモデルの種類と SDK に特化したコードを使用する必要があります。コードサンプルは、Azure/azureml-examples サンプルリポジトリにあります。

コードを使用してマネージドコンピューティングデプロイをデプロイおよび推論する

モデルのデプロイ

AI Studio モデルカタログには 1,600 を超えるモデルが用意されており、これらのモデルをデプロイする最も一般的な方法は、マネージドオンラインデプロイとも呼ばれるマネージドコンピューティングデプロイオプションを使用することです。

モデル ID を取得する

Azure Machine Learning SDK を使用してマネージドコンピューティングモデルをデプロイできますが、まず、モデルカタログを参照し、デプロイに必要なモデル ID を取得しましょう。

AI Studio にサインインし、[ホーム] ページに移動します。
左側のサイドバーから [モデルカタログ] を選択します。
[デプロイオプション] フィルターで、[マネージドコンピューティング] を選択します。
モデルを選択します。
選択したモデルの詳細ページからモデル ID をコピーします。次のようになります。azureml://registries/azureml/models/deepset-roberta-base-squad2/versions/16

Azure Machine Learning SDK をインストールする

この手順では、Azure Machine Learning SDK をインストールする必要があります。

pip install azure-ai-ml
pip install azure-identity

モデルをデプロイする

まず、Azure AI に対して認証する必要があります。

from azure.ai.ml import MLClient
from azure.identity import InteractiveBrowserCredential

client = MLClient(
    credential=InteractiveBrowserCredential,
    subscription_id="your subscription name goes here",
    resource_group_name="your resource group name goes here",
    workspace_name="your project name goes here",
)

モデルをデプロイしましょう。

マネージドコンピューティングデプロイオプションでは、モデルデプロイの前にエンドポイントを作成する必要があります。エンドポイントは、複数のモデルデプロイを格納できるコンテナーと考えることができます。エンドポイント名はリージョン内で一意である必要があるため、この例ではタイムスタンプを使用して一意のエンドポイント名を作成します。

import time, sys
from azure.ai.ml.entities import (
    ManagedOnlineEndpoint,
    ManagedOnlineDeployment,
    ProbeSettings,
)

# Make the endpoint name unique
timestamp = int(time.time())
online_endpoint_name = "customize your endpoint name here" + str(timestamp)

# Create an online endpoint
endpoint = ManagedOnlineEndpoint(
    name=online_endpoint_name,
    auth_mode="key",
)
workspace_ml_client.begin_create_or_update(endpoint).wait()

デプロイを作成します。モデル ID はモデルカタログで確認できます。

model_name = "azureml://registries/azureml/models/deepset-roberta-base-squad2/versions/16" 

demo_deployment = ManagedOnlineDeployment(
    name="demo",
    endpoint_name=online_endpoint_name,
    model=model_name,
    instance_type="Standard_DS3_v2",
    instance_count=2,
    liveness_probe=ProbeSettings(
        failure_threshold=30,
        success_threshold=1,
        timeout=2,
        period=10,
        initial_delay=1000,
    ),
    readiness_probe=ProbeSettings(
        failure_threshold=10,
        success_threshold=1,
        timeout=10,
        period=10,
        initial_delay=1000,
    ),
)
workspace_ml_client.online_deployments.begin_create_or_update(demo_deployment).wait()
endpoint.traffic = {"demo": 100}
workspace_ml_client.begin_create_or_update(endpoint).result()

デプロイを推論する

推論をテストするには、サンプル json データが必要です。次の例を参考にして、sample_score.json を作成します。

{
  "inputs": {
    "question": [
      "Where do I live?",
      "Where do I live?",
      "What's my name?",
      "Which name is also used to describe the Amazon rainforest in English?"
    ],
    "context": [
      "My name is Wolfgang and I live in Berlin",
      "My name is Sarah and I live in London",
      "My name is Clara and I live in Berkeley.",
      "The Amazon rainforest (Portuguese: Floresta Amaz\u00f4nica or Amaz\u00f4nia; Spanish: Selva Amaz\u00f3nica, Amazon\u00eda or usually Amazonia; French: For\u00eat amazonienne; Dutch: Amazoneregenwoud), also known in English as Amazonia or the Amazon Jungle, is a moist broadleaf forest that covers most of the Amazon basin of South America. This basin encompasses 7,000,000 square kilometres (2,700,000 sq mi), of which 5,500,000 square kilometres (2,100,000 sq mi) are covered by the rainforest. This region includes territory belonging to nine nations. The majority of the forest is contained within Brazil, with 60% of the rainforest, followed by Peru with 13%, Colombia with 10%, and with minor amounts in Venezuela, Ecuador, Bolivia, Guyana, Suriname and French Guiana. States or departments in four nations contain \"Amazonas\" in their names. The Amazon represents over half of the planet's remaining rainforests, and comprises the largest and most biodiverse tract of tropical rainforest in the world, with an estimated 390 billion individual trees divided into 16,000 species."
    ]
  }
}

sample_score.json を使用して推論してみましょう。サンプル json ファイルを保存した場所に基づいて場所を変更します。

scoring_file = "./sample_score.json" 
response = workspace_ml_client.online_endpoints.invoke(
    endpoint_name=online_endpoint_name,
    deployment_name="demo",
    request_file=scoring_file,
)
response_json = json.loads(response)
print(json.dumps(response_json, indent=2))

デプロイエンドポイントを削除する

AI Studio でデプロイを削除するには、デプロイの詳細ページの上部パネルにある [削除] ボタンを選択します。

クォータの考慮事項

リアルタイムエンドポイントを使用した推論のデプロイと実行には、リージョンごとにサブスクリプションに割り当てられている仮想マシン (VM) コアクォータを使用します。 AI Studio にサインアップすると、リージョンで使用可能な複数の VM ファミリに対する既定の VM クォータを受け取ります。クォータ制限に達するまで、デプロイを作成し続けることができます。その後は、クォータの引き上げを要求できます。

次のステップ

AI Studio でできることについて、詳細を確認します
Azure AI の FAQ の記事で、よくあるご質問とその回答を確認します

次の方法で共有

Azure AI Studio を使用して大規模言語モデルをデプロイする方法

コードを使用してサーバーレス API モデルをデプロイして推論する

モデルのデプロイ

モデル ID を取得する

Azure Machine Learning SDK をインストールする

サーバーレス API モデルをデプロイする

サーバーレス API エンドポイントとキーを取得する

デプロイを推論する

コードを使用してマネージドコンピューティングデプロイをデプロイおよび推論する

モデルのデプロイ

モデル ID を取得する

Azure Machine Learning SDK をインストールする

モデルをデプロイする

デプロイを推論する

デプロイエンドポイントを削除する

クォータの考慮事項

次のステップ

その他のリソース

次の方法で共有

Azure AI Studio を使用して大規模言語モデルをデプロイする方法

コードを使用してサーバーレス API モデルをデプロイして推論する

モデルのデプロイ

モデル ID を取得する

Azure Machine Learning SDK をインストールする

サーバーレス API モデルをデプロイする

サーバーレス API エンドポイントとキーを取得する

デプロイを推論する

コードを使用してマネージド コンピューティング デプロイをデプロイおよび推論する

モデルのデプロイ

モデル ID を取得する

Azure Machine Learning SDK をインストールする

モデルをデプロイする

デプロイを推論する

デプロイ エンドポイントを削除する

クォータの考慮事項

次のステップ

その他のリソース

コードを使用してマネージドコンピューティングデプロイをデプロイおよび推論する

デプロイエンドポイントを削除する