在 Microsoft Fabric 中使用 PyTorch 訓練模型

發行項
10/15/2024

本文說明如何訓練和追蹤 PyTorch 模型的反覆項目。 PyTorch 機器學習架構以 Torch 程式庫為基礎。 PyTorch 通常用於電腦視覺和自然語言處理應用程式。

必要條件

在您的筆記本中安裝 PyTorch 和 Torchvision。您可以使用下列命令，在您的環境中安裝或升級這些程式庫的版本：

pip install torch torchvision

設定機器學習實驗

您可以使用 MLFLow API 來建立機器學習實驗。如果名稱為 sample-pytorch 的機器學習實驗尚未存在，MLflow set_experiment() 函數會建立一個新的機器學習實驗。

在您的筆記本中執行下列程式碼並建立實驗：

import mlflow

mlflow.set_experiment("sample-pytorch")

訓練及評估 Pytorch 模型

設定實驗之後，即會載入修改的國家標準暨技術研究院 (MNIST) 資料集。產生測試和訓練資料集，然後建立訓練函數。

在您的筆記本中執行下列程式碼，並訓練 Pytorch 模型：

import os
import torch
import torch.nn as nn
from torch.autograd import Variable
import torchvision.datasets as dset
import torchvision.transforms as transforms
import torch.nn.functional as F
import torch.optim as optim

# Load the MNIST dataset
root = "/tmp/mnist"
if not os.path.exists(root):
    os.mkdir(root)

trans = transforms.Compose(
    [transforms.ToTensor(), transforms.Normalize((0.5,), (1.0,))]
)

# If the data doesn't exist, download the MNIST dataset
train_set = dset.MNIST(root=root, train=True, transform=trans, download=True)
test_set = dset.MNIST(root=root, train=False, transform=trans, download=True)

batch_size = 100

train_loader = torch.utils.data.DataLoader(
    dataset=train_set, batch_size=batch_size, shuffle=True
)
test_loader = torch.utils.data.DataLoader(
    dataset=test_set, batch_size=batch_size, shuffle=False
) 

print("==>>> total trainning batch number: {}".format(len(train_loader)))
print("==>>> total testing batch number: {}".format(len(test_loader)))

# Define the network
class LeNet(nn.Module):
    def __init__(self):
        super(LeNet, self).__init__()
        self.conv1 = nn.Conv2d(1, 20, 5, 1)
        self.conv2 = nn.Conv2d(20, 50, 5, 1)
        self.fc1 = nn.Linear(4 * 4 * 50, 500)
        self.fc2 = nn.Linear(500, 10)

    def forward(self, x): 
        x = F.relu(self.conv1(x))
        x = F.max_pool2d(x, 2, 2)
        x = F.relu(self.conv2(x))
        x = F.max_pool2d(x, 2, 2)
        x = x.view(-1, 4 * 4 * 50)
        x = F.relu(self.fc1(x))
        x = self.fc2(x)
        return x

    def name(self):
        return "LeNet"

# Train the model
model = LeNet()

optimizer = optim.SGD(model.parameters(), lr=0.01, momentum=0.9)

criterion = nn.CrossEntropyLoss()

for epoch in range(1):
    # Model training
    ave_loss = 0
    for batch_idx, (x, target) in enumerate(train_loader):
        optimizer.zero_grad()
        x, target = Variable(x), Variable(target)
        out = model(x)
        loss = criterion(out, target)
        ave_loss = (ave_loss * batch_idx + loss.item()) / (batch_idx + 1)
        loss.backward()
        optimizer.step()
        if (batch_idx + 1) % 100 == 0 or (batch_idx + 1) == len(train_loader):
            print(
                "==>>> epoch: {}, batch index: {}, train loss: {:.6f}".format(
                    epoch, batch_idx + 1, ave_loss
                )
            )
    # Model testing
    correct_cnt, total_cnt, ave_loss = 0, 0, 0
    for batch_idx, (x, target) in enumerate(test_loader):
        x, target = Variable(x, volatile=True), Variable(target, volatile=True)
        out = model(x)
        loss = criterion(out, target)
        _, pred_label = torch.max(out.data, 1)
        total_cnt += x.data.size()[0]
        correct_cnt += (pred_label == target.data).sum()
        ave_loss = (ave_loss * batch_idx + loss.item()) / (batch_idx + 1)

        if (batch_idx + 1) % 100 == 0 or (batch_idx + 1) == len(test_loader):
            print(
                "==>>> epoch: {}, batch index: {}, test loss: {:.6f}, acc: {:.3f}".format(
                    epoch, batch_idx + 1, ave_loss, correct_cnt * 1.0 / total_cnt
                )
            )

torch.save(model.state_dict(), model.name())

使用 MLflow 記錄模型

下一項工作會啟動 MLflow 執行，並追蹤機器學習實驗的結果。範例程式碼會建立名稱為 sample-pytorch 的新模型。它會使用指定的參數建立執行，並在 sample-pytorch 實驗中記錄執行。

在您的筆記本中執行下列程式碼並記錄模型：

with mlflow.start_run() as run:
    print("log pytorch model:")
    mlflow.pytorch.log_model(
        model, "pytorch-model", registered_model_name="sample-pytorch"
    )

    model_uri = "runs:/{}/pytorch-model".format(run.info.run_id)
    print("Model saved in run %s" % run.info.run_id)
    print(f"Model URI: {model_uri}")

載入並評估模型

儲存模型之後，您可以載入模型以做出推斷。

在您的筆記本中執行下列程式碼並載入模型，以做出推斷：

# Inference with loading the logged model
loaded_model = mlflow.pytorch.load_model(model_uri)
print(type(loaded_model))

correct_cnt, total_cnt, ave_loss = 0, 0, 0
for batch_idx, (x, target) in enumerate(test_loader):
    x, target = Variable(x, volatile=True), Variable(target, volatile=True)
    out = loaded_model(x)
    loss = criterion(out, target)
    _, pred_label = torch.max(out.data, 1)
    total_cnt += x.data.size()[0]
    correct_cnt += (pred_label == target.data).sum()
    ave_loss = (ave_loss * batch_idx + loss.item()) / (batch_idx + 1)

    if (batch_idx + 1) % 100 == 0 or (batch_idx + 1) == len(test_loader):
        print(
            "==>>> epoch: {}, batch index: {}, test loss: {:.6f}, acc: {:.3f}".format(
                epoch, batch_idx + 1, ave_loss, correct_cnt * 1.0 / total_cnt
            )
        )

探索機器學習模型
建立機器學習實驗

共用方式為

在 Microsoft Fabric 中使用 PyTorch 訓練模型

必要條件

設定機器學習實驗

訓練及評估 Pytorch 模型

使用 MLflow 記錄模型

載入並評估模型

意見反應

其他資源

共用方式為

在 Microsoft Fabric 中使用 PyTorch 訓練模型

必要條件

設定機器學習實驗

訓練及評估 Pytorch 模型

使用 MLflow 記錄模型

載入並評估模型

相關內容

意見反應

其他資源