An Azure machine learning service for building and deploying models.
Good day.
Thank you for sharing your architecture.
Here are answer to your queries.
-
Prediction results normally get stored in same datastore unless explicitly mentioned. You can create ADF node to save the prediction results back to SQL.How does my on-premise SQL database receive the results of the AutoML batch job?
-
You can set cron jobs to trigger pipeline on certain hours Schedule Jobs But please note that AutoML needs data in MLTable format, you need to convert them to ML Table format prior. I am skeptical if we can set Batch size in AutoML side. But it is possible via scoring scripts in Batch Endpoint jobs. Sample code of using batches.How is the job scheduled, and how is data passed to the AutoML cluster for batching?
import os import pandas as pd import torch import torchvision import glob from os.path import basename from mnist_classifier import MnistClassifier from typing import List def init(): global model global device # AZUREML_MODEL_DIR is an environment variable created during deployment # It is the path to the model folder model_path = os.environ["AZUREML_MODEL_DIR"] model_file = glob.glob(f"{model_path}/*/*.pt")[-1] model = MnistClassifier() model.load_state_dict(torch.load(model_file)) model.eval() device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu") def run(mini_batch: List[str]) -> pd.DataFrame: print(f"Executing run method over batch of {len(mini_batch)} files.") results = [] with torch.no_grad(): for image_path in mini_batch: image_data = torchvision.io.read_image(image_path).float() batch_data = image_data.expand(1, -1, -1, -1) input = batch_data.to(device) # perform inference predict_logits = model(input) # Compute probabilities, classes and labels predictions = torch.nn.Softmax(dim=-1)(predict_logits) predicted_prob, predicted_class = torch.max(predictions, axis=-1) results.append( { "file": basename(image_path), "class": predicted_class.numpy()[0], "probability": predicted_prob.numpy()[0], } ) return pd.DataFrame(results)
Thank you.