Packaging and deploying ML models

Model Release involves the process of packaging the latest (and ideally best performing) model from the training pipeline, and promoting it through to the Production environment.

Model packaging options

Model processing pipeline for generating a docker image

After a model is determined through experimentation by a data scientist, it's time to deploy to an environment. Although it is possible to deploy a model along with its artifacts directly to an environment, a better practice is to configure a docker image with model artifacts. Then, run containers based on the docker image. Docker helps in providing more flexibility to test the model including security scanning, smoke test and publishing it to a container registry.

Model promotion using shared registries

Currently in preview, the Machine Learning Registries for MLOps in Azure ML allows you to register a model once, and easily retrieve it across multiple workspaces (including in different subscriptions). Instead of copying a model to workspaces in each deployment environment, each of those workspaces could refer to the same registry. Then, as a model is tested, its tags are updated to reflect its status. Models ready for production can be deployed directly from the shared registry to the production environment.

Model release example: Finding the best performing AutoML model during the build process

The following example illustrates how to deploy a solution that uses Azure ML's AutoML functionality to:

  • look for the best performing model
  • determine if more data has been labeled and a training run has initiated
  • download it for serving.

Performing unit and smoke tests are recommended to ensure any new models perform as expected.

This example is part of a larger end-to-end solution: "Building a custom video search experience using Azure Video Indexer for Media, Azure Machine Learning and Azure AI Search".

Refer to the Azure ML AutoML containerized API for an example of automated model release.

Model release example: How to manage a Form Recognizer (FR) model

  • When submitting a request to train a custom model, FR generates a unique model id to refer to the model training attempt. FR expects the user to track this model id and pass it whenever they need to use or manage the trained custom model. In addition to the model id, FR tracks when the model was trained, training duration, and the status of the model. Status examples include ready for inference, pending training, etc.

  • As a user, you need to create a custom solution to track your model ids. The model id is a UUID, thus it would be beneficial to track additional information as part of your solution. For example, a description of the model, readable labels, an audit trail of how the model came to exist, and previous versions of the model. One thing to note, FR does not support retraining an existing model, instead it would generate a new model with a new model id for each training request. Each version of your model would have a different model id to be tracked.

  • FR has a max limit on how many models it can persist at any given time. Training attempts contribute towards this limit, regardless of whether the training was successful or not. You can identify your FR instance limit via the management APIs. Depending on your use case, you may need to manage which models to keep in your FR instance vs archive or remove. It might be beneficial to persist recent models to enable model rollback if needed. For archived models, the audit trail can persist the necessary information to re-train an archived model. On the other hand, if your use case justifiably requires persisting a high number of models, the FR models limit can be increased by reaching out to Azure customer support. It would be beneficial to evaluate the needed capacity before engaging with customer support.

For assets and guidance for Form Recognizer model management and more, refer to the Playbook for Knowledge Extraction For Forms Accelerators & Examples