What are web services in Machine Learning Server?

Important

This content is being retired and may not be updated in the future. The support for Machine Learning Server will end on July 1, 2022. For more information, see What's happening to Machine Learning Server?

Applies to: Machine Learning Server

In Machine Learning Server, a web service is an R or Python code execution on the operationalization compute node.

Data scientists can deploy R and Python code and models as web services into Machine Learning Server to give other users a chance to use their code and predictive models. Once hosted there, these web services are exposed and available for consumption.

Web services can be consumed directly in R or Python, programmatically using REST APIs, or via Swagger-generated client libraries. They can be consumed synchronously, in real-time, or in batch mode. They can also be deployed from one platform and consumed on another.

Web services facilitate the consumption and integration of the operationalized models and code they contain. Once you've built a predictive model, in many cases the next step is to operationalize the model. That is to generate predictions from the pre-trained model on demand. In this scenario, where new data often become available one row at a time, latency becomes the critical metric. It is important to respond with the single prediction (or score) as quickly as possible.

Each web service is uniquely defined by its name and version. You can use the functions in the mrsdeploy R package or the azureml-model-management-sdk Python package to gain access a service's lifecycle from an R or Python script.

Requirement! Before you can deploy and work with web services, you must have access to a Machine Learning Server instance configured to host web services.

There are two types of web services: standard and real-time.

Standard web services

These web services offer fast execution and scoring of arbitrary Python or R code and models. They can contain code, models, and model assets. They can also take specific inputs and provide specific outputs for those users who are integrating the services inside their applications.

Standard web services, like all web services, are identified by their name and version. Additionally, they can also be defined by any Python or R code, models, and any necessary model assets. When deploying a standard web service, you should also define the required inputs and any output the application developers use to integrate the service in their applications.

See a standard web service deployment example: R | Python

Real-time web services

Real-time web services do not support arbitrary code and only accept models created with the supported functions from packages installed with the product. See the following sections for the list of supported functions by language and package.

Real-time web services offer even lower latency to produce results faster and score more models in parallel. The improved performance boost comes from the fact that these web services do not depend on an interpreter at consumption time even though the services use the objects created by the model. Therefore, fewer additional resources and less time is spent spinning up a session for each call. Additionally, the model is only loaded once in the compute node and can be scored multiple times.

For real-time services, you do not need to specify:

  • inputs and outputs (dataframes are assumed)
  • code (only serialized models are supported)

See real-time web service deployment examples: R | Python

Supported R functions for real time

A model object created with these supported functions:

R package Supported functions
RevoScaleR rxBTrees, rxDTree, rxDForest, rxLogit, rxLinMod
MicrosoftML Machine learning and transform tasks:
rxFastTrees, rxFastForest, rxLogisticRegression, rxOneClassSvm, rxNeuralNet, rxFastLinear, featurizeText, concat, categorical, categoricalHash, selectFeatures, featurizeImage, getSentiment, loadimage, resizeImage, extractPixels, selectColumns, and dropColumns

While mlTransform featurization is supported in real-time scoring, R transforms are not supported. Instead, use sp_execute_external_script.

There are additional restrictions on the input dataframe format for microsoftml models:

  1. The dataframe must have the same number of columns as the formula specified for the model.

  2. The dataframe must be in the exact same order as the formula specified for the model.

  3. The columns must be of the same data type as the training data. Type casting is not possible.

Supported Python functions for real time

Python package Supported functions
revoscalepy rx_btrees, rx_dforest, rx_dtree, rx_logit, rx_lin_mod
microsoftml Machine learning and transform tasks:
categorical, categorical_hash, concat, extract_pixels, featurize_text, featurize_image, get_sentiment, rx_fast_trees, rx_fast_forest, rx_fast_linear, rx_logistic_regression, rx_neural_network, rx_oneclass_svm, load_image, resize_image, select_columns, and drop_columns.

See the preceding input dataframe format restrictions.

Versioning

Every time a web service is published, a version is assigned to the web service. Versioning enables users to better manage the release of their web services and helps the people consuming your service to find it easily.

At publish time, specify an alphanumeric string that is meaningful to those users who consume the service. For example, you could use '2.0', 'v1.0.0', 'v1.0.0-alpha', or 'test-1'. Meaningful versions are helpful when you intend to share services with others. We highly recommend a consistent and meaningful versioning convention across your organization or team such as semantic versioning. Learn more about semantic versioning here: http://semver.org/.

If you do not specify a version, a globally unique identifier (GUID) is automatically assigned. These GUID numbers are long making them harder to remember and use.

Who consumes web services

After a web service has been published, authenticated users can consume that web service on various platforms and in various languages. You can consume directly in R or Python, using APIs, or in your preferred language via Swagger.

You can make it easy for others to find your web services by providing them with the name and version of the web service.

  • Data scientists who want to explore and consume the services directly in R and in Python.

  • Quality engineers who want to bring the models in these web services into validation and monitoring cycles.

  • Application developers who want to call and integrate a web service into their applications. Developers can generate client libraries for integration using the Swagger-based JSON file generated during service deployment. Read "How to integrate web services and authentication into your application" for more details. Services can also be consumed using the RESTful APIs that provide direct programmatic access to a service's lifecycle.

How are web services consumed

Web services can be consumed using one of these approaches:

Approach Description
Request Response The service is consumed directly using a single synchronous consumption call.
Learn how in R | in Python
Asynchronous Batch Users send a single asynchronous request to the server who in turn makes multiple service calls on their behalf.
Learn how in R

Permissions

By default, any authenticated Machine Learning Server user can:

  • Publish a new service
  • Update and delete web services they have published
  • Retrieve any web service object for consumption
  • Retrieve a list of any or all web services

Destructive tasks, such as deleting a web service, are available only to the user who initially created the service. However, your administrator can also assign role-based authorization to further control the permissions around web services. When you list services, you can see your role for each one of them.

See also

In R:

In Python: