Real-time scoring with sp_rxPredict in SQL Server
Applies to: SQL Server 2016 (13.x) and later versions
Learn how to perform real-time scoring with the sp_rxPredict system stored procedure in SQL Server for high-performance predictions or scores in forecasting workloads.
Real-time scoring with sp_rxPredict
is language-agnostic and executes with no dependencies on the R or Python runtimes in Machine Learning Services. Using a model created and trained using Microsoft functions and serialized to a binary format in SQL Server, you can use real-time scoring to generate predicted outcomes on new data inputs on SQL Server instances that do not have the R or Python add-on installed.
Real-time scoring is supported on specific model types based on functions in RevoScaleR or MicrosoftML in R, or revoscalepy or microsoftml in Python. It uses native C++ libraries to generate scores based on user input provided to a machine learning model stored in a special binary format.
Because a trained model can be used for scoring without having to call an external language runtime in Machine Learning Services, the overhead of multiple processes is reduced.
Real-time scoring is a multi-step process:
- You enable the stored procedure that does scoring on a per-database basis.
- You load the pre-trained model in binary format.
- You provide new input data to be scored, either tabular or single rows, as input to the model.
- To generate scores, call the sp_rxPredict stored procedure.
The model must be trained in advance using one of the supported rx algorithms. For details, see Supported algorithms for
sp_rxPredict
.Serialize the model using rxSerialize for R or rx_serialize_model for Python. These serialization functions have been optimized to support fast scoring.
Save the model to the database engine instance from which you want to call it. This instance is not required to have the R or Python runtime extension.
Opomba
Real-time scoring is currently optimized for fast predictions on smaller data sets, ranging from a few rows to hundreds of thousands of rows. On big datasets, using rxPredict might be faster.
Enable this feature for each database that you want to use for scoring. The server administrator should run the command-line utility, RegisterRExt.exe, which is included with the RevoScaleR package.
Svarilo
In order for real-time scoring to work, SQL CLR functionality needs to be enabled in the instance and the database needs to be marked trustworthy. When you run the script, these actions are performed for you. However, consider carefully the additional security implications before doing this.
Open an elevated command prompt, and navigate to the folder where RegisterRExt.exe is located. The following path can be used in a default installation:
<SQLInstancePath>\R_SERVICES\library\RevoScaleR\rxLibs\x64\
Run the following command, substituting the name of your instance and the target database where you want to enable the extended stored procedures:
RegisterRExt.exe /installRts [/instance:name] /database:databasename
For example, to add the extended stored procedure to the CLRPredict database on the default instance, type:
RegisterRExt.exe /installRts /database:CLRPRedict
The instance name is optional if the database is on the default instance. If you're using a named instance, specify the instance name.
RegisterRExt.exe creates the following objects:
- Trusted assemblies
- The stored procedure
sp_rxPredict
- A new database role,
rxpredict_users
. The database administrator can use this role to grant permission to users who use the real-time scoring functionality.
Add any users who need to run
sp_rxPredict
to the new role.
Opomba
In SQL Server 2017 and later, additional security measures are in place to prevent problems with CLR integration. These measures impose additional restrictions on the use of this stored procedure as well.
To disable real-time scoring functionality, open an elevated command prompt, and run the following command: RegisterRExt.exe /uninstallrts /database:<database_name> [/instance:name]
This example describes the steps required to prepare and save a model for real-time prediction, and provides an example in R of how to call the function from T-SQL.
The binary format required by sp_rxPredict is the same as the format required to use the PREDICT function. Therefore, in your R code, include a call to rxSerializeModel, and be sure to specify realtimeScoringOnly = TRUE
, as in this example:
model <- rxSerializeModel(model.name, realtimeScoringOnly = TRUE)
You call sp_rxPredict
as you would any other stored procedure. In the current release, the stored procedure takes only two parameters: @model for the model in binary format, and @inputData for the data to use in scoring, defined as a valid SQL query.
Because the binary format is the same as that used by the PREDICT function, you can use the models and data table from the preceding example.
DECLARE @irismodel varbinary(max)
SELECT @irismodel = [native_model_object] from [ml_models]
WHERE model_name = 'iris.dtree'
AND model_version = 'v1'
EXEC sp_rxPredict
@model = @irismodel,
@inputData = N'SELECT * FROM iris_rx_data'
Opomba
The call to sp_rxPredict
fails if the input data for scoring does not include columns that match the requirements of the model. Currently, only the following .NET data types are supported: double, float, short, ushort, long, ulong and string.
Therefore, you might need to filter out unsupported types in your input data before using it for real-time scoring.
For information about corresponding SQL types, see SQL-CLR Type Mapping or Mapping CLR Parameter Data.