PREDICT (Transact-SQL)
Applies to: SQL Server 2017 (14.x) and later Azure SQL Database Azure SQL Managed Instance Azure Synapse Analytics
Generates a predicted value or scores based on a stored model. For more information, see Native scoring using the PREDICT T-SQL function.
Select a product
In the following row, select the product name you're interested in, and only that product's information is displayed.
* SQL Server *
* SQL Database *
* SQL Managed Instance *
* Azure Synapse
Analytics *
Syntax
PREDICT
(
MODEL = @model | model_literal,
DATA = object AS <table_alias>
[, RUNTIME = ONNX ]
)
WITH ( <result_set_definition> )
<result_set_definition> ::=
{
{ column_name
data_type
[ COLLATE collation_name ]
[ NULL | NOT NULL ]
}
[,...n ]
}
MODEL = @model | model_literal
PREDICT
(
MODEL = <model_object>,
DATA = object AS <table_alias>
[, RUNTIME = ONNX ]
)
WITH ( <result_set_definition> )
<result_set_definition> ::=
{
{ column_name
data_type
[ COLLATE collation_name ]
[ NULL | NOT NULL ]
}
[,...n ]
}
<model_object> ::=
{
model_literal
| model_variable
| ( scalar_subquery )
}
Arguments
MODEL
The MODEL
parameter is used to specify the model used for scoring or prediction. The model is specified as a variable or a literal or a scalar expression.
PREDICT
supports models trained using the RevoScaleR and revoscalepy packages.
The MODEL
parameter is used to specify the model used for scoring or prediction. The model is specified as a variable or a literal or a scalar expression.
In Azure SQL Managed Instance, PREDICT
supports models in Open Neural Network Exchange (ONNX) format or models trained using the RevoScaleR and revoscalepy packages.
Important
Support for ONNX in PREDICT
is in preview in Azure SQL Managed Instance.
The MODEL
parameter is used to specify the model used for scoring or prediction. The model is specified as a variable or a literal or a scalar expression or a scalar subquery.
In Azure Synapse Analytics, PREDICT
supports models in Open Neural Network Exchange (ONNX) format.
DATA
The DATA parameter is used to specify the data used for scoring or prediction. Data is specified in the form of a table source in the query. Table source can be a table, table alias, CTE alias, view, or table-valued function.
RUNTIME = ONNX
Important
The RUNTIME = ONNX
argument is only available in Azure SQL Edge, Azure Synapse Analytics, and is in preview in Azure SQL Managed Instance.
Indicates the machine learning engine used for model execution. The RUNTIME
parameter value is always ONNX
. The parameter is required for Azure SQL Edge and Azure Synapse Analytics. On Azure SQL Managed Instance (in Preview), the parameter is optional and only used when using ONNX models.
WITH ( <result_set_definition> )
The WITH clause is used to specify the schema of the output returned by the PREDICT
function.
In addition to the columns returned by the PREDICT
function itself, all the columns that are part of the data input are available for use in the query.
Return values
No predefined schema is available; the contents of the model is not validated and the returned column values are not validated either.
- The
PREDICT
function passes through columns as input. - The
PREDICT
function also generates new columns, but the number of columns and their data types depends on the type of model that was used for prediction.
Any error messages related to the data, the model, or the column format are returned by the underlying prediction function associated with the model.
Remarks
The PREDICT
function is supported in all editions of SQL Server 2017 or later, on Windows and Linux. Machine Learning Services does not need to be enabled to use PREDICT
.
Supported algorithms
The model that you use must have been created using one of the supported algorithms from the RevoScaleR or revoscalepy packages. For a list of currently supported models, see Native scoring using the PREDICT T-SQL function.
Algorithms that can be converted to ONNX model format are supported.
Algorithms that can be converted to ONNX model format and models that you have created using one of the supported algorithms from the RevoScaleR or revoscalepy packages are supported. For a list of currently supported algorithms in RevoScaleR and revoscalepy, see Native scoring using the PREDICT T-SQL function.
Permissions
No permissions are required for PREDICT
; however, the user needs EXECUTE
permission on the database, and permission to query any data that is used as inputs. The user must also be able to read the model from a table, if the model has been stored in a table.
Examples
The following examples demonstrate the syntax for calling PREDICT
.
Using PREDICT in a FROM clause
This example references the PREDICT
function in the FROM
clause of a SELECT
statement:
SELECT d.*, p.Score
FROM PREDICT(MODEL = @model,
DATA = dbo.mytable AS d) WITH (Score FLOAT) AS p;
DECLARE @model VARBINARY(max) = (SELECT test_model FROM scoring_model WHERE model_id = 1);
SELECT d.*, p.Score
FROM PREDICT(MODEL = @model,
DATA = dbo.mytable AS d, RUNTIME = ONNX) WITH (Score FLOAT) AS p;
The alias d specified for table source in the DATA
parameter is used to reference the columns belonging to dbo.mytable
. The alias p specified for the PREDICT
function is used to reference the columns returned by the PREDICT
function.
- The model is stored as
varbinary(max)
column in table calledModels
. Additional information such asID
anddescription
is saved in the table to identify the model. - The alias d specified for table source in the
DATA
parameter is used to reference the columns belonging todbo.mytable
. The input data column names should match the name of inputs for the model. - The alias p specified for the
PREDICT
function is used to reference the predicted column returned by thePREDICT
function. The column name should have the same name as the output name for the model. - All input data columns and the predicted columns are available to display in the SELECT statement.
The preceding example query can be rewritten to create a view by specifying MODEL
as a scalar subquery:
CREATE VIEW predictions
AS
SELECT d.*, p.Score
FROM PREDICT(MODEL = (SELECT test_model FROM scoring_model WHERE model_id = 1),
DATA = dbo.mytable AS d, RUNTIME = ONNX) WITH (Score FLOAT) AS p;
Combining PREDICT with an INSERT statement
A common use case for prediction is to generate a score for input data, and then insert the predicted values into a table. The following example assumes the calling application uses a stored procedure to insert a row containing the predicted value into a table:
DECLARE @model VARBINARY(max) = (SELECT model FROM scoring_model WHERE model_name = 'ScoringModelV1');
INSERT INTO loan_applications (c1, c2, c3, c4, score)
SELECT d.c1, d.c2, d.c3, d.c4, p.score
FROM PREDICT(MODEL = @model, DATA = dbo.mytable AS d) WITH(score FLOAT) AS p;
DECLARE @model VARBINARY(max) = (SELECT model FROM scoring_model WHERE model_name = 'ScoringModelV1');
INSERT INTO loan_applications (c1, c2, c3, c4, score)
SELECT d.c1, d.c2, d.c3, d.c4, p.score
FROM PREDICT(MODEL = @model, DATA = dbo.mytable AS d, RUNTIME = ONNX) WITH(score FLOAT) AS p;
- The results of
PREDICT
are stored in a table calledPredictionResults
. - The model is stored as
varbinary(max)
column in table calledModels
. Additional information such as ID and description can be saved in the table to identify the model. - The alias d specified for table source in the
DATA
parameter is used to reference the columns indbo.mytable
. The input data column names should match the name of inputs for the model. - The alias p specified for the
PREDICT
function is used to reference the predicted column returned by thePREDICT
function. The column name should have the same name as the output name for the model. - All input columns and the predicted column are available to display in the SELECT statement.
Next steps
Learn more about related concepts in the following articles: