predict_onnx_fl()

[アーティクル]
08/12/2024

Applies to: ✅ Microsoft Fabric ✅ Azure Data Explorer ✅ Azure Monitor ✅ Microsoft Sentinel

The function predict_onnx_fl() is a user-defined function (UDF) that predicts using an existing trained machine learning model. This model has been converted to ONNX format, serialized to string, and saved in a standard table.

Prerequisites

The Python plugin must be enabled on the cluster. This is required for the inline Python used in the function.

The Python plugin must be enabled on the database. This is required for the inline Python used in the function.

Syntax

T | invoke predict_onnx_fl(models_tbl, model_name, features_cols, pred_col)

Learn more about syntax conventions.

Parameters

Name	Type	Required	Description
models_tbl	`string`	✔️	The name of the table that contains all serialized models. The table must have the following columns: `name`: the model name `timestamp`: time of model training `model`: string representation of the serialized model
model_name	`string`	✔️	The name of the specific model to use.
features_cols	synamic	✔️	An array containing the names of the features columns that are used by the model for prediction.
pred_col	`string`	✔️	The name of the column that stores the predictions.

Function definition

You can define the function by either embedding its code as a query-defined function, or creating it as a stored function in your database, as follows:

Query-defined
Stored

Define the function using the following let statement. No permissions are required.

Important

A let statement can't run on its own. It must be followed by a tabular expression statement. To run a working example of predict_onnx_fl(), see Example.

let predict_onnx_fl=(samples:(*), models_tbl:(name:string, timestamp:datetime, model:string), model_name:string, features_cols:dynamic, pred_col:string)
{
    let model_str = toscalar(models_tbl | where name == model_name | top 1 by timestamp desc | project model);
    let kwargs = bag_pack('smodel', model_str, 'features_cols', features_cols, 'pred_col', pred_col);
    let code = ```if 1:
    
    import binascii
    
    smodel = kargs["smodel"]
    features_cols = kargs["features_cols"]
    pred_col = kargs["pred_col"]
    bmodel = binascii.unhexlify(smodel)
    
    features_cols = kargs["features_cols"]
    pred_col = kargs["pred_col"]
    
    import onnxruntime as rt
    sess = rt.InferenceSession(bmodel)
    input_name = sess.get_inputs()[0].name
    label_name = sess.get_outputs()[0].name
    df1 = df[features_cols]
    predictions = sess.run([label_name], {input_name: df1.values.astype(np.float32)})[0]
    
    result = df
    result[pred_col] = pd.DataFrame(predictions, columns=[pred_col])
    
    ```;
    samples | evaluate python(typeof(*), code, kwargs)
};
// Write your query to use the function here.

Define the stored function once using the following .create function. Database User permissions are required.

Important

You must run this code to create the function before you can use the function as shown in the Example.

.create-or-alter function with (folder = "Packages\\ML", docstring = "Predict using ONNX model")
predict_onnx_fl(samples:(*), models_tbl:(name:string, timestamp:datetime, model:string), model_name:string, features_cols:dynamic, pred_col:string)
{
    let model_str = toscalar(models_tbl | where name == model_name | top 1 by timestamp desc | project model);
    let kwargs = bag_pack('smodel', model_str, 'features_cols', features_cols, 'pred_col', pred_col);
    let code = ```if 1:
    
    import binascii
    
    smodel = kargs["smodel"]
    features_cols = kargs["features_cols"]
    pred_col = kargs["pred_col"]
    bmodel = binascii.unhexlify(smodel)
    
    features_cols = kargs["features_cols"]
    pred_col = kargs["pred_col"]
    
    import onnxruntime as rt
    sess = rt.InferenceSession(bmodel)
    input_name = sess.get_inputs()[0].name
    label_name = sess.get_outputs()[0].name
    df1 = df[features_cols]
    predictions = sess.run([label_name], {input_name: df1.values.astype(np.float32)})[0]
    
    result = df
    result[pred_col] = pd.DataFrame(predictions, columns=[pred_col])
    
    ```;
    samples | evaluate python(typeof(*), code, kwargs)
}

Example

The following example uses the invoke operator to run the function.

Query-defined
Stored

To use a query-defined function, invoke it after the embedded function definition.

let predict_onnx_fl=(samples:(*), models_tbl:(name:string, timestamp:datetime, model:string), model_name:string, features_cols:dynamic, pred_col:string)
{
    let model_str = toscalar(models_tbl | where name == model_name | top 1 by timestamp desc | project model);
    let kwargs = bag_pack('smodel', model_str, 'features_cols', features_cols, 'pred_col', pred_col);
    let code = ```if 1:
    
    import binascii
    
    smodel = kargs["smodel"]
    features_cols = kargs["features_cols"]
    pred_col = kargs["pred_col"]
    bmodel = binascii.unhexlify(smodel)
    
    features_cols = kargs["features_cols"]
    pred_col = kargs["pred_col"]
    
    import onnxruntime as rt
    sess = rt.InferenceSession(bmodel)
    input_name = sess.get_inputs()[0].name
    label_name = sess.get_outputs()[0].name
    df1 = df[features_cols]
    predictions = sess.run([label_name], {input_name: df1.values.astype(np.float32)})[0]
    
    result = df
    result[pred_col] = pd.DataFrame(predictions, columns=[pred_col])
    
    ```;
    samples | evaluate python(typeof(*), code, kwargs)
};
//
// Predicts room occupancy from sensors measurements, and calculates the confusion matrix
//
// Occupancy Detection is an open dataset from UCI Repository at https://archive.ics.uci.edu/ml/datasets/Occupancy+Detection+
// It contains experimental data for binary classification of room occupancy from Temperature,Humidity,Light and CO2.
// Ground-truth labels were obtained from time stamped pictures that were taken every minute
//
OccupancyDetection 
| where Test == 1
| extend pred_Occupancy=bool(0)
| invoke predict_onnx_fl(ML_Models, 'ONNX-Occupancy', pack_array('Temperature', 'Humidity', 'Light', 'CO2', 'HumidityRatio'), 'pred_Occupancy')
| summarize n=count() by Occupancy, pred_Occupancy

Important

For this example to run successfully, you must first run the Function definition code to store the function.

//
// Predicts room occupancy from sensors measurements, and calculates the confusion matrix
//
// Occupancy Detection is an open dataset from UCI Repository at https://archive.ics.uci.edu/ml/datasets/Occupancy+Detection+
// It contains experimental data for binary classification of room occupancy from Temperature,Humidity,Light and CO2.
// Ground-truth labels were obtained from time stamped pictures that were taken every minute
//
OccupancyDetection 
| where Test == 1
| extend pred_Occupancy=bool(0)
| invoke predict_onnx_fl(ML_Models, 'ONNX-Occupancy', pack_array('Temperature', 'Humidity', 'Light', 'CO2', 'HumidityRatio'), 'pred_Occupancy')
| summarize n=count() by Occupancy, pred_Occupancy

Output

Occupancy	pred_Occupancy	n
TRUE	TRUE	3006
FALSE	TRUE	112
TRUE	FALSE	15
FALSE	FALSE	9284

次の方法で共有

predict_onnx_fl()

Prerequisites

Syntax

Parameters

Function definition

Example

フィードバック

その他のリソース