sqlrutils (R package in SQL Server Machine Learning Services)
Applies to: SQL Server 2016 (13.x) and later versions
sqlrutils is an R package from Microsoft that provides a mechanism for R users to put their R scripts into a T-SQL stored procedure, register that stored procedure with a database, and run the stored procedure from an R development environment. The package is included in SQL Server Machine Learning Services and SQL Server 2016 R Services.
By converting your R code to run within a single stored procedure, you can make more effective use of SQL Server R Services, which requires that R script be embedded as a parameter to sp_execute_external_script. The sqlrutils package helps you build this embedded R script and set related parameters appropriately.
The sqlrutils package performs these tasks:
- Saves the generated T-SQL script as a string inside an R data structure
- Optionally, generate a .sql file for the T-SQL script, which you can edit or run to create a stored procedure
- Registers the newly created stored procedure with the SQL Server instance from your R development environment
You can also execute the stored procedure from an R environment, by passing well-formed parameters and processing the results. Or, you can use the stored procedure from SQL Server to support common database integration scenarios such as ETL, model training, and high-volume scoring.
If you intend to run the stored procedure from an R environment by calling the executeStoredProcedure function, you must use an ODBC 3.8 provider, such as ODBC Driver 13 for SQL Server.
Full reference documentation
The sqlrutils package is distributed in multiple Microsoft products, but usage is the same whether you get the package in SQL Server or another product. Should any product-specific behaviors exist, discrepancies will be noted in the function help page.
The following section provides an overview of the functions that you can call from the sqlrutils package to develop a stored procedure containing embedded R code. For details of the parameters for each method or function, see the R help for the package:
|Execute a SQL stored procedure.
|Get a list of input parameters to the stored procedure.
|Defines the source of data in SQL Server that will be used in the R data frame. You specify the name of the data.frame in which to store the input data, and a query to get the data, or a default value. Only simple SELECT queries are supported.
|Defines a single input parameter that will be embedded in the T-SQL script. You must provide the name of the parameter and its R data type.
|Generates an intermediate data object that is needed if your R function returns a list that contains a data.frame. The OutputData object is used to store the name of a single data.frame obtained from the list.
|Generates an intermediate data object that is needed if your R function returns a list. The OutputParameter object stores the name and data type of a single member of the list, assuming that member is not a data frame.
|Register the stored procedure with a database.
|Assign a query to an input data parameter of the stored procedure.
|Assign a value to an input parameter of the stored procedure.
|A stored procedure object.
How to use sqlrutils
The sqlrutils package functions must run on a computer having SQL Server Machine Learning with R. If you are working on a client workstation, set a remote compute context to shift execution to SQL Server. The workflow for using this package includes the following steps:
- Define stored procedure parameters (inputs, outputs, or both)
- Generate and register the stored procedure
- Execute the stored procedure
In an R session, load sqlrutils from the command line by typing
You can load this package on computer that does not have SQL Server (for example, on an R Client instance) if you change the compute context to SQL Server and execute the code in that compute context.
Define stored procedure parameters and inputs
StoredProcedure is the main constructor used to build the stored procedure. This constructor generates a SQL Server Stored Procedure object, and optionally creates a text file containing a query that can be used to generate the stored procedure using a T-SQL command.
Optionally, the StoredProcedure function can also register the stored procedure with the specified instance and database.
funcargument to specify a valid R function. All the variables that the function uses must be defined either inside the function or be provided as input parameters. These parameters can include a maximum of one data frame.
The R function must return either a data frame, a named list, or a NULL. If the function returns a list, the list can contain a maximum of one data.frame.
Use the argument
spNameto specify the name of the stored procedure you want to create.
You can pass in optional input and output parameters, using the objects created by these helper functions:
filePathto provide the path and name of a .sql file to create. You can run this file on the SQL Server instance to generate the stored procedure using T-SQL.
To define the server and database where the stored procedure will be saved, use the arguments
To get a list of the InputData and InputParameter objects that were used to create a specific StoredProcedure object, call
To register the stored procedure with the specified database, use
The stored procedure object typically does not have any data or values associated with it, unless a default value was specified. Data is not retrieved until the stored procedure is executed.
Specify inputs and execute
setInputDataQueryto assign a query to an InputParameter object. For example, if you have created a stored procedure object in R, you can use
setInputDataQueryto pass arguments to the StoredProcedure function in order to execute the stored procedure with the desired inputs.
setInputValueto assign specific values to a parameter stored as an InputParameter object. You then pass the parameter object and its value assignment to the StoredProcedure function to execute the stored procedure with the set values.
executeStoredProcedureto execute a stored procedure defined as an StoredProcedure object. Call this function only when executing a stored procedure from R code. Do not use it when running the stored procedure from SQL Server using T-SQL.
The executeStoredProcedure function requires an ODBC 3.8 provider, such as ODBC Driver 13 for SQL Server.