Use RevoScaleR to install R packages
Applies to: SQL Server 2016 (13.x) SQL Server 2017 (14.x)
This article describes how to use RevoScaleR (version 9.0.1 and later) functions to install R packages on SQL Server with Machine Learning Services or R Services. The RevoScaleR functions can be used by remote, non-administrators to install packages on SQL Server without direct access to the server.
RevoScaleR functions for package management
The following table describes the functions used for R package installation and management.
Function | Description |
---|---|
rxSqlLibPaths | Determine the path of the instance library on the remote SQL Server. |
rxFindPackage | Gets the path for one or more packages on the remote SQL Server. |
rxInstallPackages | Call this function from a remote R client to install packages in a SQL Server compute context, either from a specified repository, or by reading locally saved zipped packages. This function checks for dependencies and ensures that any related packages can be installed to SQL Server, just like R package installation in the local compute context. To use this option, you must have enabled package management on the server and database. Both client and server environments must have the same version of RevoScaleR. |
rxInstalledPackages | Gets a list of packages installed in the specified compute context. |
rxSyncPackages | Copy information about a package library between the file system and database, for the specified compute context. |
rxRemovePackages | Removes packages from a specified compute context. It also computes dependencies and ensures that packages that are no longer used by other packages on SQL Server are removed, to free up resources. |
Prerequisites
Remote management enabled on SQL Server. For more information, see Enable remote R package management on SQL Server.
RevoScaleR versions are the same on both client and server environments. For more information, see Get R package information.
You have permission to connect to the server and a database, and to run R commands. You must be a member of a database role that allows you to install packages on the specified instance and database.
Packages in shared scope can be installed by users belonging to the
rpkgs-shared
role in a specified database. All users in this role can uninstall shared packages.Packages in private scope can be installed by any user belonging to the
rpkgs-private
role in a database. However, users can see and uninstall only their own packages.Database owners can work with shared or private packages.
Client connections
Important
The support for Machine Learning Server (previously known as R Server) ended on July 1, 2022. For more information, see What's happening to Machine Learning Server?
A client workstation can be Microsoft R Client or a Microsoft Machine Learning Server (data scientists often use the free developer edition) on the same network.
When calling package management functions from a remote R client, you must create a compute context object first, using the RxInSqlServer function. Thereafter, for each package management function that you use, pass the compute context as an argument.
User identity is typically specified when setting the compute context. If you don't specify a user name and password when you create the compute context, the identity of the user running the R code is used.
From an R command line, define a connection string to the instance and database.
Use the RxInSqlServer constructor to define a SQL Server compute context, using the connection string.
sqlcc <- RxInSqlServer(connectionString = myConnString, shareDir = sqlShareDir, wait = sqlWait, consoleOutput = sqlConsoleOutput)
Create a list of the packages you want to install and save the list in a string variable.
packageList <- c("e1071", "mice")
Call rxInstallPackages and pass the compute context and the string variable containing the package names.
rxInstallPackages(pkgs = packageList, verbose = TRUE, computeContext = sqlcc)
If dependent packages are required, they are also installed, assuming an internet connection is available on the client.
Packages are installed using the credentials of the user making the connection, in the default scope for that user.
Call package management functions in stored procedures
You can run package management functions inside sp_execute_external_script
. When you do so, the function is executed using the security context of the stored procedure caller.
Examples
This section provides examples of how to use these functions from a remote client when connecting to a SQL Server instance or database as the compute context.
For all examples, you must provide either a connection string, or a compute context, which requires a connection string. This example provides one way to create a compute context for SQL Server:
instance_name <- "computer-name/instance-name";
database_name <- "TestDB";
sqlWait= TRUE;
sqlConsoleOutput <- TRUE;
connString <- paste("Driver=SQL Server;Server=", instance_name, ";Database=", database_name, ";Trusted_Connection=true;", sep="");
sqlcc <- RxInSqlServer(connectionString = connString, wait = sqlWait, consoleOutput = sqlConsoleOutput, numTasks = 4);
Depending on where the server is located, and the security model, you might need to provide a domain and subnet specification in the connection string, or use a SQL login. For example:
connStr <- "Driver=SQL Server;Server=myserver.financeweb.contoso.com;Database=Finance;Uid=RUser1;Pwd=RUserPassword"
Get package path on a remote SQL Server compute context
This example gets the path for the RevoScaleR package on the compute context, sqlcc
.
sqlPackagePaths <- rxFindPackage(package = "RevoScaleR", computeContext = sqlcc)
print(sqlPackagePaths)
Results
"C:/Program Files/Microsoft SQL Server/MSSQL14.MSSQLSERVER/R_SERVICES/library/RevoScaleR"
Tip
If you have enabled the option to see SQL console output, you might get status messages from the function that precedes the print
statement. After you have finished testing your code, set consoleOutput
to FALSE in the compute context constructor to eliminate messages.
Get locations for multiple packages
The following example gets the paths for the RevoScaleR and lattice packages, on the compute context, sqlcc
. To get information about multiple packages, pass a string vector containing the package names.
packagePaths <- rxFindPackage(package = c("RevoScaleR", "lattice"), computeContext = sqlcc)
print(packagePaths)
Get package versions on a remote compute context
Run this command from an R console to get the build number and version numbers for packages installed on the compute context, sqlServer.
sqlPackages <- rxInstalledPackages(fields = c("Package", "Version", "Built"), computeContext = sqlServer)
Install a package on SQL Server
This example installs the forecast package and its dependencies into the compute context.
pkgs <- c("forecast")
rxInstallPackages(pkgs = pkgs, verbose = TRUE, scope = "private", computeContext = sqlcc)
Remove a package from SQL Server
This example removes the forecast package and its dependencies from the compute context.
pkgs <- c("forecast")
rxRemovePackages(pkgs = pkgs, verbose = TRUE, scope = "private", computeContext = sqlcc)
Synchronize packages between database and file system
The following example checks the database TestDB, and determines whether all packages are installed in the file system. If some packages are missing, they are installed in the file system.
# Instantiate the compute context
connectionString <- "Driver=SQL Server;Server=myServer;Database=TestDB;Trusted_Connection=True;"
computeContext <- RxInSqlServer(connectionString = connectionString )
# Synchronize the packages in the file system for all scopes and users
rxSyncPackages(computeContext=computeContext, verbose=TRUE)
Package synchronization works on a per database and per user basis. For more information, see R package synchronization for SQL Server.
Use a stored procedure to list packages in SQL Server
Run this command from Management Studio or another tool that supports T-SQL, to get a list of installed packages on the current instance, using rxInstalledPackages
in a stored procedure.
EXEC sp_execute_external_script
@language=N'R',
@script=N'
myPackages <- rxInstalledPackages();
OutputDataSet <- as.data.frame(myPackages);
'
The rxSqlLibPaths
function can be used to determine the active library used by SQL Server Machine Learning Services. This script can return only the library path for the current server.
declare @instance_name nvarchar(100) = @@SERVERNAME, @database_name nvarchar(128) = db_name();
exec sp_execute_external_script
@language = N'R',
@script = N'
connStr <- paste("Driver=SQL Server;Server=", instance_name, ";Database=", database_name, ";Trusted_Connection=true;", sep="");
.libPaths(rxSqlLibPaths(connStr));
print(.libPaths());
',
@input_data_1 = N'',
@params = N'@instance_name nvarchar(100), @database_name nvarchar(128)',
@instance_name = @instance_name,
@database_name = @database_name;