Getting Started - Administrators
Applies to: DeployR 8.x (See comparison between 8.x and 9.x)
Looking to deploy with Machine Learning Server? Start here.
This guide is for system administrators of DeployR, the R Integration Server. If you are responsible for creating or maintaining an evaluation or a production deployment of the DeployR server, then this guide is for you.
As an administrator, your key responsibilities are to ensure the DeployR server is properly provisioned and configured to meet the demands of your user community. In this context, the following policies are of central importance:
- Server security policies, which include user authentication and authorization
- Server R package management policies
- Server runtime policies, which affect availability, scalability, and throughput
Whenever your policies fail to deliver the expected runtime behavior or performance, you'll need to troubleshoot your deployment. For that we provide diagnostic tools and numerous recommendations.
But first, you must install the server. Comprehensive installation guides are available here.
For a general introduction to DeployR, read the About DeployR document.
User access to the DeployR server and the services offered on its API are entirely under your control as the server administrator. The principal method of granting access to DeployR is to use the Administration Console to create user accounts for each member of your community.
The creation of user accounts establishes a trust relationship between your user community and the DeployR server. Your users can then supply simple
password credentials in order to verify their identity to the server. You can grant additional permissions on a user-by-user basis to expose further functionality and data on the server.
However, basic user authentication and authorization are only one small part of the full set of DeployR security features, which includes full HTTPS/SSL encryption support, IP filters, and
password format and auto-locking policies. DeployR Enterprise also offers seamless integration with popular enterprise security solutions.
The full set of DeployR security features available to you as a system administrator are detailed in this Security guide.
R Package Policies
The primary function of the DeployR server is to support the execution of R code on behalf of client applications. One of your key objectives as a DeployR administrator is to ensure a reliable, consistent execution environment for that code.
The R code developed and deployed by data scientists within your community will frequently depend on one or more R packages. Those R packages may be hosted on CRAN, MRAN, GitHub, in your own local CRAN repository or elsewhere.
Making sure that these R package dependencies are available to the code executing on the DeployR server requires active participation from you, the administrator. There are two R package management policies you can adopt for your deployment, which are detailed in this R Package Management guide.
The DeployR server supports a wide range of runtime policies that affect many aspects of the server runtime environment. As an administrator, you can select the preferred policies that best reflect the needs of your user community.
- Global settings such as server name, default server boundary, and so on
- Project persistence policies governing resource usage (sizes and autosave)
- Runtime policies governing authenticated, asynchronous, and anonymous operations
- Runtime policies governing concurrent operation limits, file upload limits, and event stream access
The full set of general policy options available with the Server Policies tab is detailed in this online help topic.
The DeployR product consists of a number of software components that combine to deliver the full capabilities of the R Integration Server: the server, the grid and the database. Each component can be configured for High Availability (HA) in order to deliver a robust, reliable runtime environment.
For a discussion of the available server, grid, and database HA policy options, see the DeployR High Availability Guide.
Scalability & Throughput
In the context of a discussion on DeployR server runtime policies, the topics of scalability and throughput are closely related. Some of the most common questions that arise when planning the configuration and provisioning of the DeployR server and grid are:
- How many users can I support?
- How many grid nodes do I need?
- What throughput can I expect?
The answer to these questions will ultimately depend on the configuration and size of the server and grid resources allocated to your deployment.
For detailed information and recommendations on tuning the server and grid for optimal throughput, read the DeployR Scale & Throughput Guide.
Big Data Policies
There may be times when your DeployR user community needs access to genuinely large data files, or big data.
When such files are stored in the DeployR Repository or at any network-accessible URI, the R code executing on the DeployR server can load the desired file data on-demand. However, physically moving big data is expensive both in terms of bandwidth and throughput.
To alleviate this overhead, the DeployR server supports a set of NFS-mounted directories dedicated to managing large data files. We refer to these directories as 'big data' external directories. As an administrator, you can enable this service by:
Configuring the big data directories within your deployment.
Informing your DeployR users that they must use the R function,
deployrExternalin their R code to reference big data files within these directories.
For the complete configuration and usage documentation, read the guide "Managing External Directories for Big Data".
There is no doubt that, as an administrator, you've experienced failures with servers, networks, and systems. Likewise, your chosen runtime policies may sometime fail to deliver the runtime behavior or performance needed by your community of users.
When those failures occur in the DeployR environment, we recommend you first turn to the DeployR diagnostic testing tool to attempt to identify the underlying cause of the problem.
Beyond the diagnostics tool, the Troubleshooting documentation offers suggestions and recommendations for common problems with known solutions.
This section provides a quick summary of useful links for administrators working with DeployR.
Use the table of contents to find all of the guides and documentation needed by the administrator.
- About DeployR
- Installation & Configuration
- R Package Management
- Scale & Throughput
- Diagnostic Testing & Troubleshooting
Other Getting Started Guides