DeployR Server and Grid High Availability

Article
11/10/2017

Applies to: DeployR 8.x (See comparison between 8.x and 9.x)

Looking to deploy with Machine Learning Server? Start here.

Server High Availability

When discussing high-availability in the context of server systems, clustered Java Web applications are a common architectural solution. However, due to the non-serializable socket connections maintained per R session by the DeployR server, full distributed session management across servers in a cluster is not possible. It depends on serializable session state.

There are however alternative deployment options that can be adopted to improve server availability and overall uptime. The relative merits of these deployment options are discussed in the following sections.

Primary-Standby Deployment

This deployment model while simple can be highly effective. Your application makes use of a primary server and a standby server that share a single instance of the DeployR database. By default, your application talks only to the primary server. If the primary server becomes unresponsive at runtime, your application detects this condition and transitions away from the unresponsive primary by redirecting all traffic to the standby server.

The following is supported with a SQL Server or PostgreSQL database (not the default H2 database) for DeployR 8.0.5, or a MongoDB database for DeployR 8.0.0.

The following steps detail how to configure this type of deployment:

DeployR Database Setup
- Install a single, standalone instance of the DeployR database.
- This single database instance is used by the primary and standby servers, as detailed next.
Primary Server Setup
- Install a dedicated DeployR server and designate it as your primary.
- Configure the database endpoint for the primary using deployr.groovy.
- Set the primary deployr.server.policy.override.web.context property using deployr.groovy.
Standby Server Setup
- Install a dedicated DeployR server and designate it as your standby.
- Configure the database endpoint for the standby using deployr.groovy.
- Set the standby deployr.server.policy.override.web.context property using deployr.groovy.
DeployR Grid Setup
- Using either the primary or standby server administration console:
- Add details for each grid node you have provisioned for your deployment.

At this point, your Primary-Standby Deployment is fully configured. As the primary and standby servers in your deployment are sharing a single instance of the DeployR database your servers are also sharing:

A single DeployR repository where all R scripts, models, etc. are managed.
A single DeployR grid.

While this can be an effective deployment model, take careful note of the following:

Failover is not automatic. Your application must detect when the primary becomes unresponsive and then actively take steps to reroute all traffic to the standby.
A single grid means zombie tasks (initiated by a primary that then fails) can leak grid resources (memory and/or processor cycles) that cannot be reclaimed by the standby when it takes over active management of the grid. This type of resource impairment can lead to performance degradation over time.

A potentially better deployment model would isolate failures, in order to ensure that resource impairment by zombie tasks would not continue to impact overall availability or responsiveness of the system. This model forms the motivation for the next deployment model to be described, Load Balanced Deployment.

Load Balanced Deployment

While multiple DeployR servers cannot participate in a classic cluster configuration with automatic failover, it is possible to deploy multiple servers behind an HTTP load balancer to distribute load and even adapt to server failures at runtime.

The following steps detail how to configure this type of deployment:

Server Setup * N
- Install two or more DeployR servers.
- Each server should be configured to use its own dedicated database.
- Each server should be configured to use its own dedicated set of grid nodes.
Load Balancer Setup
- Install your preferred HTTP load balancer.
- Configure the load balancer to enforce sticky HTTP sessions.
- Register the IP endpoint of each DeployR server with your load balancer.

At this point, your Load Balanced Deployment is fully configured. In the event of a server failure all other servers in the cluster will continue functioning unaffected. However, given each server in your deployment operates independently of the other servers there is one important implication:

Your R scripts, models, etc. must be manually replicated across the DeployR repositories associated with each of your servers.

Grid High Availability

The DeployR grid is designed to deliver high-availability at runtime, supporting both redundancy and failover. The server monitors the health off all resources on the grid at runtime. If an individual slot on the grid or even an entire grid node becomes unresponsive, the server automatically contracts the grid, bypassing those failing resources. If failing resources once again become responsive the server automatically expands the grid, to include the newly responsive grid resources.

In this way, the DeployR grid can be said to be self-healing, capable of automatically adjusting to system failures and recoveries at runtime, ensuring maximum uptime and responsiveness.

RBroker High Availability! The RBroker Framework includes a fault-tolerant implementation of the PooledTaskBroker. Tasks executing on the new PooledTaskBroker that fail due to grid failures at runtime are automatically detected and re-executed without requiring client application intervention.

DeployR Server and Grid High Availability

Server High Availability

Primary-Standby Deployment

Load Balanced Deployment

Grid High Availability

Additional resources