Scaling out an Analysis Services Solution
There are often situations when an Analysis Services database administrator (dba) wants to improve the query response time for an increasing number of end-users. There are two ways to accomplish this goal: adding more power to the existing server (scale up), or distributing the load among several small servers (scale out).
Scaling up a solution is usually limited by the point where the existing hardware can no longer be expanded or upgraded. Perhaps the existing motherboard cannot accept the new version of CPUs, or the physical address space for memory has been reached. On the other hand, scaling out a solution is more flexible and its limitations are overcome more easily. If the number of servers in the Network Load Balancer (NLB) has reached the maximum limit, an additional NLB can be added to the solution and the servers distributed among the multiple NLBs.
This document describes a theoretical architecture for scaling out an Analysis Services solution.
Scenario
An Analysis Services dba must provide the end-users of an Analysis Services solution with a better response time to queries, yet with the minimum daily downtime window for refreshing the data. From the original 80 users, the number of users has doubled during the last month and it is expected that the number will double again in the next six months. From the seventh month onward, user expansion is expected to grow by 4 percent monthly. The Analysis Services database size is currently at 80GB and grows by 6GB monthly. The database currently holds data for the last 12 months and is expected to maintain the history of the last 3 fiscal years plus the current one. The average processing time is 2 ½ hours, and the downtime window is limited to ½ hour.
Alternatives
After you read the scenario, it might seem that the only solution is to scale up the server. This will provide service that has no downtime, but slow performance during processing time. However, there are currently 160 users, and this number will double to 320 in the next six months. After that it will continue to grow at 13-16 users per month for an unknown time. At that rate, the number of users will double again between the 18th and 19th months of stable growth. Given this situation, it will be difficult to size the correct hardware and justify a budget request for equipment that will be used at under 50 percent of its capacity for the next 12 months.
Fortunately, scaling out this solution is possible in SQL Server 2008 Analysis Services with the Read-Only Database capability.
Scale-Out Architecture
This architecture is designed with two elements:
A Physical Layout with the goal of maximizing the end-user throughput.
An Operations Framework with the goal of minimizing the downtime.
Physical Layout
The solution consists of three main components:
The processing environment
The Storage Area Network (SAN)
The Data Access environment
The first component, processing, is where the data is updated and processed by using a segment of the SAN. The second component, SAN, is where the data is kept for both processing and data access environments. The third component, data access, is where the data is made available to end-users.
The Processing Environment
The processing environment is formed by one server, a connection to the SAN, and a logical SAN volume to hold the Analysis Services data.
The Storage Area Network (SAN)
This solution consists of two independent 'SAN logical volumes': one for the Processing Environment and the other for the Data Access Environment.
The SAN is the set of devices that provide the physical storage for the multidimensional databases. It enables high-speed connections between the servers and the storage that include shared storage, clusters, and data recovery mechanisms.
In this document, a 'SAN logical volume' defines the unit of storage that is perceived by the operating system as one physical drive.
The Data Access Environment
The Data Access environment is formed by multiple servers, usually starting with three, that share the same SAN logical volume. Users connect to the data access servers through a NLB device that routes all incoming requests by using a loadbalancing algorithm.
Variants to the Physical Layout
If necessary, you might use the following variants to achieve better performance in the solution.
Processing Environment
On some occasions, you might use two processing servers: one for the relational databases and another to hold the Analysis Services databases.
In addition, multiple logical volumes can be defined to independently hold the relational databases and the multidimensional databases in the SAN.
Data Access Environment
Two or more NLBs are defined as part of the solution with a minimum of three data access servers per NLB device.
Operations Framework
The operation of the solution is divided into three phases:
Data Processing.
Downtime Window
Reset Data Processing
Data Processing
In this phase, the multidimensional database is updated and processed. as soon as it is the content of the multidimensional database is ready to be sent, the Data Access Environment processing the data for transfer. This process is composed of the following steps:
Detach the Analysis Services database from the Data Processing server.
Take offline the logical SAN volume that holds the Analysis Services database.
Downtime Window
In this phase, the content of the updated database is swapped with the content of the original database.
Set the NLBs to reject any incoming requests.
Detach the Analysis Services databases from each Data Access server.
Take offline the logical SAN volume that holds the Analysis Services database from each Data Access server.
Using SAN commands, swap the logical SAN volumes between the Processing Environment and the Data Access Environment.
Bring online, as a read-only device, the logical SAN volume that holds the Analysis Services database for each Data Access server.
Attach the Analysis Services database, in ReadOnly mode, to each Data Access server.
Set the NLBs to accept any incoming request.
Reset Data Processing
In this phase, the content of the old logical SAN volume is updated and brought online in the Processing Environment.
Using SAN commands, mirror the logical SAN volume in Data Access to the Processing Environment logical SAN.
Bring online, as a read/write device, the logical SAN volume that holds the Analysis Services database for the Processing Environment.
Attach the Analysis Services database, in ReadWrite mode, to the Processing Environment server.